1Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, 2Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1. *To whom correspondance should be addressed:
The human transcription factor (TF) CGGBP1 (“CGG Binding Protein”) is conserved only in amniotes, and is believed to derive from the zf-BED and Hermes transposase DNA-binding domains (DBDs) of a hAT DNA transposon. Here, we show that TFs with this bipartite domain structure have resulted from dozens of independent hAT domestications in different eukaryotic lineages. CGGBPs display a wide range of sequence specificity, usually including preferences for CGG or CGC trinucleotides, while some bind AT-rich motifs. The CGGBPs are almost entirely non-syntenic, and their protein sequences, DNA binding motifs, and patterns of presence or absence in genomes are uncharacteristic of ancestry via speciation. At least eight CGGBPs in the coelacanth Latimeria chalumnae bind distinct motifs, and the expression of the corresponding genes varies considerably across tissues, indicating two overlapping modes of neofunctionalization.