Web supplement to
"Extensive binding of uncharacterized human transcription factors to genomic dark matter"

Rozita Razavi1*, Ali Fathi1*, Isaac Yellan1*, Alexander Brechalov1*, Kaitlin U. Laverty1,2, Arttu Jolma1, Aldo Hernandez Corchado3, Hong Zheng1, Ally Yang1, Marjan Barazandeh1, Chun Hu1, Ilya Vorontsov4, Zain Patel1, The Codebook Consortium, Ivan Kulakovskiy5, Philipp Bucher6, Quaid Morris2, Hamed S. Najafabadi3,7, and Timothy R. Hughes1**

1Donnelly Centre and Department of Molecular Genetics, 160 College Street, Toronto, ON M5S 3E1 CANADA
2Memorial Sloan Kettering Cancer Center, Rockefeller Research Laboratories, New York, NY 10065, USA
3Victor P. Dahdaleh Institute of Genomic Medicine, 740 Dr. Penfield Avenue, Room 7202, Montréal, Québec, H3A 0G1, Canada
4Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, Moscow, Russia
5Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Russia
6Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
7Department of Human Genetics, McGill University, Montréal, Québec, H3A 0C7, Canada

*these authors contributed equally

#To whom correspondance should be addressed:

Abstract

Most of the human genome is thought to be non-functional, and includes large segments often referred to as “dark matter” DNA. The genome also encodes hundreds of putative and poorly characterized transcription factors (TFs). We determined genomic binding locations of 166 uncharacterized human TFs in living cells. Nearly half of them associated strongly with known regulatory regions such as promoters and enhancers, often at conserved motif matches and co-localizing with each other. Surprisingly, the other half often associated with genomic dark matter, at largely unique sites, via intrinsic sequence recognition. Dozens of these, which we term “Dark TFs”, mainly bind within regions of closed chromatin. Dark TF binding sites are enriched for transposable elements, and are rarely under purifying selection. Some Dark TFs are KZNFs, which contain the repressive KRAB domain, but many are not: the Dark TFs also include known or potential pioneer TFs. Compiled literature information supports that the Dark TFs exert diverse functions ranging from early development to tumor suppression. Thus, our results sheds light on a large fraction of previously uncharacterized human TFs and their unappreciated activities within the dark matter genome.

Supplemental Files.