Web supplement to
"Reconstructing the sequence specificities of RNA-binding proteins across eukaryotes"

Alexander Sasse1,2,3,4,*, Debashish Ray2,*, Kaitlin Laverty1,2,4,5,*, Cyrus L. Tam5,6, Mihai Albu2, Hong Zheng2, Olga Lyudovyk5,6, Kate Nie1,2,4, Cedrik Magis7,8, Cedric Notredame7,8, Matthew T. Weirauch9,10,‡, Timothy R. Hughes1,2,‡, Quaid Morris1,2,4,5,6,11,‡

1Department of Molecular Genetics, University of Toronto, Toronto, ON Canada
2Donnelly Centre, University of Toronto, Toronto, ON Canada
3Department of Computer Science, University of Washington, Seattle, WA, USA
4Vector Institute, Toronto, ON Canada
5Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
6Graduate Program in Computational Biology and Medicine, Weill-Cornell Graduate School, New York, NY, USA
7 for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
8Universitat Pompeu Fabra, Barcelona, Spain
9Center for Autoimmune Genomics and Etiology, Divisions of Allergy & Immunology, Human Genetics, Biomedical Informatics and Developmental Biology, Cincinnati Children’s Hospital, Cincinnati, OH, USA
10Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
11Ontario Institute for Cancer Research, Toronto, ON, Canada

*these authors contributed equally
To whom correspondance should be addressed:

Abstract

RNA-binding proteins (RBPs) are key regulators of gene expression. Here, we introduce RBPzoo — a resource of RNAcompete-derived in vitro RNA-binding data for 379 RBPs from 33 diverse eukaryotes. We develop a new method, Joint Protein-Ligand Embedding (JPLE), to map specificity-determining peptides to corresponding RNA motifs for 28,667 RBPs from 690 eukaryotes. We illustrate the broad utility of this resource by inferring post-transcriptional function for 12 eukaryotic RBPs in mRNA stability and reconstructing the evolution of 2,568 RNA motifs. For the latter, we identify a universal set of 19 RNA motifs conserved between plants and metazoa and observe rapid motif evolution arising from whole genome duplications in vertebrate ancestors. RBPzoo represents a powerful resource for the study of gene regulation for any organism with an annotated genome.

Supplementary Data Tables

Array Information, Raw and Processed Data

Z-Scores

Motifs

Code