Introduction
This project aims to establish a non-redundant set of mouse
genes by mapping each of the known and predicted cDNAs to their
location on the latest genome build, and creating clusters of
significantly overlapping genes.
Downloads
(Representative) IRC gene list - 48k genes (25 MB)
(Representative) IRC cDNAs Fasta file (90 MB)
(Representative) IRC proteins Fasta file (25 MB)
(Anchor) IRC gene list - 48k genes (25 MB)
(Anchor) IRC cDNAs Fasta file (90 MB)
(Anchor) IRC proteins Fasta file (25 MB)
Full Cluster Results (93 MB)
Full Cluster Results -- Compressed Format (3 MB)
Column Headings -- Lists
Column Headings -- Clusters
"Anchor" refers to the longest ORF in a cluster. Often, the anchor is
the result of a prediction. "Representative" selects the MGI or RefSeq transcript in a cluster.
If there is no such sequence, the longest ORF is taken.
See the documentation for full details.
IRC - EnsEMBL Lookup Table (obtained by best reciprocal match using megablast)
IRC - RefSeq Lookup Table (obtained by best reciprocal match using megablast)
EnsEMBL Gene - EnsEMBL Transcript Lookup Table (extracted from EnsEMBL FASTA file)
Putative TF (MRK, Ensembl, RefSeq, IRC IDs)
|
Original cDNA Databases
MGI (83 MB)
SGP (36 MB)
Riken Fantom (121 MB)
GeneID(39 MB)
Refseq (57 MB)
EnsEMBL (65 MB)
EnsEMBL-Abinitio (59 MB)
Unigene (126 MB)
Pseudogene (4 MB)
EnsEMBL-Pseudo (1.5 MB)
Documentation (DOC)
(HTML)
Frequently Asked Questions (FAQs)
(HTML)
Ordered Genome Build 32 can be download by chromosome here.
MRK_Sequence file from MGI that we used can be downloaded here.
|