Publications - Supplemental material

Please find below supplemental material corresponding to publications of our group. Currently, we list 132 supplements. If you have problems accessing electronic information, please let us know:

©NOTICE: All documents are copyrighted by the authors; If you would like to use all or a portion of any paper, please contact the author.

This supplement is also available at http://www.bioinf.uni-leipzig.de/publications/supplements/10-010
You may use this URL to cite or link to us.

BIOINF 10-010: Computational discovery of human coding and non-coding transcripts with conserved splice sites

Dominic Rose, Michael Hiller, Katharina Schutt, Jörg Hackermüller, Rolf Backofen, Peter F. Stadler

Supplemental PDF

Additional figures and tables are available as seperate PDF.

LOSS: tree-based (L)og-(O)dds (S)ubstitution (S)cores

Proof-of-concept perl implementation: LOSS.tar.gz

The script evaluates substitution patterns of multiple fasta files along a phylogenetic tree. A log-odds score is assigned to each alignment. Scores above 0 indicate that the alignment contains substitutions typical for real splice sites. Negative scores indicate the opposite. Boxplots of the score distributions are given in the supplemental PDF (they help to further interprete the score). Keep in mind that the model was trained on the UCSC 44-way multiz alignments with hg18 as a reference.

Predicted splice sites

The project was actually realized with hg18. Here, we also offer genomic coordinates for hg19 as given by UCSC's liftOver tool.
hg18hg19
Novel donor (5') splice sites: BED (927,693 entries, p>0.5) BED.mapped (927,660 entries, p>0.5)
Novel acceptor (3') splice sites: BED (2,497,067 entries, p>0.5) BED.mapped (2,496,984 entries, p>0.5)

Splice site derived exons

hg18hg19
Novel exons (which passed the EST-SVM): BED FASTA (8,832 entries, p>0.5) BED.mapped (8,829 entries, p>0.5)
Putative coding exons: BED FASTA (938 entries, p>0.5) BED.mapped (937 entries, p>0.5)
Putative non-coding exons: BED FASTA (7,894 entries, p>0.5) BED.mapped (7,892 entries, p>0.5)

Inferred gene structures

hg18hg19
Predicted genes (exon-cluster): GFF (336 genes, 734 exons, p>0.5) GFF.mapped (336 genes, 734 exons, p>0.5)
Putative coding genes: GFF (48 genes, 114 exons, p>0.5) GFF.mapped (48 genes, 114 exons, p>0.5)
Putative non-coding genes: GFF (241 genes, 503 exons, p>0.5) GFF.mapped (241 genes, 503 exons, p>0.5)

Bulk download

hg18hg19
Download all custom tracks: hg18.tar.gz (55 Mb) hg19.tar.gz (57 Mb)

UCSC Genome Browser links

hg18hg19
Selected custom tracks at the UCSC GB.:
(EST-SVM exons, exon-cluster)
hg18.customTracks hg19.customTracks