NOTE! This is a read-only copy of the ENCODE2 wiki.
Please go to the ENCODE3 wiki for current information.

Otter das

From Encode2 Wiki
Jump to: navigation, search

The otter_das servers for Havana manual annotations

The otter_das / Havana Gene Models server

Address

http://das.sanger.ac.uk/das/otter_das

Description

This DAS server is set up on top of the Otter database that collects the data curated by the manual annotators. It returns all genes and transcripts with individual exons, introns and UTRs.

The data in the database is in mixed coordinate systems. Where needed the otter_das server applies transformations to the data into the NCBI_36 chromosomal coordinate system. As such all the data is provided in NCBI_36 coordinates.

Entry in the DAS-Registry

Display in DAS clients

To enable this DAS server in Ensembl please follow one of the two links:

http://tinyurl.com/32yv37

http://preview.tinyurl.com/32yv37

Format

(Subject to change)

<DASGFF>
 <GFF version="1.01" href="http://das.sanger.ac.uk:80/das/otter_das/features">
   <SEGMENT id="22" version="1.0" start="19173435" stop="19173436">
     <FEATURE id="OTTHUMT00000320045" label="OTTHUMT00000320045">
       <TYPE id="UTR_protein_coding_KNOWN" category="transcription">UTR_protein_coding_KNOWN</TYPE>
       <START>19180047</START>
       <END>19180170</END>
       <METHOD id="KNOWN">KNOWN</METHOD>
       <SCORE>1</SCORE>
       <PHASE>-1</PHASE>
       <ORIENTATION>-1</ORIENTATION>
       <TARGET id="XXbac-B562F10.9-001" start="19125806" stop="19180170">XXbac-B562F10.9-001</TARGET>
       <GROUP id="OTTHUMT00000320045" type="OTTHUMG00000150778">
       <NOTE>kelch-like 22 (Drosophila)</NOTE>
       <LINK href="http://vega.sanger.ac.uk/Homo_sapiens/transview?transcript=OTTHUMT00000320045">show in vega transcript view</LINK>
       </GROUP>
     </FEATURE>
   </SEGMENT>
 </GFF>
</DASGFF>

The otter_das_pep / Havana Translations server

Address

http://das.sanger.ac.uk/das/otter_das_pep

Description

This DAS server is also set up on top of the Otter database that collects the data curated by the manual annotators. It returns the translations of transcripts if available.

Entry in the DAS-Registry

Format

<DASSEQUENCE>
 <SEQUENCE id="OTTHUMT00000320620" start="1" stop="288" moltype="Protein" version="1.0">
   XRPHLQNQWRKRKMTTWSCLVAMIVSGVITAVWAVRAAPIWRSQVKQKMRIGKQGNCRPPRCI...
 </SEQUENCE>
</DASSEQUENCE>

The otter_das_trans / Havana Transcripts server

Address

http://das.sanger.ac.uk/das/otter_transcripts

Description

This DAS server is also set up on top of the Otter database that collects the data curated by the manual annotators. It returns the DNA sequence of transcripts.

Entry in the DAS-Registry

Format

<DASSEQUENCE>
  <SEQUENCE id="OTTHUMT00000320620" start="1" stop="864" moltype="DNA" version="2007-10-25T14:14:18+0100">
    GAGACCCCATCTTCAGAACCAATGGAGGAAGAGGAAGATGACGACTTGGAGCTGTTTGGTGGCTATGATAGTTTCCGGAGTTATAACAGCAGTGTGGGCAGTGAG
  </SEQUENCE>
</DASSEQUENCE>

Script access

These scripts use the DAS servers to fetch the most up-to-date HAVANA annotations from the otter database:

* dump_translations_from_region.pl.gz Script to fetch translations of otter transcripts from a given region
* genes_das2gff.pl.gz Script to fetch genes and transcripts from a given region and print them in GFF format (version: 2009-01-14)

They are compressed with the Unix gzip program. Perl modules required:

For verification of the output of genes_das2gff.pl the number of transcripts / genes can be compared to the expected number (counted directly in the database) in this list. To count the transcripts in the output file you can do something like grep 'ID=OTTHUMT' filename.gff | cut -f 9 | cut -d';' -f1 | cut -d'=' -f2 | sort -u | wc -l