NOTE! This is a read-only copy of the ENCODE2 wiki.
Please go to the ENCODE3 wiki for current information.

Controlled vocabularies for ENCODE data reporting

From Encode2 Wiki
Jump to: navigation, search

This page lists the controlled vocabulary terms used for ENCODE data submission. Before submitting data to the DCC, all references to controlled vocabulary terms in the ENCODE metadata submission files (DAFs and DDFs) must be registered.

The registered terms section at the bottom of this page displays the contents of the CV. The current file can be downloaded here: CV File.

To register a new term, add it to the appropriate table below and inform your data wrangler. The DCC will register your entry, and it will shortly appear in the controlled vocabulary (CV) file.

Antibodies

Antibodies are registered here.

Cell Types

Human cell types are registered here.

Mouse cell lines and tissues are registered here.

Controls

Term: control Now distnct from Input.

Enter new terms into the following table of yet to be registered terms (which includes one already registered term as an example):

Term Description Added By
Control This data represents a control being compared with the other tracks in the set. DCC
Harvard Input library prepared at Harvard Snyder

Registered Controls

The following special control terms have been registered to the ENCODE validator, and can be used in data submissions: <embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=control{scrolling=auto}{height=420}</embedurl>

Fragment Size

Term: fragSize
Updated: 2008-10-07
Enter new terms into the following table of yet to be registered terms (which includes one already registered term as an example):

Fragment Size Description
1k 1kb DNA fragments

Registered fragSize

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=fragSize{scrolling=auto}{height=170}</embedurl>

Grants

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=grant{scrolling=auto}{height=600}</embedurl>


Labs

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=lab{scrolling=auto}{height=600}</embedurl>

Localizations

Term: localization Also known as Subcellular localizations or Cell components.
Updated: 2008-07-23
Enter new terms into the following table of yet to be registered terms (which includes one already registered term as an example):

Localization Description termId Source
nucleus Large membrane bound part of cell containing chromosomes and the bulk of the cell's DNA GO:0005634 [1]

Registered localizations

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=localization{scrolling=auto}{height=300}</embedurl>



Map Algorithms

Term: mapAlgorithm
Updated: 2008-07-23
Enter new terms into the following table of yet to be registered terms (which includes one already registered term as an example):

Algorithm Description
erng3 erange v3.0
erng32a erange v3.2.0 alpha

Registered Map Algorithms

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=mapAlgorithm{scrolling=auto}{height=320}</embedurl>

Phases

Term: phase
The following phases were defined by the Stam lab.
Enter new terms into the following table of yet to be registered terms (which includes one already registered term as an example):

Treatment Description Added By
G1 flow sorted, G1 phase Stam

Registered Phases

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=phase{scrolling=auto}{height=120}</embedurl>

Promoters

Term: promoter
The following promoters are used by Elnitski (NHGRI) Negative Regulatory Elements (NRE) Data type.
Enter new terms into the following table of yet to be registered terms (which includes one already registered term as an example):

Promoter Description Added By
G-gamma promoter for the G-gamma globulin genes Elnitski

Registered Promoters

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=promoter{scrolling=auto}{height=150}</embedurl>

Protocols

Term: protocol
The following protocols are used by HudsonAlpha. Updated: 2009-12-09
Enter new terms into the following table of yet to be registered terms (which includes one already registered term as an example):

Protocol Description Added By
PCR1x 1-Cycle of PCR Myers
Yale_TFBS(Struhl) Cells grown using the Yale TFBS (Struhl) growth protocol. Struhl
v042211.1 Faster ChIP protocol & AMpure XP size selection for ChIP-seq. Myers
v042211.2 Faster ChIP protocol & gel size selection for ChIP-seq. Myers

Registered Protocols

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=protocol{scrolling=auto}{height=300}</embedurl>

readTypes

Term: readTypes
Updated: 2012-03-20
Enter new terms into the following table of yet to be registered terms:

ReadType Description
1x41 Single 41 nt reads


Registered readTypes

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=readType{scrolling=auto}{height=280}</embedurl>


RNA extracts

Term: rnaExtract
Updated: 2008-07-23
Enter new terms into the following table of yet to be registered terms (which includes one already registered term as an example):

RNA extract Description
longPolyA Poly(A)+ RNA longer than 200 nt

Registered RNA extracts

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=rnaExtract{scrolling=auto}{height=330}</embedurl>

Sequence Platforms

Term: seqPlatform
The following platforms are defined by Gene Expression Omnibus(GEO) for data submission.

Registered Sequence Platforms

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=seqPlatform{scrolling=auto}{heigh=200}</embedurl>


Sex

Term Description
M Male
F Female
B Both
U Unknown



Treatments

Term: treatment
The following treatments were defined by Yale/UCDavis/Harvard, HudsonAlpha and Duke/UNC/UT.
Enter new terms into the following table of yet to be registered terms (which includes one already registered term as an example):

Treatment Description Added By
serum_free_media (EXAMPLE) grown in serum free media (EXAMPLE) Crawford (EXAMPLE)

Registered Treaments

<embedurl>http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=treatment{scrolling=auto}{height=1000}</embedurl>