NOTE! This is a read-only copy of the ENCODE2 wiki.
Please go to the ENCODE3 wiki for current information.

January 2011 Data Freeze

From Encode2 Wiki
Jump to: navigation, search

The current ENCODE data freeze will be January 21, 2011. The data submission pipeline is currently closed for data submissions, and is expected to re-open on Wednesday February 9.

Goals and Policy

This freeze is the cut off for data that will be included in the upcoming analysis paper, and applies specifically to Human ENCODE data (Mouse ENCODE submissions during the freeze will be processed by the DCC, but are not officially part of the freeze and so they are not included in the freeze management plan described here). This will be a 'hard freeze', managed similarly to the January 2010 mid-course evaluation freeze. The data submission pipeline will be closed at midnight PST on the freeze date, and will remain closed for a period of time (est. 2 weeks) so that wranglers can focus on vetting, configuring and associating metadata to freeze submissions.

Freeze Progress

Latest spreadsheet of submitted experiments: Media:EncodeExperimentsHg19.2010-02-09.xlsx

Summary as of February 3, 20112:

Data_Type	Total
Affy Exon-array	118
ChIP-seq	793
CNV	61
Combined	7
DNase-DGF	23
DNase-seq	75
FAIRE-seq	20
Genes	2
Methyl 27	62
Methyl RRBS	86
MethylSeq	44
Proteogenomics	2
RIP-chip	24
RIP-seq	8
RNA-chip	26
RNA-seq	177
SwitchGear	1
Grand Total	1578

Expected Submissions

  • Expected Submissions contains 3 Google spreadsheets (one each for RNA, ChIP, and Other data types) where labs should list the specifics of any further datasets that they plan to submit during the Freeze. This will help the DCC plan for the large flow of data and make sure all of the appropriate controlled vocabulary is registered. The Expected Submissions should be filled out by Dec 1, 2010 to help the DCC prepare for the large flow of data.

RNA tracks

<embedurl>{scrolling=auto}{height=220} </embedurl>

ChIP tracks

Stanford ChIP expected submissions are in: File:Snyder Jan15 2011freezedatasets 12 7 10.xls <embedurl>{scrolling=auto}{height=220} </embedurl>

DNase tracks

<embedurl>{scrolling=auto}{height=220} </embedurl>

Other tracks

<embedurl>{scrolling=auto}{height=220} </embedurl>

Data Approval Calls

  • Approval Calls is for scheduling approval teleconferences for data submitted during the freeze. Labs are urged to carefully review their data in the browser as soon as they are notified by their wrangler that it is ready for review. Review should focus on data quality, completeness, and metadata correctness (not cosmetics of browser display). During the call, the status of each dataset (approved for analysis or else revoked) will be finalized. These calls should be scheduled for the week of Feb 7 - Feb 11 2011. The Approval Call proposed times should be completed by January 21, 2011.

Approval calls will be scheduled for the week of Feb 7 - Feb 11 2011. Labs are asked to fill in proposed times and attendees.

Lab Wrangler Lab approver(s) Dates/times proposed by lab Scheduled date/time Results
Broad Venkat, Tim Noam Shoresh, Chuck Epstein
Duke Tim Terry Furey Done for AffyExon,
Caltech Cricket Georgi Marinov Feb 11 11:00 PST or Feb 10 anytimePST
HudsonAlpha Venkat Cricket Flo Pauli (HudsonAlpha) 1:30 pm PST Feb 7
BU and NHGRI Brian, Cricket Tom Tullius, Steve Parker, Elliott Margulies DONE In pushQ
Transcriptome Cricket Carrie Davis (CSHL) 9:00 am PST Feb 9
UMass-Dekker Kate Bryan Lajoie
UW Venkat John, Raj, Richard Feb 8, 1:00 PST In pushQ
SUNY Cricket Scott and Frank 10 am PST Feb 9 or 10 am PST Feb 8 ???
SYDH Venkat, Cricket Steve Landt, Philip Cayting 1PM PST Feb9
"UNC/BSU" Melissa Morgan Giddings, Jainab Khatun 10AM PST Feb 22