NOTE! This is a read-only copy of the ENCODE2 wiki.
Please go to the ENCODE3 wiki for current information.

2011-01-21 DCC Progress

From Encode2 Wiki
Jump to: navigation, search

January 2011 freeze

The January 11 data freeze ends TODAY, January 21, 2011. Important dates for this freeze are:

  • January 21: Pipeline closure at Midnight PST, for 1 week intensive wrangler processing
  • Feb 7-11: Data approval calls between production labs and DCC

Approval calls are scheduled here: January_2011_Data_Freeze#Data_Approval_Calls

The detailed freeze plan and expected submission spreadsheets are here: January_2011_Data_Freeze.

Data submission and track status

The pipeline reporting currently shows a total of 1354 (229 submitted this month) experiments validated on hg19. A spreadsheet of hg19 submissions as reported by the pipeline is: Media:EncodeExperimentsHg19.2010-01-20.xlsx

Some new datatype submissions that have been submitted for this freeze but are not yet sufficiently processed to appear in reporting are:

 GIS ChiaPET (Ruan):  6 experiments:  Pol2 in K562, Hela-S3, NB4, HCT-116 and MCF-7, and ER in MCF-7
 UChicago TFBS (White): 7 experiments in K562: JunB, JunD, HDAC8, Fos, GATA2, NR4A1 and Control (tagged w/ eGFP)
 SYDH Nucleosome (Snyder): 1 experiment in K562

Reports from the DCC wranglers regarding hg19 track status are here:

Data release

The DCC quality group is currently reviewing the following hg19 and mm9 tracks for release to the public:

  • Riken CAGE
  • Caltech RNA-seq
  • Stan/Yale TFBS (mm9)
  • UW Histone

The next tracks for review will include:

  • SUNY Tiling
  • GIS RNA PET
  • UW Affy Exon

Ready response by production groups to any issues arising during the Q/A process will help assure speedy release.

ENCODE-related data

Recent data releases on hg19/GRCh37 from the UCSC Browser group are listed here:

http://genome.ucsc.edu/goldenPath/releaseLog.html#hg19

Other News

  • Early public access to ENCODE data: UCSC is completing deployment of a Preview Browser server (http://genome-preview.ucsc.edu). This server will be a mirror of genome-test, synchronized on a weekly basis.
  • UCSC is updating the ENCODE portal to provide a more user-friendly interface, based on input from PI's and the ENCODE101 manuscript review. The current version (still under development) is viewable on genome-test.

Mouse ENCODE

Data Submissions

  • As of this reporting, a total of 87 experiments have been submitted and validated by the Mouse ENCODE groups. The latest spreadsheet of Mouse ENCODE experiments submitted to the DCC is: File:MouseExperiments.2011-01-20.xlsx

Data Release

  • The SYDH TFBS track (w/o CH12) is currently under review by Q/A group at the DCC, with release expected next week. This will be the first MOUSE ENCODE track to be released on the UCSC public site. The other Mouse ENCODE tracks on the test server will soon be publicly available on the UCSC Preview site (http://genome-preview.ucsc.edu), soon to be announced.

Other News

  • As part of the ENCODE portal update in progress, a Mouse Data Summary page is under development. The initial version will host a snapshot of the submissions spreadsheet. We are planning to replace this with a more graphic matrix-oriented interface. The planning spreadsheet is also linked to this page.
  • Work is in progress at the DCC to submit Mouse ENCODE data to GEO. GEO is setting up a Mouse ENCODE project to support this.

Information needed from Labs

  • CH12 data is on hold until metadata is complete (DCC is advised regarding sex to register for the cell line).
  • Documentation is needed from these labs:
 Hardison:  TFBS and Histone tracks
 Stam:  DNase track
 Ren: TFBS and Histone tracks