NOTE! This is a read-only copy of the ENCODE2 wiki.
Please go to the ENCODE3 wiki for current information.

Integration paper new

From Encode2 Wiki
Jump to: navigation, search


Final Submission version (Updated 21st June 2012)

ReSubmission version (Updated 4th April 2012)

Submitted version



Draft versions of figures, tables and other supplementary information.

latest Figures 23rd November

Figure Ppt mockup Full version Supplementary Document Comments
Figure 1 Fig1.pptx New File:Fig1A.pdf File:Fig1B.pdf Word version "ENCODE has lots of data" : Preliminary version of figure
Figure 2 Fig2_overview Updated 4th Nov

Component PDFs here: [1]

Draft (google doc) Multi-panel selection figure: Luke ward, Javier Herrero
  • A: Mammalian constraint vs Hetreozygosity
  • B: Something about Primate specific elements
  • C: Motif IC content vs conservation
  • D: Scatter plot of Motif correlation to conservation
Figure 3 Fig 3 pdf
Figure 4 Fig4.pptx Fig4AC.pdf Fig4B.pdf Fig4D.pdf Supplementary_Info_Fig4.doc (promoter centric integration) Xianjun Dong, Chao Cheng
  • A: Distribution of histone modifications over promoters, in LCP and HCP
  • B: Distribution of TF density over promoters, in LCP HCP, TATA and something else pulled out?
  • C: Quantitative model fit for LCP and HCP; histones
  • D: Quantitative model fit for TFs?

Preliminary version

Figure 5 Fig5.pptx , Fig5.pdf Individual panel pdfs (tar.gz) Updated 16th Nov Final google doc , Word doc, Data+code package Updated 10th Nov Patterns of chromatin modificiations at transcription factor binding sites. (tf_anchored) Anshul Kundaje, Ewan Birney
Figure 6 Fig6.pptx FIg 6 source pdf Preliminary draft (not final) First figure to go to Darryl Leja! Well done, Kevin Yip! Revised to be colour blindness friendly.
Figure 7 Fig7.pptx No longer needed Google doc [2] Patterns of chromatin modifications at other genomic features
  • A. Exons
  • B. Model of exon inclusion from Histone modifications
  • C. repeat sequences (repeat_anchored) Ian Dunham/Ewan Birney. Likely to be similar to TF line above.
Figure 8 (Now Figure 7) Fig8.pptx

panel A panel B (part 1) panel B (part 2) panel C panel D

Word version genome Segmentation Steve Wilder/Michael Hoffman/Jason Ernst
  • A. Illustrative region with both segmentations and joint call
  • B. Association vs different types of elements
  • C. RNA segmentation; confusion plot of RNA vs Chromatin if it makes sense
  • D. Methylation is referred to in the text.
Figure 9 (Now Figure 8) Fig9.pptx

Tar file of high resolution pngs Media:fig9.tar.gz

SOM analysis. Preliminary draft from Ali.
Figure 10 (Now Figure 9) Fig10.pptx New Google doc [3] Experimental_validation) : Chris/Ewan
  • Experimental design figure
  • Fish
  • Mouse
  • Enrichment over background of different methods vs different assays
Figure 11 (Now Figure 10) Fig11.pptx Updated 16th Nov Fig11 panel A Fig11 panel B Fig11 panel B (pdf) Fig11 panel C Fig11 panel C (pdf) Fig11 panel D Fig11 panel D (pdf) Supplementary_Info_Allele_Specific.docx Allele specific information: Bob Altshuler
  • A. Example locus
  • B. Example assay combination
  • C. Pairwise assay correlation
  • D. Pol2-->Cytoplasmic RNA progression
Figure 12 (Now Figure 11) Fig12new.pptx New Panel C horizontal Figure 12: (personal_genome) Personal genomes/rare disease. Joel/Mark

A. Example region with assay recalled on personal haplotypes B. Diagram indicating the new set of variants prioritised by ENCODE C. Exclusion of Somatic variants in Cancer.

Figure 13 (Now Figure 12) Fig13_GWAS_composite_v7.pdf New, coloring in panel d now reflects p<0.01, removed some tracks from panel e. Fig13d_GWAS_TFandDHSpeaks_v8.pdf Panel d only Fig13e_CrohnsChr5desertV2.pdf Panel e only SupplFig_GWAS_SNPs_partitions_v2 New uses same datasets as main figure; SupplFig_GWASenrichDiffCells_v2 New random samplings shown as boxplots; SuppIementary Info on GWAS and ENCODE New


What does ENCODE add to the study of GWAS association (Ross/Belinda/Weisheng Wu/Bob H/Ali Mortazavi/Marc Schaub/others)
  • a. Fraction of GWAS SNPs in Regulatory Elements
  • b. Enrichment of GWAS SNPs om function-associated segments
  • c. Enrichment of GWAS SNPs in Functional Segments by SOM
  • d. Clustering of Disease phenotypes with TF OSs and with DHSs (by cell-type)
  • e. Example locus: 5q13.1 gene desert implicated in regulating PTGER4

The Supplementary Info applies to this Figure, the two Supplementary Figures and the Supplementary Table Xa and Xb.

Table 1: Encode TF Class Summary word version New Word version Based on the Factorbook metatable processed using ' Metatable_Factorbook.tsv'
TableX2: Summary of histone modifications
TableX3: Summary of ENCODE Combined Segmentation States word version New
Supplementary Table 1: Encode TF Categorisation sets excel version New Based on the Factorbook metatable processed using ' Metatable_Factorbook.tsv'
Supplementary Table 2: Encode Data sets excel version New Based on downloads from the encode-test hg19 site. Counts are of files in brackets, and of experiments using the dccAccession to group files to experiment. Provisional draft of table. Supplementary details to follow. what format should this be in?
Supplementary Table 3: Main ENCODE element counts and element lengths excel version New Generated by counting elements from AWG processed element files for ChIP-seq (spp for TFBS or Macs for histone modifications), Dnase-seq (FDR 0.01 peaks see Locations_of_ENCODE_Data), Faire-seq (lab calls), Long and Short RNA calls (IDR see Locations_of_ENCODE_Data). Provisional draft of table. Supplementary details to follow.
Supplementary Table 4: ENCODE Gene Annotation Statistics (GENCODE). excel version New From Gencode (GRCP001)
Saturation curve fitting DNase1


Weibull distribution fitting of element (max 5000bp) saturation of DNase1 and CTCF. The best fit suggests saturation counts of 4,099,340 for Dnase1 and 181,186 for CTCF.
Supplementary Table XXX: TF Coassociation list excel version New From NCP002, GRCP018
Supplementary Table X. GWAS-ENCODE Associations Table Xa. GWAS phenotypes (1 row per phenotype) and associated SNPs that overlap with TF occupied segments and DNAse peaks in the ENCODE cell types; Table Xb. Individual GWAS SNPs and their overlaps with TF occupied segments in the ENCODE cell types;

Table Xc. Individual GWAS SNPs and their overlaps with DNase peaks in the ENCODE cell types;

These tables list GWAS SNPs, the phenotype with which they are associated, and (a) TF occupied segments or (b) DNase peaks that contain them. Table Xc has a single phenotype per row, and the number of SNP-TF overlaps is given in each cell (like in Fig 13d).