NOTE! This is a read-only copy of the ENCODE2 wiki.
Please go to the ENCODE3 wiki for current information.


From Encode2 Wiki
Jump to: navigation, search

5C Documentation

Spatial Organization of Genomes


Although the DNA of chromosomes is a linear sequence, the living genome does not function in a linear fashion. This is most clearly illustrated by the fact that genes are often regulated by elements that can be located far away along the genome sequence. Recent evidence shows that regulatory elements can act over large genomic distances by engaging in direct physical interactions with target genes, resulting in the formation of chromatin loops. Based on these observations we have proposed that the spatial organization of the genome resembles a three-dimensional network that is driven by physical associations between genes and regulatory elements, both in cis (along the same chromosome) and in trans (between different chromosomes) (Dekker, Nat. Methods 2006).

Long-range chromatin looping interactions can be detected using the Chromosome Conformation Capture (3C) technology (Dekker et al. 2002). 3C employs formaldehyde cross-linking to covalently link interacting chromatin segments in intact cells. Cells are subsequently lysed and chromatin is digested and then ligated under dilute conditions so that cross-linked fragments are preferentially ligated. As a result a genome-wide library is formed composed of ligation products that each correspond to specific long-range interactions. Ligation products can be detected by PCR. 3C is particularly suited for analysis of relatively small genomic regions (up to a few hundred kilobases) to study interactions between candidate genomic elements.

To dramatically increase the 3C throughput, the 3C carbon copy (5C) method was developed (Dostie et al., 2006; Dostie and Dekker, 2007). The 5C method greatly increases the scale of chromatin interaction detection by replacing the PCR detection step of 3C with ligation-mediated amplification (LMA). The major advantage of LMA is that it can be performed at a very high level of multiplexing using, in a single assay, thousands of primer, which combined can detect millions of chromatin interactions (ligation junctions) in parallel. The LMA step effectively "copies" 3C ligation products into much smaller 5C ligation products that correspond precisely to ligation junctions formed during the 3C procedure. The products of the multiplexed LMA reaction constitute the 5C library. The composition of the 5C library is determined high throughput DNA sequencing .


The aim of this pilot study is to generate a “connectivity map” between all genes and regulatory elements within the 44 ENCODE PILOT regions.

In the current scheme, 5C primers were constructed for all HindIII restriction fragments in the 44 ENCODE PILOT regions. Reverse primers were designed on fragments containing the promoter (TSS) of all annotated genes. Forward primers were designed on all other fragments. This design allows for the interrogation of all TSS with all other restriction fragments, thus geneating a connectivity map betwen all TSS and regulatory elements.

For ENCODE PILOT regions not containing a gene (i.e. ENr313) an altering primer design was used.

Primers are selected for relative uniqueness using a custom 15-mer frequency table, as well as by BLAST. A custom hexamer barcode was added to each primer to ensure the sequence was unique relative to the primer pool being used. Primers were also selected for the appropriate Tm and GC content and are modified on the 3’ end if necessary.

In the ENCODE manual pick set, 3,095 separate primers were used - 439 reverse primers and 2,656 forward primers. Using this primer scheme, 1,165,984 possible interactions are detectable - 89,620 being cis interactions (interactions between same ENm region) and 1,076,364 being trans interactions (interactions between different ENm regions). Currently data for two biological replicates have been generated for ENCODE Tier I cells (GM12878 and K562) spanning 14 ENCODE Manual regions and 1 gene desert region (ENr313) using high throughput paired End sequencing via the Illumina GA2 platform.

A full suite of web-based 5C tools were developed for the design, analysis and data visualization of 5C data. These tools allow users to design a 5C experiment for any given locus / species and ease you through the primer layout and filtering processes. Once designed a full spectrum of analysis, integration and visualizations tools become available.


Data was produced by the Dekker Lab at UMASS Medical School, Worcester, MA.
Bryan Lajoie, Amartya Sanyal, Ye Zhan and Job Dekker.
More information can be found on the Dekker Lab website -


Lajoie, B.R., van Berkum, N.L., Sanyal, A. and Dekker, J. (2009) My5C: web tools for chromosome conformation capture studies. Nat. Methods, 6(10): 690-691

Dostie, J. and Dekker, J. (2007).Mapping networks of physical interactions between genomic elements using 5C technology. Nature Protocols, 2(4): 988-1002.

Dostie, J., Richmond, T.A., Arnaout, R.A., Selzer, R.R., Lee, W.L., Honan, T.A., Rubio, E.D., Krumm, A., Lamb, J., Nusbaum, C., Green, R.D. and Dekker, J. (2006). Chromosome Conformation Capture Carbon Copy (5C): A massively parallel solution for mapping interactions between genomic elements. Genome Research, 16(10): 1299-1309.

Dekker, J. (2006).The three C's of chromosome conformation capture: controls, controls, controls<A>. Nat. Methods, 3(1): 17-21.

Dekker, J., Rippe, K., Dekker, M., and Kleckner, N. (2002). Capturing Chromosome Conformation. Science, 295, 1306-1311.