NOTE! This is a read-only copy of the ENCODE2 wiki.
Please go to the ENCODE3 wiki for current information.

Small RNA Seq Hannon Lab

From Encode2 Wiki
Jump to: navigation, search

CONTACT INFO

These libraries were constructed by "Vihra Sotirova" <sotirova@cshl.edu> in the Hannon lab.


SUBMITTED RNA TYPES THIS PROTOCOL APPLIES TO

All ENCODE Transcriptome small RNA datasets


PROTOCOL DETAILS

This protocol is used by the Hannon lab to generate all the small RNA (20-200) ENCODE libraries. It generates stranded single-end reads. It will capture both capped (in so far as they are eliminated with Tobacco Acid Pyrophosphatase) and native 5' monophosphate ends. There are no additional fragmentation steps of the 20-200 RNA during library generation hence we get higher read densities at the 5' ends of transcripts.

LIBRARY CONSTRUCTION: (expand)

STEP 1: rRNA Elimination

Ribominus the RNA according to Invitrogens directions and supplemented with custom LNA probes.

(Provide LNA probe sequences).

STEP 2: Tobacco Acid Pyrophosphatase 5' Cap Removal leaving 5' monophosphates.

STEP 3: C-tailing (polyA kit, Ambion AM1350)

STEP 4: 5' RNA Linker Ligation

STEP 5: First Strand Synthesis

STEP 6: PCR Amplifiation

STEP 7: Removal of RNA

STEP 8: PCR Clean-up

STEP 9: Library Assessment

OLIGO SEQUENCES:

 5' Linker-1:rCrGrArCrUrGrGrArGrCrArCrGrArGrGrArCrArCrUrGrArCrArUrGrGrArCrUrGrArArGrGrArGrUrArGrArArA
 -or- (depending on dataset)
 5' Linker-2:rCrArCrGrArGrGrArCrArCrUrGrArCrArUrGrGrArCrUrGrArArGrGrArGrUrArGrArArA
 RT: 5′TCGCGAGCGGCCGCGGGGGGGGGGGGGGG 3′
 PCR: 5′CAAGCAGAAGACGGCATACGATCGCGAGCGGCCGCGGGGGG 3′
 PCR: 5′ AATGATACGGCGACCACCGACACGAGGACACTGACATGGACTGAAGGAGTAGAAA 3′ 
 Seq: 5′ CACGAGGACACTGACATGGACTGAAGGAGTAGAAA 3′

SEQUENCING: It is essential to perform the on-flow cell enrichment at 65C when sequencing C-tailed libraries. This greatly increases the number of reads. The reads are 36 n'tds in length.

MAPPING:

Sequence reads underwent quality filtration using Illumina standard pipeline (Gerlad). Identical reads were collapsed while maintaining their multiplicity information. The 3′ sequencing adaptor was removed from the reads using a custom clipper program, which aligned the adaptor sequence to the short-reads, allowing up to 2 mismatches and no indels. The trimmed sequences were aligned to the human genome (NCBI build 36, hg18) using Nexalign (Lassmann et al., not published). The alignment parameters tuned to tolerate up to 2 mismatches and no indels. Ee allow up to 100 hits of a multimappers. The best (how is best defined?) hit/s were reported in a bed-like format.


VALIDATION and QUALITY CONTROL (expand)

One sample is run on a single Illumina lane for a test run. If the results pass the QC for cluster density generation (criteria?) and quality scores (criteria), the sample is run on additional 7 lanes of Illumina, which completes the validation process. The resulting files are then analyzed through a custom-made pipeline and submitted to Barcelona.


PUBLICATIONS

This protocol has been published in PMID: 19169241.