NOTE! This is a read-only copy of the ENCODE2 wiki.
Please go to the ENCODE3 wiki for current information.
Small RNA Seq Hannon Lab
These libraries were constructed by "Vihra Sotirova" <firstname.lastname@example.org> in the Hannon lab.
SUBMITTED RNA TYPES THIS PROTOCOL APPLIES TO
All ENCODE Transcriptome small RNA datasets
This protocol is used by the Hannon lab to generate all the small RNA (20-200) ENCODE libraries. It generates stranded single-end reads. It will capture both capped (in so far as they are eliminated with Tobacco Acid Pyrophosphatase) and native 5' monophosphate ends. There are no additional fragmentation steps of the 20-200 RNA during library generation hence we get higher read densities at the 5' ends of transcripts.
LIBRARY CONSTRUCTION: (expand)
STEP 1: rRNA Elimination
Ribominus the RNA according to Invitrogens directions and supplemented with custom LNA probes.
(Provide LNA probe sequences).
STEP 2: Tobacco Acid Pyrophosphatase 5' Cap Removal leaving 5' monophosphates.
STEP 3: C-tailing (polyA kit, Ambion AM1350)
STEP 4: 5' RNA Linker Ligation
STEP 5: First Strand Synthesis
STEP 6: PCR Amplifiation
STEP 7: Removal of RNA
STEP 8: PCR Clean-up
STEP 9: Library Assessment
5' Linker-1:rCrGrArCrUrGrGrArGrCrArCrGrArGrGrArCrArCrUrGrArCrArUrGrGrArCrUrGrArArGrGrArGrUrArGrArArA -or- (depending on dataset) 5' Linker-2:rCrArCrGrArGrGrArCrArCrUrGrArCrArUrGrGrArCrUrGrArArGrGrArGrUrArGrArArA RT: 5′TCGCGAGCGGCCGCGGGGGGGGGGGGGGG 3′ PCR: 5′CAAGCAGAAGACGGCATACGATCGCGAGCGGCCGCGGGGGG 3′ PCR: 5′ AATGATACGGCGACCACCGACACGAGGACACTGACATGGACTGAAGGAGTAGAAA 3′ Seq: 5′ CACGAGGACACTGACATGGACTGAAGGAGTAGAAA 3′
SEQUENCING: It is essential to perform the on-flow cell enrichment at 65C when sequencing C-tailed libraries. This greatly increases the number of reads. The reads are 36 n'tds in length.
Sequence reads underwent quality filtration using Illumina standard pipeline (Gerlad). Identical reads were collapsed while maintaining their multiplicity information. The 3′ sequencing adaptor was removed from the reads using a custom clipper program, which aligned the adaptor sequence to the short-reads, allowing up to 2 mismatches and no indels. The trimmed sequences were aligned to the human genome (NCBI build 36, hg18) using Nexalign (Lassmann et al., not published). The alignment parameters tuned to tolerate up to 2 mismatches and no indels. Ee allow up to 100 hits of a multimappers. The best (how is best defined?) hit/s were reported in a bed-like format.
VALIDATION and QUALITY CONTROL (expand)
One sample is run on a single Illumina lane for a test run. If the results pass the QC for cluster density generation (criteria?) and quality scores (criteria), the sample is run on additional 7 lanes of Illumina, which completes the validation process. The resulting files are then analyzed through a custom-made pipeline and submitted to Barcelona.
This protocol has been published in PMID: 19169241.