Next Generation Sequencing: An Overview
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
DNA Sequencing • Refers to determining the order of nucleotide (G, A, T, and C) in a stretch of DNA • Useful in biotechnology research and discovery, diagnostics, and forensics.
What is Sequencing?...............How does it work? DNA Sequencing DNA sequencing = determining the nucleotide sequence (A, T, G, and C) of the DNA of a gene. We are currently using the Sanger’s method.
Sanger Sequencing • Utilizes dideoxunucleotides triphosphates to terminate DNA chain elongation. • Separation of molecules by gel or capillary electorphoresis and detection of dye-labelled terminator • Can be used to interrogate the sequence of single samples. • 96 samples can be run in SBS format with 24-36 runs per day. A single instrument can generate 1-2 million bases per day.
Sanger Sequencing Pipeline Library Picking & Template Prep Construction Growth Plasmids PCR Amplicons PCR Cycling Seq. Setup / AxyPrep Mag Plasmid & CleanUp Cycling AxyPrep Mag PCR Clean up AxyPrep Mag PCR Normalizer Ready to Sequence Templates 3730xl Seq Setup/ 3730 XL Analysis Cycling Analysis AxyPrep Mag DyeClean
From Slab Gels to Single Molecule Sequencers • Late 1980s Slab Gel sequencers using radioactive isotopes and later Fluorescence chemistry (10 Kb per 4 hr run). • Late 1990s Capillary sequencers (50 Kb per 1hour run) • 2005 Massive parallel pyrosequencing (20 MB per 5 hr run) • 2007 Sequencing by synthesis (1 GB per 5 day run) • 2010 Single molecule sequencing (100 GB per 5 day run) • 2013 Human genome in 15 minutes 8
Next Gen Sequencing • Employs micro- and nanotechnologies to reduce the size of sample components, reducing reagent costs, and enabling massively parallel sequencing reactions. • Highly multiplexed, allowing simultaneous sequencing and analysis of millions of samples. Multiple cycles T G C T A C
Next Gen v. Sanger
Traditional Sequencing vs. Next Generation Sequencing: Data Throughput 1 x Illumina GAII 200+ of 3730xl Vs. Days vs. Years The Sequencing Landscape is Changing
Next Generation Sequencing Platforms
Second Generation Sequencing Throughput Illumina Genome Analyzer IIx Roche GS FLX Life Technologies SOLiD 3 Plus • Sequencing by synthesis using • Sequencing by synthesis using reversible flurorescent dye chemiluminescence detection; • Sequencing by ligation; in vitro terminators using clonal single sample prep; • 400 to 500 base reads molecule array; • 35 to 2 by 50 base reads; • 1 million fragments of DNA in • 35 to 50 base reads length • 500 million to 1 billion shotgun parallel on picotitre plate • 138 to 336 million shotgun reads reads per 2 slide run; • 400 MB per 10 hr run per run; • reference sequence required • 4.5 to 36 GB per 2 to 9.5 day run • 12 – 48 GB per 3.5 to 14 day 2- slide run.
Differentiating Next Gen technologies Sheared Library Clonal Sequencing construction template DNA amplification Clonal amplification via Massively parallel sequencing-by-synthesis Illumina Library Construction bridge amplification of DNA clusters Clonal amplification with Massively parallel pyrosequencing of bead 454 Library Construction emulsionPCR and bound DNA templates enrichment Clonal amplification with Massively parallel ligation-based SOLiD Library Construction emulsionPCR and sequencing of bead bound DNA templates enrichment
Comparison of Next Gen Technologies GS FLX Genome Analyzer SOLiD Library Fragment, Mate- Fragment, Mate- Fragment, Mate- Construction Paired Paired, Paired-End Paired Sequencing Sequencing by Sequencing by Sequencing by Chemistry Synthesis Synthesis Ligation DNA Support 25-35 µm bead Flow cell surface 1 µ bead Amplification Emulsion PCR Cluster Emulsion PCR amplification Sequencing High density well- 8-channel flow Single slide Reaction Surface plate cell imaged in panel
Illumina Genome Analyzer (GA)
Illumina Genome Analyzer (GA, GAII, GAIIx) • DNA Libraries bound to a 8 channel flowcell • Sequencing by synthesis using reversible terminators cBot • Detection of fluorescent tagged bases • Readlengths of up to 2 by 100 bases. Paired end module Cluster station
Illumina workflow Library construction 1-4 days depending on application Cluster Amplification (Cluster station) Automated, approx 1 hour to set up, 5 hours run time Sequencing (GAII and Paired End module) Approx 2-9 days depending on read length and number of reads
Illumina GA - Cluster Generation Flow cell, reagents and samples loaded onto cluster station Aspirates samples and reagents into flow cell Automates the formation of amplified clonal clusters from single DNA molecules Approx 5 hours run time, 1 hour hands on time
Template amplification: no beads, no emulsions “Cluster generation” (walk-away)
Illumina GA - Cluster Generation (cont)
Illumina GA - Sequencing • Flowcell and reagents loaded onto Genome Analyzer. • 1-2 hour loading time • Walkaway automation • 2 - 9 days run time depending on read length
3’ 5’ Sequencing By Synthesis (SBS) Cycle 1: Add sequencing reagents First base incorporated A T G Remove unincorporated bases C C G T T Detect signal A C C Deblock and defluor A G T A PPP Base Fluor G T A A C T C C G G A Cycle 2-n: Add sequencing reagents and repeat C T C G A T 5’
Sequencing by Synthesis - 4 Fluors Synthesis of 2nd strand Four nucs per cycle
Genome Analyzer IIx - Specifications
Roche Genome Sequencer GS FLX
Roche GS FLX (GS20, FLX, Titanium) • Uses pyrosequencing – a process that uses chemiluminescence for detection. • Detection of light signal generated by luciferase when complimentary bases are incorporated into sequencing strand. • Libraries are bound to beads which are deposited onto a PPT plate for sequencing. • Long read lengths up to 400bp. • Run time approx 10 hours.
GS FLX Workflow DNA Library Preparation emPCR and enrichment Sequencing 1 to 3 days 1.5 days 1 day DNA Library Preparation emPCR Sequencing Fragment DNA through nebulization Water-in-oil emulsion DNA beads or other means. placed in pico- Attach adapters to DNA fragments. Fixes adapter-ligated titre plate device. Prepare single-stranded DNA library fragments to small Uses with adapters. DNA-capture beads pyrosequencing Recently, a rapid protocol has been chemistry introduced which eliminates the Purification of Detection using a need for making the adapter-ligated amplified DNA CCD camera DNA fragments single stranded. colonies on beads 10 hr sequencing run
Emulsion PCR Mix PCR aqueous phase into a water-in-oil (w/o) emulsion and carry out emulsion PCR
Enrichment Beads with amplified DNA are purified using magnetic enrichment beads. Approximately 1/3 of beads have a product.
GS FLX: Bead deposition DNA beads are loaded DNA beads packed into the wells of the PTP. into wells with surrounding beads and sequencing Empty PicoTiter slide enzymes.
GS FLX: Instrument Loading Genome is loaded Load PicoTiterPlate into into instrument. a PicoTiterPlate. Load reagents in a Sequence entire genome single rack. at once, in real-time.
GS FLX: Sequencing-by- synthesis • Simultaneous sequencing of million of DNA library molecules in a pico-titre plate.. • Pyrophosphate signal generation upon complimentary nucleotide incorporation — dark otherwise. DNA capture bead containing millions of copies of a single clonal fragment
GS FLX Sequencing-by- synthesis Repeated dNTP flow sequence: G T C A Process continues until user- defined number of nucleotide flow cycles are completed. A A T C G G C A T G C T A A A A G T C A T T A G C C G T A C G C A T T T T C G A T C G T C A G A G T Anneal Primer
Advantages/Disadvantages • Advantages • Disadvantages – Q20 read lengths of 400 – Difficulty getting through bases (99% accuracy at the homopolymers 400th base and higher for preceding bases) – Each run is expensive and – significantly higher hence not ideal for re- throughput compared to sequencing applications Sanger sequencing compared to the Genome – Does not rely on cloning Analyzer and/or SOLiD efficiency – DNA libraries can be barcoded and separated during data analysis.
Life Technologies SOLiD
Life Technologies SOLiD • Sequencing by Oligo Ligation and Detection • Libraries are bound to beads which are covalently attached to a glass support surface after emulsion PCR • Uses fluorescently labelled oligomers • Dibase encoding • Read lengths up to 2 by 50bp •Up to 8 samples/slide
Emulsion PCR
Enrichment P2’ Large P1 Polystyrene P2 P2 P1 bead coated with P2 Centrifuge in Supernatant glycerol gradient Captured beads with templates Pellet Beads with no template
Bead Deposition 3’-end modification Beads attached to glass surface in a random array
Slide deposition and installation
Sequencing by Ligation
Sequence Data Analysis • 4 dyes to encode 16, 2-base combinations • Each base is interrogated by two probes, two different ligation reactions • Dual interrogation eases discrimination errors – Random or systematic vs. True polymorphisms (SNPs) Data is best analyzed in color space - Leverages di-based advantages
SOLiD 3 Plus Specifications
Multiplexing
What is multiplexing? • Multiplexing: a method to analyze multiple biological samples in a single sample. • Barcodes are unique sequence identifiers added to samples during library construction. • Once barcodes are added, multiple libraries can be pooled together for emulsion PCR/cluster generation and sequencing. • Sequence data is then analyzed and traced back to each source.
Multiplexing • Simpler workflow, ease-of-use • Lower running costs • Higher number of samples per run Standard Protocol Multiplexing Protocol 8 Samples 16 Samples 128 Samples 8 Libraries 16 Libraries 128 Libraries 8 Emulsions 1 Emulsion 8 Emulsions
Why it Multiplexing Important? Next generation DNA sequencing generates massive amounts of sequence data. Currently more data is generated per library than is required. To overcome this, researchers multiplex multiple libraries into single lane. However, generation of libraries is a bottleneck. Most researchers are not able to do this. SPRIworks will relieve this bottleneck by enabling researchers to make more DNA libraries faster.
Summary of Next Generation Sequencing Platforms
Examples of Next Generation Sequencing Applications De novo sequencing “De novo sequencing is the initial sequencing that results in the primary genetic sequence of organisms. A detailed genetic analysis of an organism is possible only after de novo sequencing has been performed.” - Applied Biosystems website Re-sequencing- looks for variation between strains or individuals cDNA Sequencing (Fragmented cDNAs)- Sequencing of transcribed regions Amplicon/PCR sequencing- could be targeted re-sequencing or possibly genome sequencing (i.e. viral).
You can also read