31
Dec

A new genome sequencing method from Complete Genomics.

This DNA ligation based genome sequencing is new to me, although SOLiD has been using it for a while now. The following paper will be on the first issue of Science of 2010.

Science. 2009 Nov 5. [Epub ahead of print]

Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays.

Drmanac R, Sparks AB, Callow MJ, Halpern AL, Burns NL, Kermani BG, Carnevali P, Nazarenko I, Nilsen GB, Yeung G, Dahl F, Fernandez A, Staker B, Pant KP, Baccash J, Borcherding AP, Brownley A, Cedeno R, Chen L, Chernikoff D, Cheung A, Chirita R, Curson B, Ebert JC, Hacker CR, Hartlage R, Hauser B, Huang S, Jiang Y, Karpinchyk V, Koenig M, Kong C, Landers T, Le C, Liu J, McBride CE, Morenzoni M, Morey RE, Mutch K, Perazich H, Perry K, Peters BA, Peterson J, Pethiyagoda CL, Pothuraju K, Richter C, Rosenbaum AM, Roy S, Shafto J, Sharanhovich U, Shannon KW, Sheppy CG, Sun M, Thakuria JV, Tran A, Vu D, Zaranek AW, Wu X, Drmanac S, Oliphant AR, Banyai WC, Martin B, Ballinger DG, Church GM, Reid CA.

Complete Genomics, Inc., 2071 Stierlin Court, Mountain View, CA 94043, USA.

Genome sequencing of large numbers of individuals promises to advance the understanding, treatment, and prevention of human diseases, among other applications. We describe a genome sequencing platform that achieves efficient imaging and low reagent consumption with combinatorial probe anchor ligation (cPAL) chemistry to independently assay each base from patterned nanoarrays of self-assembling DNA nanoballs (DNBs). We sequenced three human genomes with this platform, generating an average of 45- to 87-fold coverage per genome and identifying 3.2 to 4.5 million sequence variants per genome. Validation of one genome data set demonstrates a sequence accuracy of about 1 false variant per 100 kilobases. The high-accuracy, affordable cost of $4,400 for sequencing consumables and scalability of this platform enable complete human genome sequencing for the detection of rare variants in large-scale genetic studies.

PMID: 19892942

Based on the company's online material and reading of the paper, I am really amazed by the technology that assembles genome sequencing from these 10-nt base pairing!


Four adapters were used in the figure, which can generate 80-nt read. In their online supplemental data, they indicated that six adapters can be used for 120-nt read per DNA nanoball.


To generate the DNA nanoball for sequencing, the phage phi 29 DNA polymerase is used!

J Biol Chem. 1989 May 25;264(15):8935-40.
Highly efficient DNA synthesis by the phage phi 29 DNA polymerase. Symmetrical mode of DNA replication.

Blanco L, Bernad A, Lázaro JM, Martín G, Garmendia C, Salas M.

Centro de Biología Molecular (Consejo Superior de Investigaciones Científicas), Universidad Autónoma de Madrid, Spain.

The results presented in this paper indicate that the phi 29 DNA polymerase is the only enzyme required for efficient synthesis of full length phi 29 DNA with the phi 29 terminal protein, the initiation primer, as the only additional protein requirement. Analysis of phi 29 DNA polymerase activity in various in vitro DNA replication systems indicates that two main reasons are responsible for the efficiency of this minimal system: 1) the phi 29 DNA polymerase is highly processive in the absence of any accessory protein; 2) the polymerase itself is able to produce strand displacement coupled to the polymerization process. Using primed M13 DNA as template, the phi 29 DNA polymerase is able to synthesize DNA chains greater than 70 kilobase pairs. Furthermore, conditions that increase the stability of secondary structure in the template do not affect the processivity and strand displacement ability of the enzyme. Thus, the catalytic properties of the phi 29 DNA polymerase are appropriate for a phi 29 DNA replication mechanism involving two replication origins, strand displacement and continuous synthesis of both strands. The enzymology of phi 29 DNA replication would support a symmetrical model of DNA replication.

PMID: 2498321


Position DNA nanoballs into defined pattern for optimized imaging and analysis


They use 10-nt base pairing, instead of previous 6/7-nt base pairing. Not sure any trick involved. The four-color sequencing is based on the following dyes
5’‐pNNNANNNNN‐Quasar 670
5’‐pNNNGNNNNN‐Quasar 570
5’‐pNNNCNNNNN‐Cal fluor red 610
5’‐pNNNTNNNNN‐fluorescein


Still not quite understand the "mate gap", which is like 370-nt long ??


Just like all other next-generation sequencing methods, it highly involves advanced math and enormous computing power.

The advantages of this unchained, DNA nanoball, patterned genome sequencing method are:

* Because the DNA Nanoball sequencing substrates are produced by rolling-circle replication (33) in a uniform-temperature, solution-phase reaction with high template concentrations (>20 billion per ml), this system avoids substantial selection bottlenecks and nonclonal DNBs. This circumvents the stochastic inefficiencies of approaches that require precise titration of template concentrations for in situ clonal amplification in emulsion (9, 14, 29) or bridge PCR (6, 19).

* Our patterned arrays include high-occupancy and high-density nanoarrays self-assembled on photolithography-patterned, solid-phase substrates through electrostatic adsorption of solution-phase DNBs and yield a high proportion of informative pixels. This results in several hundred reaction sites in the compact (~300-nm diameter) DNB that produce bright signals useful for rapid imaging of the sequences ... in high image efficiency and reduced reagent consumption that enable high sequencing throughput per instrument

* Both sequencing by synthesis (SBS) and sequencing by ligation (SBL) use chained reads, wherein the substrate for cycle N + 1 is dependent on the product of cycle N; consequently, errors may accumulate over multiple cycles and data quality may be affected by errors (especially incomplete extensions) occurring in previous cycles. The independent, unchained nature of cPAL avoids error accumulation and tolerates low-quality bases in otherwise high-quality reads, thereby decreasing reagent costs. The average sequencing consumables cost for these three genomes was under $4400

* This recursive process can be implemented in batches of 96 samples and extended by inserting additional adapters to read 120 bases or more per DNB. The current read length is comparable to other massively parallel sequencing technologies (6–12).

Leave a Reply