US20090075343A1 - Selection of dna adaptor orientation by nicking - Google Patents

Selection of dna adaptor orientation by nicking Download PDF

Info

Publication number
US20090075343A1
US20090075343A1 US11/934,695 US93469507A US2009075343A1 US 20090075343 A1 US20090075343 A1 US 20090075343A1 US 93469507 A US93469507 A US 93469507A US 2009075343 A1 US2009075343 A1 US 2009075343A1
Authority
US
United States
Prior art keywords
adaptor
library constructs
adaptors
nucleic acid
target nucleic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/934,695
Inventor
Andrew Sparks
Steven Huang
Radoje Drmanac
Arnold Oliphant
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Complete Genomics Inc
Original Assignee
Complete Genomics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Complete Genomics Inc filed Critical Complete Genomics Inc
Priority to US11/934,695 priority Critical patent/US20090075343A1/en
Assigned to COMPLETE GENOMICS, INC. reassignment COMPLETE GENOMICS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: OLIPHANT, ARNOLD, DRMANAC, RADOJE, HUANG, STEVEN, SPARKS, ANDREW
Publication of US20090075343A1 publication Critical patent/US20090075343A1/en
Priority to US12/573,697 priority patent/US8518640B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/66General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease

Definitions

  • Embodiments described and claimed herein address the foregoing and other situations by providing methods to provide repeated cycles of nucleic acid cleavage and ligation to insert multiple DNA adaptors into a population of circular target DNAs at defined positions and orientations with respect to one another.
  • the resulting multi-adaptor constructs are then used in massively-parallel nucleic acid sequencing techniques.
  • the described technology provides in one aspect a method for enriching for orientation of two adaptors with respect to one another in nucleic acid library constructs comprising: obtaining target nucleic acids; ligating a first adaptor to the target nucleic acids to produce first library constructs, wherein one strand of the first adaptor comprises a first nickable site; ligating a second adaptor to the first library constructs to produce second library constructs, wherein one strand of the second adaptor comprises a second nickable site; circularizing the second library constructs; nicking the second library constructs to form a mixture of library constructs with nicks on both strands and library constructs with nicks on one strand and no nick on the other strand; and subjecting the library constructs to circle dependant amplification, wherein the strands with no nicks will be amplified exponentially and the strands with nicks will be amplified linearly, thereby enriching for orientation of the second adaptor with respect to the first adaptor in the nucleic
  • the first library constructs are circularized between the ligating processes, and, in other aspects of the methods, the first library constructs are cut with a restriction endonuclease after being circularized.
  • the first adaptor is ligated to the target nucleic acid as two adaptor arms; and in some aspects, the second adaptor is ligated to the first library construct as two adaptor arms.
  • the first and second adaptors comprise Type IIs endonuclease recognition sites.
  • the strands are nicked with a nickase, in other aspects, uracil is incorporated into one strand of an adaptor and nicking is accomplished by using uracil-DNA glycosylase.
  • Additional aspects of the technology provide methods for enriching for orientation of two adaptors with respect to one another in nucleic acid library constructs comprising: obtaining target nucleic acids; ligating a first adaptor to the target nucleic acids to produce first library constructs, wherein one strand of the first adaptor comprises a first nickable site and a Type IIs endonuclease recognition site; circularizing the first library constructs; cutting the first library constructs with a Type IIs endonuclease to produce linearized first library constructs; ligating a second adaptor to the linearized first library constructs to produce second library constructs, wherein one strand of the second adaptor comprises a second nickable site; circularizing the second library constructs; nicking the second library constructs to form a mixture of library constructs with nicks on both strands and library constructs with nicks on one strand and no nick on the other strand; and subjecting the library constructs to circle dependant amplification, wherein the strands with no
  • Yet other aspects of the technology provide methods for enriching or selecting for orientation of two or more adaptors with respect to one another in nucleic acid library constructs comprising: (a) obtaining target nucleic acids; (b) ligating a first adaptor to the target nucleic acids to produce first library constructs, wherein one strand of the first adaptor comprises a first nickable site; (c) ligating a second adaptor to the first library constructs to produce second library constructs, wherein one strand of the second adaptor comprises a second nickable site; (d) circularizing the second library constructs; (e) nicking the second library constructs to form a mixture of library constructs with nicks on both strands and library constructs with nicks on one strand and no nick on the other strand; (f) subjecting the library constructs to circle dependant amplification, wherein the strands with no nicks will be amplified exponentially and the strands with nicks will be amplified linearly, thereby enriching
  • the first library constructs are circularized between the ligating processes. In other aspects of the methods, the first library constructs are cut with a restriction endonuclease after being circularized. In yet other aspects of the methods, the first adaptor is ligated to the target nucleic acid as two adaptor arms; and in some aspects, one or more of the second and subsequently-added adaptors are ligated as two adaptor arms. Also, in some aspects the first and second adaptors comprise Type IIs endonuclease recognition sites. In some aspects, the strands are nicked with a nickase, in other aspects, uracil is incorporated into one strand of an adaptor and nicking is accomplished by using uracil-DNA glycosylase.
  • kits for selecting for desired orientations of multiple adaptors in library constructs comprising a first double-stranded adaptor comprising a first Type IIs restriction endonculease recognition site and a nickable site; a second double-stranded adaptor comprising a first Type IIs restriction endonculease recognition site and a nickable site; and primers complimentary to each strand of the first and second adaptors.
  • FIG. 1 is a simplified flow diagram of an overall method for sequencing nucleic acids using the processes of the claimed invention.
  • FIG. 2 is a schematic representation of one aspect of a method for assembling adaptor/target nucleic acid library constructs.
  • FIG. 3 is a schematic illustration of a basic adaptor insertion process.
  • FIG. 4 is a schematic illustration of one aspect of a DNA array employing multi-adaptor nucleic acid library constructs.
  • FIG. 5 is a schematic illustration of the components that may be used in an exemplary sequencing-by-ligation technique.
  • FIG. 6 is a schematic illustration of an insertion of a second adaptor relative to a first adaptor in a nucleic acid library construct.
  • FIG. 7 is a schematic representation of components of an exemplary adaptor useful for selecting insertion orientation.
  • FIG. 8 is a schematic representation of adaptor insertion allowing subsequent circularization of the target/adaptor construct.
  • FIG. 9 is a schematic illustration of a process where a desired orientation of a second adaptor relative to a first adaptor is selected using nicking of the adaptor.
  • FIG. 10 is a schematic representation of a nicking process for selecting constructs where four adaptors are inserted into a target nucleic acid in a desired orientation.
  • Adaptor refers to an engineered construct comprising “adaptor elements” where one or more adaptors may be interspersed within target nucleic acid in a library construct.
  • the adaptor elements or features included in any adaptor vary widely depending on the use of the adaptors, but typically include sites for restriction endonuclease recognition and/or cutting, sites for primer binding (for amplifying the library constructs) or anchor primer binding (for sequencing the target nucleic acids in the library constructs), nickable sites, and the like.
  • adaptors are engineered so as to comprise one or more of the following: 1) a length of about 20 to about 250 nucleotides, or about 40 to about 100 oligonucleotides, or less than about 60 nucleotides, or less than about 50 nucleotides; 2) features so as to be ligated to the target nucleic acid as two “arms”; 3) different and distinct anchor binding sites at the 5′ and the 3′ ends of the adaptor for use in sequencing of adjacent target nucleic acid; and 4) one or more restriction sites.
  • Amplicon means the product of a polynucleotide amplification reaction. That is, it is a population of polynucleotides that are replicated from one or more starting sequences. Amplicons may be produced by a variety of amplification reactions, including but not limited to polymerase chain reactions (PCRs), linear polymerase reactions, nucleic acid sequence-based amplification, circle dependant amplification and like reactions (see, e.g., U.S. Pat. Nos. 4,683,195; 4,965,188; 4,683,202; 4,800,159; 5,210,015; 6,174,670; 5,399,491; 6,287,824 and 5,854,033; and US Pub. No. 2006/0024711).
  • PCRs polymerase chain reactions
  • linear polymerase reactions nucleic acid sequence-based amplification
  • circle dependant amplification circle dependant amplification and like reactions
  • CDR cle dependant replication
  • the primer(s) used may be of a random sequence (e.g., one or more random hexamers) or may have a specific sequence to select for amplification of a desired product. Without further modification of the end product, CDR often results in the creation of a linear construct having multiple copies of a strand of the circular template in tandem, i.e. a linear, single-stranded concatamer of multiple copies of a strand of the template.
  • CDA cle dependant amplification
  • the primers used may be of a random sequence (e.g., random hexamers) or may have a specific sequence to select for amplification of a desired product. CDA results in a set of concatemeric double-stranded fragments is formed.
  • “Complementary” or “substantially complementary” refers to the hybridization or base pairing or the formation of a duplex between nucleotides or nucleic acids, such as, for instance, between the two strands of a double-stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single-stranded nucleic acid.
  • Complementary nucleotides are, generally, A and T (or A and U), or C and G.
  • Two single-stranded RNA or DNA molecules are said to be substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the other strand, usually at least about 90% to about 95%, and even about 98% to about 100%.
  • Duplex means at least two oligonucleotides or polynucleotides that are fully or partially complementary and which undergo Watson-Crick type base pairing among all or most of their nucleotides so that a stable complex is formed.
  • annealing and “hybridization” are used interchangeably to mean formation of a stable duplex.
  • Perfectly matched in reference to a duplex means that the poly- or oligonucleotide strands making up the duplex form a double-stranded structure with one another such that every nucleotide in each strand undergoes Watson-Crick base pairing with a nucleotide in the other strand.
  • a “mismatch” in a duplex between two oligonucleotides or polynucleotides means that a pair of nucleotides in the duplex fails to undergo Watson-Crick basepairing.
  • Hybridization refers to the process in which two single-stranded polynucleotides bind non-covalently to form a stable double-stranded polynucleotide.
  • the resulting (usually) double-stranded polynucleotide is a “hybrid” or “duplex.”
  • “Hybridization conditions” will typically include salt concentrations of less than about 1M, more usually less than about 500 mM and may be less than about 200 mM.
  • a “hybridization buffer” is a buffered salt solution such as 5% SSPE, or other such buffers known in the art.
  • Hybridization temperatures can be as low as 5° C., but are typically greater than 22° C., and more typically greater than about 30° C., and typically in excess of 37° C.
  • Hybridizations are usually performed under stringent conditions, i.e., conditions under which a probe will hybridize to its target subsequence but will not hybridize to the other, uncomplimentary sequences.
  • Stringent conditions are sequence-dependent and are different in different circumstances. For example, longer fragments may require higher hybridization temperatures for specific hybridization than short fragments.
  • the combination of parameters is more important than the absolute measure of any one parameter alone.
  • Generally stringent conditions are selected to be about 5° C. lower than the T m for the specific sequence at a defined ionic strength and pH.
  • Exemplary stringent conditions include a salt concentration of at least 0.01M to no more than 1M sodium ion concentration (or other salt) at a pH of about 7.0 to about 8.3 and a temperature of at least 25° C.
  • 5 ⁇ SSPE 750 mM NaCl, 50 mM sodium phosphate, 5 mM EDTA at pH 7.4
  • a temperature of 30° C. are suitable for allele-specific probe hybridizations.
  • “Ligation” means to form a covalent bond or linkage between the termini of two or more nucleic acids, e.g., oligonucleotides and/or polynucleotides, in a template-driven reaction.
  • the nature of the bond or linkage may vary widely and the ligation may be carried out enzymatically or chemically.
  • ligations are usually carried out enzymatically to form a phosphodiester linkage between a 5′ carbon terminal nucleotide of one oligonucleotide with a 3′ carbon of another nucleotide.
  • Template driven ligation reactions are described in the following references: U.S. Pat. Nos. 4,883,750; 5,476,930; 5,593,826; and 5,871,921.
  • “Microarray” or “array” refers to a solid phase support having a surface, preferably but not exclusively a planar or substantially planar surface, which carries an array of sites containing nucleic acids such that each site of the array comprises identical copies of oligonucleotides or polynucleotides and is spatially defined and not overlapping with other member sites of the array; that is, the sites are spatially discrete.
  • the array or microarray can also comprise a non-planar interrogatable structure with a surface such as a bead or a well.
  • the oligonucleotides or polynucleotides of the array may be covalently bound to the solid support, or may be non-covalently bound.
  • random array or “random microarray” refers to a microarray where the identity of the oligonucleotides or polynucleotides is not discernable, at least initially, from their location but may be determined by a particular operation on the array, such as by sequencing, hybridizing decoding probes or the like. See, e.g., U.S. Pat. Nos. 6,396,995; 6,544,732; 6,401,267; and 7,070,927; WO publications WO 2006/073504 and 2005/082098; and US Pub Nos. 2007/0207482 and 2007/0087362.
  • Nucleic acid refers generally to at least two nucleotides covalently linked together.
  • a nucleic acid generally will contain phosphodiester bonds, although in some cases nucleic acid analogs may be included that have alternative backbones such as phosphoramidite, phosphorodithioate, or methylphophoroamidite linkages; or peptide nucleic acid backbones and linkages.
  • Other analog nucleic acids include those with bicyclic structures including locked nucleic acids, positive backbones, non-ionic backbones and non-ribose backbones. Modifications of the ribose-phosphate backbone may be done to increase the stability of the molecules; for example, PNA:DNA hybrids can exhibit higher stability in some environments.
  • Primer means an oligonucleotide, either natural or synthetic, that is capable, upon forming a duplex with a polynucleotide template, of acting as a point of initiation of nucleic acid synthesis and being extended from its 3′ end along the template so that an extended duplex is formed.
  • the sequence of nucleotides added during the extension process is determined by the sequence of the template polynucleotide. Primers usually are extended by a DNA polymerase.
  • Probe means generally an oligonucleotide that is complementary to an oligonucleotide or target nucleic acid under investigation. Probes used in certain aspects of the claimed invention are labeled in a way that permits detection, e.g., with a fluorescent or other optically-discernable tag.
  • Sequence determination in reference to a target nucleic acid means determination of information relating to the sequence of nucleotides in the target nucleic acid. Such information may include the identification or determination of partial as well as full sequence information of the target nucleic acid. The sequence information may be determined with varying degrees of statistical reliability or confidence. In one aspect, the term includes the determination of the identity and ordering of a plurality of contiguous nucleotides in a target nucleic acid starting from different nucleotides in the target nucleic acid.
  • Target nucleic acid means a nucleic acid from a gene, a regulatory element, genomic DNA, cDNA, RNAs including mRNAs, rRNAs, siRNAs, miRNAs and the like and fragments thereof.
  • a target nucleic acid may be a nucleic acid from a sample, or a secondary nucleic acid such as a product of an amplification reaction.
  • T m is commonly defined as the temperature at which half of the population of double-stranded nucleic acid molecules becomes dissociated into single strands.
  • the equation for calculating the Tm of nucleic acids is well known in the art.
  • Tm 81.5+16.6(log 10[Na+])0.41 (%[G+C]) ⁇ 675/n ⁇ 1.0 m, when a nucleic acid is in aqueous solution having cation concentrations of 0.5 M, or less, the (G+C) content is between 30% and 70%, n is the number of bases, and m is the percentage of base pair mismatches (see e.g., Sambrook J et al., “Molecular Cloning, A Laboratory Manual”, 3rd Edition, Cold Spring Harbor Laboratory Press (2001)).
  • FIG. 1 is a simplified flow diagram of an overall method 100 for sequencing nucleic acids using the processes of the claimed invention.
  • creation of a target molecule for sequencing is accomplished by extracting and preparing target nucleic acids 110 (e.g., fractionating, shearing or cleaving), constructing a library with the sheared target nucleic acids using engineered adaptors 120 , replicating the library constructs to form amplified library constructs (e.g., forming DNA nanoballs through circle dependant replication) 130 , and sequencing the amplified target nucleic acids.
  • target nucleic acids 110 e.g., fractionating, shearing or cleaving
  • engineered adaptors 120 e.g., constructing a library with the sheared target nucleic acids using engineered adaptors 120
  • replicating the library constructs e.g., forming DNA nanoballs through circle dependant replication
  • the target nucleic acids for some aspects are derived from genomic DNA.
  • 10-100 genome-equivalents of DNA preferably are obtained to ensure that the population of target DNA fragments covers the entire genome.
  • the target genomic DNA is isolated using conventional techniques, for example as disclosed in Sambrook and Russell, Molecular Cloning: A Laboratory Manual .
  • the target genomic DNA is then fragmented to a desired size by conventional techniques including enzymatic digestion, shearing, or sonication.
  • Fragment size of the target nucleic acid can vary depending on the source target nucleic acid and the library construction methods used, but typically range from 50 nucleotides in length to over 11 kb in length, including 200-700 nucleotides in length, 400-600 nucleotides in length, 450-550 in length, or 4 kb to over 10 kb in length.
  • the target nucleic acids comprise mRNAs or cDNAs.
  • the target DNA is created using isolated transcripts from a biological sample. Isolated mRNA may be reverse transcribed into cDNAs using conventional techniques, again as described in Genome Analysis: A Laboratory Manual Series (Vols. I-IV) or Molecular Cloning: A Laboratory Manual.
  • a library is constructed using the fragmented target nucleic acids.
  • Library construction will be discussed in detail infra; briefly, the library constructs are assembled by inserting adaptor molecules at a multiplicity of sites throughout each target nucleic acid fragment.
  • the interspersed adaptors permit acquisition of sequence information from multiple sites in the target nucleic acid consecutively or simultaneously.
  • the interspersed adaptors are inserted at intervals within a contiguous region of the target nucleic acids at predetermined positions. The intervals may or may not be equal.
  • the accuracy of the spacing between interspersed adaptors may be known only to an accuracy of one to a few nucleotides.
  • the spacing of the adaptors is known, and the orientation of each adaptor relative to other adaptors in the library constructs is known.
  • the library constructs are amplified and, in some aspects, are replicated to form DNA nanoballs.
  • the library constructs (the target nucleic acids with the interspersed adaptors) are replicated in such a way so as to form single-stranded DNA concatemers of each library construct, each concatamer comprising multiple linear tandem repeats of the library construct.
  • Single-stranded DNA concatemers under conventional conditions form random coils in a manner known in the art (e.g., see Edvinssom (2002), “On the size and shape of polymers and polymer complexes,” Dissertation 696 (University of Uppsala)).
  • Concatemeric DNA randomly coiled forms nanoballs (also termed “DNA nanoballs”, “nucleic acid nanoballs” or “DNBs”).
  • the DNBs formed in process 130 are sequenced.
  • the DNBs are randomly arrayed on a planar surface.
  • the DNBs may be covalently or noncovalently attached to the planar surface.
  • the target nucleic acids within each DNB are then sequenced by iterative interrogation using sequencing-by-synthesis techniques and/or sequencing-by-ligation techniques.
  • FIG. 2 is a schematic representation of one aspect of a method for assembling adaptor/target nucleic acid library constructs.
  • DNA such as genomic DNA 202
  • genomic DNA 202 is isolated and fragmented 203 into target nucleic acids 204 using standard techniques as described briefly above.
  • the fragmented target nucleic acids 204 are then repaired so that the 5′ and 3′ ends of each strand are flush or blunt ended.
  • each fragment is “A-tailed” with a single A added to the 3′ end of each strand of the fragmented target nucleic acids using a non-proofreading polymerase 205 .
  • a first and second arm of a first adaptor is then ligated to each target nucleic acid, producing a target nucleic acid with adaptor arms ligated to each end 206 .
  • the adaptor arms are “T tailed” to be complementary to the A tailing of the target nucleic acid, facilitating ligation of the adaptor arms in a known orientation.
  • the invention provide adaptor ligation to each fragment in a manner that minimizes the creation of intra- or intermolecular ligation artefacts. This is desirable because random fragments of target nucleic acids forming ligation artefacts with one another create false proximal genomic relationships between target nucleic acid fragments, complicating the sequence alignment process.
  • the aspect shown in FIG. 2 shows step 205 as a combination of blunt end repair and an A tail addition. This preferred aspect using both A tailing and T tailing to attach the adaptor to the DNA fragments prevents random intra- or inter-molecular associations of adaptors and fragments, which reduces artefacts that would be created from self-ligation, adaptor-adaptor or fragment-fragment ligation.
  • various other methods can be implemented to prevent formation of ligation artefacts of the target nucleic acids and the adaptors, as well as orient the adaptor arms with respect to the target nucleic acids, including using complementary NN overhangs in the target nucleic acids and the adaptor arms, or employing blunt end ligation with an appropriate target nucleic acid to adaptor ratio to optimize single fragment nucleic acid/adaptor arm ligation ratios.
  • the linear target nucleic acid 206 is circularized, a process that will be discussed in detail infra, resulting in a circular library construct 208 comprising target nucleic acid and an adaptor. Note that the circularization process results in bringing the first and second arms of the first adaptor together to form a contiguous adaptor sequence in the circular construct.
  • the circular construct is amplified, such as by circle dependant amplification, using, e.g., random hexamers and ⁇ 29 or helicase.
  • target nucleic acid/adaptor structure 206 may remain linear, and amplification may be accomplished by PCR primed from sites in the adaptor arms.
  • the amplification 209 preferably is a controlled amplification process and uses a high fidelity, proof-reading polymerase, resulting in a sequence-accurate library of amplified target nucleic acid/adaptor constructs where there is sufficient representation of the genome or one or more portions of the genome being queried.
  • the first adaptor comprises two Type IIs restriction endonuclease recognition sites, positioned such that the target nucleic acid outside the recognition sequence (and outside of the adaptor) is cut 210 .
  • the arrows around structure 210 indicate the recognition sites and the site of restriction.
  • EcoP15 a Type IIs restriction endonuclease, is used to cut the library constructs. Note that in the aspect shown in FIG. 2 , a portion of each library construct mapping to a portion of the target nucleic acid will be cut away from the construct (the portion of the target nucleic acid between the arrow heads in structure 210 ).
  • the linear construct 212 like the fragmented target nucleic acid 204 , is treated by conventional methods to become blunt or flush ended, A tails comprising a single A are added to the 3′ ends of the linear library construct using a non-proofreading polymerase and first and second arms of a second adaptor are ligated to ends of the linearized library construct by A-T tailing and ligation 213 .
  • the resulting library construct comprises the structure seen at 214 , with the first adaptor interior to the ends of the linear construct, with target nucleic acid flanked on one end by the first adaptor, and on the other end by either the first or second arm of the second adaptor.
  • the double-stranded linear library constructs are treated so as to become single-stranded 216 , and the single-stranded library constructs 216 are then ligated 217 to form single-stranded circles of target nucleic acid interspersed with two adaptors 218 .
  • the ligation/circularization process of 217 is performed under conditions that optimize intramolecular ligation.
  • the single-stranded, circularized library constructs 218 are amplified by circle dependant replication 219 to form DNA nanoballs 220 .
  • Circle dependant replication is performed, e.g., using specific primers where the amplification product displaces its own tail, producing linear, tandem single-stranded copies of ⁇ target nucleic acid/adaptor 1 /target nucleic acid/adaptor 2 ⁇ library constructs. As the tandem copies begin to multiply, the library constructs begin to coil and form secondary structures, ultimately forming DNA nanoballs.
  • Each library construct contains in some aspects between about ten to about 5000 copies, or from about 250 copies to about 2500 copies of the ⁇ target nucleic acid/adaptor 1 /target nucleic acid/adaptor 2 ⁇ repeats, and preferably contains about 500 to about 1200 copies of the ⁇ target nucleic acid/adaptor 1 /target nucleic acid/adaptor 2 ⁇ repeats
  • the resulting DNA nanoballs 220 are clonal populations of DNA in discrete structures, which can then be arrayed and sequenced (process not shown).
  • FIG. 3 is a simplified schematic illustration showing the cyclical nature of the basic adaptor insertion process 300 where two, three, four, five or more adaptors can be inserted into a target nucleic acid.
  • a fragmented target nucleic acid is shown at 302 .
  • Process 303 provides adaptor arm to target nucleic acid ligation (as was described with some detail in the discussion of the aspect shown in FIG. 2 ), resulting in a linear target nucleic acid with first and second adaptor arms of a first adaptor ligated onto its ends 304 .
  • the adaptor arms are then ligated to one another in an intramolecular reaction that results in a circularization of the target nucleic acid/adaptor library construct 306 .
  • the library construct is then amplified 307 resulting in a population comprising a plurality of copies of each target nucleic acid/adaptor library construct 308 .
  • These library constructs 308 are then cleaved 309 (for example, by restriction with a Type IIs restriction endonuclease recognizing one or more sites in the adaptor and cutting in the target nucleic acid sequence), and the cycle continues to add second, third, fourth or more adaptors.
  • FIG. 4 is a schematic illustration of one aspect of a DNA array 400 employing multi-adaptor nucleic acid library constructs.
  • the multi-adaptor nucleic acid library constructs in the form of DNA nanoballs (DNBs) are seen at 402 .
  • DNBs are arrayed on a planar matrix 404 having discrete sites 406 .
  • the DNBs 402 may be fixed to the discrete sites by a variety of techniques, including covalent attachment and non-covalent attachment.
  • the surface of the matrix 406 may comprise attached capture oligonucleotides that form complexes, e.g., double-stranded duplexes, with a segment of an adaptor component of the DNB.
  • capture oligonucleotides may comprise oligonucleotide clamps, or like structures, that form triplexes with adaptor oligonucleotides (see, e.g., U.S. Pat. No. 5,473,060).
  • the surface of the array matrix 406 may have reactive functionalities that react with complementary functionalities on the DNBs to form a covalent linkage (see, e.g., Beaucage (2001), Current Medicinal Chemistry 8:1213-1244). Once the DNBs are arrayed, the adaptors interspersed in the target nucleic acids are used to acquire sequence information of the target nucleic acids.
  • a variety of sequencing methodologies may be used with multi-adaptor nucleic acid library constructs, including but not limited to hybridization methods as disclosed in U.S. Pat. Nos. 6,864,052; 6,309,824; 6,401,267; sequencing-by-synthesis methods as disclosed in U.S. Pat. Nos. 6,210,891; 6,828,100, 6,833,246; 6,911,345; Margulies, et al. (2005), Nature 437:376-380 and Ronaghi, et al. (1996), Anal. Biochem. 242:84-89; and ligation-based methods as disclosed in U.S. Pat. No. 6,306,597; and Shendure et al. (2005) Science 309:1728-1739, all of which are incorporated by reference in their entirety.
  • cPAL combinatorial probe-anchor ligation reaction
  • cPAL comprises cycling of the following steps: First, an anchor is hybridized to a first adaptor in the DNBs (typically immediately at the 5′ or 3′ end of one of the adaptors). Enzymatic ligation reactions are then performed with the anchor to a fully degenerate probe population of, e.g., 8-mer probes that are labeled, e.g., with fluorescent dyes.
  • Probes may have a length, e.g., about 6-20 bases, or, preferably, about 7-12 bases.
  • the population of 8-mer probes that is used is structured such that the identity of one or more of its positions is correlated with the identity of the fluorophore attached to that 8-mer probe.
  • a set of fluorophore-labeled probes for identifying a base immediately adjacent to an interspersed adaptor may have the following structure: 3′-F1-NNNNNNAp, 3 ′-F2-NNNNNNGp. 3′-F3-NNNNCP and 3′-F4-NNNNNNTp (where “p” is a phosphate available for ligation).
  • a set of fluorophore-labeled 7-mer probes for identifying a base three bases into a target nucleic acid from an interspersed adaptor may have the following structure: 3′-F1-NNNNANNp, 3′-F2-NNNNGNNp. 3′-F3-NNNNCNNp and 3′-F4-NNNNTNNp.
  • the fluorescent signal provides the identity of that base.
  • the anchor:8-mer probe complexes are stripped and a new cycle is begun.
  • T4 DNA ligase accurate sequence information can be obtained as far as six bases or more from the ligation junction, allowing access to at least 12 bp per adaptor (six bases from both the 5′ and 3′ ends), for a total of 48 bp per 4-adaptor DNB, 60 bp per 5-adaptor DNB and so on.
  • FIG. 5 is a schematic illustration of the components that may be used in an exemplary sequencing-by-ligation technique.
  • a construct 500 is shown with a stretch of target nucleic acid to be analyzed interspersed with three adaptors, with the 5′ end of the stretch shown at 502 and the 3′ end shown at 504 .
  • the target nucleic acid portions are shown at 506 and 508 , with adaptor 1 shown at 501 , adaptor 2 shown at 503 and adaptor 3 shown at 505 .
  • anchor A 1 ( 510 ), which binds to the 3′ end of adaptor 1 ( 501 ) and is used to sequence the 5′ end of target nucleic acid 506 ;
  • anchor A 2 ( 512 ), which binds to the 5′ end of adaptor 2 ( 503 ) and is used to sequence the 3′ end of target nucleic acid 506 ;
  • anchor A 3 ( 514 ), which binds to the 3′ end of adaptor 2 ( 503 ) and is used to sequence the 5′ end of target nucleic acid 508 ;
  • anchor A 4 ( 516 ), which binds to the 5′ end of adaptor 3 ( 505 ) and is used to sequence the 3′ end of target nucleic acid 508 .
  • the 8-mer probes are structured differently. Specifically, a single position within each 8-mer probe is correlated with the identity of the fluorophore with which it is labeled. Additionally, the fluorophore molecule is attached to the opposite end of the 8-mer probe relative to the end targeted to the ligation junction. For example, in the graphic shown here, the anchor 530 is hybridized such that its 3′ end is adjacent to the target nucleic acid. To query a position five bases into the target nucleic acid, a population of degenerate 8-mer probes shown here at 518 may be used. The query position is shown at 532 .
  • this correlates with the fifth nucleic acid from the 5′ end of the 8-mer probe, which is the end of the 8-mer probe that will ligate to the anchor.
  • the 8-mer probes are individually labeled with one of four fluorophores, where Cy5 is correlated with A ( 522 ), Cy3 is correlated with G ( 524 ), Texas Red is correlated with C ( 526 ), and FITC is correlated with T ( 528 ).
  • cPAL or other sequencing-by-ligation approaches may be selected depending on various factors such as the volume of sequencing desired, the type of labels employed, the number of different adaptors used within each library construct, the number of bases being queried per cycle, how the DNBs are attached to the surface of the array, the desired speed of sequencing operations, signal detection approaches and the like.
  • four fluorophores were used and a single base was queried per cycle. It should, however, be recognized that eight or sixteen fluorophores or more may be used per cycle, increasing the number of bases that can be identified during any one cycle.
  • the degenerate probes in FIG.
  • 5 , 8-mer probes can be labeled in a variety of ways, including the direct or indirect attachment of radioactive moieties, fluorescent moieties, colorimetric moieties, chemiluminescent moieties, and the like.
  • Many comprehensive reviews of methodologies for labeling DNA and constructing DNA adaptors provide guidance applicable to constructing oligonucleotide probes of the present invention. Such reviews include Kricka (2002), Ann. Clin. Biochem., 39: 114-129; and Haugland (2006), Handbook of Fluorescent Probes and Research Chemicals, 10th Ed. (Invitrogen/Molecular Probes, Inc., Eugene); Keller and Manak (1993), DNA Probes, 2nd Ed. (Stockton Press, New York, 1993); and Eckstein (1991), Ed., Oligonucleotides and Analogues: A Practical Approach (IRL Press, Oxford); and the like.
  • one or more fluorescent dyes are used as labels for the oligonucleotide probes. Labeling can also be carried out with quantum dots, as disclosed in the following patents and patent publications, incorporated herein by reference: U.S. Pat. Nos. 6,322,901; 6,576,291; 6,423,551; 6,251,303; 6,319,426; 6,426,513; 6,444,143; 5,990,479; 6,207,392; 2002/0045045; 2003/0017264; and the like.
  • fluorescent nucleotide analogues readily incorporated into the degenerate probes include, for example, Cascade Blue, Cascade Yellow, Dansyl, lissamine rhodamine B, Marina Blue, Oregon Green 488, Oregon Green 514, Pacific Blue, rhodamine 6G, rhodamine green, rhodamine red, tetramethylrhodamine, Texas Red, the Cy fluorophores, the Alexa Fluor® fluorophores, the BODIPY® fluorophores and the like. FRET tandem fluorophores may also be used.
  • suitable labels for detection oligonucleotides may include fluorescein (FAM), digoxigenin, dinitrophenol (DNP), dansyl, biotin, bromodeoxyuridine (BrdU), hexahistidine (6 ⁇ His), phosphor-amino acids (e.g. P-tyr, P-ser, P-thr) or any other suitable label.
  • FAM fluorescein
  • DNP dinitrophenol
  • RhdU bromodeoxyuridine
  • hexahistidine 6 ⁇ His
  • phosphor-amino acids e.g. P-tyr, P-ser, P-thr
  • Imaging acquisition may be performed by methods known in the art, such as use of the commercial imaging package Metamorph.
  • Data extraction may be performed by a series of binaries written in, e.g., C/C++, and base-calling and read-mapping may be performed by a series of Matlab and Perl scripts.
  • a hybridization reaction for each base in a target nucleic acid to be queried (for example, for 12 bases, reading 6 bases in from both the 5′ and 3′ ends of each target nucleic acid portion of each DNB), a hybridization reaction, a ligation reaction, imaging and a primer stripping reaction is performed.
  • each field of view (“frame”) is imaged with four different wavelengths corresponding to the four fluorescent, e.g., 8-mers used. All images from each cycle are saved in a cycle directory, where the number of images is 4 ⁇ the number of frames (for example, if a four-fluorophore technique is employed). Cycle image data may then be saved into a directory structure organized for downstream processing.
  • Data extraction typically requires two types of image data: bright field images to demarcate the positions of all DNBs in the array; and sets of fluorescence images acquired during each sequencing cycle.
  • the data extraction software identifies all objects with the brightfield images, then for each such object, computes an average fluorescence value for each sequencing cycle. For any given cycle, there are four data-points, corresponding to the four images taken at different wavelengths to query whether that base is an A, G, C or T. These raw base-calls are consolidated, yielding a discontinuous sequencing read for each DNB. The next task is to match these sequencing reads against a reference genome.
  • a reference table may be compiled using existing sequencing data on the organism of choice.
  • human genome data can be accessed through the National Center for Biotechnology Information at ftp.ncbi.nih.gov/refseq/release, or through the J. Craig Venter Institute at http://www.jcvi.org/researchhuref/. All or a subset of human genome information can be used to create a reference table for particular sequencing queries.
  • specific reference tables can be constructed from empirical data derived from specific populations, including genetic sequence from humans with specific ethnicities, geographic heritage, religious or culturally-defined populations, as the variation within the human genome may slant the reference data depending upon the origin of the information contained therein.
  • parallel sequencing of the target nucleic acids in the DNBs on a random array is performed by combinatorial sequencing-by-hybridization (cSBH), as disclosed by Drmanac in U.S. Pat. Nos. 6,864,052; 6,309,824; and 6,401,267.
  • first and second sets of oligonucleotide probes are provided, where each set has member probes that comprise oligonucleotides having every possible sequence for the defined length of probes in the set. For example, if a set contains probes of length six, then it contains 4096 (4 6 ) probes.
  • first and second sets of oligonucleotide probes comprise probes having selected nucleotide sequences designed to detect selected sets of target polynucleotides. Sequences are determined by hybridizing one probe or pool of probes, hybridizing a second probe or a second pool or probes, ligating probes that form perfectly matched duplexes on their target nucleic acids, identifying those probes that are ligated to obtain sequence information about the target nucleic acid sequence, repeating the steps until all the probes or pools of probes have been hybridized, and determining the nucleotide sequence of the target nucleic acid from the sequence information accumulated during the hybridization and identification processes.
  • parallel sequencing of the target nucleic acids in the DNBs is performed by sequencing-by-synthesis techniques as described in U.S. Pat. Nos. 6,210,891; 6,828,100, 6,833,246; 6,911,345; Margulies, et al. (2005), Nature 437:376-380 and Ronaghi, et al. (1996), Anal. Biochem. 242:84-89.
  • modified pyrosequencing in which nucleotide incorporation is detected by the release of an inorganic pyrophosphate and the generation of photons, is performed on the DNBs in the array using sequences in the adaptors for binding of the primers that are extended in the synthesis.
  • FIG. 6 is a schematic illustration of an insertion of a second adaptor relative to a first adaptor in a nucleic acid library construct.
  • Process 600 begins with circular library construct 602 , having an inserted first adaptor 610 .
  • First adaptor 610 has a specific orientation, with a rectangle identifying the “outer strand” of the first adaptor and a diamond identifying the “inner strand” of the first adaptor (Ad1 orientation 610 ).
  • a Type IIs restriction endonuclease site in the first adaptor 610 is indicated by the tail of arrow 601 , and the site of cutting is indicated by the arrow head.
  • Process 603 comprises cutting with the Type IIs restriction endonuclease, ligating first and second adaptor arms of a second adaptor, and recircularization.
  • the second adaptor can be inserted in two different ways relative to the first adaptor.
  • the oval is inserted into the circle's outer strand with the rectangle, and the bowtie is inserted into the circle's inner strand with the diamond (Ad2 orientation 620 ).
  • the oval is inserted into the circle's inner strand with the diamond and the bowtie is inserted into the circle's outer strand with the rectangle (Ad2 orientation 630 ).
  • FIG. 7 is a schematic representation of components of an exemplary adaptor useful for selecting insertion orientation.
  • a basic schematic of an adaptor is shown at 700 .
  • the adaptor comprises a 5′ arm 701 , a double-stranded region 702 and a 3′ arm 703 . Both the 5′ and the 3′ arms have a “T tail” 704 and a Type IIs restriction endonuclease site 705 (here, EcoP15).
  • the binding region 702 is the region where the two arms of the adaptor come together to be ligated in the circularization process ( 305 of FIG. 3 ).
  • Structure 710 is the 5′ arm of adaptor 700 .
  • T tail 704 and the EcoP15 site 705 are shown, as well as the 5′ arm 701 and the binding region 712 .
  • Structure 720 is the 3′ arm of adaptor 700 . Note the T tail 704 and the EcoP15 site 705 , as well as the 3′ arm 703 and the binding region 722 . In the 5′ arm, the binding region 712 is complementary to the binding region 722 of the 3′ arm.
  • the aspects of the claimed invention work optimally when library constructs are of a desired size and limited target nucleic acid sequence, it is preferred that throughout the library construction process the circularization reactions occur intramolecularly. That is, that the separate constructs of the library that are generated in the library construct assembly cycle (as shown in FIG. 3 ) do not ligate to one another. Also, it is preferred that only one set of adaptor arms for each adaptor used in the library construction process be included per target nucleic acid/adaptor construct. Thus, blocking oligos 717 and 727 are used to block the binding regions 712 and 722 regions, respectively.
  • Blocker oligonucleotide 717 is complementary to binding sequence 716
  • blocker oligonucleotide 727 is complementary to binding sequence 726 .
  • the underlined bases are ddC and the bolded font bases are phosphorylated.
  • Blocker oligonucleotides 717 and 727 are not covalently bound to the adaptor arms, and can be “melted off” after ligation of the adaptor arms to the library construct and before circularization; further, the dideoxy nucleotide (here, ddC or alternatively a different non-ligatable nucleotide) prevents ligation of blocker to adaptor.
  • the blocker oligo-adaptor arm hybrids contain a one or more base gap between the adaptor arm and the blocker to reduce ligation of blocker to adaptor.
  • the blocker/binding region hybrids have T m s of about 37° C. to enable easy melting of the blocker sequences prior to tail to tail ligation (circularization).
  • Adaptor structure 730 is a schematic of the final adaptor, where N is an unspecified base, a numeral “1” specifies bases added to disrupt the palindrome (i.e., the EcoP15 site is flanked by A's to isolate the 6-base palindrome formed by the EcoP15 sites on the two arms of the adaptor), numeral “2” specifies bases that correspond to the ddC in the blocker oligonucleotides, numeral “3” specifies the EcoP15 site (CTGCTG) and numeral “4” specifies the T bases designated for TA ligation to the A tailed target nucleic acid.
  • N is an unspecified base
  • a numeral “1” specifies bases added to disrupt the palindrome (i.e., the EcoP15 site is flanked by A's to isolate the 6-base palindrome formed by the EcoP15 sites on the two arms of the adaptor)
  • numeral “2” specifies bases that correspond to the ddC in the blocker
  • the adaptor shown as 900 and detailed at 930 would, in some aspects, be appropriate for a first adaptor to be added in the construction of a library. Adaptors added subsequently would, in some aspects, have a single Type IIs restriction endonuclease site rather than two sites, and, in some aspects, the Type IIs restriction endonuclease sites in each adaptor would be different from one another.
  • Type IIs restriction endonucleases include, but are not limited to, Eco57M I, Mme I, Acu I, Bpm I, BceA I, Bbv I, BciV I, BpuE I, BseM II, BseR I, Bsg I, BsmF I, BtgZ I, Eci I, EcoP15 I, Eco57M I, Fok I, Hga I, Hph I, Mbo II, Mnl I, SfaN I, TspDT I, TspDW I, Taq II, and the like.
  • the adaptors when assembled have a total length of about 50 nucleotides.
  • the adaptors are ligated to the target nucleic acid as two adaptor arms, where each adaptor arm comprises two adaptor oligos (the two complementary strands) and one blocker oligo.
  • the 5′ ends of all four adaptor arm oligos are phosphorylated to support ligation to the insert and tail-to-tail ligation of 5′ to 3′ adaptor arms.
  • the 5′ and 3′ adaptor arms have 3′ overhangs at the adaptor-target nucleic acid ligation junctions, to enable ligation to an A-tailed insert, and to suppress head-to-head adaptor arm ligation.
  • the 5′ and 3′ adaptor arms have Type IIs restriction endonuclease recognition sites oriented to enable cleavage of the adjacent target nucleic acid.
  • the adaptor construct shown in FIG. 7 would be, in some aspects, appropriate for a first adaptor to be inserted into a library construct because it contains two Type IIs restriction endonuclease recognition sites. Subsequently inserted adaptors would, in some aspects, comprise a single Type IIs restriction endonuclease recognition site oriented to enable cleavage of the adjacent target nucleic acid. Additionally, in preferred aspects, the 5′ and 3′ adaptor arms have anchor primer binding sites to enable sequencing of adjacent target nucleic acids. The anchor primer binding sites in some aspects overlap with the respective Type IIs restriction endonuclease recognition site(s); however, in other aspects the anchor primer binding sites do not overlap with the Type IIs restriction endonuclease recognition site(s).
  • FIG. 8 is a schematic representation of adaptor insertion allowing subsequent circularization of the target/adaptor construct.
  • the portion of the library construct seen in FIG. 8 is adaptor-centric, showing target nucleic acid at 802 and 812 , a 5′ adaptor arm at 804 , a 5′ adaptor arm blocking oligo at 806 , a 3′ adaptor arm at 810 , and a 3′ adaptor arm blocking oligo at 808 .
  • the T tail of the adaptor arms 804 and 810 and the A tail of the target nucleic acids 802 and 812 are indicated.
  • the adaptor arms are ligated to the target nucleic acid resulting in target nucleic acid/5′ adaptor arm structure 814 , and target nucleic acid/3′ adaptor arm structure 816 , with blocking oligos 806 and 808 still hybridized to the target nucleic acid/adaptor arm structures.
  • the blockers are removed by melting, and, in preferred aspects under dilute conditions to favor intramolecular ligation of process 805 .
  • the resulting structure is seen at 818 .
  • adaptor arm ligation featuring blocking oligos illustrates the process of adaptor arm ligation featuring blocking oligos; however, other methods may be used to block ligation-creating concatemers of adaptor arms or of library constructs, including using adaptor arms that comprise a restriction site, preferably a site for a restriction endonuclease that cuts asymmetrically, such as Ava I.
  • the adaptor arms may comprise one or more uracil bases that can be selectively cleaved using uracil-DNA glycosylase enzyme (Krokan et al, 1997 ) with the resulting fragments then being melted off in the same way the blocker oligo is melted off.
  • FIG. 9 is a schematic illustration of a process where a desired orientation of a second adaptor relative to a first adaptor is selected using nicking of the first and second adaptors.
  • FIG. 9 shows a circularized library construct 902 comprising one adaptor having an orientation such that a rectangle is on the “outer strand” and a diamond is on the “inner strand.”
  • Further construct 902 comprises a Type IIs restriction endonuclease recognition site that cuts in the adjacent target nucleic acid sequence (noted by the arrow, where the arrowhead indicates the site of the cut). Once cut, the circular library construct is linearized.
  • first and second arms of a second adaptor are ligated onto the linear library construct and the first and second arms of the second adaptor are then ligated to provide an intact second adaptor and a circular library construct.
  • the second adaptor can be inserted in two different orientations relative to the first adaptor.
  • the oval of the second adaptor is on the “outer strand” with the rectangle of the first adaptor and the bowtie of the second adaptor is on the “inner strand” with the diamond of the first adaptor.
  • the oval of the second adaptor is on the “inner strand” with the diamond of the first adaptor and the bowtie of the second adaptor is on the “outer strand” with the rectangle of the first adaptor.
  • the diamond and the bowtie portions of the first and second adaptors respectfully, comprise a nickase enzyme digestion site or other nickable site.
  • Nickases are endonucleases that recognize a specific recognition sequence in double stranded DNA, and cut one strand at a specific location relative to said recognition sequence, thereby giving rise to single-stranded breaks in duplex DNA.
  • Nickases include but are not limited to Nb.BsrDI, Nb.BsmI, Nt.BbvCI, Nb.BbvCI, Nb.BtsI and Nt.BstNBI.
  • uracil is incorporated into one strand of an adaptor and nicking is accomplished by using uracil-DNA glycosylase.
  • the diamond and bowtie portions of the first and second adaptors comprise nickase digestion sites such that treatment of the desired 904 and undesired 906 library constructs with nickase ( 907 ) results in two nicks in the double-stranded library constructs.
  • the desired orientation 904 the “outer strand” remains circular, and both nicks are sustained by the “inner strand” resulting in two linear portions of nucleic acid.
  • both strands are nicked once, resulting in a circular construct with both strands nicked.
  • the library constructs are subjected to circle dependant amplification ( 911 ), where only the intact, circular constructs that survived the nicking will be efficiently amplified.
  • circle dependant amplification 911
  • library constructs having the desired orientation 904 are selected by nicking and subsequent amplification.
  • Library constructs in the undesired orientation 906 will not be amplified or will be amplified significantly less efficiently than the desired library constructs.
  • FIG. 10 is a schematic representation of a library construction process 1000 where four adaptors are inserted into a target nucleic acid in a desired orientation driven by nicking.
  • First genomic DNA 1002 is fragmented 1003 as described previously.
  • the resulting fragmented genomic DNA 1003 is then treated to polish the ends, A's are added to the 3′ ends of the genomic DNA fragments, and first and second arms of a first adaptor are ligated to the genomic DNA fragments in process 805 to produce library construct 1006 .
  • the first and second arms of the first adaptor are then ligated, resulting in a circularized library construct 1008 .
  • Circularized library construct 1008 is then subjected to circle dependant amplification 1009 , which exponentially amplifies circularized library construct 1008 .
  • the collection of amplified, circularized library constructs are then cut with EcoP15 1011 , where the first adaptor has two recognition sites (represented by the two arrows), and the arrowheads indicate the sites where the circularized library construct is cut (within the target nucleic acid).
  • the processes are then repeated with the resulting linearized library construct 1012 . That is, library construct 1012 is treated to polish the ends, A's are added to the 3′ ends of the DNA fragments, and first and second arms of a second adaptor are ligated to the library constructs in process 1013 to produce library construct 1014 . The first and second arms of a second adaptor are then ligated to the ends of the library constructs, the constructs are circularized and are nicked in process 1015 . The resulting library constructs have the second adaptor inserted into the construct in two different orientations relative to the first adaptor, desired orientation 1016 and undesired orientation 1018 .
  • desired orientation 1016 the double-stranded circularized construct is nicked twice in one strand and not nicked in the other strand.
  • undesired orientation 1018 each strand is nicked once.
  • circle dependant amplification 1019 the one remaining circular strand will be amplified, and the strands that were linearized as the result of the nicking will not be amplified (or will be amplified at a very low efficiency).
  • the resulting population of amplified, circularized library constructs 1020 are then cut with Bpm I 1021 , where the second adaptor has one Bpm I recognition site (represented by the one arrow), and the arrowheads indicate the sites where the circularized library construct is cut (within the target nucleic acid).
  • Library construct 1022 is treated to blunt the ends, A's are added to the 3′ ends of the DNA fragments, and first and second arms of a third adaptor are ligated to the library constructs in process 1023 to produce library construct 1024 .
  • the first and second arms of the third adaptor are then ligated to the ends of the library constructs, and the constructs are circularized and are nicked in process 1025 .
  • the resulting library constructs have the third adaptor inserted into the construct in two different orientations relative to the first and second adaptors, desired orientation 1026 and undesired orientation 1028 .
  • desired orientation 1026 the double-stranded circularized construct is nicked three times in one strand and not nicked in the other strand.
  • undesired orientation 1028 one strand is nicked twice and one strand is nicked once.
  • circle dependant amplification 1029 the combined population of desired 1026 and undesired constructs 1028 is then subjected to circle dependant amplification 1029 , where the one remaining circular strand will be amplified, and the strands that were linearized as the result of the nicking will not be amplified (or will be amplified at a very low efficiency).
  • the resulting population of amplified, circularized library constructs 1030 are then cut with Acu I 1031 , where the third adaptor has one Acu I recognition site (represented by the one arrow), and the arrowheads indicate the sites where the circularized library construct is cut (within the target nucleic acid).
  • Library construct 1032 is treated to blunt the ends, A's are added to the 3′ ends of the DNA fragments, and first and second arms of a fourth adaptor are ligated to the library constructs in process 1033 to produce library construct 1034 .
  • the first and second arms of the fourth adaptor are then ligated to the ends of the library constructs, the constructs are circularized and are nicked in process 1035 .
  • the resulting library constructs have the fourth adaptor inserted into the construct in two different orientations relative to the first, second and third adaptors, desired orientation 1036 and undesired orientation 1038 .
  • desired orientation 1036 the double-stranded circularized construct is nicked four times in one strand and not nicked in the other strand.
  • undesired orientation 1038 one strand is nicked three times and one strand is nicked once.
  • the combined population of desired 1036 and undesired constructs 1038 is then subjected to circle dependant replication 1039 , where the one remaining circular strand will be replicated linearly, and the resulting concatemer is allowed to form DNBs 1040 .
  • a Tailing Samples of 100 ng of fragmented genomic DNA were prepared in Thermopol buffer, with dATP and Taq polymerase added. The samples were then incubated at 70° C. for 60 minutes and cooled to 4° C. The samples were then purified by Qiagen MinElute columns.
  • Adaptor annealing The A tailed fragmented genomic DNA samples were mixed with T tailed adaptors and blocking oligos in a buffer containing NaCl, Tris and EDTA. The samples were then heated to 95° C. for 5 minutes and then allowed to cool to room temperature.
  • Adaptor ligation The annealed adaptor/genomic DNA samples were mixed with HB ligation buffer and T4 ligase. The samples were then incubated at 14° C. for two hours, 70° C. for 10 minutes (to inactivate the T4 enzyme and remove the blocking oligos) and cooled to 4° C. The samples were then purified by Qiagen MinElute columns.
  • Adaptor circularization The linear fragmented genomic DNAs now flanked by first and second arms of an adaptor were circularized by incubation in epicenter buffer and T4 Ligase at 14° C. for 14 hours. The samples were then heat inactivated at 70° C. for 10 minutes and then cooled to 4° C.
  • Model System for Orientation Selection by Nicking A model system was used to optimize selection of adaptor orientation. First, genomic DNA samples were amplified using primers with various nick sites including a Sph I site. Once amplified, Sph I was used to cut the amplified DNA to expose ends for circularization. The restricted amplified DNA was cut with a restriction endonuclease, blunt ended and A tailed. The A tailed products were ligated to adaptor arms and circularized in epicentre buffer, 25 mM ATP and T4 ligase, with a 2 hour incubation at 14° C., and a subsequent heat denaturation. The samples were then purified by Qiagen MinElute columns. The circularized constructs were then cut with a restriction endonuclease, blunt ended and A tailed. The A tailed products were ligated to a second set of adaptor arms (containing an Ava I site) and circularized.
  • the circularized constructs were nicked in NEB buffer and a nickase enzyme (Nt.BstNBI Nb.BsrDI or NB.BsmI) at 55° C. for 1.5 hours, with a subsequent 20 minute inactivation.
  • the nicked amplification products were then amplified by circle dependant amplification (four hour incubation at 30° C.).
  • the circularized products were then cut with Ava I and an aliquot was removed to confirm the adaptor orientation.

Abstract

Aspects described and claimed herein provide methods to insert multiple DNA adaptors into a population of circular target DNAs at defined positions and orientations with respect to one another using nicking reactions. The resulting multi-adaptor constructs are then used in massively-parallel nucleic acid sequencing techniques.

Description

  • This application claims priority to U.S. Provisional Application 60/864,992 filed Nov. 9, 2006.
  • BACKGROUND
  • Large-scale sequence analysis of genomic DNA is central to understanding a wide range of biological phenomena related to health and disease in humans and is economically important plants and animals. The need for low-cost, high-throughput sequencing and re-sequencing has led to the development of new approaches to sequencing that employ parallel analysis of many target DNA fragments simultaneously. Improvements to sequencing methods and increasing the amount and quality of data from such methods is of great value in the art.
  • SUMMARY
  • Embodiments described and claimed herein address the foregoing and other situations by providing methods to provide repeated cycles of nucleic acid cleavage and ligation to insert multiple DNA adaptors into a population of circular target DNAs at defined positions and orientations with respect to one another. The resulting multi-adaptor constructs are then used in massively-parallel nucleic acid sequencing techniques.
  • The described technology provides in one aspect a method for enriching for orientation of two adaptors with respect to one another in nucleic acid library constructs comprising: obtaining target nucleic acids; ligating a first adaptor to the target nucleic acids to produce first library constructs, wherein one strand of the first adaptor comprises a first nickable site; ligating a second adaptor to the first library constructs to produce second library constructs, wherein one strand of the second adaptor comprises a second nickable site; circularizing the second library constructs; nicking the second library constructs to form a mixture of library constructs with nicks on both strands and library constructs with nicks on one strand and no nick on the other strand; and subjecting the library constructs to circle dependant amplification, wherein the strands with no nicks will be amplified exponentially and the strands with nicks will be amplified linearly, thereby enriching for orientation of the second adaptor with respect to the first adaptor in the nucleic acid library constructs.
  • In some aspects of the methods, the first library constructs are circularized between the ligating processes, and, in other aspects of the methods, the first library constructs are cut with a restriction endonuclease after being circularized. In yet other aspects of the methods, the first adaptor is ligated to the target nucleic acid as two adaptor arms; and in some aspects, the second adaptor is ligated to the first library construct as two adaptor arms. Also, in some aspects the first and second adaptors comprise Type IIs endonuclease recognition sites. In some aspects, the strands are nicked with a nickase, in other aspects, uracil is incorporated into one strand of an adaptor and nicking is accomplished by using uracil-DNA glycosylase.
  • Additional aspects of the technology provide methods for enriching for orientation of two adaptors with respect to one another in nucleic acid library constructs comprising: obtaining target nucleic acids; ligating a first adaptor to the target nucleic acids to produce first library constructs, wherein one strand of the first adaptor comprises a first nickable site and a Type IIs endonuclease recognition site; circularizing the first library constructs; cutting the first library constructs with a Type IIs endonuclease to produce linearized first library constructs; ligating a second adaptor to the linearized first library constructs to produce second library constructs, wherein one strand of the second adaptor comprises a second nickable site; circularizing the second library constructs; nicking the second library constructs to form a mixture of library constructs with nicks on both strands and library constructs with nicks on one strand and no nick on the other strand; and subjecting the library constructs to circle dependant amplification, wherein the strands with no nicks will be amplified exponentially and the strands with nicks will be amplified linearly, thereby enriching for orientation of the second adaptor with respect to the first adaptor in the nucleic acid library constructs. In some aspects, the first adaptor is ligated to the target nucleic acid as two adaptor arms; and in yet other aspects, the second adaptor is ligated to the first library constructs as two adaptor arms.
  • Yet other aspects of the technology provide methods for enriching or selecting for orientation of two or more adaptors with respect to one another in nucleic acid library constructs comprising: (a) obtaining target nucleic acids; (b) ligating a first adaptor to the target nucleic acids to produce first library constructs, wherein one strand of the first adaptor comprises a first nickable site; (c) ligating a second adaptor to the first library constructs to produce second library constructs, wherein one strand of the second adaptor comprises a second nickable site; (d) circularizing the second library constructs; (e) nicking the second library constructs to form a mixture of library constructs with nicks on both strands and library constructs with nicks on one strand and no nick on the other strand; (f) subjecting the library constructs to circle dependant amplification, wherein the strands with no nicks will be amplified exponentially and the strands with nicks will be amplified linearly, thereby enriching for orientation of the second adaptor with respect to the first adaptor in the nucleic acid library constructs; and (g) repeating processes (b) through (f) until a desired number of adaptors have been inserted into the nucleic acid library constructs.
  • In some aspects of these methods, the first library constructs are circularized between the ligating processes. In other aspects of the methods, the first library constructs are cut with a restriction endonuclease after being circularized. In yet other aspects of the methods, the first adaptor is ligated to the target nucleic acid as two adaptor arms; and in some aspects, one or more of the second and subsequently-added adaptors are ligated as two adaptor arms. Also, in some aspects the first and second adaptors comprise Type IIs endonuclease recognition sites. In some aspects, the strands are nicked with a nickase, in other aspects, uracil is incorporated into one strand of an adaptor and nicking is accomplished by using uracil-DNA glycosylase.
  • Also in some aspects, amplicons made by selectively nicking one strand of each of two adaptors in a library construct are provided, as are libraries comprising a multiplicity (five or more) of such amplicons. In other aspects, kits are provided for selecting for desired orientations of multiple adaptors in library constructs comprising a first double-stranded adaptor comprising a first Type IIs restriction endonculease recognition site and a nickable site; a second double-stranded adaptor comprising a first Type IIs restriction endonculease recognition site and a nickable site; and primers complimentary to each strand of the first and second adaptors.
  • This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Other features, details, utilities, and advantages of the claimed subject matter will be apparent from the following written Detailed Description including those aspects illustrated in the accompanying drawings and defined in the appended claims.
  • BRIEF DESCRIPTIONS OF THE DRAWINGS
  • FIG. 1 is a simplified flow diagram of an overall method for sequencing nucleic acids using the processes of the claimed invention.
  • FIG. 2 is a schematic representation of one aspect of a method for assembling adaptor/target nucleic acid library constructs.
  • FIG. 3 is a schematic illustration of a basic adaptor insertion process.
  • FIG. 4 is a schematic illustration of one aspect of a DNA array employing multi-adaptor nucleic acid library constructs.
  • FIG. 5 is a schematic illustration of the components that may be used in an exemplary sequencing-by-ligation technique.
  • FIG. 6 is a schematic illustration of an insertion of a second adaptor relative to a first adaptor in a nucleic acid library construct.
  • FIG. 7 is a schematic representation of components of an exemplary adaptor useful for selecting insertion orientation.
  • FIG. 8 is a schematic representation of adaptor insertion allowing subsequent circularization of the target/adaptor construct.
  • FIG. 9 is a schematic illustration of a process where a desired orientation of a second adaptor relative to a first adaptor is selected using nicking of the adaptor.
  • FIG. 10 is a schematic representation of a nicking process for selecting constructs where four adaptors are inserted into a target nucleic acid in a desired orientation.
  • DEFINITIONS
  • The practice of the techniques described herein may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and sequencing technology, which are within the skill of those who practice in the art. Such conventional techniques include polymer array synthesis, hybridization and ligation of polynucleotides, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the examples herein. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Green, et al., Eds. (1999), Genome Analysis: A Laboratory Manual Series (Vols. I-IV); Weiner, Gabriel, Stephens, Eds. (2007), Genetic Variation: A Laboratory Manual; Dieffenbach, Dveksler, Eds. (2003), PCR Primer: A Laboratory Manual; Bowtell and Sambrook (2003), DNA Microarrays: A Molecular Cloning Manual; Mount (2004), Bioinformatics: Sequence and Genome Analysis; Sambrook and Russell (2006), Condensed Protocols from Molecular Cloning: A Laboratory Manual; and Sambrook and Russell (2002), Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press); Stryer, L. (1995) Biochemistry (4th Ed.) W.H. Freeman, New York N.Y.; Gait, “Oligonucleotide Synthesis: A Practical Approach” 1984, IRL Press, London; Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3rd Ed., W. H. Freeman Pub., New York, N.Y.; and Berg et al. (2002) Biochemistry, 5th Ed., W.H. Freeman Pub., New York, N.Y., all of which are herein incorporated in their entirety by reference for all purposes.
  • Note that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “an agent” refers to one agent or mixtures of agents, and reference to “the method of administration” includes reference to equivalent steps and methods known to those skilled in the art, and so forth.
  • Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications mentioned herein are incorporated herein by reference for the purpose of describing and disclosing devices, formulations and methodologies which are described in the publication and which might be used in connection with the presently described invention.
  • Where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either both of those included limits are also included in the invention.
  • In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, well-known features and procedures well known to those skilled in the art have not been described in order to avoid obscuring the invention.
  • “Adaptor” refers to an engineered construct comprising “adaptor elements” where one or more adaptors may be interspersed within target nucleic acid in a library construct. The adaptor elements or features included in any adaptor vary widely depending on the use of the adaptors, but typically include sites for restriction endonuclease recognition and/or cutting, sites for primer binding (for amplifying the library constructs) or anchor primer binding (for sequencing the target nucleic acids in the library constructs), nickable sites, and the like. In some aspects, adaptors are engineered so as to comprise one or more of the following: 1) a length of about 20 to about 250 nucleotides, or about 40 to about 100 oligonucleotides, or less than about 60 nucleotides, or less than about 50 nucleotides; 2) features so as to be ligated to the target nucleic acid as two “arms”; 3) different and distinct anchor binding sites at the 5′ and the 3′ ends of the adaptor for use in sequencing of adjacent target nucleic acid; and 4) one or more restriction sites.
  • “Amplicon” means the product of a polynucleotide amplification reaction. That is, it is a population of polynucleotides that are replicated from one or more starting sequences. Amplicons may be produced by a variety of amplification reactions, including but not limited to polymerase chain reactions (PCRs), linear polymerase reactions, nucleic acid sequence-based amplification, circle dependant amplification and like reactions (see, e.g., U.S. Pat. Nos. 4,683,195; 4,965,188; 4,683,202; 4,800,159; 5,210,015; 6,174,670; 5,399,491; 6,287,824 and 5,854,033; and US Pub. No. 2006/0024711).
  • “Circle dependant replication” or “CDR” refers to multiple displacement amplification of a double-stranded circular template using one or more primers annealing to the same strand of the circular template to generate products representing only one strand of the template. In CDR, no additional primer binding sites are generated and the amount of product increases only linearly with time. The primer(s) used may be of a random sequence (e.g., one or more random hexamers) or may have a specific sequence to select for amplification of a desired product. Without further modification of the end product, CDR often results in the creation of a linear construct having multiple copies of a strand of the circular template in tandem, i.e. a linear, single-stranded concatamer of multiple copies of a strand of the template.
  • “Circle dependant amplification” or “CDA” refers to multiple displacement amplification of a double-stranded circular template using primers annealing to both strands of the circular template to generate products representing both strands of the template, resulting in a cascade of multiple-hybridization, primer-extension and strand-displacement events. This leads to an exponential increase in the number of primer binding sites, with a consequent exponential increase in the amount of product generated over time. The primers used may be of a random sequence (e.g., random hexamers) or may have a specific sequence to select for amplification of a desired product. CDA results in a set of concatemeric double-stranded fragments is formed.
  • “Complementary” or “substantially complementary” refers to the hybridization or base pairing or the formation of a duplex between nucleotides or nucleic acids, such as, for instance, between the two strands of a double-stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single-stranded nucleic acid. Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single-stranded RNA or DNA molecules are said to be substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the other strand, usually at least about 90% to about 95%, and even about 98% to about 100%.
  • “Duplex” means at least two oligonucleotides or polynucleotides that are fully or partially complementary and which undergo Watson-Crick type base pairing among all or most of their nucleotides so that a stable complex is formed. The terms “annealing” and “hybridization” are used interchangeably to mean formation of a stable duplex. “Perfectly matched” in reference to a duplex means that the poly- or oligonucleotide strands making up the duplex form a double-stranded structure with one another such that every nucleotide in each strand undergoes Watson-Crick base pairing with a nucleotide in the other strand. A “mismatch” in a duplex between two oligonucleotides or polynucleotides means that a pair of nucleotides in the duplex fails to undergo Watson-Crick basepairing.
  • “Hybridization” refers to the process in which two single-stranded polynucleotides bind non-covalently to form a stable double-stranded polynucleotide. The resulting (usually) double-stranded polynucleotide is a “hybrid” or “duplex.” “Hybridization conditions” will typically include salt concentrations of less than about 1M, more usually less than about 500 mM and may be less than about 200 mM. A “hybridization buffer” is a buffered salt solution such as 5% SSPE, or other such buffers known in the art. Hybridization temperatures can be as low as 5° C., but are typically greater than 22° C., and more typically greater than about 30° C., and typically in excess of 37° C. Hybridizations are usually performed under stringent conditions, i.e., conditions under which a probe will hybridize to its target subsequence but will not hybridize to the other, uncomplimentary sequences. Stringent conditions are sequence-dependent and are different in different circumstances. For example, longer fragments may require higher hybridization temperatures for specific hybridization than short fragments. As other factors may affect the stringency of hybridization, including base composition and length of the complementary strands, presence of organic solvents, and the extent of base mismatching, the combination of parameters is more important than the absolute measure of any one parameter alone. Generally stringent conditions are selected to be about 5° C. lower than the Tm for the specific sequence at a defined ionic strength and pH. Exemplary stringent conditions include a salt concentration of at least 0.01M to no more than 1M sodium ion concentration (or other salt) at a pH of about 7.0 to about 8.3 and a temperature of at least 25° C. For example, conditions of 5×SSPE (750 mM NaCl, 50 mM sodium phosphate, 5 mM EDTA at pH 7.4) and a temperature of 30° C. are suitable for allele-specific probe hybridizations.
  • “Ligation” means to form a covalent bond or linkage between the termini of two or more nucleic acids, e.g., oligonucleotides and/or polynucleotides, in a template-driven reaction. The nature of the bond or linkage may vary widely and the ligation may be carried out enzymatically or chemically. As used herein, ligations are usually carried out enzymatically to form a phosphodiester linkage between a 5′ carbon terminal nucleotide of one oligonucleotide with a 3′ carbon of another nucleotide. Template driven ligation reactions are described in the following references: U.S. Pat. Nos. 4,883,750; 5,476,930; 5,593,826; and 5,871,921.
  • “Microarray” or “array” refers to a solid phase support having a surface, preferably but not exclusively a planar or substantially planar surface, which carries an array of sites containing nucleic acids such that each site of the array comprises identical copies of oligonucleotides or polynucleotides and is spatially defined and not overlapping with other member sites of the array; that is, the sites are spatially discrete. The array or microarray can also comprise a non-planar interrogatable structure with a surface such as a bead or a well. The oligonucleotides or polynucleotides of the array may be covalently bound to the solid support, or may be non-covalently bound. Conventional microarray technology is reviewed in, e.g., Schena, Ed. (2000), Microarrays: A Practical Approach (IRL Press, Oxford). As used herein, “random array” or “random microarray” refers to a microarray where the identity of the oligonucleotides or polynucleotides is not discernable, at least initially, from their location but may be determined by a particular operation on the array, such as by sequencing, hybridizing decoding probes or the like. See, e.g., U.S. Pat. Nos. 6,396,995; 6,544,732; 6,401,267; and 7,070,927; WO publications WO 2006/073504 and 2005/082098; and US Pub Nos. 2007/0207482 and 2007/0087362.
  • “Nucleic acid”, “oligonucleotide”, “polynucleotide”, “oligo” or grammatical equivalents used herein refers generally to at least two nucleotides covalently linked together. A nucleic acid generally will contain phosphodiester bonds, although in some cases nucleic acid analogs may be included that have alternative backbones such as phosphoramidite, phosphorodithioate, or methylphophoroamidite linkages; or peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with bicyclic structures including locked nucleic acids, positive backbones, non-ionic backbones and non-ribose backbones. Modifications of the ribose-phosphate backbone may be done to increase the stability of the molecules; for example, PNA:DNA hybrids can exhibit higher stability in some environments.
  • “Primer” means an oligonucleotide, either natural or synthetic, that is capable, upon forming a duplex with a polynucleotide template, of acting as a point of initiation of nucleic acid synthesis and being extended from its 3′ end along the template so that an extended duplex is formed. The sequence of nucleotides added during the extension process is determined by the sequence of the template polynucleotide. Primers usually are extended by a DNA polymerase.
  • “Probe” means generally an oligonucleotide that is complementary to an oligonucleotide or target nucleic acid under investigation. Probes used in certain aspects of the claimed invention are labeled in a way that permits detection, e.g., with a fluorescent or other optically-discernable tag.
  • “Sequence determination” in reference to a target nucleic acid means determination of information relating to the sequence of nucleotides in the target nucleic acid. Such information may include the identification or determination of partial as well as full sequence information of the target nucleic acid. The sequence information may be determined with varying degrees of statistical reliability or confidence. In one aspect, the term includes the determination of the identity and ordering of a plurality of contiguous nucleotides in a target nucleic acid starting from different nucleotides in the target nucleic acid.
  • “Target nucleic acid” means a nucleic acid from a gene, a regulatory element, genomic DNA, cDNA, RNAs including mRNAs, rRNAs, siRNAs, miRNAs and the like and fragments thereof. A target nucleic acid may be a nucleic acid from a sample, or a secondary nucleic acid such as a product of an amplification reaction.
  • As used herein, the term “Tm” is commonly defined as the temperature at which half of the population of double-stranded nucleic acid molecules becomes dissociated into single strands. The equation for calculating the Tm of nucleic acids is well known in the art. As indicated by standard references, a simple estimate of the Tm value may be calculated by the equation: Tm=81.5+16.6(log 10[Na+])0.41 (%[G+C])−675/n−1.0 m, when a nucleic acid is in aqueous solution having cation concentrations of 0.5 M, or less, the (G+C) content is between 30% and 70%, n is the number of bases, and m is the percentage of base pair mismatches (see e.g., Sambrook J et al., “Molecular Cloning, A Laboratory Manual”, 3rd Edition, Cold Spring Harbor Laboratory Press (2001)). Other references include more sophisticated computations, which take structural as well as sequence characteristics into account for the calculation of Tm (see also, Anderson and Young (1985), Quantitative Filter Hybridization, Nucleic Acid Hybridization, and Allawi and SantaLucia (1997), Biochemistry 36:10581-94).
  • DETAILED DESCRIPTION
  • Technology is described herein for providing nucleic acid constructs having interspersed adaptors inserted in a desired orientation with respect to one another for use in large scale sequencing methods. Many adaptor insertion methods developed to date do not allow control of the orientation of newly inserted adaptors vis-à-vis previously inserted adaptors. The inability to control the orientation of adaptors with respect to one another can have a number of undesired consequences. The presence of adaptors in both orientations in a population of target nucleic acid/adaptor library constructs may require multiple sequencing primers in each sequencing reaction to enable sequencing regardless of the orientation of a given adaptor. In addition, analysis of sequence data collected from multiple adaptors of unspecified orientation may require either determination of the orientation of each adaptor or consideration of all possible combinations of adaptor orientation during assembly.
  • Overview of Sequencing Approaches for Use with the Present Invention
  • FIG. 1 is a simplified flow diagram of an overall method 100 for sequencing nucleic acids using the processes of the claimed invention. Generally, creation of a target molecule for sequencing is accomplished by extracting and preparing target nucleic acids 110 (e.g., fractionating, shearing or cleaving), constructing a library with the sheared target nucleic acids using engineered adaptors 120, replicating the library constructs to form amplified library constructs (e.g., forming DNA nanoballs through circle dependant replication) 130, and sequencing the amplified target nucleic acids.
  • In process 110 of method 100, the target nucleic acids for some aspects are derived from genomic DNA. In some aspects such as whole genome sequencing, 10-100 genome-equivalents of DNA preferably are obtained to ensure that the population of target DNA fragments covers the entire genome. The target genomic DNA is isolated using conventional techniques, for example as disclosed in Sambrook and Russell, Molecular Cloning: A Laboratory Manual. The target genomic DNA is then fragmented to a desired size by conventional techniques including enzymatic digestion, shearing, or sonication. Fragment size of the target nucleic acid can vary depending on the source target nucleic acid and the library construction methods used, but typically range from 50 nucleotides in length to over 11 kb in length, including 200-700 nucleotides in length, 400-600 nucleotides in length, 450-550 in length, or 4 kb to over 10 kb in length. Alternatively, in some aspects, the target nucleic acids comprise mRNAs or cDNAs. In specific embodiments, the target DNA is created using isolated transcripts from a biological sample. Isolated mRNA may be reverse transcribed into cDNAs using conventional techniques, again as described in Genome Analysis: A Laboratory Manual Series (Vols. I-IV) or Molecular Cloning: A Laboratory Manual.
  • In process 120 of method 100, a library is constructed using the fragmented target nucleic acids. Library construction will be discussed in detail infra; briefly, the library constructs are assembled by inserting adaptor molecules at a multiplicity of sites throughout each target nucleic acid fragment. The interspersed adaptors permit acquisition of sequence information from multiple sites in the target nucleic acid consecutively or simultaneously. In some aspects, the interspersed adaptors are inserted at intervals within a contiguous region of the target nucleic acids at predetermined positions. The intervals may or may not be equal. In some aspects, the accuracy of the spacing between interspersed adaptors may be known only to an accuracy of one to a few nucleotides. In other aspects, the spacing of the adaptors is known, and the orientation of each adaptor relative to other adaptors in the library constructs is known.
  • In process 130 of method 100, the library constructs are amplified and, in some aspects, are replicated to form DNA nanoballs. In such a process, the library constructs (the target nucleic acids with the interspersed adaptors) are replicated in such a way so as to form single-stranded DNA concatemers of each library construct, each concatamer comprising multiple linear tandem repeats of the library construct. Single-stranded DNA concatemers under conventional conditions (in buffers, e.g., TE, SSC, SSPE or the like) form random coils in a manner known in the art (e.g., see Edvinssom (2002), “On the size and shape of polymers and polymer complexes,” Dissertation 696 (University of Uppsala)). Concatemeric DNA randomly coiled forms nanoballs (also termed “DNA nanoballs”, “nucleic acid nanoballs” or “DNBs”).
  • In process 140 of method 100, the DNBs formed in process 130 are sequenced. In some aspects, the DNBs are randomly arrayed on a planar surface. The DNBs may be covalently or noncovalently attached to the planar surface. The target nucleic acids within each DNB are then sequenced by iterative interrogation using sequencing-by-synthesis techniques and/or sequencing-by-ligation techniques.
  • FIG. 2 is a schematic representation of one aspect of a method for assembling adaptor/target nucleic acid library constructs. DNA, such as genomic DNA 202, is isolated and fragmented 203 into target nucleic acids 204 using standard techniques as described briefly above. The fragmented target nucleic acids 204 are then repaired so that the 5′ and 3′ ends of each strand are flush or blunt ended. Following this reaction, each fragment is “A-tailed” with a single A added to the 3′ end of each strand of the fragmented target nucleic acids using a non-proofreading polymerase 205. Also as part of process 205, a first and second arm of a first adaptor is then ligated to each target nucleic acid, producing a target nucleic acid with adaptor arms ligated to each end 206. In one aspect, the adaptor arms are “T tailed” to be complementary to the A tailing of the target nucleic acid, facilitating ligation of the adaptor arms in a known orientation.
  • In a preferred embodiment, the invention provide adaptor ligation to each fragment in a manner that minimizes the creation of intra- or intermolecular ligation artefacts. This is desirable because random fragments of target nucleic acids forming ligation artefacts with one another create false proximal genomic relationships between target nucleic acid fragments, complicating the sequence alignment process. The aspect shown in FIG. 2 shows step 205 as a combination of blunt end repair and an A tail addition. This preferred aspect using both A tailing and T tailing to attach the adaptor to the DNA fragments prevents random intra- or inter-molecular associations of adaptors and fragments, which reduces artefacts that would be created from self-ligation, adaptor-adaptor or fragment-fragment ligation.
  • As an alternative to A tailing, various other methods can be implemented to prevent formation of ligation artefacts of the target nucleic acids and the adaptors, as well as orient the adaptor arms with respect to the target nucleic acids, including using complementary NN overhangs in the target nucleic acids and the adaptor arms, or employing blunt end ligation with an appropriate target nucleic acid to adaptor ratio to optimize single fragment nucleic acid/adaptor arm ligation ratios.
  • In process 207, the linear target nucleic acid 206 is circularized, a process that will be discussed in detail infra, resulting in a circular library construct 208 comprising target nucleic acid and an adaptor. Note that the circularization process results in bringing the first and second arms of the first adaptor together to form a contiguous adaptor sequence in the circular construct. In process 209, the circular construct is amplified, such as by circle dependant amplification, using, e.g., random hexamers and φ29 or helicase. Alternatively, target nucleic acid/adaptor structure 206 may remain linear, and amplification may be accomplished by PCR primed from sites in the adaptor arms. The amplification 209 preferably is a controlled amplification process and uses a high fidelity, proof-reading polymerase, resulting in a sequence-accurate library of amplified target nucleic acid/adaptor constructs where there is sufficient representation of the genome or one or more portions of the genome being queried.
  • In aspects herein, the first adaptor comprises two Type IIs restriction endonuclease recognition sites, positioned such that the target nucleic acid outside the recognition sequence (and outside of the adaptor) is cut 210. The arrows around structure 210 indicate the recognition sites and the site of restriction. In process 211, EcoP15, a Type IIs restriction endonuclease, is used to cut the library constructs. Note that in the aspect shown in FIG. 2, a portion of each library construct mapping to a portion of the target nucleic acid will be cut away from the construct (the portion of the target nucleic acid between the arrow heads in structure 210). Restriction of the library constructs with EcoP15 in process 211 results in a library of linear constructs containing the first adaptor, with the first adaptor “interior” to the ends of the linear construct 212. The resulting linear library construct will have a size defined by the distance between the endonuclease recognition sites and the endonuclease restriction site plus the size of the adaptor. In process 213, the linear construct 212, like the fragmented target nucleic acid 204, is treated by conventional methods to become blunt or flush ended, A tails comprising a single A are added to the 3′ ends of the linear library construct using a non-proofreading polymerase and first and second arms of a second adaptor are ligated to ends of the linearized library construct by A-T tailing and ligation 213. The resulting library construct comprises the structure seen at 214, with the first adaptor interior to the ends of the linear construct, with target nucleic acid flanked on one end by the first adaptor, and on the other end by either the first or second arm of the second adaptor.
  • In process 215, the double-stranded linear library constructs are treated so as to become single-stranded 216, and the single-stranded library constructs 216 are then ligated 217 to form single-stranded circles of target nucleic acid interspersed with two adaptors 218. The ligation/circularization process of 217 is performed under conditions that optimize intramolecular ligation.
  • Next, in the two-adaptor aspect shown in FIG. 2, the single-stranded, circularized library constructs 218 are amplified by circle dependant replication 219 to form DNA nanoballs 220. Circle dependant replication is performed, e.g., using specific primers where the amplification product displaces its own tail, producing linear, tandem single-stranded copies of ├target nucleic acid/adaptor 1/target nucleic acid/adaptor 2┤ library constructs. As the tandem copies begin to multiply, the library constructs begin to coil and form secondary structures, ultimately forming DNA nanoballs. Each library construct contains in some aspects between about ten to about 5000 copies, or from about 250 copies to about 2500 copies of the ├target nucleic acid/adaptor 1/target nucleic acid/adaptor 2┤ repeats, and preferably contains about 500 to about 1200 copies of the ├target nucleic acid/adaptor 1/target nucleic acid/adaptor 2┤ repeats The resulting DNA nanoballs 220, then, are clonal populations of DNA in discrete structures, which can then be arrayed and sequenced (process not shown).
  • FIG. 3 is a simplified schematic illustration showing the cyclical nature of the basic adaptor insertion process 300 where two, three, four, five or more adaptors can be inserted into a target nucleic acid. A fragmented target nucleic acid is shown at 302. Process 303 provides adaptor arm to target nucleic acid ligation (as was described with some detail in the discussion of the aspect shown in FIG. 2), resulting in a linear target nucleic acid with first and second adaptor arms of a first adaptor ligated onto its ends 304. The adaptor arms are then ligated to one another in an intramolecular reaction that results in a circularization of the target nucleic acid/adaptor library construct 306. The library construct is then amplified 307 resulting in a population comprising a plurality of copies of each target nucleic acid/adaptor library construct 308. These library constructs 308 are then cleaved 309 (for example, by restriction with a Type IIs restriction endonuclease recognizing one or more sites in the adaptor and cutting in the target nucleic acid sequence), and the cycle continues to add second, third, fourth or more adaptors.
  • FIG. 4 is a schematic illustration of one aspect of a DNA array 400 employing multi-adaptor nucleic acid library constructs. The multi-adaptor nucleic acid library constructs in the form of DNA nanoballs (DNBs) are seen at 402. DNBs are arrayed on a planar matrix 404 having discrete sites 406. The DNBs 402 may be fixed to the discrete sites by a variety of techniques, including covalent attachment and non-covalent attachment. In one embodiment, the surface of the matrix 406 may comprise attached capture oligonucleotides that form complexes, e.g., double-stranded duplexes, with a segment of an adaptor component of the DNB. In other embodiments, capture oligonucleotides may comprise oligonucleotide clamps, or like structures, that form triplexes with adaptor oligonucleotides (see, e.g., U.S. Pat. No. 5,473,060). In another embodiment, the surface of the array matrix 406 may have reactive functionalities that react with complementary functionalities on the DNBs to form a covalent linkage (see, e.g., Beaucage (2001), Current Medicinal Chemistry 8:1213-1244). Once the DNBs are arrayed, the adaptors interspersed in the target nucleic acids are used to acquire sequence information of the target nucleic acids. A variety of sequencing methodologies may be used with multi-adaptor nucleic acid library constructs, including but not limited to hybridization methods as disclosed in U.S. Pat. Nos. 6,864,052; 6,309,824; 6,401,267; sequencing-by-synthesis methods as disclosed in U.S. Pat. Nos. 6,210,891; 6,828,100, 6,833,246; 6,911,345; Margulies, et al. (2005), Nature 437:376-380 and Ronaghi, et al. (1996), Anal. Biochem. 242:84-89; and ligation-based methods as disclosed in U.S. Pat. No. 6,306,597; and Shendure et al. (2005) Science 309:1728-1739, all of which are incorporated by reference in their entirety.
  • In one aspect, the DNBs described herein—particularly those with inserted and interspersed adapters—are used in sequencing by combinatorial probe-anchor ligation reaction (cPAL) (see U.S. Ser. No. 11/679,124, filed Feb. 24, 2007). In brief, cPAL comprises cycling of the following steps: First, an anchor is hybridized to a first adaptor in the DNBs (typically immediately at the 5′ or 3′ end of one of the adaptors). Enzymatic ligation reactions are then performed with the anchor to a fully degenerate probe population of, e.g., 8-mer probes that are labeled, e.g., with fluorescent dyes. Probes may have a length, e.g., about 6-20 bases, or, preferably, about 7-12 bases. At any given cycle, the population of 8-mer probes that is used is structured such that the identity of one or more of its positions is correlated with the identity of the fluorophore attached to that 8-mer probe. For example, when 7-mer sequencing probes are employed, a set of fluorophore-labeled probes for identifying a base immediately adjacent to an interspersed adaptor may have the following structure: 3′-F1-NNNNNNAp, 3′-F2-NNNNNNGp. 3′-F3-NNNNNNCP and 3′-F4-NNNNNNTp (where “p” is a phosphate available for ligation). In yet another example, a set of fluorophore-labeled 7-mer probes for identifying a base three bases into a target nucleic acid from an interspersed adaptor may have the following structure: 3′-F1-NNNNANNp, 3′-F2-NNNNGNNp. 3′-F3-NNNNCNNp and 3′-F4-NNNNTNNp. To the extent that the ligase discriminates for complementarity at that queried position, the fluorescent signal provides the identity of that base.
  • After performing the ligation and four-color imaging, the anchor:8-mer probe complexes are stripped and a new cycle is begun. With T4 DNA ligase, accurate sequence information can be obtained as far as six bases or more from the ligation junction, allowing access to at least 12 bp per adaptor (six bases from both the 5′ and 3′ ends), for a total of 48 bp per 4-adaptor DNB, 60 bp per 5-adaptor DNB and so on.
  • FIG. 5 is a schematic illustration of the components that may be used in an exemplary sequencing-by-ligation technique. A construct 500 is shown with a stretch of target nucleic acid to be analyzed interspersed with three adaptors, with the 5′ end of the stretch shown at 502 and the 3′ end shown at 504. The target nucleic acid portions are shown at 506 and 508, with adaptor 1 shown at 501, adaptor 2 shown at 503 and adaptor 3 shown at 505. Four anchors are shown: anchor A1 (510), which binds to the 3′ end of adaptor 1 (501) and is used to sequence the 5′ end of target nucleic acid 506; anchor A2 (512), which binds to the 5′ end of adaptor 2 (503) and is used to sequence the 3′ end of target nucleic acid 506; anchor A3 (514), which binds to the 3′ end of adaptor 2 (503) and is used to sequence the 5′ end of target nucleic acid 508; and anchor A4 (516), which binds to the 5′ end of adaptor 3 (505) and is used to sequence the 3′ end of target nucleic acid 508.
  • Depending on which position that a given cycle is aiming to interrogate, the 8-mer probes are structured differently. Specifically, a single position within each 8-mer probe is correlated with the identity of the fluorophore with which it is labeled. Additionally, the fluorophore molecule is attached to the opposite end of the 8-mer probe relative to the end targeted to the ligation junction. For example, in the graphic shown here, the anchor 530 is hybridized such that its 3′ end is adjacent to the target nucleic acid. To query a position five bases into the target nucleic acid, a population of degenerate 8-mer probes shown here at 518 may be used. The query position is shown at 532. In this case, this correlates with the fifth nucleic acid from the 5′ end of the 8-mer probe, which is the end of the 8-mer probe that will ligate to the anchor. In the aspect shown in FIG. 5, the 8-mer probes are individually labeled with one of four fluorophores, where Cy5 is correlated with A (522), Cy3 is correlated with G (524), Texas Red is correlated with C (526), and FITC is correlated with T (528).
  • Many different variations of cPAL or other sequencing-by-ligation approaches may be selected depending on various factors such as the volume of sequencing desired, the type of labels employed, the number of different adaptors used within each library construct, the number of bases being queried per cycle, how the DNBs are attached to the surface of the array, the desired speed of sequencing operations, signal detection approaches and the like. In the aspect shown in FIG. 5 and described herein, four fluorophores were used and a single base was queried per cycle. It should, however, be recognized that eight or sixteen fluorophores or more may be used per cycle, increasing the number of bases that can be identified during any one cycle. The degenerate probes (in FIG. 5, 8-mer probes) can be labeled in a variety of ways, including the direct or indirect attachment of radioactive moieties, fluorescent moieties, colorimetric moieties, chemiluminescent moieties, and the like. Many comprehensive reviews of methodologies for labeling DNA and constructing DNA adaptors provide guidance applicable to constructing oligonucleotide probes of the present invention. Such reviews include Kricka (2002), Ann. Clin. Biochem., 39: 114-129; and Haugland (2006), Handbook of Fluorescent Probes and Research Chemicals, 10th Ed. (Invitrogen/Molecular Probes, Inc., Eugene); Keller and Manak (1993), DNA Probes, 2nd Ed. (Stockton Press, New York, 1993); and Eckstein (1991), Ed., Oligonucleotides and Analogues: A Practical Approach (IRL Press, Oxford); and the like.
  • In one aspect, one or more fluorescent dyes are used as labels for the oligonucleotide probes. Labeling can also be carried out with quantum dots, as disclosed in the following patents and patent publications, incorporated herein by reference: U.S. Pat. Nos. 6,322,901; 6,576,291; 6,423,551; 6,251,303; 6,319,426; 6,426,513; 6,444,143; 5,990,479; 6,207,392; 2002/0045045; 2003/0017264; and the like. Commercially available fluorescent nucleotide analogues readily incorporated into the degenerate probes include, for example, Cascade Blue, Cascade Yellow, Dansyl, lissamine rhodamine B, Marina Blue, Oregon Green 488, Oregon Green 514, Pacific Blue, rhodamine 6G, rhodamine green, rhodamine red, tetramethylrhodamine, Texas Red, the Cy fluorophores, the Alexa Fluor® fluorophores, the BODIPY® fluorophores and the like. FRET tandem fluorophores may also be used. Other suitable labels for detection oligonucleotides may include fluorescein (FAM), digoxigenin, dinitrophenol (DNP), dansyl, biotin, bromodeoxyuridine (BrdU), hexahistidine (6×His), phosphor-amino acids (e.g. P-tyr, P-ser, P-thr) or any other suitable label.
  • Imaging acquisition may be performed by methods known in the art, such as use of the commercial imaging package Metamorph. Data extraction may be performed by a series of binaries written in, e.g., C/C++, and base-calling and read-mapping may be performed by a series of Matlab and Perl scripts. As described above, for each base in a target nucleic acid to be queried (for example, for 12 bases, reading 6 bases in from both the 5′ and 3′ ends of each target nucleic acid portion of each DNB), a hybridization reaction, a ligation reaction, imaging and a primer stripping reaction is performed. To determine the identity of each DNB in an array at a given position, after performing the biological sequencing reactions, each field of view (“frame”) is imaged with four different wavelengths corresponding to the four fluorescent, e.g., 8-mers used. All images from each cycle are saved in a cycle directory, where the number of images is 4× the number of frames (for example, if a four-fluorophore technique is employed). Cycle image data may then be saved into a directory structure organized for downstream processing.
  • Data extraction typically requires two types of image data: bright field images to demarcate the positions of all DNBs in the array; and sets of fluorescence images acquired during each sequencing cycle. The data extraction software identifies all objects with the brightfield images, then for each such object, computes an average fluorescence value for each sequencing cycle. For any given cycle, there are four data-points, corresponding to the four images taken at different wavelengths to query whether that base is an A, G, C or T. These raw base-calls are consolidated, yielding a discontinuous sequencing read for each DNB. The next task is to match these sequencing reads against a reference genome.
  • Information regarding the reference genome may be stored in a reference table. A reference table may be compiled using existing sequencing data on the organism of choice. For example human genome data can be accessed through the National Center for Biotechnology Information at ftp.ncbi.nih.gov/refseq/release, or through the J. Craig Venter Institute at http://www.jcvi.org/researchhuref/. All or a subset of human genome information can be used to create a reference table for particular sequencing queries. In addition, specific reference tables can be constructed from empirical data derived from specific populations, including genetic sequence from humans with specific ethnicities, geographic heritage, religious or culturally-defined populations, as the variation within the human genome may slant the reference data depending upon the origin of the information contained therein.
  • In an alternative aspect of the claimed invention, parallel sequencing of the target nucleic acids in the DNBs on a random array is performed by combinatorial sequencing-by-hybridization (cSBH), as disclosed by Drmanac in U.S. Pat. Nos. 6,864,052; 6,309,824; and 6,401,267. In one aspect, first and second sets of oligonucleotide probes are provided, where each set has member probes that comprise oligonucleotides having every possible sequence for the defined length of probes in the set. For example, if a set contains probes of length six, then it contains 4096 (46) probes. In another aspect, first and second sets of oligonucleotide probes comprise probes having selected nucleotide sequences designed to detect selected sets of target polynucleotides. Sequences are determined by hybridizing one probe or pool of probes, hybridizing a second probe or a second pool or probes, ligating probes that form perfectly matched duplexes on their target nucleic acids, identifying those probes that are ligated to obtain sequence information about the target nucleic acid sequence, repeating the steps until all the probes or pools of probes have been hybridized, and determining the nucleotide sequence of the target nucleic acid from the sequence information accumulated during the hybridization and identification processes.
  • In yet another alternative aspect, parallel sequencing of the target nucleic acids in the DNBs is performed by sequencing-by-synthesis techniques as described in U.S. Pat. Nos. 6,210,891; 6,828,100, 6,833,246; 6,911,345; Margulies, et al. (2005), Nature 437:376-380 and Ronaghi, et al. (1996), Anal. Biochem. 242:84-89. Briefly, modified pyrosequencing, in which nucleotide incorporation is detected by the release of an inorganic pyrophosphate and the generation of photons, is performed on the DNBs in the array using sequences in the adaptors for binding of the primers that are extended in the synthesis.
  • Adaptor Insertion and Structure
  • The inability to control the orientation of adaptors with respect to one another can have a number of undesired consequences. The presence of adaptors in both orientations in a population of target nucleic acid/adaptor library constructs requires the use of two different anchor oligos in each sequencing reaction to enable sequencing regardless of the orientation of a given adaptor. In addition, sequencing of adaptors of unspecified orientation requires either determination of the orientation of each adaptor-adding at least one additional round of hybridization and scanning to the sequencing process- or consideration of all possible combinations of adaptor orientation during assembly of sequencing reads from adaptors in the same target nucleic acid/adaptor construct.
  • FIG. 6 is a schematic illustration of an insertion of a second adaptor relative to a first adaptor in a nucleic acid library construct. Again Process 600 begins with circular library construct 602, having an inserted first adaptor 610. First adaptor 610 has a specific orientation, with a rectangle identifying the “outer strand” of the first adaptor and a diamond identifying the “inner strand” of the first adaptor (Ad1 orientation 610). A Type IIs restriction endonuclease site in the first adaptor 610 is indicated by the tail of arrow 601, and the site of cutting is indicated by the arrow head. Process 603 comprises cutting with the Type IIs restriction endonuclease, ligating first and second adaptor arms of a second adaptor, and recircularization. As can be seen in the resulting library constructs 604 and 606, the second adaptor can be inserted in two different ways relative to the first adaptor. In the desired orientation 604, the oval is inserted into the circle's outer strand with the rectangle, and the bowtie is inserted into the circle's inner strand with the diamond (Ad2 orientation 620). In the undesired orientation the oval is inserted into the circle's inner strand with the diamond and the bowtie is inserted into the circle's outer strand with the rectangle (Ad2 orientation 630).
  • FIG. 7 is a schematic representation of components of an exemplary adaptor useful for selecting insertion orientation. A basic schematic of an adaptor is shown at 700. The adaptor comprises a 5′ arm 701, a double-stranded region 702 and a 3′ arm 703. Both the 5′ and the 3′ arms have a “T tail” 704 and a Type IIs restriction endonuclease site 705 (here, EcoP15). The binding region 702 is the region where the two arms of the adaptor come together to be ligated in the circularization process (305 of FIG. 3). Structure 710 is the 5′ arm of adaptor 700. Again, T tail 704 and the EcoP15 site 705 are shown, as well as the 5′ arm 701 and the binding region 712. Structure 720 is the 3′ arm of adaptor 700. Note the T tail 704 and the EcoP15 site 705, as well as the 3′ arm 703 and the binding region 722. In the 5′ arm, the binding region 712 is complementary to the binding region 722 of the 3′ arm.
  • Because the aspects of the claimed invention work optimally when library constructs are of a desired size and limited target nucleic acid sequence, it is preferred that throughout the library construction process the circularization reactions occur intramolecularly. That is, that the separate constructs of the library that are generated in the library construct assembly cycle (as shown in FIG. 3) do not ligate to one another. Also, it is preferred that only one set of adaptor arms for each adaptor used in the library construction process be included per target nucleic acid/adaptor construct. Thus, blocking oligos 717 and 727 are used to block the binding regions 712 and 722 regions, respectively. Blocker oligonucleotide 717 is complementary to binding sequence 716, and blocker oligonucleotide 727 is complementary to binding sequence 726. In the schematic illustrations of the 5′ adaptor arm and the 3′ adaptor arm, the underlined bases are ddC and the bolded font bases are phosphorylated. Blocker oligonucleotides 717 and 727 are not covalently bound to the adaptor arms, and can be “melted off” after ligation of the adaptor arms to the library construct and before circularization; further, the dideoxy nucleotide (here, ddC or alternatively a different non-ligatable nucleotide) prevents ligation of blocker to adaptor. In addition or as an alternative, in some aspects, the blocker oligo-adaptor arm hybrids contain a one or more base gap between the adaptor arm and the blocker to reduce ligation of blocker to adaptor. In some aspects, the blocker/binding region hybrids have Tms of about 37° C. to enable easy melting of the blocker sequences prior to tail to tail ligation (circularization).
  • Adaptor structure 730 is a schematic of the final adaptor, where N is an unspecified base, a numeral “1” specifies bases added to disrupt the palindrome (i.e., the EcoP15 site is flanked by A's to isolate the 6-base palindrome formed by the EcoP15 sites on the two arms of the adaptor), numeral “2” specifies bases that correspond to the ddC in the blocker oligonucleotides, numeral “3” specifies the EcoP15 site (CTGCTG) and numeral “4” specifies the T bases designated for TA ligation to the A tailed target nucleic acid. The adaptor shown as 900 and detailed at 930 would, in some aspects, be appropriate for a first adaptor to be added in the construction of a library. Adaptors added subsequently would, in some aspects, have a single Type IIs restriction endonuclease site rather than two sites, and, in some aspects, the Type IIs restriction endonuclease sites in each adaptor would be different from one another. Exemplary Type IIs restriction endonucleases include, but are not limited to, Eco57M I, Mme I, Acu I, Bpm I, BceA I, Bbv I, BciV I, BpuE I, BseM II, BseR I, Bsg I, BsmF I, BtgZ I, Eci I, EcoP15 I, Eco57M I, Fok I, Hga I, Hph I, Mbo II, Mnl I, SfaN I, TspDT I, TspDW I, Taq II, and the like.
  • In some aspects, the adaptors when assembled have a total length of about 50 nucleotides. As shown above, in some aspects, the adaptors are ligated to the target nucleic acid as two adaptor arms, where each adaptor arm comprises two adaptor oligos (the two complementary strands) and one blocker oligo. As shown the 5′ ends of all four adaptor arm oligos are phosphorylated to support ligation to the insert and tail-to-tail ligation of 5′ to 3′ adaptor arms. As shown, the 5′ and 3′ adaptor arms have 3′ overhangs at the adaptor-target nucleic acid ligation junctions, to enable ligation to an A-tailed insert, and to suppress head-to-head adaptor arm ligation. Also as shown, the 5′ and 3′ adaptor arms have Type IIs restriction endonuclease recognition sites oriented to enable cleavage of the adjacent target nucleic acid.
  • Again, the adaptor construct shown in FIG. 7 would be, in some aspects, appropriate for a first adaptor to be inserted into a library construct because it contains two Type IIs restriction endonuclease recognition sites. Subsequently inserted adaptors would, in some aspects, comprise a single Type IIs restriction endonuclease recognition site oriented to enable cleavage of the adjacent target nucleic acid. Additionally, in preferred aspects, the 5′ and 3′ adaptor arms have anchor primer binding sites to enable sequencing of adjacent target nucleic acids. The anchor primer binding sites in some aspects overlap with the respective Type IIs restriction endonuclease recognition site(s); however, in other aspects the anchor primer binding sites do not overlap with the Type IIs restriction endonuclease recognition site(s).
  • FIG. 8 is a schematic representation of adaptor insertion allowing subsequent circularization of the target/adaptor construct. The portion of the library construct seen in FIG. 8 is adaptor-centric, showing target nucleic acid at 802 and 812, a 5′ adaptor arm at 804, a 5′ adaptor arm blocking oligo at 806, a 3′ adaptor arm at 810, and a 3′ adaptor arm blocking oligo at 808. The T tail of the adaptor arms 804 and 810 and the A tail of the target nucleic acids 802 and 812 are indicated. In process 801, the adaptor arms are ligated to the target nucleic acid resulting in target nucleic acid/5′ adaptor arm structure 814, and target nucleic acid/3′ adaptor arm structure 816, with blocking oligos 806 and 808 still hybridized to the target nucleic acid/adaptor arm structures. In process 803, the blockers are removed by melting, and, in preferred aspects under dilute conditions to favor intramolecular ligation of process 805. The resulting structure is seen at 818. FIG. 8 illustrates the process of adaptor arm ligation featuring blocking oligos; however, other methods may be used to block ligation-creating concatemers of adaptor arms or of library constructs, including using adaptor arms that comprise a restriction site, preferably a site for a restriction endonuclease that cuts asymmetrically, such as Ava I. Alternatively, the adaptor arms may comprise one or more uracil bases that can be selectively cleaved using uracil-DNA glycosylase enzyme (Krokan et al, 1997) with the resulting fragments then being melted off in the same way the blocker oligo is melted off.
  • Selection of Adaptor Orientation by Nicking
  • FIG. 9 is a schematic illustration of a process where a desired orientation of a second adaptor relative to a first adaptor is selected using nicking of the first and second adaptors. FIG. 9 shows a circularized library construct 902 comprising one adaptor having an orientation such that a rectangle is on the “outer strand” and a diamond is on the “inner strand.” Further construct 902 comprises a Type IIs restriction endonuclease recognition site that cuts in the adjacent target nucleic acid sequence (noted by the arrow, where the arrowhead indicates the site of the cut). Once cut, the circular library construct is linearized. In process 903, first and second arms of a second adaptor are ligated onto the linear library construct and the first and second arms of the second adaptor are then ligated to provide an intact second adaptor and a circular library construct. However, as can be seen in FIG. 9, the second adaptor can be inserted in two different orientations relative to the first adaptor. In the desired orientation library construct 904, the oval of the second adaptor is on the “outer strand” with the rectangle of the first adaptor and the bowtie of the second adaptor is on the “inner strand” with the diamond of the first adaptor. In the undesired orientation library construct 906, the oval of the second adaptor is on the “inner strand” with the diamond of the first adaptor and the bowtie of the second adaptor is on the “outer strand” with the rectangle of the first adaptor. At this juncture there is no simple way to separate the desired library constructs 904 from the undesired library constructs 906.
  • However, in the aspect shown in FIG. 9, the diamond and the bowtie portions of the first and second adaptors, respectfully, comprise a nickase enzyme digestion site or other nickable site. Nickases are endonucleases that recognize a specific recognition sequence in double stranded DNA, and cut one strand at a specific location relative to said recognition sequence, thereby giving rise to single-stranded breaks in duplex DNA. Nickases include but are not limited to Nb.BsrDI, Nb.BsmI, Nt.BbvCI, Nb.BbvCI, Nb.BtsI and Nt.BstNBI. As an alternative to using a nickase site and a nickase, in other aspects uracil is incorporated into one strand of an adaptor and nicking is accomplished by using uracil-DNA glycosylase. The diamond and bowtie portions of the first and second adaptors comprise nickase digestion sites such that treatment of the desired 904 and undesired 906 library constructs with nickase (907) results in two nicks in the double-stranded library constructs. In the desired orientation 904, the “outer strand” remains circular, and both nicks are sustained by the “inner strand” resulting in two linear portions of nucleic acid. However, in the undesired orientation 906, both strands are nicked once, resulting in a circular construct with both strands nicked. Once nicked, the library constructs are subjected to circle dependant amplification (911), where only the intact, circular constructs that survived the nicking will be efficiently amplified. Thus, library constructs having the desired orientation 904, are selected by nicking and subsequent amplification. Library constructs in the undesired orientation 906 will not be amplified or will be amplified significantly less efficiently than the desired library constructs.
  • FIG. 10 is a schematic representation of a library construction process 1000 where four adaptors are inserted into a target nucleic acid in a desired orientation driven by nicking. First genomic DNA 1002 is fragmented 1003 as described previously. The resulting fragmented genomic DNA 1003 is then treated to polish the ends, A's are added to the 3′ ends of the genomic DNA fragments, and first and second arms of a first adaptor are ligated to the genomic DNA fragments in process 805 to produce library construct 1006. The first and second arms of the first adaptor are then ligated, resulting in a circularized library construct 1008. Circularized library construct 1008 is then subjected to circle dependant amplification 1009, which exponentially amplifies circularized library construct 1008. The collection of amplified, circularized library constructs are then cut with EcoP15 1011, where the first adaptor has two recognition sites (represented by the two arrows), and the arrowheads indicate the sites where the circularized library construct is cut (within the target nucleic acid).
  • The processes are then repeated with the resulting linearized library construct 1012. That is, library construct 1012 is treated to polish the ends, A's are added to the 3′ ends of the DNA fragments, and first and second arms of a second adaptor are ligated to the library constructs in process 1013 to produce library construct 1014. The first and second arms of a second adaptor are then ligated to the ends of the library constructs, the constructs are circularized and are nicked in process 1015. The resulting library constructs have the second adaptor inserted into the construct in two different orientations relative to the first adaptor, desired orientation 1016 and undesired orientation 1018. Note that in desired orientation 1016, the double-stranded circularized construct is nicked twice in one strand and not nicked in the other strand. In the undesired orientation 1018, each strand is nicked once. The combined population of desired constructs 1016 and undesired constructs 1018 is then subjected to circle dependant amplification 1019, where the one remaining circular strand will be amplified, and the strands that were linearized as the result of the nicking will not be amplified (or will be amplified at a very low efficiency). The resulting population of amplified, circularized library constructs 1020 are then cut with Bpm I 1021, where the second adaptor has one Bpm I recognition site (represented by the one arrow), and the arrowheads indicate the sites where the circularized library construct is cut (within the target nucleic acid).
  • The processes are then repeated with the resulting linearized library construct 1022. Library construct 1022 is treated to blunt the ends, A's are added to the 3′ ends of the DNA fragments, and first and second arms of a third adaptor are ligated to the library constructs in process 1023 to produce library construct 1024. The first and second arms of the third adaptor are then ligated to the ends of the library constructs, and the constructs are circularized and are nicked in process 1025. The resulting library constructs have the third adaptor inserted into the construct in two different orientations relative to the first and second adaptors, desired orientation 1026 and undesired orientation 1028. Note that in desired orientation 1026, the double-stranded circularized construct is nicked three times in one strand and not nicked in the other strand. In undesired orientation 1028, one strand is nicked twice and one strand is nicked once. The combined population of desired 1026 and undesired constructs 1028 is then subjected to circle dependant amplification 1029, where the one remaining circular strand will be amplified, and the strands that were linearized as the result of the nicking will not be amplified (or will be amplified at a very low efficiency). The resulting population of amplified, circularized library constructs 1030 are then cut with Acu I 1031, where the third adaptor has one Acu I recognition site (represented by the one arrow), and the arrowheads indicate the sites where the circularized library construct is cut (within the target nucleic acid).
  • Again the processes are repeated with the resulting linearized library construct 1032. Library construct 1032 is treated to blunt the ends, A's are added to the 3′ ends of the DNA fragments, and first and second arms of a fourth adaptor are ligated to the library constructs in process 1033 to produce library construct 1034. The first and second arms of the fourth adaptor are then ligated to the ends of the library constructs, the constructs are circularized and are nicked in process 1035. The resulting library constructs have the fourth adaptor inserted into the construct in two different orientations relative to the first, second and third adaptors, desired orientation 1036 and undesired orientation 1038. Note that in desired orientation 1036, the double-stranded circularized construct is nicked four times in one strand and not nicked in the other strand. In undesired orientation 1038, one strand is nicked three times and one strand is nicked once. The combined population of desired 1036 and undesired constructs 1038 is then subjected to circle dependant replication 1039, where the one remaining circular strand will be replicated linearly, and the resulting concatemer is allowed to form DNBs 1040.
  • EXAMPLES
  • A Tailing: Samples of 100 ng of fragmented genomic DNA were prepared in Thermopol buffer, with dATP and Taq polymerase added. The samples were then incubated at 70° C. for 60 minutes and cooled to 4° C. The samples were then purified by Qiagen MinElute columns.
  • Adaptor annealing: The A tailed fragmented genomic DNA samples were mixed with T tailed adaptors and blocking oligos in a buffer containing NaCl, Tris and EDTA. The samples were then heated to 95° C. for 5 minutes and then allowed to cool to room temperature.
  • Adaptor ligation: The annealed adaptor/genomic DNA samples were mixed with HB ligation buffer and T4 ligase. The samples were then incubated at 14° C. for two hours, 70° C. for 10 minutes (to inactivate the T4 enzyme and remove the blocking oligos) and cooled to 4° C. The samples were then purified by Qiagen MinElute columns.
  • Adaptor circularization: The linear fragmented genomic DNAs now flanked by first and second arms of an adaptor were circularized by incubation in epicenter buffer and T4 Ligase at 14° C. for 14 hours. The samples were then heat inactivated at 70° C. for 10 minutes and then cooled to 4° C.
  • Model System for Orientation Selection by Nicking: A model system was used to optimize selection of adaptor orientation. First, genomic DNA samples were amplified using primers with various nick sites including a Sph I site. Once amplified, Sph I was used to cut the amplified DNA to expose ends for circularization. The restricted amplified DNA was cut with a restriction endonuclease, blunt ended and A tailed. The A tailed products were ligated to adaptor arms and circularized in epicentre buffer, 25 mM ATP and T4 ligase, with a 2 hour incubation at 14° C., and a subsequent heat denaturation. The samples were then purified by Qiagen MinElute columns. The circularized constructs were then cut with a restriction endonuclease, blunt ended and A tailed. The A tailed products were ligated to a second set of adaptor arms (containing an Ava I site) and circularized.
  • Next, the circularized constructs were nicked in NEB buffer and a nickase enzyme (Nt.BstNBI Nb.BsrDI or NB.BsmI) at 55° C. for 1.5 hours, with a subsequent 20 minute inactivation. The nicked amplification products were then amplified by circle dependant amplification (four hour incubation at 30° C.). The circularized products were then cut with Ava I and an aliquot was removed to confirm the adaptor orientation.
  • The present specification provides a complete description of the methodologies, systems and/or structures and uses thereof in example aspects of the presently-described technology. Although various aspects of this technology have been described above with a certain degree of particularity, or with reference to one or more individual aspects, those skilled in the art could make numerous alterations to the disclosed aspects without departing from the spirit or scope of the technology hereof. Since many aspects can be made without departing from the spirit and scope of the presently described technology, the appropriate scope resides in the claims hereinafter appended. Other aspects are therefore contemplated. Furthermore, it should be understood that any operations may be performed in any order, unless explicitly claimed otherwise or a specific order is inherently necessitated by the claim language. It is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative only of particular aspects and are not limiting to the embodiments shown. Changes in detail or structure may be made without departing from the basic elements of the present technology as defined in the following claims.

Claims (24)

1. A method for enriching or selecting for orientation of two adaptors with respect to one another in nucleic acid library constructs comprising:
obtaining target nucleic acids;
ligating a first adaptor to the target nucleic acids to produce first library constructs, wherein one strand of the first adaptor comprises a first nickable site;
ligating a second adaptor to the first library constructs to produce second library constructs, wherein one strand of the second adaptor comprises a second nickable site;
circularizing the second library constructs;
nicking the second library constructs to form a mixture of library constructs with nicks on both strands and library constructs with nicks on one strand and no nicks on the other strand; and
subjecting the library constructs to circle dependant amplification, wherein the strands without nicks will be amplified exponentially and the strands with nicks will be amplified non-exponentially, thereby enriching for library constructs having a desired orientation of the second adaptor with respect to the first adaptor.
2. The method of claim 1, wherein the first library constructs circularized between the ligating processes.
3. The method of claim 2, wherein the first library constructs are cut with a restriction endonuclease after being circularized.
4. The method of claim 1, wherein the first adaptor is ligated to the target nucleic acid as two adaptor arms.
5. The method of claim 1, wherein the second adaptor is ligated to the first library constructs as two adaptor arms.
6. The method of claim 1, wherein the first and second adaptors further comprise Type IIs endonuclease recognition sites.
7. A method for enriching or selecting for orientation of two adaptors with respect to one another in nucleic acid library constructs comprising:
(a) obtaining target nucleic acids;
(b) ligating a first adaptor to the target nucleic acids to produce first library constructs, wherein one strand of the first adaptor comprises a first nickase site;
(c) ligating a second adaptor to the first library constructs to produce second library constructs, wherein one strand of the second adaptor comprises a second nickase site;
(d) circularizing the second library constructs;
(e) nicking the second library constructs with a nickase to form a mixture of library constructs with nicks on both strands and library constructs with nicks on one strand and no nicks on the other strand; and
(f) subjecting the library constructs to circle dependant amplification, wherein the strands with no nicks will be amplified exponentially and the strands with nicks will be amplified non-exponentially, thereby enriching for orientation of the second adaptor with respect to the first adaptor in the nucleic acid library constructs.
8. The method of claim 7, wherein processes (b) through (f) until a desired number of adaptors have been inserted into the nucleic acid library constructs.
9. The method of claim 7, wherein the first adaptor is ligated to the target nucleic acid as two adaptor arms.
10. The method of claim 7, wherein the second adaptor is ligated to the first library constructs as two adaptor arms.
11. An amplicon made by amplification of a circular library construct comprising target nucleic acid interspersed with a plurality of adaptors, wherein at least one of the plurality of adaptors has a desired orientation with respect to at least one other of the plurality of adaptors.
12. The amplicon of claim 11, wherein each of the plurality of adaptors has a desired orientation with respect to at least one other of the plurality of adaptors.
13. The amplicon of claim 11, wherein one or more of the adaptors comprises a restriction endonuclease recognition site.
14. The amplicon of claim 13, wherein the restriction endonuclease recognition site is a Type IIs restriction endonuclease recognition site.
15. The amplicon of claim 11, wherein each adaptor of the plurality of adaptors further comprise a different anchor primer binding site at a 5′ and 3′ end of each of the plurality of adaptors.
16. A multiplicity of amplicons of circular library constructs, wherein each amplicon comprises target nucleic acid interspersed with a plurality of adaptors, wherein at least one of the plurality of adaptors has a desired orientation with respect to at least one other of the plurality of adaptors.
17. The multiplicity of amplicons of claim 16, wherein each of the plurality of adaptors in each amplicon has a desired orientation with respect to at least one other of the plurality of adaptors.
18. The multiplicity of amplicons of claim 16, wherein the target nucleic acid is genomic DNA, cDNA or RNA, and wherein the multiplicity of amplicons comprises substantially all of genomic DNA, cDNA or RNA of interest.
19. The multiplicity of amplicons of claim 16, wherein one or more of the adaptors comprises a restriction endonuclease recognition site.
20. The multiplicity of amplicons of claim 19, wherein the restriction endonuclease recognition site is a Type IIs restriction endonuclease recognition site.
21. The multiplicity of amplicons of claim 16, wherein each adaptor of the plurality of adaptors further comprise a different anchor primer binding site at a 5′ and 3′ end of each of the plurality of adaptors.
22. A kit for selecting for desired orientations of multiple adaptors in library constructs, wherein said kit comprises:
a) a first double-stranded adaptor, wherein said first adaptor comprises a recognition site for a first Type IIs restriction endonuclease and a nickable site;
b) a second double-stranded adaptor, wherein said second adaptor comprises a restriction site for a second Type IIs restriction endonuclease and a nickable site; and
c) primers complementary to both strands of each of said first and second adaptors.
23. The kit for selecting for desired orientation of multiple adaptors in library constructs of claim 22, further comprising: Type IIs restriction endonucleases, polymerases or dNTPs.
24. A method for enriching or selecting for orientation of two adaptors with respect to one another in library constructs comprising:
obtaining target nucleic acids;
ligating a first adaptor to the target nucleic acids to produce first library constructs, wherein the undesired strand of the first adaptor comprises a first nickable site;
ligating a second adaptor to the first library constructs to produce second library constructs, wherein the undesired strand of the second adaptor comprises a second nickable site;
circularizing the second library constructs;
nicking the second library constructs to form a mixture of library constructs with nicks on both strands and library constructs with nicks on one strand and no nicks on the other strand; and
amplifying the circularized second library constructs with primers that bind to the desired strand of the first adaptor and the desired strand of the second adaptor, wherein the desired strands without nicks will be amplified exponentially and the strands with nicks will be amplified non-exponentially, thereby enriching for library constructs having a desired orientation of the second adaptor with respect to the first adaptor.
US11/934,695 2006-11-09 2007-11-02 Selection of dna adaptor orientation by nicking Abandoned US20090075343A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US11/934,695 US20090075343A1 (en) 2006-11-09 2007-11-02 Selection of dna adaptor orientation by nicking
US12/573,697 US8518640B2 (en) 2007-10-29 2009-10-05 Nucleic acid sequencing and process

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US86499206P 2006-11-09 2006-11-09
US11/934,695 US20090075343A1 (en) 2006-11-09 2007-11-02 Selection of dna adaptor orientation by nicking

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/934,697 Continuation-In-Part US20090111705A1 (en) 2006-11-09 2007-11-02 Selection of dna adaptor orientation by hybrid capture

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US11/927,356 Continuation-In-Part US7910354B2 (en) 2006-10-27 2007-10-29 Efficient arrays of amplified polynucleotides

Publications (1)

Publication Number Publication Date
US20090075343A1 true US20090075343A1 (en) 2009-03-19

Family

ID=39365402

Family Applications (5)

Application Number Title Priority Date Filing Date
US11/934,697 Abandoned US20090111705A1 (en) 2006-11-09 2007-11-02 Selection of dna adaptor orientation by hybrid capture
US11/934,703 Abandoned US20090111706A1 (en) 2006-11-09 2007-11-02 Selection of dna adaptor orientation by amplification
US11/934,695 Abandoned US20090075343A1 (en) 2006-11-09 2007-11-02 Selection of dna adaptor orientation by nicking
US11/938,106 Abandoned US20080171331A1 (en) 2006-11-09 2007-11-09 Methods and Compositions for Large-Scale Analysis of Nucleic Acids Using DNA Deletions
US11/938,096 Active 2030-03-13 US9334490B2 (en) 2006-11-09 2007-11-09 Methods and compositions for large-scale analysis of nucleic acids using DNA deletions

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US11/934,697 Abandoned US20090111705A1 (en) 2006-11-09 2007-11-02 Selection of dna adaptor orientation by hybrid capture
US11/934,703 Abandoned US20090111706A1 (en) 2006-11-09 2007-11-02 Selection of dna adaptor orientation by amplification

Family Applications After (2)

Application Number Title Priority Date Filing Date
US11/938,106 Abandoned US20080171331A1 (en) 2006-11-09 2007-11-09 Methods and Compositions for Large-Scale Analysis of Nucleic Acids Using DNA Deletions
US11/938,096 Active 2030-03-13 US9334490B2 (en) 2006-11-09 2007-11-09 Methods and compositions for large-scale analysis of nucleic acids using DNA deletions

Country Status (2)

Country Link
US (5) US20090111705A1 (en)
WO (2) WO2008070375A2 (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090118488A1 (en) * 2006-02-24 2009-05-07 Complete Genomics, Inc. High throughput genome sequencing on DNA arrays
US20090176652A1 (en) * 2007-11-06 2009-07-09 Complete Genomics, Inc. Methods and Oligonucleotide Designs for Insertion of Multiple Adaptors into Library Constructs
US20090203551A1 (en) * 2007-11-05 2009-08-13 Complete Genomics, Inc. Methods and Oligonucleotide Designs for Insertion of Multiple Adaptors Employing Selective Methylation
US20090263872A1 (en) * 2008-01-23 2009-10-22 Complete Genomics Inc. Methods and compositions for preventing bias in amplification and sequencing reactions
US20100287165A1 (en) * 2009-02-03 2010-11-11 Halpern Aaron L Indexing a reference sequence for oligomer sequence mapping
US20100286925A1 (en) * 2009-02-03 2010-11-11 Halpern Aaron L Oligomer sequences mapping
US20110004413A1 (en) * 2009-04-29 2011-01-06 Complete Genomics, Inc. Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence
US20110015864A1 (en) * 2009-02-03 2011-01-20 Halpern Aaron L Oligomer sequences mapping
WO2013166517A1 (en) 2012-05-04 2013-11-07 Complete Genomics, Inc. Methods for determining absolute genome-wide copy number variations of complex tumors
US8609335B2 (en) 2005-10-07 2013-12-17 Callida Genomics, Inc. Self-assembled single molecule arrays and uses thereof
US8725422B2 (en) 2010-10-13 2014-05-13 Complete Genomics, Inc. Methods for estimating genome-wide copy number variations
US8785127B2 (en) 2003-02-26 2014-07-22 Callida Genomics, Inc. Random array DNA analysis by hybridization
WO2014145820A2 (en) 2013-03-15 2014-09-18 Complete Genomics, Inc. Multiple tagging of long dna fragments
US20180044668A1 (en) * 2014-10-14 2018-02-15 Bgi Shenzhen Co., Limited Mate pair library construction
US9944984B2 (en) 2005-06-15 2018-04-17 Complete Genomics, Inc. High density DNA array
US10726942B2 (en) 2013-08-23 2020-07-28 Complete Genomics, Inc. Long fragment de novo assembly using short reads
US10837879B2 (en) 2011-11-02 2020-11-17 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
WO2023028478A3 (en) * 2021-08-26 2023-04-06 Illumina, Inc. Methods and compositions for detecting genomic methylation

Families Citing this family (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2202322A1 (en) 2003-10-31 2010-06-30 AB Advanced Genetic Analysis Corporation Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof
US20090111705A1 (en) 2006-11-09 2009-04-30 Complete Genomics, Inc. Selection of dna adaptor orientation by hybrid capture
US11339430B2 (en) 2007-07-10 2022-05-24 Life Technologies Corporation Methods and apparatus for measuring analytes using large scale FET arrays
US8262900B2 (en) 2006-12-14 2012-09-11 Life Technologies Corporation Methods and apparatus for measuring analytes using large scale FET arrays
CA2672315A1 (en) 2006-12-14 2008-06-26 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes using large scale fet arrays
US8349167B2 (en) 2006-12-14 2013-01-08 Life Technologies Corporation Methods and apparatus for detecting molecular interactions using FET arrays
CN100570022C (en) * 2007-07-23 2009-12-16 百奥迈科生物技术有限公司 PCR high flux construction siRNA whole site molecule library preparation method
WO2009052214A2 (en) 2007-10-15 2009-04-23 Complete Genomics, Inc. Sequence analysis using decorated nucleic acids
US8530197B2 (en) 2008-01-09 2013-09-10 Applied Biosystems, Llc Method of making a paired tag library for nucleic acid sequencing
WO2012044847A1 (en) 2010-10-01 2012-04-05 Life Technologies Corporation Nucleic acid adaptors and uses thereof
WO2009120372A2 (en) 2008-03-28 2009-10-01 Pacific Biosciences Of California, Inc. Compositions and methods for nucleic acid sequencing
US8628940B2 (en) 2008-09-24 2014-01-14 Pacific Biosciences Of California, Inc. Intermittent detection during analytical reactions
WO2009120374A2 (en) 2008-03-28 2009-10-01 Pacific Biosciences Of California, Inc. Methods and compositions for nucleic acid sample preparation
CN102076851A (en) 2008-05-02 2011-05-25 Epi中心科技公司 Selective 5' ligation tagging of rna
US8470164B2 (en) 2008-06-25 2013-06-25 Life Technologies Corporation Methods and apparatus for measuring analytes using large scale FET arrays
JP2010029146A (en) * 2008-07-30 2010-02-12 Hitachi Ltd Method for analysis of base sequence
WO2010036287A1 (en) 2008-09-24 2010-04-01 Pacific Biosciences Of California, Inc. Intermittent detection during analytical reactions
US8383369B2 (en) 2008-09-24 2013-02-26 Pacific Biosciences Of California, Inc. Intermittent detection during analytical reactions
US20100301398A1 (en) 2009-05-29 2010-12-02 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
US20100137143A1 (en) 2008-10-22 2010-06-03 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
PL2963709T3 (en) 2008-10-24 2017-11-30 Epicentre Technologies Corporation Transposon end compositions and methods for modifying nucleic acids
US9080211B2 (en) 2008-10-24 2015-07-14 Epicentre Technologies Corporation Transposon end compositions and methods for modifying nucleic acids
US8776573B2 (en) 2009-05-29 2014-07-15 Life Technologies Corporation Methods and apparatus for measuring analytes
US20120261274A1 (en) 2009-05-29 2012-10-18 Life Technologies Corporation Methods and apparatus for measuring analytes
US8895242B2 (en) * 2009-10-20 2014-11-25 The Regents Of The University Of California Single molecule nucleic acid nanoparticles
US20110106636A1 (en) * 2009-11-02 2011-05-05 Undercurrent Inc. Method and system for managing online presence
US9023769B2 (en) 2009-11-30 2015-05-05 Complete Genomics, Inc. cDNA library for nucleic acid sequencing
US8774494B2 (en) * 2010-04-30 2014-07-08 Complete Genomics, Inc. Method and system for accurate alignment and registration of array for DNA sequencing
JP5952813B2 (en) 2010-06-30 2016-07-13 ライフ テクノロジーズ コーポレーション Method and apparatus for testing ISFET arrays
AU2011226767B1 (en) 2010-06-30 2011-11-10 Life Technologies Corporation Ion-sensing charge-accumulation circuits and methods
US8731847B2 (en) 2010-06-30 2014-05-20 Life Technologies Corporation Array configuration and readout scheme
US11307166B2 (en) 2010-07-01 2022-04-19 Life Technologies Corporation Column ADC
WO2012036679A1 (en) 2010-09-15 2012-03-22 Life Technologies Corporation Methods and apparatus for measuring analytes
EP2619564B1 (en) 2010-09-24 2016-03-16 Life Technologies Corporation Matched pair transistor circuits
US9970984B2 (en) 2011-12-01 2018-05-15 Life Technologies Corporation Method and apparatus for identifying defects in a chemical sensor array
US8786331B2 (en) 2012-05-29 2014-07-22 Life Technologies Corporation System for reducing noise in a chemical sensor array
KR101922124B1 (en) * 2012-08-27 2018-11-26 삼성전자주식회사 Method for amplifying DNA from RNA in a sample
US9080968B2 (en) 2013-01-04 2015-07-14 Life Technologies Corporation Methods and systems for point of use removal of sacrificial material
US9841398B2 (en) 2013-01-08 2017-12-12 Life Technologies Corporation Methods for manufacturing well structures for low-noise chemical sensors
US8962366B2 (en) 2013-01-28 2015-02-24 Life Technologies Corporation Self-aligned well structures for low-noise chemical sensors
US8841217B1 (en) 2013-03-13 2014-09-23 Life Technologies Corporation Chemical sensor with protruded sensor surface
US10443092B2 (en) 2013-03-13 2019-10-15 President And Fellows Of Harvard College Methods of elongating DNA
US8963216B2 (en) 2013-03-13 2015-02-24 Life Technologies Corporation Chemical sensor with sidewall spacer sensor surface
CN105051525B (en) 2013-03-15 2019-07-26 生命科技公司 Chemical device with thin conducting element
CN105283758B (en) 2013-03-15 2018-06-05 生命科技公司 Chemical sensor with consistent sensor surface area
CN105264366B (en) 2013-03-15 2019-04-16 生命科技公司 Chemical sensor with consistent sensor surface area
US9116117B2 (en) 2013-03-15 2015-08-25 Life Technologies Corporation Chemical sensor with sidewall sensor surface
US9835585B2 (en) 2013-03-15 2017-12-05 Life Technologies Corporation Chemical sensor with protruded sensor surface
US20140336063A1 (en) 2013-05-09 2014-11-13 Life Technologies Corporation Windowed Sequencing
AU2013101567B4 (en) * 2013-05-27 2014-03-13 P & M Hebbard Pty Ltd A catheter system
US10458942B2 (en) 2013-06-10 2019-10-29 Life Technologies Corporation Chemical sensor array having multiple sensors per well
US20160032281A1 (en) * 2014-07-31 2016-02-04 Fei Company Functionalized grids for locating and imaging biological specimens and methods of using the same
EP3192869B1 (en) 2014-09-12 2019-03-27 MGI Tech Co., Ltd. Isolated oligonucleotide and use thereof in nucleic acid sequencing
EP3234575B1 (en) 2014-12-18 2023-01-25 Life Technologies Corporation Apparatus for measuring analytes using large scale fet arrays
US10077472B2 (en) 2014-12-18 2018-09-18 Life Technologies Corporation High data rate integrated circuit with power management
KR102593647B1 (en) 2014-12-18 2023-10-26 라이프 테크놀로지스 코포레이션 High data rate integrated circuit with transmitter configuration
US10590451B2 (en) * 2016-07-01 2020-03-17 Personal Genomics, Inc. Methods of constructing a circular template and detecting DNA molecules
GB2566986A (en) 2017-09-29 2019-04-03 Evonetix Ltd Error detection during hybridisation of target double-stranded nucleic acid
US10704094B1 (en) 2018-11-14 2020-07-07 Element Biosciences, Inc. Multipart reagents having increased avidity for polymerase binding
US20200149095A1 (en) * 2018-11-14 2020-05-14 Element Biosciences, Inc. Low binding supports for improved solid-phase dna hybridization and amplification
US10876148B2 (en) 2018-11-14 2020-12-29 Element Biosciences, Inc. De novo surface preparation and uses thereof
AU2020266530A1 (en) 2019-04-29 2021-04-08 Illumina, Inc. Identification and analysis of microbial samples by rapid incubation and nucleic acid enrichment
WO2022212574A1 (en) * 2021-03-30 2022-10-06 Resolution Bioscience, Inc. Compositions and methods for simultaneous genetic analysis of multiple libraries
US11859241B2 (en) 2021-06-17 2024-01-02 Element Biosciences, Inc. Compositions and methods for pairwise sequencing
US11236388B1 (en) 2021-06-17 2022-02-01 Element Biosciences, Inc. Compositions and methods for pairwise sequencing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5710000A (en) * 1994-09-16 1998-01-20 Affymetrix, Inc. Capturing sequences adjacent to Type-IIs restriction sites for genomic library mapping
US6287824B1 (en) * 1998-09-15 2001-09-11 Yale University Molecular cloning using rolling circle amplification
US20060024711A1 (en) * 2004-07-02 2006-02-02 Helicos Biosciences Corporation Methods for nucleic acid amplification and sequence determination
US7232656B2 (en) * 1998-07-30 2007-06-19 Solexa Ltd. Arrayed biomolecules and their use in sequencing

Family Cites Families (173)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4318846A (en) 1979-09-07 1982-03-09 Syva Company Novel ether substituted fluorescein polyamino acid compounds as fluorescers and quenchers
US4469863A (en) 1980-11-12 1984-09-04 Ts O Paul O P Nonionic nucleic acid alkyl and aryl phosphonates and processes for manufacture and use thereof
US4994373A (en) * 1983-01-27 1991-02-19 Enzo Biochem, Inc. Method and structures employing chemically-labelled polynucleotide probes
US4605735A (en) * 1983-02-14 1986-08-12 Wakunaga Seiyaku Kabushiki Kaisha Oligonucleotide derivatives
US4719179A (en) 1984-11-30 1988-01-12 Pharmacia P-L Biochemicals, Inc. Six base oligonucleotide linkers and methods for their use
US4883750A (en) 1984-12-13 1989-11-28 Applied Biosystems, Inc. Detection of specific sequences in nucleic acids
US5235033A (en) * 1985-03-15 1993-08-10 Anti-Gene Development Group Alpha-morpholino ribonucleoside derivatives and polymers thereof
US5034506A (en) * 1985-03-15 1991-07-23 Anti-Gene Development Group Uncharged morpholino-based polymers having achiral intersubunit linkages
US4757141A (en) * 1985-08-26 1988-07-12 Applied Biosystems, Incorporated Amino-derivatized phosphite and phosphate linking agents, phosphoramidite precursors, and useful conjugates thereof
US5091519A (en) * 1986-05-01 1992-02-25 Amoco Corporation Nucleotide compositions with linking groups
US5151507A (en) 1986-07-02 1992-09-29 E. I. Du Pont De Nemours And Company Alkynylamino-nucleotides
US6270961B1 (en) 1987-04-01 2001-08-07 Hyseq, Inc. Methods and apparatus for DNA sequencing and DNA identification
US5124246A (en) 1987-10-15 1992-06-23 Chiron Corporation Nucleic acid multimers and amplified nucleic acid hybridization assays using same
US4886741A (en) 1987-12-09 1989-12-12 Microprobe Corporation Use of volume exclusion agents for the enhancement of in situ hybridization
DE3813278A1 (en) 1988-01-12 1989-07-20 Boehringer Mannheim Gmbh METHOD FOR DETECTING NUCLEIC ACIDS
US5354657A (en) 1988-01-12 1994-10-11 Boehringer Mannheim Gmbh Process for the highly specific detection of nucleic acids in solid
US5216141A (en) 1988-06-06 1993-06-01 Benner Steven A Oligonucleotide analogs containing sulfur linkages
US5066580A (en) 1988-08-31 1991-11-19 Becton Dickinson And Company Xanthene dyes that emit to the red of fluorescein
DE3836656A1 (en) * 1988-10-27 1990-05-03 Boehringer Mannheim Gmbh NEW DIGOXIGENINE DERIVATIVES AND THEIR USE
US5091302A (en) * 1989-04-27 1992-02-25 The Blood Center Of Southeastern Wisconsin, Inc. Polymorphism of human platelet membrane glycoprotein iiia and diagnostic and therapeutic applications thereof
US6416952B1 (en) * 1989-06-07 2002-07-09 Affymetrix, Inc. Photolithographic and other means for manufacturing arrays
US5143854A (en) 1989-06-07 1992-09-01 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof
US5800992A (en) 1989-06-07 1998-09-01 Fodor; Stephen P.A. Method of detecting nucleic acids
US6346413B1 (en) 1989-06-07 2002-02-12 Affymetrix, Inc. Polymer arrays
US5744101A (en) * 1989-06-07 1998-04-28 Affymax Technologies N.V. Photolabile nucleoside protecting groups
US5366860A (en) 1989-09-29 1994-11-22 Applied Biosystems, Inc. Spectrally resolvable rhodamine dyes for nucleic acid sequence determination
US5188934A (en) * 1989-11-14 1993-02-23 Applied Biosystems, Inc. 4,7-dichlorofluorescein dyes as molecular probes
US5427930A (en) * 1990-01-26 1995-06-27 Abbott Laboratories Amplification of target nucleic acids using gap filling ligase chain reaction
CA2036946C (en) 1990-04-06 2001-10-16 Kenneth V. Deugau Indexing linkers
GB9009980D0 (en) 1990-05-03 1990-06-27 Amersham Int Plc Phosphoramidite derivatives,their preparation and the use thereof in the incorporation of reporter groups on synthetic oligonucleotides
US5386023A (en) * 1990-07-27 1995-01-31 Isis Pharmaceuticals Backbone modified oligonucleotide analogs and preparation thereof through reductive coupling
US5602240A (en) * 1990-07-27 1997-02-11 Ciba Geigy Ag. Backbone modified oligonucleotide analogs
JP3080178B2 (en) 1991-02-18 2000-08-21 東洋紡績株式会社 Method for amplifying nucleic acid sequence and reagent kit therefor
US5426180A (en) 1991-03-27 1995-06-20 Research Corporation Technologies, Inc. Methods of making single-stranded circular oligonucleotides
JP3085409B2 (en) 1991-03-29 2000-09-11 東洋紡績株式会社 Method for detecting target nucleic acid sequence and reagent kit therefor
US5474796A (en) 1991-09-04 1995-12-12 Protogene Laboratories, Inc. Method and apparatus for conducting an array of chemical reactions on a support surface
US6589726B1 (en) * 1991-09-04 2003-07-08 Metrigen, Inc. Method and apparatus for in situ synthesis on a solid support
PT534858E (en) * 1991-09-24 2000-09-29 Keygene Nv SELECTIVE AMPLIFICATION OF RESTRICTION FRAGMENTS A GENERAL METHOD FOR OBTAINING "DIGITAL IMPRESSIONS" OF DNA
US5632957A (en) 1993-11-01 1997-05-27 Nanogen Molecular biological diagnostic systems including electrodes
US5644048A (en) 1992-01-10 1997-07-01 Isis Pharmaceuticals, Inc. Process for preparing phosphorothioate oligonucleotides
US5403708A (en) 1992-07-06 1995-04-04 Brennan; Thomas M. Methods and compositions for determining the sequence of nucleic acids
GB9214873D0 (en) 1992-07-13 1992-08-26 Medical Res Council Process for categorising nucleotide sequence populations
US6261808B1 (en) * 1992-08-04 2001-07-17 Replicon, Inc. Amplification of nucleic acid molecules via circular replicons
US5834202A (en) * 1992-08-04 1998-11-10 Replicon, Inc. Methods for the isothermal amplification of nucleic acid molecules
WO1994003624A1 (en) 1992-08-04 1994-02-17 Auerbach Jeffrey I Methods for the isothermal amplification of nucleic acid molecules
US5714320A (en) * 1993-04-15 1998-02-03 University Of Rochester Rolling circle synthesis of oligonucleotides and amplification of select randomized circular oligonucleotides
US6096880A (en) 1993-04-15 2000-08-01 University Of Rochester Circular DNA vectors for synthesis of RNA and DNA
US6077668A (en) 1993-04-15 2000-06-20 University Of Rochester Highly sensitive multimeric nucleic acid probes
US5473060A (en) 1993-07-02 1995-12-05 Lynx Therapeutics, Inc. Oligonucleotide clamps having diagnostic applications
EP0706531A4 (en) 1993-07-02 1998-11-25 Lynx Therapeutics Inc Synthesis of branched nucleic acids
US6401267B1 (en) 1993-09-27 2002-06-11 Radoje Drmanac Methods and compositions for efficient nucleic acid sequencing
DE69433487T2 (en) 1993-09-27 2004-11-25 Arch Development Corp., Chicago METHODS AND COMPOSITIONS FOR EFFICIENT NUCLEIC ACID SEQUENCING
US5654419A (en) 1994-02-01 1997-08-05 The Regents Of The University Of California Fluorescent labels and their use in separations
SE9400522D0 (en) 1994-02-16 1994-02-16 Ulf Landegren Method and reagent for detecting specific nucleotide sequences
US5637684A (en) * 1994-02-23 1997-06-10 Isis Pharmaceuticals, Inc. Phosphoramidate and phosphorothioamidate oligomeric compounds
US5641658A (en) * 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
US6013445A (en) * 1996-06-06 2000-01-11 Lynx Therapeutics, Inc. Massively parallel signature sequencing by ligation of encoded adaptors
US6654505B2 (en) 1994-10-13 2003-11-25 Lynx Therapeutics, Inc. System and apparatus for sequential processing of analytes
FR2726286B1 (en) 1994-10-28 1997-01-17 Genset Sa SOLID PHASE NUCLEIC ACID AMPLIFICATION PROCESS AND REAGENT KIT USEFUL FOR CARRYING OUT SAID PROCESS
US5866337A (en) 1995-03-24 1999-02-02 The Trustees Of Columbia University In The City Of New York Method to detect mutations in a nucleic acid using a hybridization-ligation procedure
US5750341A (en) 1995-04-17 1998-05-12 Lynx Therapeutics, Inc. DNA sequencing by parallel oligonucleotide extensions
US5648245A (en) * 1995-05-09 1997-07-15 Carnegie Institution Of Washington Method for constructing an oligonucleotide concatamer library by rolling circle replication
AU5763296A (en) 1995-05-12 1996-11-29 Novartis Ag Sensor platform and method for the parallel detection of a plurality of analytes using evanescently excited luminescence
US5854033A (en) 1995-11-21 1998-12-29 Yale University Rolling circle replication reporter systems
DE69612013T2 (en) 1995-11-21 2001-08-02 Univ Yale New Haven UNIMOLECULAR SEGMENT AMPLIFICATION AND DETERMINATION
ATE238434T1 (en) 1995-12-05 2003-05-15 Jorn Erland Koch A CASCADED DUPLICATION REACTION OF NUCLEIC ACIDS
US5800996A (en) 1996-05-03 1998-09-01 The Perkin Elmer Corporation Energy transfer dyes with enchanced fluorescence
US5847162A (en) 1996-06-27 1998-12-08 The Perkin Elmer Corporation 4, 7-Dichlororhodamine dyes
US5851804A (en) 1996-05-06 1998-12-22 Apollon, Inc. Chimeric kanamycin resistance gene
US5869245A (en) 1996-06-05 1999-02-09 Fox Chase Cancer Center Mismatch endonuclease and its use in identifying mutations in targeted polynucleotide strands
GB9620209D0 (en) 1996-09-27 1996-11-13 Cemu Bioteknik Ab Method of sequencing DNA
US5916750A (en) 1997-01-08 1999-06-29 Biogenex Laboratories Multifunctional linking reagents for synthesis of branched oligomers
US6309824B1 (en) 1997-01-16 2001-10-30 Hyseq, Inc. Methods for analyzing a target nucleic acid using immobilized heterogeneous mixtures of oligonucleotide probes
US6297006B1 (en) 1997-01-16 2001-10-02 Hyseq, Inc. Methods for sequencing repetitive sequences and for determining the order of sequence subfragments
US5994068A (en) 1997-03-11 1999-11-30 Wisconsin Alumni Research Foundation Nucleic acid indexing
ATE269908T1 (en) * 1997-04-01 2004-07-15 Manteia S A METHOD FOR SEQUENCING NUCLEIC ACIDS
US5888737A (en) * 1997-04-15 1999-03-30 Lynx Therapeutics, Inc. Adaptor-based sequence analysis
US20040229221A1 (en) 1997-05-08 2004-11-18 Trustees Of Columbia University In The City Of New York Method to detect mutations in a nucleic acid using a hybridization-ligation procedure
CA2792122C (en) 1997-07-07 2015-09-08 Medical Research Council In vitro sorting method
CA2305449A1 (en) 1997-10-10 1999-04-22 President & Fellows Of Harvard College Replica amplification of nucleic acid arrays
US6322901B1 (en) 1997-11-13 2001-11-27 Massachusetts Institute Of Technology Highly luminescent color-selective nano-crystalline materials
US5990479A (en) 1997-11-25 1999-11-23 Regents Of The University Of California Organo Luminescent semiconductor nanocrystal probes for biological applications and process for making and using such probes
US6207392B1 (en) 1997-11-25 2001-03-27 The Regents Of The University Of California Semiconductor nanocrystal probes for biological applications and process for making and using such probes
US6136537A (en) 1998-02-23 2000-10-24 Macevicz; Stephen C. Gene expression analysis
EP1985714B1 (en) 1998-03-25 2012-02-29 Olink AB Method and kit for detecting a target molecule employing at least two affinity probes and rolling circle replication of padlock probes
US6004755A (en) 1998-04-07 1999-12-21 Incyte Pharmaceuticals, Inc. Quantitative microarray hybridizaton assays
US6284497B1 (en) 1998-04-09 2001-09-04 Trustees Of Boston University Nucleic acid arrays and methods of synthesis
US6255469B1 (en) * 1998-05-06 2001-07-03 New York University Periodic two and three dimensional nucleic acid structures
US6316229B1 (en) 1998-07-20 2001-11-13 Yale University Single molecule analysis target-mediated ligation of bipartite primers
WO2000006770A1 (en) * 1998-07-30 2000-02-10 Solexa Ltd. Arrayed biomolecules and their use in sequencing
US6232067B1 (en) 1998-08-17 2001-05-15 The Perkin-Elmer Corporation Adapter directed expression analysis
US6653077B1 (en) 1998-09-04 2003-11-25 Lynx Therapeutics, Inc. Method of screening for genetic polymorphism
US6251303B1 (en) * 1998-09-18 2001-06-26 Massachusetts Institute Of Technology Water-soluble fluorescent nanocrystals
US6326144B1 (en) 1998-09-18 2001-12-04 Massachusetts Institute Of Technology Biological applications of quantum dots
US6426513B1 (en) 1998-09-18 2002-07-30 Massachusetts Institute Of Technology Water-soluble thiol-capped nanocrystals
US6235502B1 (en) 1998-09-18 2001-05-22 Molecular Staging Inc. Methods for selectively isolating DNA using rolling circle amplification
US6534293B1 (en) * 1999-01-06 2003-03-18 Cornell Research Foundation, Inc. Accelerating identification of single nucleotide polymorphisms and alignment of clones in genomic sequencing
DE60042775D1 (en) 1999-01-06 2009-10-01 Callida Genomics Inc IMPROVED SEQUENCING BY HYBRIDIZATION THROUGH THE USE OF PROBABLE MIXTURES
GB9901475D0 (en) 1999-01-22 1999-03-17 Pyrosequencing Ab A method of DNA sequencing
US6514768B1 (en) * 1999-01-29 2003-02-04 Surmodics, Inc. Replicable probe array
ES2324513T3 (en) * 1999-03-18 2009-08-10 Complete Genomics As CLONING AND PRODUCTION PROCEDURES OF FRAGMENT CHAINS WITH LEGIBLE INFORMATION CONTENT.
AU4701200A (en) 1999-05-07 2000-11-21 Quantum Dot Corporation A method of detecting an analyte using semiconductor nanocrystals
EP1190100B1 (en) 1999-05-20 2012-07-25 Illumina, Inc. Combinatorial decoding of random nucleic acid arrays
US6573369B2 (en) * 1999-05-21 2003-06-03 Bioforce Nanosciences, Inc. Method and apparatus for solid state molecular analysis
US7501245B2 (en) 1999-06-28 2009-03-10 Helicos Biosciences Corp. Methods and apparatuses for analyzing polynucleotide sequences
US6818395B1 (en) * 1999-06-28 2004-11-16 California Institute Of Technology Methods and apparatus for analyzing polynucleotide sequences
US6472156B1 (en) 1999-08-30 2002-10-29 The University Of Utah Homogeneous multiplex hybridization analysis by color and Tm
US7244559B2 (en) 1999-09-16 2007-07-17 454 Life Sciences Corporation Method of sequencing a nucleic acid
US6274320B1 (en) 1999-09-16 2001-08-14 Curagen Corporation Method of sequencing a nucleic acid
US7211390B2 (en) 1999-09-16 2007-05-01 454 Life Sciences Corporation Method of sequencing a nucleic acid
WO2001023610A2 (en) 1999-09-29 2001-04-05 Solexa Ltd. Polynucleotide sequencing
US6297016B1 (en) 1999-10-08 2001-10-02 Applera Corporation Template-dependent ligation with PNA-DNA chimeric probes
EP1238105A2 (en) * 1999-12-02 2002-09-11 Molecular Staging Inc. Generation of single-strand circular dna from linear self-annealing segments
US6500620B2 (en) 1999-12-29 2002-12-31 Mergen Ltd. Methods for amplifying and detecting multiple polynucleotides on a solid phase support
GB0002389D0 (en) 2000-02-02 2000-03-22 Solexa Ltd Molecular arrays
US6221603B1 (en) * 2000-02-04 2001-04-24 Molecular Dynamics, Inc. Rolling circle amplification assay for nucleic acid analysis
EP1255865B1 (en) 2000-02-07 2007-04-18 Illumina, Inc. Nucleic acid detection method using universal priming
US6913884B2 (en) * 2001-08-16 2005-07-05 Illumina, Inc. Compositions and methods for repetitive use of genomic DNA
EP1990428B1 (en) 2000-02-07 2010-12-22 Illumina, Inc. Nucleic acid detection methods using universal priming
WO2001062982A2 (en) 2000-02-25 2001-08-30 Mosaic Technologies, Inc. Methods for multi-stage solid phase amplification of nucleic acids
US20020004204A1 (en) * 2000-02-29 2002-01-10 O'keefe Matthew T. Microarray substrate with integrated photodetector and methods of use thereof
US6413722B1 (en) 2000-03-22 2002-07-02 Incyte Genomics, Inc. Polymer coated surfaces for microarray applications
US6649138B2 (en) 2000-10-13 2003-11-18 Quantum Dot Corporation Surface-modified semiconductive and metallic nanoparticles having enhanced dispersibility in aqueous media
US6576291B2 (en) 2000-12-08 2003-06-10 Massachusetts Institute Of Technology Preparation of nanocrystallites
WO2002050310A2 (en) 2000-12-20 2002-06-27 The Regents Of The University Of California Rolling circle amplification detection of rna and dna
DK1370690T3 (en) 2001-03-16 2012-07-09 Kalim Mir Arrays and methods for using them
ATE556845T1 (en) 2001-07-20 2012-05-15 Life Technologies Corp LUMINESCENT NANOPARTICLES AND THEIR PRODUCTION
US7297778B2 (en) 2001-07-25 2007-11-20 Affymetrix, Inc. Complexity management of genomic DNA
GB2378245A (en) 2001-08-03 2003-02-05 Mats Nilsson Nucleic acid amplification method
GB2382137A (en) 2001-11-20 2003-05-21 Mats Gullberg Nucleic acid enrichment
US7011945B2 (en) 2001-12-21 2006-03-14 Eastman Kodak Company Random array of micro-spheres for the analysis of nucleic acids
US20040002090A1 (en) * 2002-03-05 2004-01-01 Pascal Mayer Methods for detecting genome-wide sequence variations associated with a phenotype
AUPS298102A0 (en) 2002-06-13 2002-07-04 Nucleics Pty Ltd Method for performing chemical reactions
US20050019776A1 (en) * 2002-06-28 2005-01-27 Callow Matthew James Universal selective genome amplification and universal genotyping system
AU2003267583A1 (en) 2002-09-19 2004-04-08 The Chancellor, Master And Scholars Of The University Of Oxford Molecular arrays and single molecule detection
US7459273B2 (en) 2002-10-04 2008-12-02 Affymetrix, Inc. Methods for genotyping selected polymorphism
JP4395133B2 (en) 2002-12-20 2010-01-06 カリパー・ライフ・サイエンシズ・インク. Single molecule amplification and detection of DNA
US6977153B2 (en) 2002-12-31 2005-12-20 Qiagen Gmbh Rolling circle amplification of RNA
AU2004254552B2 (en) * 2003-01-29 2008-04-24 454 Life Sciences Corporation Methods of amplifying and sequencing nucleic acids
JP2006517798A (en) 2003-02-12 2006-08-03 イェニソン スベンスカ アクティエボラーグ Methods and means for nucleic acid sequences
AU2004214891B2 (en) 2003-02-26 2010-01-07 Complete Genomics, Inc. Random array DNA analysis by hybridization
WO2005029040A2 (en) * 2003-09-18 2005-03-31 Parallele Biosciences, Inc. System and methods for enhancing signal-to-noise ratios of microarray-based measurements
GB0324456D0 (en) 2003-10-20 2003-11-19 Isis Innovation Parallel DNA sequencing methods
EP2202322A1 (en) * 2003-10-31 2010-06-30 AB Advanced Genetic Analysis Corporation Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof
US7169560B2 (en) 2003-11-12 2007-01-30 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
US20050136414A1 (en) 2003-12-23 2005-06-23 Kevin Gunderson Methods and compositions for making locus-specific arrays
WO2005073410A2 (en) 2004-01-28 2005-08-11 454 Corporation Nucleic acid amplification with continuous flow emulsion
GB0402895D0 (en) 2004-02-10 2004-03-17 Solexa Ltd Arrayed polynucleotides
EP2248911A1 (en) 2004-02-19 2010-11-10 Helicos Biosciences Corporation Methods and kits for analyzing polynucleotide sequences
CA2557841A1 (en) 2004-02-27 2005-09-09 President And Fellows Of Harvard College Polony fluorescent in situ sequencing beads
US20050214840A1 (en) 2004-03-23 2005-09-29 Xiangning Chen Restriction enzyme mediated method of multiplex genotyping
GB2413796B (en) 2004-03-25 2006-03-29 Global Genomics Ab Methods and means for nucleic acid sequencing
US20050260609A1 (en) 2004-05-24 2005-11-24 Lapidus Stanley N Methods and devices for sequencing nucleic acids
US20070117104A1 (en) 2005-11-22 2007-05-24 Buzby Philip R Nucleotide analogs
WO2006007207A2 (en) 2004-05-25 2006-01-19 Helicos Biosciences Corporation Methods and devices for nucleic acid sequence determination
US7276720B2 (en) 2004-07-19 2007-10-02 Helicos Biosciences Corporation Apparatus and methods for analyzing samples
US20060012793A1 (en) 2004-07-19 2006-01-19 Helicos Biosciences Corporation Apparatus and methods for analyzing samples
WO2006073504A2 (en) 2004-08-04 2006-07-13 President And Fellows Of Harvard College Wobble sequencing
GB0422551D0 (en) 2004-10-11 2004-11-10 Univ Liverpool Labelling and sequencing of nucleic acids
CA2588122A1 (en) 2004-11-16 2006-05-26 Helicos Biosciences Corporation Tirf single molecule analysis and method of sequencing nucleic acids
JP2008526877A (en) 2005-01-05 2008-07-24 エージェンコート パーソナル ジェノミクス Reversible nucleotide terminator and use thereof
EP1844162B1 (en) 2005-02-01 2014-10-15 Applied Biosystems, LLC Method for identifying a sequence in a polynucleotide
SG162795A1 (en) 2005-06-15 2010-07-29 Callida Genomics Inc Single molecule arrays for genetic and chemical analysis
EP1907591A2 (en) 2005-07-28 2008-04-09 Helicos Biosciences Corporation Consecutive base single molecule sequencing
WO2007021944A2 (en) 2005-08-11 2007-02-22 The J. Craig Venter Institute In vitro recombination method
US7666593B2 (en) 2005-08-26 2010-02-23 Helicos Biosciences Corporation Single molecule sequencing of captured nucleic acids
WO2007087310A2 (en) 2006-01-23 2007-08-02 Population Genetics Technologies Ltd. Nucleic acid analysis using sequence tokens
WO2007092538A2 (en) 2006-02-07 2007-08-16 President And Fellows Of Harvard College Methods for making nucleotide probes for sequencing and synthesis
SG10201405158QA (en) 2006-02-24 2014-10-30 Callida Genomics Inc High throughput genome sequencing on dna arrays
CN101432439B (en) * 2006-02-24 2013-07-24 考利达基因组股份有限公司 High throughput genome sequencing on DNA arrays
CA2647786A1 (en) 2006-03-14 2007-09-20 Genizon Biosciences Inc. Methods and means for nucleic acid sequencing
AU2007237909A1 (en) 2006-04-19 2007-10-25 Applied Biosystems, Llc. Reagents, methods, and libraries for gel-free bead-based sequencing
US20090111705A1 (en) 2006-11-09 2009-04-30 Complete Genomics, Inc. Selection of dna adaptor orientation by hybrid capture

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5710000A (en) * 1994-09-16 1998-01-20 Affymetrix, Inc. Capturing sequences adjacent to Type-IIs restriction sites for genomic library mapping
US20060223097A1 (en) * 1994-09-16 2006-10-05 Affymetrix, Inc. Capturing sequences adjacent to type-IIS restriction sites for genomic library mapping
US7232656B2 (en) * 1998-07-30 2007-06-19 Solexa Ltd. Arrayed biomolecules and their use in sequencing
US6287824B1 (en) * 1998-09-15 2001-09-11 Yale University Molecular cloning using rolling circle amplification
US20060024711A1 (en) * 2004-07-02 2006-02-02 Helicos Biosciences Corporation Methods for nucleic acid amplification and sequence determination

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8785127B2 (en) 2003-02-26 2014-07-22 Callida Genomics, Inc. Random array DNA analysis by hybridization
US11414702B2 (en) 2005-06-15 2022-08-16 Complete Genomics, Inc. Nucleic acid analysis by random mixtures of non-overlapping fragments
US10351909B2 (en) 2005-06-15 2019-07-16 Complete Genomics, Inc. DNA sequencing from high density DNA arrays using asynchronous reactions
US9944984B2 (en) 2005-06-15 2018-04-17 Complete Genomics, Inc. High density DNA array
US8609335B2 (en) 2005-10-07 2013-12-17 Callida Genomics, Inc. Self-assembled single molecule arrays and uses thereof
US20090118488A1 (en) * 2006-02-24 2009-05-07 Complete Genomics, Inc. High throughput genome sequencing on DNA arrays
US20090203551A1 (en) * 2007-11-05 2009-08-13 Complete Genomics, Inc. Methods and Oligonucleotide Designs for Insertion of Multiple Adaptors Employing Selective Methylation
US7901890B2 (en) * 2007-11-05 2011-03-08 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors employing selective methylation
US20090176652A1 (en) * 2007-11-06 2009-07-09 Complete Genomics, Inc. Methods and Oligonucleotide Designs for Insertion of Multiple Adaptors into Library Constructs
US7897344B2 (en) * 2007-11-06 2011-03-01 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors into library constructs
US20090263872A1 (en) * 2008-01-23 2009-10-22 Complete Genomics Inc. Methods and compositions for preventing bias in amplification and sequencing reactions
US20110015864A1 (en) * 2009-02-03 2011-01-20 Halpern Aaron L Oligomer sequences mapping
US20100286925A1 (en) * 2009-02-03 2010-11-11 Halpern Aaron L Oligomer sequences mapping
US8615365B2 (en) 2009-02-03 2013-12-24 Complete Genomics, Inc. Oligomer sequences mapping
US20100287165A1 (en) * 2009-02-03 2010-11-11 Halpern Aaron L Indexing a reference sequence for oligomer sequence mapping
US8731843B2 (en) 2009-02-03 2014-05-20 Complete Genomics, Inc. Oligomer sequences mapping
US8738296B2 (en) 2009-02-03 2014-05-27 Complete Genomics, Inc. Indexing a reference sequence for oligomer sequence mapping
US20110004413A1 (en) * 2009-04-29 2011-01-06 Complete Genomics, Inc. Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence
EP2511843A2 (en) 2009-04-29 2012-10-17 Complete Genomics, Inc. Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence
US8725422B2 (en) 2010-10-13 2014-05-13 Complete Genomics, Inc. Methods for estimating genome-wide copy number variations
US10837879B2 (en) 2011-11-02 2020-11-17 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
US11835437B2 (en) 2011-11-02 2023-12-05 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
WO2013166517A1 (en) 2012-05-04 2013-11-07 Complete Genomics, Inc. Methods for determining absolute genome-wide copy number variations of complex tumors
WO2014145820A2 (en) 2013-03-15 2014-09-18 Complete Genomics, Inc. Multiple tagging of long dna fragments
EP3741872A1 (en) 2013-03-15 2020-11-25 Complete Genomics, Inc. Multiple tagging of long dna fragments
US10726942B2 (en) 2013-08-23 2020-07-28 Complete Genomics, Inc. Long fragment de novo assembly using short reads
US20180044668A1 (en) * 2014-10-14 2018-02-15 Bgi Shenzhen Co., Limited Mate pair library construction
WO2023028478A3 (en) * 2021-08-26 2023-04-06 Illumina, Inc. Methods and compositions for detecting genomic methylation

Also Published As

Publication number Publication date
US20080171331A1 (en) 2008-07-17
US20080213771A1 (en) 2008-09-04
WO2008058282A3 (en) 2008-10-30
US20090111705A1 (en) 2009-04-30
US20090111706A1 (en) 2009-04-30
WO2008070375A3 (en) 2008-10-02
WO2008070375A2 (en) 2008-06-12
US9334490B2 (en) 2016-05-10
WO2008058282A2 (en) 2008-05-15

Similar Documents

Publication Publication Date Title
US20090075343A1 (en) Selection of dna adaptor orientation by nicking
US7897344B2 (en) Methods and oligonucleotide designs for insertion of multiple adaptors into library constructs
US7901890B2 (en) Methods and oligonucleotide designs for insertion of multiple adaptors employing selective methylation
US10662473B2 (en) Methods and compositions for efficient base calling in sequencing reactions
US8298768B2 (en) Efficient shotgun sequencing methods
US11155813B2 (en) Semi-random barcodes for nucleic acid analysis
US11705217B2 (en) Sequencing using concatemers of copies of sense and antisense strands
US8999677B1 (en) Method for differentiation of polynucleotide strands
AU2010330936B2 (en) Restriction enzyme based whole genome sequencing
US20110189679A1 (en) Compositions and methods for whole transcriptome analysis
US11168367B2 (en) Flexible and high-throughput sequencing of targeted genomic regions
AU2015202111A1 (en) Compositions and methods for nucleic acid sequencing
Fairchild Definition of the yeast transcriptome using next-generation RNA sequencing
CA2480320A1 (en) Analysis of mixtures of nucleic acid fragments and of gene expression

Legal Events

Date Code Title Description
AS Assignment

Owner name: COMPLETE GENOMICS, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DRMANAC, RADOJE;SPARKS, ANDREW;HUANG, STEVEN;AND OTHERS;REEL/FRAME:021779/0091;SIGNING DATES FROM 20081028 TO 20081031

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION