US20020192662A1 - Method for determining the nucleotide sequence of a polynucleotide - Google Patents

Method for determining the nucleotide sequence of a polynucleotide Download PDF

Info

Publication number
US20020192662A1
US20020192662A1 US09/962,537 US96253701A US2002192662A1 US 20020192662 A1 US20020192662 A1 US 20020192662A1 US 96253701 A US96253701 A US 96253701A US 2002192662 A1 US2002192662 A1 US 2002192662A1
Authority
US
United States
Prior art keywords
nucleotide
nucleic acid
oligonucleotide
sequence
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/962,537
Inventor
Michael Boyce-Jacino
Yu-Hui Rogers
Philip Goelet
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US09/962,537 priority Critical patent/US20020192662A1/en
Publication of US20020192662A1 publication Critical patent/US20020192662A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation

Definitions

  • the present invention relates to nucleic acid chemistry, and more specifically to a method for determining the nucleotide sequence of a polynucleotide.
  • the invention further relates to apparatus and kits that embody or employ such a method.
  • nucleotide sequence of a polynucleotide has substantial utility in medicine, forensics, biomedical research, and in the determination of paternity and identity.
  • Several methods for determining the nucleotide sequence of a polynucleotide have been identified.
  • the sequence of a DNA molecule is obtained through the extension of an oligonucleotide primer which is hybridized to the nucleic acid molecule being sequenced.
  • four separate primer extension reactions are conducted.
  • a DNA polymerase is added along with the four nucleotide triphosphates needed to polymerize DNA.
  • Each of the reactions is carried out in the additional presence of a 2′,3′ dideoxy derivative of the A, T, C, or G nucleoside triphosphates.
  • Such derivatives differ from conventional nucleotide triphosphates in that they lack a hydroxyl residue at the 3′ position of deoxyribose.
  • the absence of the 3′ hydroxyl group causes them to be incapable of forming a phosphodiester bond with a succeeding nucleotide triphosphate.
  • the incorporation of a dideoxy derivative results in the termination of the extension reaction. Since the dideoxy derivatives are present in lower concentrations than their corresponding, conventional nucleotide triphosphate analogs, the net result of each of the four reactions is to produce a set of nested oligonucleotides each of which is terminated by the particular dideoxy derivative used in the reaction.
  • each of the extension reactions By subjecting the reaction products of each of the extension reactions to electrophoresis, it is possible to obtain a series of four “ladders.” Since the position of each “rung” of the ladder is determined by the size of the molecule, and since such size is determined by the incorporation of the dideoxy derivative, the appearance and location of a particular “rung” can be readily translated into the sequence of the extended primer. Thus, through an electrophoretic analysis, the sequence of the extended primer can be determined.
  • One deficiency of the dideoxy-mediated sequencing method is the need to optimize the ratio of dideoxy nucleoside triphosphates to conventional nucleoside triphosphates in the chain-extension/chain-termination reactions. Such adjustments are needed in order to maximize the amount of information which can be obtained from each primer. Additionally, the efficiency of dideoxy nucleotide incorporation in a particular target molecule is partially dependent upon the primary and secondary structures of the target.
  • the dideoxy-mediated method thus requires single-stranded templates, specific oligonucleotide primers, and high quality preparations of a DNA polymerase (typically the Klenow fragment of E. coli DNA polymerase I). Initially, these requirements delayed the wide spread use of the method. However, with the ready availability of synthetic primers, and the availability of bacteriophage M13 and phagemid vectors (Maniatis, T., et al., Molecular Cloning, a Laboratory Manual, 2nd Edition, Cold Spring Harbor Press Cold Spring Harbor, N.Y. (1989), herein incorporated by reference), the dideoxy-mediated chain termination method is now extensively employed.
  • the Maxam-Gilbert method of DNA sequencing is a degradative method.
  • a fragment of DNA is labeled at one end and partially cleaved in four separate chemical reactions, each of which is specific for cleaving the DNA molecule at a particular base (G or C) at a particular type of base (A/G, C/T, or A>C).
  • G or C a particular base
  • A/G, C/T, or A>C the effect of such reactions is to create a set of nested molecules whose lengths are determined by the locations of a particular base along the length of the DNA molecule being sequenced.
  • the nested reaction products are then resolved by electrophoresis, and the end-labeled molecules are detected, typically by autoradiography when a 32 p label is employed. Four single lanes are typically required in order to determine the sequence.
  • the Maxam-Gilbert method thus uses simple chemical reagents which are readily available. Nevertheless, the dideoxy-mediated method has several advantages over the Maxam-Gilbert method.
  • the Maxam-Gilbert method is extremely laborious and requires meticulous experimental technique. In contrast, the Sanger method may be employed on larger nucleic acid molecules.
  • the sequence is obtained from the original DNA molecule, and not from an enzymatic copy.
  • the method can be used to sequence synthetic oligonucleotides, and to analyze DNA modifications such as methylation, etc. It can also be used to study both DNA secondary structure and protein-DNA interactions. Indeed, it has been readily employed in the identification of the binding sites of DNA binding proteins.
  • Both the dideoxy-mediated method and the Maxam-Gilbert method of DNA sequencing require the prior isolation of the DNA molecule which is to be sequenced.
  • the sequence information is obtained by subjecting the reaction products to electrophoretic analysis (typically using polyacrylamide gels).
  • electrophoretic analysis typically using polyacrylamide gels.
  • a sample is applied to a lane of a gel, and the various species of nested fragments are separated from one another by their migration velocity through the gel.
  • the number of nested fragments which can be separated in a single lane is approximately 200-300 regardless of whether the Sanger or the Maxam-Gilbert method is used.
  • Those of great skill in the art can separate up to 600 fragments in a single lane.
  • the sequence of the entire molecule is obtained by orienting and ordering the sequence data obtained from each fragment.
  • the target DNA molecule may be ordered by restriction mapping using the methods described above, and the discrete restriction fragments sequenced.
  • the target molecule may be sequenced by sequencing nested sets of deletions which begin at one of its ends. The use of such nested fragments progressively brings more and more remote regions of the target DNA into range for sequencing.
  • sequence information obtained from a particular target molecule can be used to prepare a primer which can then be used in a subsequent sequencing reaction in order to obtain additional sequence information.
  • a directed sequence analysis of a target DNA molecule often requires substantial a priori information regarding the sequence.
  • directional sequencing is quite arduous.
  • microsequencing methods determine the identity of only a single nucleotide at a “predetermined” site. Such methods have particular utility in determining the presence and identity of polymorphisms in a target polynucleotide.
  • the GBATM Genetic Bit Analysis method disclosed by Goelet, P. et al. is a particularly useful microsequencing method.
  • the nucleotide sequence information surrounding a predetermined site of interrogation is used to design an oligonucleotide primer that is complementary to the region immediately adjacent to, but not including, the predetermined site.
  • the target DNA template is selected from the biological sample and hybridized to the interrogating primer.
  • This primer is extended by a single labeled dideoxynucleotide using DNA polymerase in the presence of at least two, and most preferably all four chain terminating nucleoside triphosphate precursors.
  • Such deoxynucleotide misincorporation events may be due to the Km of the DNA polymerase for the mispaired deoxy-substrate being comparable, in some sequence contexts, to the relatively poor Km of even a correctly base paired dideoxy-substrate ( Komberg, A., et al., In: DNA Replication, Second Edition (1992), W. H. Freeman and Company, New York; Tabor, S. et al., Proc. Natl. Acad. Sci. ( U.S.A. ) 86:4076-4080 (1989)). This effect would contribute to the background noise in the polymorphic site interrogation.
  • Mundy, C. R. discusses alternative microsequencing methods for determining the identity of the nucleotide present at a particular polymorphic site.
  • Mundy's methods employ a specialized exonuclease-resistant nucleotide derivative.
  • a primer complementary to the allelic sequence immediately 3′-to the polymorphic site is permitted to hybridize to a target molecule obtained from a particular animal or human. If the polymorphic site on the target molecule contains a nucleotide that is complementary to the particular exonucleotide-resistant nucleotide derivative present, then that derivative will be incorporated by a polymerase onto the end of the hybridized primer.
  • Such incorporation renders the primer resistant to exonuclease, and thereby permits its detection. Since the identity of the exonucleotide-resistant derivative of the sample is known, a finding that the primer has become resistant to exonucleases reveals that the nucleotide present in the polymorphic site of the target molecule was complementary to that of the nucleotide derivative used in the reaction.
  • the Mundy method has the advantage that it does not require the determination of large amounts of extraneous sequence data. It has the disadvantages of destroying the amplified target sequences, and unmodified primer and of being extremely sensitive to the rate of polymerase incorporation of the specific exonuclease-resistant nucleotide being used.
  • the target DNA template is usually prepared by a DNA amplification reaction, such as the PCR, that uses a high concentration of deoxynucleoside triphosphates, the natural substrates of DNA polymerases. These monomers will compete in the subsequent extension reaction with the dideoxynucleoside triphosphates. Therefore, following the PCR, an additional purification step is required to separate the DNA template from the unincorporated dNTPs. Because it is a solution-based method, the unincorporated dNTPs are difficult to remove and the method is not suited for high volume testing.
  • Macevicz U.S. Pat. No. 5,002,867
  • Macevicz describes a method for determining nucleic acid sequence via hybridization with multiple mixtures of oligonucleotide probes.
  • the sequence of a target polynucleotide is determined by permitting the target to sequentially hybridize with sets of probes having an invariant nucleotide at one position, and a variant nucleotides at other positions.
  • the Macevicz method determines the nucleotide sequence of the target by hybridizing the target with a set of probes, and then determining the number of sites that at least one member of the set is capable of hybridizing to the target (i.e. the number of “matches”). This procedure is repeated until each member of a sets of probes has been tested.
  • each lane of a sequencing gel can resolve only about 300 different fragments.
  • multiple sequencing gels are often needed. This, in turn, limits the amount of new sequence information which can be readily obtained per day.
  • a substantial number of technically demanding and time consuming steps must be performed.
  • the sequencing of large DNA molecules requires the use of multiple sequencing gels each having a large number of lanes.
  • the electrophoretic analysis step in the sequencing process thus comprises a significant limitation to the amount of sequence information which can be obtained and the rate with which it can be processed.
  • the invention provies a solid phase sequencing method for determining the sequence of nucelic acid molecules (either DNA or RNA).
  • the invention provides a method for determining the nucleotide sequence of a nucleic acid molecule which comprises the steps of:
  • step (b) permitting the removal of any non-hybridized nucleotide residues from the 3′ terminus of the hybridized oligonucleotides, so as to form a truncated primer oligonucleotide whose 3′ terminus is hybridized to the nucleic acid molecule, and then permitting polymerase-mediated incorporation of a single chain terminator nucleotide residue onto the 3′ terminus of the hybridized truncated oligonucleotide, wherein for each hybridized truncated oligonucleotide being so extended, the incorporated nucleotide residue is complementary to the nucleotide residue immediately 5′ to the nucleotide residue of the nucleic acid molecule that is hybridized with that truncated oligonucleotide's 3′ terminal nucleotide residue; then performing step (D);
  • each array position contains a primer oligonucleotide that is capable of hybridizing to a region of the nucleic acid molecule, and/or wherein in step (C), at least some array positions contain nucleic acid molecules hybridized to oligonucleotides the whose 3′ terminal nucleotide is not hybridized to the nucleic acid molecule, and wherein step (C)(1) is conducted for such oligonucleotides.
  • Either a Thermosequenase class polymerase or a Klenow class polymerase may be employed in the method.
  • the invention particularly includes the embodiments in which the array is a random oligonucleotide array, and in which the array is a nested oligonucleotide array (especially one containing oligonucleotide members having all possible permutations of nucleotides over a region of from 1 to 20 bases.
  • the invention is particularly adaped for conducting the method in the presence of at least four chain terminator nucleotide species, at least one of which is labeled, and more preferably wherein all of the chain terminator nucleotide species are labeled, and wherein the label of any such species can be distinguished from the label of any other species present.
  • the invention particularly provides a method of sequence determination for genomic DNA of a human or non-human mammal, and is especially adapted for use in determining the sequence of DNA suspected to contain a genetic variation associated with a disease (e.g., cancer or cystic fibrosis), and in which the method is employed to determine whether the DNA contains the variation.
  • a disease e.g., cancer or cystic fibrosis
  • the oligonucleotides are immobilized onto the solid support, such as plastic or glass).
  • the invention also provides a kit for determining the sequence of a nucleic acid molecule which comprises a solid support containing an array of spaced apart receptacles for oligonucleotides, each receptacle containing a different primer oligonucleotide.
  • the kit may additionally contain at least four chain terminator nucleotide species, at least one of which is labeled.
  • a highly preferred kit contains at least four chain terminator nucleotide species, wherein all of the chain terminator nucleotide species are labeled, and wherein the label of any such species can be distinguished from the label of any other species present.
  • the kit is particularly suited for determining the nucleotide sequence of DNA suspected to contain a genetic variation associated with a disease, and to provide a determination of nucleotide sequence sufficient to determine whether the DNA contains the variation.
  • FIG. 1 shows the result of a nested GBATM (N-GBATM) experiment.
  • the present invention provides a method of sequencing which provides the advantages of micro- and nano-sequencing and has the ability to sequence polynucleotide regions.
  • the method employs ordered arrays of linear primers that are capable of hybridizing to a target molecule and reporting the identity of the single nucleotide that is present in the hybridized molecule immediately 5′ to the 3′ terminus of the primer.
  • the invention permits one to ascertain he complete nucleotide sequence of a target polynucleotide.
  • the method of sequence analysis the nature of the primer array.
  • the most preferred method of the present invention employs a modification of the GBATM method of analyzing a predetermined site as the means for accomplishing sequence analysis.
  • the GBATM method can be conducted in a variety of ways. In particular, such interrogation can be accomplished via a polymerase-mediated analysis or by a ligase-mediated analysis.
  • oligonucleotide having a defined sequence is bound to a solid support, especially a microtiter dish.
  • a sample, suspected to contain the target molecule, or an amplification product thereof, is placed in contact with the support, and any target molecules present are permitted to hybridize to the bound oligonucleotide.
  • an oligonucleotide having a sequence that is complementary to an immediately distal sequence of a polymorphism is prepared using the above-described methods (and preferably that of Nikiforov, T. (U.S. patent application Ser. No. 08/005,061, herein incorporated by reference).
  • the terminus of the oligonucleotide is attached to the solid support, as described, for example by Goelet, P. et al. (PCT Application WO 92/15712), such that the 3′-end of the oligonucleotide can serve as a substrate for primer extension.
  • the immobilized primer is then incubated in the presence of a DNA molecule (preferably a genomic DNA molecule) having a single nucleotide polymorphism whose immediately 3′-distal sequence is complementary to that of the immobilized primer.
  • a DNA molecule preferably a genomic DNA molecule
  • dNTP i.e. dATP, dCTP, dGTP, or dTTP
  • chain terminating nucleotide triphosphate derivatives such as a dideoxy derivative
  • the polymorphic site is such that only two or three alleles exist (such that only two or three species of dNTPs, respectively, could be incorporated into the primer extension product)
  • the presence of unusable nucleotide triphosphate(s) in the reaction is immaterial.
  • a single dideoxynucleotide is added to the 3′-terminus of the primer.
  • the identity of that added nucleotide is determined by, and is complementary to, the nucleotide of the polymorphic site of the polymorphism.
  • the nucleotide of the polymorphic site is thus determined by assaying which of the set of labeled nucleotides has been incorporated onto the 3′-terminus of the bound oligonucleotide by a primer-dependent polymerase. Most preferably, where multiple dideoxynucleotide derivatives are simultaneously employed, different labels will be used to permit the differential determination of the identity of the incorporated dideoxynucleotide derivative.
  • the identity of the nucleotide of the polymorphic site is determined using a polymerase/ligase-mediated process.
  • an oligonucleotide primer is employed, that is complementary to the immediately 3′-distal invariant sequence of the polynucleotide being analyzed.
  • a second oligonucleotide is tethered to the solid phase via its 3′-end. The sequence of this oligonucleotide is complementary to the 5′-proximal sequence of the predetermined site being analyzed, but is incapable of hybridizing to the oligonucleotide primer.
  • oligonucleotides are incubated in the presence of DNA containing the single nucleotide polymorphism that is to be analyzed, and at least one 2′, 5′-deoxynucleotide triphosphate.
  • the incubation reaction further includes a DNA polymerase and a DNA ligase.
  • the tethered and soluble oligonucleotides are thus capable of hybridizing to the same strand of the single nucleotide polymorphism under analysis.
  • the sequence considerations cause the two oligonucleotides to hybridize to the proximal and distal sequences of the polynucleotide site that flank the predetermined site; the hybridized oligonucleotides are thus separated by a “gap” of a single nucleotide at the precise position of the predetermined site.
  • a deoxynucleotide triphosphate that is complementary to the nucleotide of the polymorphic site permits the creation of a ligatable substrate.
  • the ligation reaction immobilizes the deoxynucleotide and the previously soluble primer oligonucleotide to the solid support.
  • the identity of the polymorphic site that was opposite the “gap” can then be determined by any of several means.
  • the deoxynucleotide of the reaction is labeled, and its detection thus reveals the identity of the complementary nucleotide of the predetermined site.
  • Several different deoxynucleotides may (and preferably will) be present, each differentially labeled.
  • separate reactions can be conducted, each with a different deoxynucleotide.
  • the deoxynucleotides are unlabeled and a labeled dideoxynucleotide is employed, and the second, soluble oligonucleotide is labeled.
  • reaction that contains the complementary nucleotide permits the ligatable substrate to form, and is detected by detecting the immobilization of the previously soluble oligonucleotide.
  • the sensitivity of nucleic acid hybridization detection assays may be increased by altering the manner in which detection is reported or signaled to the observer.
  • assay sensitivity can be increased through the use of detectably labeled reagents.
  • Kourilsky et al. U.S. Pat. No. 4,581,333 describe the use of enzyme labels to increase sensitivity in a detection assay. Fluorescent labels (Prober, J. et al. Science 238:336-340 (1987); Albarella et al., EP 144914), chemical labels (Sheldon III et al., U.S. Pat. No.
  • chromogenic (especially enzyme) labels it is preferable to employ fluorescent, and more preferably chromogenic (especially enzyme) labels, such that the identity of the incorporated nucleotide can be determined in an automated, or semi-automated manner using a spectrophotometer.
  • GBATM was developed as a solid-phase single nucleotide polymorphism genotyping method based on single-base extension of an interrogation primer across a target base of interest.
  • a solid-phase array can be manufactured in a standardized way with quality control, thereby ensuring that variation in performance of the test is more a factor of input DNA quality and less of operator expertise.
  • the present invention extends this method to a N-GBATM format, with the complementary interrogation primers nested at one (or more) base intervals across the target sequence, and thus enables detailed sequence analysis of a complex target DNA sequence.
  • the N-GBATM method of the present invention is ideally suited to analysis of intermediate length (10-100 base) DNA target sequences.
  • Application of the N-GBATM method in a Sequence Confirmation/composition ANalysis (SCANTM)—chip prototype (a miniaturized array of interrogation primers on a glass slide) is the most preferred embodiment of the method, and permits standardized (through manufacture of the oligonucleotide interrogation primer arrays), lower cost (through miniaturization of the test) and accurate (through use of the GBATM biochemistry) scanning for p53 mutations.
  • the solid-phase format of the present invention also provides advantages in processing, since reagents can be added by hand at small scales, or by robots on a larger scale, without changes to the test.
  • the size of the arrays can be controlled as well, so that the advantages of miniaturization can be realized: thus a 30 ⁇ l PCR reaction can be hybridized simultaneously to hundreds or thousands of oligonucleotides in an array only a few millimeters in diameter. In this way, processing can be performed at a “macro” scale, using standard pipetters, and information extracted at a “micro” or “nano” scale using fluorescent imaging.
  • the methods of the present invention exploit the use of primer extension biochemistry for nucleotide-by-nucleotide analysis and its application to a solid-phase oligonucleotide array format.
  • primer extension biochemistry for nucleotide-by-nucleotide analysis and its application to a solid-phase oligonucleotide array format.
  • the addition of primer extension to solid-phase analysis adds significant increases in test accuracy and differential sensitivity over hybridization-based approaches while exploiting the advantages of solid-phase-based testing over gel-based tests.
  • This strategy of nesting the GBATM across a region of interest eliminates any need to “expect” (i.e., guess in advance) a particular mutation. Nesting eliminates any need to limit analysis to a specific nucleotide.
  • Current GBATM detection technology is a “two-result” system (distinguishing wild-type from mutant).
  • An additional innovation of the preferred embodiments of the present invention involves the use of a “four-result” system, which, by parallel detection of all four possible DNA bases for each site in the sequence, provides enhanced accuracy. With this innovation, the change of any nucleotide in the target region to any other possible base will be detectable in a base-specific fashion, thus any mutation in a proposed target hot-spot will be identifiable, including novel mutations.
  • this is accomplished by separating the arrays into four identical array spots to which PCR or other amplified product can hybridize equally.
  • the GBATM extension is thus preferably divided into four reaction mixes, each containing a different haptenated dideoxynucleotide triphosphate (ddNTP).
  • the four spots represent the four possible bases: G, A, T and C, and incorporation of each possible base can be evaluated for each oligonucleotide in the array and from this the sequence composition of the target fragment deduced.
  • the SCANTM-chip format utilizing N-GBATM biochemistry, will thus enable: highly accurate mutation detection due to the sensitivity of primer extension to hybridization mismatch at the 3′ (extended) end of the interrogation primer; increased informativeness since the mutation is detected in a highly localized fashion; relatively standardizable and simple testing due to the SCANTM format; and cost-effectiveness due to miniaturization of the arrays.
  • the target polynucleotide i.e., the nucleic acid molecule that is to be sequenced
  • each array position of a spacially separated array of oligonucleotide primers in single-stranded form under conditions sufficient to permit hybridization to occur.
  • an array of oligonucleotides is said to be “spacially separated” if an oligonucleotide of one sequence is separated from an oligonucleotide of another sequence.
  • each oligonucleotide species of the array is provided to a separate microtiter well.
  • each oligonucleotide species of the array is provided to a distinct region of a surface, such as a glass slide, etc.
  • the term “array” is intended to define a two dimensional or three dimensional matrix having a definition of X,Y or X,Y,Z, such that, for example, at array position 1,1 a particular oligonucleotide is found; an oligonucleotide of different sequence is found at array position 1,2 or 2,1, etc.
  • the oligonucleotide found at each array position is defined and known in advance of any reaction.
  • each oligonucleotide of each array position is selected such that it will be shorter in length than the target polynucleotide being sequenced. Most preferably, such nucleotides will be less that 30 bases in length, and most preferably less than 10 bases. Oligonucleotides of 5 bases in length are preferred.
  • an oligonucleotide of N residues hybridizes to the target polynucleotide, its 3′ terminus (residue N) will hybridize to a nucleotide of the target polynucleotide, and can be extended via a template-dependent polymerization reaction to incorporate an “interrogation nucleotide” as residue N+1 of that oligonucleotide.
  • the identity of the “interrogation nucleotide” is dependent upon (and is complementary to) the nucleotide species of the target polynucleotide that is present immediately 5′ adjacent to the nucleotide that hybridizes to the 3′ terminus of the oligonucleotide, prior to the polymerization reaction.
  • Each array position additionally contains more than one different nucleotide specie, such that nucleotide species are present that are complementary to at least two, and in the most preferred embodiment, all four of the nucleotide species of DNA (i.e., adenosine, cytosine, thymidine and guanosine, designated A, C, T and G, respectively).
  • the nucleotide species present are “chain terminator” nucleotides.
  • nucleotide species can be incorporated onto the 3′ terminus of an oligonucleotide by a DNA polymerase, the resultant extended oligonucleotide cannot be further extended by a polymerase, even in the presence of non-terminator nucleotides.
  • the most preferred chain terminator nucleotide species of the present invention are 2′-deoxynucleoside 5′-triphosphates.
  • the chain terminator nucleotide species are detectably labeled, such that an extension reaction that results in the incorporation of a nucleotide complementary to one of the nucleotide species of DNA can be distinguished from an extension reaction that results in the incorporation of a nucleotide complementary to a different nucleotide species of DNA.
  • Any of the conventionally used radioisotopic, enzymatic, fluorescent or chemiluminescent labels may be used in accordance with the methods of the present invention.
  • haptenic labels such as biotin or other labels such as ligands, antigens, etc. may be used. Suitable labels are disclosed, for example, by Kourilsky et al. (U.S. Pat. No.
  • a polymerase, and suitable salts and buffers are also provided to each array position.
  • the reaction conditions are maintained such that the oligonucleotides stably and specifically hybridize to the target polynucleotide, and so that the 3′-terminus of the oligonucleotides are extended by addition of single chain terminator nucleotide (i.e., the interrogation nucleotide).
  • single chain terminator nucleotide i.e., the interrogation nucleotide
  • stable hybridization refers to a hybridization that has a Tm greater than the temperature under which the interrogation assay is to be run (generally 20-40° C.).
  • hybridization denotes that the length and/or sequence complexity of the oligonucleotides involved in the hybridization are sufficient to preclude non-desired spurious hybridization (as might occur, for example, between sequences that are only partially complementary).
  • the hybridization is usually carried out for 15 to 30 minutes at room temperature in a solution containing 1.5 M NaCl and 10 mM EDTA. Other hybridization conditions can alternatively be used.
  • the sequence of the immobilized oligonucleotide is selected such that it will hybridize to the invariant sequence that flanks the polymorphic site of the polymorphism that is to be interrogated.
  • ligase/polymerase mediated GBATM interrogation method is to be employed, the methods of Nikiforov et al. (U.S. patent application Ser. No. 08/192,631, herein incorporated by reference) are preferably employed.
  • the oligonucleotides present at each array position are immobilized to the solid surface of the array support.
  • a support may be a microtiter dish, test tube array, etched glass surface, etc.
  • the nature of the oligonucleotide array may vary depending upon the amount of prior sequence information available concerning the target molecule.
  • the array is “non-random.”
  • a “non-random” oligonucleotide array is a set of oligonucleotides whose members do not contain all possible permutations of nucleotides.
  • a non-random array is preferably employed when determining the nucleotide sequence of a polynucleotide for which some a priori sequence information is available.
  • non-random arrays would be employed in sequencing those genes of a patient for which the sequence of “normal” alleles had been previously determined.
  • a “random” array of oligonucleotides is a set of oligonucleotides whose members do contain all possible permutations of nucleotides.
  • a random array is preferably employed when determining the nucleotide sequence of a polynucleotide for which little or no a priori sequence information is available.
  • Primer design is preferably facilitated through the use of the GBATM Primer 1.0 program (Molecular Tool, Inc.) Primer stability (measured in ⁇ kcal/mol) and potential sequence-based sources of noise are evaluated by this program.
  • GBATM Primer 1.0 program (Molecular Tool, Inc.)
  • Primer stability (measured in ⁇ kcal/mol) and potential sequence-based sources of noise are evaluated by this program.
  • a number of sequence-based features can lead to GBATM noise for a particular target site.
  • the most common source of noise is template-independent noise (TIN) and results from self-priming by the GBATM primer.
  • TIN template-independent noise
  • GBATM primers may be modified by a base substitution with C3 linker or by shortening the primer at the 5′ end without sacrificing hybridization stability of the template strand.
  • N-GBATM N-GBATM primers which complement the target sequence and are staggered by one base will be designed according to the standard GBATM primer design strategies described above.
  • An example of N-GBATM primer design was shown in a model study described in the relevant experience section.
  • the oligonucleotide array can comprise a set of non-random nested oligonucleotides.
  • the nested primer array will contain all possible divergent sequences over the region whose sequence is to be determined.
  • the maximum number of primers needed to determine the sequence of N nucleotides is given by the equation: ⁇ 1 N ⁇ ⁇ 4 N - 1
  • the maximum number of sequences needed to obtain the sequence of even a relatively small region rises rapidly when non-random arrays are employed, the method is not preferred when more extensive sequencing is desired. For example, a maximum of 349,525 primers would be needed to obtain 10 nucleotides of sequence information by this method. Hence, for obtaining such (or even more extensive) sequence information, the random array method described below is preferably employed.
  • each oligonucleotide of a set as a GBATM primer in a GBATM reaction (either in the presence of three unlabeled terminator nucleotides and one labeled chain terminator nucleotide or in the presence of four differentially labeled chain terminator nucleotides), it is possible to determine the nucleotide sequence of the particular nucleic acid molecule of a sample that is complementary to the set of primers.
  • the GBATM reaction exploits the ability of the 3′ terminus of the GBATM primer to hybridize to the target molecule being interrogated.
  • This characteristic of the present invention permits sequence determinations with far fewer primers, depending upon the class of polymerase being employed in the GBATM reaction.
  • One class typified by the Klenow fragment of E. coli DNA polymerase I (Klenow class) possess 3′ to 5′ exonuclease activity, and are able to correct 3′ base mismatches in the extended primer.
  • Thermosequenase class typified by the thermostable polymerase, Thermosequenase (USB), (Thermosequenase class) do not possess 3′ to 5′ exonuclease activity, and are thus unable to correct 3′ base mismatches in the extended primer.
  • Polymerases of either class can be employed in accordance with the present invention. The characteristics of polymerases are shown in Table 2.
  • Thermosequenase class polymerases do not possess 3′ to 5′ exonuclease activity, unless a priori sequence information is available, it is preferable to employ each oligonucleotide in a nested set of all possible permutations. Nevertheless, in many circumstances incomplete sets of oligonucleotides may be employed in concert with Thermosequenase class polymerases.
  • SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:6 and SEQ ID NO:22 were employed to sequence a target having a sequence other than GTTT at positions 25-28, one or more of the oligonucleotides would fail to hybridize its 3′ terminus to the target, and minimal nucleotide incorporation would result.
  • GBATM reactions on a particular target molecule led to the incorporation of a label for SEQ ID NO:1 and SEQ ID NO:2, but not for SEQ ID NO:6 or SEQ ID NO:22 would indicate that the nucleotides at positions 27-28 were not Ts.
  • such an observation of impaired incorporation is a useful indication that the sequence of the target molecule differs from that of the reference allele. As such, this embodiment is useful in identity and paternity analysis, and in genetic screening.
  • Klenow class polymerases can correct mismatches as well as extend primers, when such polymerases are employed in the GBATM reaction, incorporation of label may reflect primer repair as well as primer extension.
  • Klenow class polymerases in the present invention has a salient advantage. Instead of needing to provide all permutations of the sequence to be determined, one need provide only one oligonucleotide for each position to be determined. Thus, to determine the sequence of positions 2629 in the example shown above, one would need to provide at most 4 oligonucleotides (i.e.
  • an oligonucleotide such as SEQ ID NO:1 ending at position 26
  • an oligonucleotide such as SEQ ID NO:2 ending at position 27
  • an oligonucleotide such as SEQ ID NO:6 ending at position 28
  • an oligonucleotide such as SEQ ID NO:22 ending at position 29.
  • Klenow class polymerases when Klenow class polymerases are employed, two possibilities exist with respect to such an array: a particular nucleotide may become labeled by extension, or it may become labeled by primer mismatch repair. In general, only a single unambiguous sequence will be obtained.
  • Table 3 gives the results that would be obtained from the use of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:6 and SEQ ID NO:22 to evaluate a particular target molecule having the sequence CATGCG at nucleotide positions 25-30. TABLE 3 Position No. SEQ ID Nucleotide Sequence of Nucleotide NO.
  • the incorporation of G reflects the removal of the 3′ terminal C residue, and the incorporation of a G (as the nucleotide complementary to the C at position 25 in the target).
  • the incorporation of G reflects the removal of the 3′ terminal A and C residues, and the incorporation of a G (as the nucleotide complementary to the C at position 25 in the target).
  • the incorporation of T reflects the hybridization of the 3′ terminus of the primer to the target, and the extension of the primer by one nucleotide (C, the nucleotide complementary to the C at position 25 in the target.
  • the incorporation of C reflects the removal of the 3′ terminal A and A residues, and the incorporation of a C (as the nucleotide complementary to the G at position 28 in the target).
  • sequence assignments flow from the known rules of base pairing.
  • the incorporation of G, G, C and C could not mean that positions 26-29 of the target were CCGG, because such a nucleotide sequence is incompatible with the (known) sequence of the 3′ terminus of SEQ ID NO:22.
  • consideration of the known sequences of the oligonucleotides reveals the nucleotide position being reported by a particular nucleotide of the array.
  • such consideration is facilitated by performing separate sequence determinations with both a nucleotide array and its complement (such that the sequences both strands of a target molecule are obtained).
  • Table 4 illustrates typical oligonucleotide arrays by displaying sets of oligonucleotides sufficient to permit sequence analysis of exon 23 of the human BRCA1 gene (E23) and cystic fibrosis (CF) (at the locus of nucleotide 549) genes with a Klenow class polymerase.
  • E23 human BRCA1 gene
  • CF cystic fibrosis
  • Thermosequenase class polymerases are the preferred polymerases of the present invention. Since such polymerases do not repair mismatches, they are preferably used in embodiments in which oligonucleotides having all possible permutations of 3′ sequence are provided, or more preferably, in embodiments in which two oligonucleotide arrays are employed (one complementary to one strand, and the other complementary to the second strand).
  • the non-random nested array method described above is predicated on providing the target molecule with hybridization oligonucleotides that possess the exact -sequence of the target
  • the random nested array method is predicated on deriving sequence information from the pattern of oligonucleotides of the array that are extended in the GBATM reaction as well as from the identity of the nucleotide added to each extended oligonucleotide.
  • an array of oligonucleotide primers is employed.
  • the lengths of the primers are most preferably uniform, and can vary from 6-20 nucleotides in length.
  • For an array of N nucleotides there are 4 N possible sequence permutations.
  • each oligonucleotide can (if hybridized to target in a GBATM reaction) be extended by one nucleotide, the use of an array of random primers of N nucleotides in length can generate sequence information for 4 N+1 nucleotides.
  • an array of 4,096 oligonucleotides (comprising a random permutation of all possible 6-mers) could simultaneously sequence 16,384 bases of a target molecule.
  • the random array method may be illustrated as follows. An array of all possible 6-mers is prepared such that the x,y array location and sequence of -each oligonucleotide of the array is known. Each array position is incubated with the same target polynucleotide, and a GBATM reaction is conducted for each array position. These parallel (or sequential) reactions lead to the formation of a sequence signature consisting of array positions whose oligonucleotides have not been extended, and those whose oligonucleotides have been extended by addition of A, T, C or G.
  • One array position for an extended oligonucleotide is selected at random (although, in a preferred automated mode, multiple positions may be processed in parallel).
  • the sequence of the extended oligonucleotide at the selected array position is determined using the oligonucleotide's initial (predefined 6-mer sequence) and the identity of the labeled nucleotide added to the oligonucleotide's 3′ terminus in the GBATM reaction. This determination defines a second 6-mer oligonucleotide (consisting of nucleotides 2-7 of the selected oligonucleotide). The array location of this second 6-mer position is identified, and the extension product formed by the oligonucleotide at that array position is determined. Such sequence information defines a third 6-mer oligonucleotide (consisting of nucleotides 2-7 of the second selected oligonucleotide). In like manner, the entire sequence stored in the array can be deduced.
  • a salient feature of the use of the GBATM reaction in accordance with the methods of the present invention is the capacity to miniaturize such methods, resulting in a savings of space, reagents, and time, and providing increased throughput and reliability.
  • a microminiaturized analysis format is employed.
  • a microminiaturized reaction is one conducted in a reaction volume of greater than 50 ⁇ l, but less than 200 ⁇ l, and most preferably less than 100 ⁇ l.
  • Such analysis is most preferably conducted in 96 well microtiter well plates, using the indirect fluorescent colorimetry method of Huang, Z. et al. ( Anal Biochem 207:32-39 (1992)), and the use of liquid handling robots to deliver reagents.
  • a preferred format involves designing the GBATM primers so that they are associated with biotinylated spacer arms sufficient to permit them to become bound to a glass or plastic support (such as a glass slide, etc.).
  • This attachment approach has the advantage of high specificity and results in minimal nonspecific backgrounds during attachment and hybridization.
  • a preferred glass slide support for oligonucleotide immobilization has wells of exposed glass surrounded by a hydrophobic Teflon coating (Cel-line Associates, Inc.). The plates have 12 wells (7 mm in diameter), and are designed such that solutions can be dispensed with standard, multichannel pipetting instruments, and signals can be read on existing plate readers. Avidin will be covalently attached onto a glass-slide using our proprietary attachment chemistry.
  • a 50 ⁇ l solution of 0.4 ⁇ m biotinylated oligonucleotide will then be added to each well, and incubated for 2 hrs, then rinsed with TNTw (10 mM Tris-HCl, pH7.5, 150 mM NaCl, 0.05% Tween-20).
  • a nanominiaturized analysis format is employed.
  • a microminiaturized reaction is one conducted in a reaction volume of less than 50 ⁇ l, and most preferably less than 10 ⁇ l
  • the support will be an etched glass plates that will hold several hundred to several thousand nanowells (0.1-5 ⁇ l volume per well), such that entire arrays can be evaluated simultaneously.
  • the determination of the result of the GBATM reaction will most preferably be performed via a automated processing using, for example a pixel by pixel CCD camera equipped to distinguish the labels of the nucleotides being employed. Detection of the extension may be accomplished using a variety of labels, however, two detection schemes are preferred i) direct fluorescence detection on glass, and ii) enzyme-mediated fluorescence detection.
  • [0108] was interrogated with a set of primers that had been immobilized on to the surface of a 96 well microtiter plate in order to type the central five bases (shown in boldface) of the “wild-type” sequence (AGTGT) and of a single-base “mutant” sequence (AG G GT).
  • the primers used had the following sequence: SEQ ID NO:48 (Primer 1) 5′ GGTTATATCATTCTTACATAAAGG 3′ SEQ ID NO:49 (Primer 2) 5′ GTTATATCATTCTTACATAAAGGA 3′ SEQ ID NO:50 (Primer 3) 5′ TTATATCATTCTTACATAAAGGAC 3′ SEQ ID NO:51 (Primer 4) 5′ TATATCATTCTTACATAAAGGACA 3′ SEQ ID NO:52 (Primer 5) 5′ ATATCATTCTTACATAAAGGACAC 3′
  • the wild-type and mutant templates were hybridized to the immobilized GBATM primers, and the 3′ end of the GBATM primers were extended by a single fluorescent labeled chain terminator ddNTP by either Klenow or Thermosequenase. Enzyme-mediated fluorescence signal were obtained using the Cytofluor II fluorescent plate reader. The results of the experiment are shown in Table 5.
  • a 50 ⁇ l solution of 0.4 ⁇ m biotinylated oligonucleotide was added to each well, incubated for 2 hrs (1.5 mM NaCl, 10 mM EDTA, and 0.5 ⁇ M target synthetic template strands), and then rinsed with TNTw (10 mM Tris-HCl, pH7.5, 150 mM NaCl, 0.05% Tween-20).
  • GBATM biochemistry (Nikiforov, T. T. et al., Nucl Acids Res 22:4167-4175 (1994); Nikiforov, T. T. et al., PCR Methods and Apps 3:285-291 (1994), both herein incorporated by reference) was used to analyze the synthetic templates; each synthetic template was split into four different wells, and each well was treated with extension mix containing all the extension reaction components, exonuclease free Klenow fragment of E. coli polymerase, and each of four fluorescein-labeled ddNTPs and co-ddNTPs. Enzyme-mediated fluorescence signal were obtained using the Cytofluor II fluorescent plate reader. Synthetic template 1 was designed to give a GBATM signal in base A, and synthetic template 2 was designed to give a GBATM signal in base G.
  • the GBATM extension reactions are detected using the enzyme-mediated fluorescence precipitation method (Huang, Z. et al., Anal Biochem 207:32-39 (1992); Huang, Z. et al., J Histochem Cytochem 41:313-317 (1993)).
  • the glass slide containing the fluorescein GBATM signal are incubated for about 30 minutes with anti-fluorescein alkaline phosphatase solution under a blocking condition commonly used in ELISA or histochemical procedures. After washing, a droplet of an alkaline phosphatase fluorogenic precipitating substrate solution (Molecular Probes) is applied to either individual reaction wells or the entire slide.
  • the GBATM signal can be immediately visualized under a conventional fluorescent microscope equipped with a 360 nm excitation filter and a 530 nm emission filter, or quantitated by a fluorescence microtiter plate scanner (Cytofluor II) equipped with the same filter set.
  • Table 10 reveals that Primers 1-3 were extended as expected to yield extension products A, C, and T, respectively for both wild-type and AG deletion target molecules.
  • the presence of wild-type target results in the extension of Primer 4 with a C residue (consistent with the results obtained above; see Table 7).
  • the presence of the wild-type target causes Primers 5 and 6 to be extended by T and A, respectively (see Table 7).
  • the presence of the AG deletion target causes.
  • Primer 4 to be extended by an A Consistent with the result shown in Table 9). Consistent with the fact that the target mixture is 1:1 wild-type:mutant, the signals of A and C addition for Primer 4 are approximately equal.
  • Primer 5 nor Primer 6 are extended when hybridized to the AG deletion target because their 3′ terminal nucleotides would not be base-paired with the AG deletion target mutant.
  • the failure of Primer 5 to be extended when hybridized to the AG mutant reflects the relatively lower binding avidity of the polymerase for Primer 5:mutant duplexes as compared to Primer 5:wild-type duplexes (in which there would be no mismatch).
  • Table 12 shows the result of this experiment with respect to the “+” strand of this target molecule.
  • TABLE 12 Nested GBA TM Reaction Using “+” Template Strand of CF Gene At Locus 549 A C G T Primer Sig- Sig- Sig- Sig- Used nal TIN nal TIN nal TIN nal TIN Nucleotide Extended Using Exo-Klenow CF549+1 0.25 0.09 2.99 0.09 0.20 0.09 0.29 0.09 CF549+2 3.49 0.10 0.74 0.10 0.19 0.10 0.25 0.10 CF549+3 1.01 0.31 3.23 0.12 0.26 0.13 0.38 0.12 CF549+4 0.47 0.19 0.89 0.16 0.56 0.25 2.76 0.14 CF549+5 0.30 0.11 2.97 0.12 0.21 0.13 0.39 0.12 Nucleotide Extended Using Klenow CF549+1 0.43 0.15 3.37 0.10 0.29 0.15 0.82 0.64 CF549+2 3.
  • the p53 gene encompasses an approximately 19 kilobase stretch, comprising 11 exons (393 codons), of chromosome region 17p13.105-p12. Characterized as a tumor antigen in 1979, then as an oncogene, and finally as a tumor suppressor gene, p53 has received increasing study in cancer research. Mutations in the p53 gene are the single most common genetic alteration in human cancers and generally result in loss of function of the protein. The p53 protein's apparent role in regulating cell growth and apoptosis suggests it is a core protein in determination of tumorigenesis, with mutations in p53 being part of the cascade necessary for the development of many tumors.
  • p53 mutations Three quarters of colon cancers and half of lung and breast cancers have been reported to contain p53 mutations (Levine, A. J., Canc. Surveys 12:59-79 (1992); herein incorporated by reference). Since more than 100,000 additional cases of each of these cancers is diagnosed each year, the potential application of p53 analysis is significant both clinically and commercially. The majority of p53 mutations are missense (ranging from 75% to more than 90%), tightly clustered between codons 118 and 309, the DNA binding region of the protein. Amino acids 175, 248, 249, 273, and 282 account for 40% of the total reported missense mutations, and the predominance of these so-called “hot-spots” vary depending on the tissue of origin of the cancer.
  • p53 is an excellent target for development of a sequence composition/confirmation analysis tool such as SCANTM.
  • Nested GBATM primers were designed for all DNA bases in a hot-spot (codons 272-282) of the target p53 gene.
  • FIG. 2 displays the four mutational hot-spot regions of p53 gene with the wild-type and known representative mutant sequences of codons 272 to 282 (region D) highlighted.
  • three synthetic templates are designed to match three DNA samples, each containing a mutation in either codon 273, 275, or 281.
  • Two additional synthetic templates are designed to be representative of a deletion mutation (codons 266 and 267 deleted) and an insertion mutation (C insertion at codon 280).
  • TargExTM is a method developed to quantitatively convert double-stranded PCR product into single-stranded DNA by selectively degrading one of the strands with bacteriophage T7 gene 6 exonuclease (Nikiforov, T. T. et al., Nucl Acids Res 22:4167-4175 (1994); Nikiforov, T. T. et al., PCR Methods and Apps 3:285-291 (1994)).
  • PCR product amplified from human genomic DNA using one fluorescein-labeled, phosphorothioated PCR primer and one unmodified primer are treated with T7 gene 6 exonuclease (U.S. Biochemical) at a final concentration of 2 U/ ⁇ l PCR (diluted in buffer supplied by manufacturer). After 1 hr of incubation at room temperature, NaCl and EDTA are added to a concentration of 1.5M and 10 mM, respectively, to stop the exonuclease digestion. The mixture is then applied to the immobilized GBATM primer for subsequent hybridization and extension. After extension, the standard ABI fluorescent cycle sequencing system is used to analyze the reaction.
  • T7 gene 6 exonuclease U.S. Biochemical
  • the 5′ end of the primers are specifically attached to glass slides to form a SCANTM array.
  • Synthetic oligonucleotide templates corresponding to portions of the target hot-spot and containing various known mutations are used to test the array and the GBATM biochemistry to demonstrate that robust, unambiguous (low noise and background) data can be obtained from such an analysis.
  • Permutations of the standard GBATM biochemistry, in particular the use of different DNApolymerases, are evaluated to ensure optimal signal:noise (S:N) characteristics for all 4 nucleotides in the feasibility test system.
  • Primer pairs will be qualified by amplification of human genomic DNA at a concentration of 12.5 ⁇ g/ml in 30 ⁇ l reactions in 96 well V-bottom polycarbonate plates (Costar). The final concentration of the reaction mixture will be 400 ⁇ M each dNTP, 50 mM KCl, 10 mM Tris HCl (pH 8.5),1.5 mM MgCl 2 , 0.5 ⁇ M each primer, 2.5 ng/ ⁇ l DNA, and 0.025 U/ ⁇ l Taq DNA polymerase (Perkin-Elmer). Each reaction will be overlayed with 30 ⁇ l mineral oil and cycled in a BioIII thermocycler (Sun BioScience Inc., Branford Conn.).
  • the performance of the Nested GBATM method is assessed using PCR-amplified genomic DNA as the target for analysis. At least two overlapping PCR primer pairs are designed and tested on wild-type and mutant-containing genomic DNAs (five total), and the resultant PCR products tested by N-GBATM on the SCANTM arrays produced in Example 5. The PCR products will be evaluated for hybridization and extension efficiencies relative to the synthetic templates of Example 5 to ensure that analysis of PCR products is equally robust.
  • Table 15 shows that Thermosequenase did not extend primers whose 3′ terminal nucleotides were not based paired to the template. In contrast, Klenow and Exo-Klenow were both able to incorporate label onto the 3′ terminus of 3′ terminally mismatched primers, consistent with the data presented above.

Abstract

The invention provides a kit and a solid phase sequencing method for determining the sequence of nucelic acid molecules, and particularly the sequence of regions of nucleic acid molecules in which variation from the normal sequence is associated with disease.

Description

    FIELD OF THE INVENTION
  • The present invention relates to nucleic acid chemistry, and more specifically to a method for determining the nucleotide sequence of a polynucleotide. The invention further relates to apparatus and kits that embody or employ such a method. [0001]
  • BACKGROUND OF THE INVENTION
  • The determination of the nucleotide sequence of a polynucleotide has substantial utility in medicine, forensics, biomedical research, and in the determination of paternity and identity. Several methods for determining the nucleotide sequence of a polynucleotide have been identified. [0002]
  • I. Nucleic Acid Sequencing [0003]
  • Initial attempts to determine the sequence of a DNA molecule were extensions of techniques which had been initially developed to permit the sequencing of RNA molecules (Sanger, F., [0004] J. Mol. Biol. 13:373 (1965); Brownlee, G. G. et al., J. Mol. Biol. 34:379 (1968)). Such methods involved the specific cleavage of DNA into smaller fragments by (1) enzymatic digestion (Robertson, H. D. et al., Nature New Biol. 241:38 (1973); Ziff, E. B. et al., Nature New Biol. 241:34 (1973)); (2) nearest neighbor analysis (Wu, R., et al., J. Mol. Biol. 57:491 (1971)), and (3) the “Wanderings Spot” method (Sanger, F., Proc. Natl. Acad. Sci. (U.S.A.) 70:1209 (1973)).
  • The most commonly used methods of nucleic acid sequencing are the “dideoxy-mediated chain termination method,” also known as the “Sanger Method” (Sanger, F., et al., [0005] J. Molec. Biol. 94:441 (1975); Prober, J. et al. Science 238:336-340 (1987)) and the “chemical degradation method,” “also known as the “Maxam-Gilbert method” (Maxam, A. M., et al., Proc. Natl. Acad. Sci. (U.S.A.) 74:560 (1977), both references herein incorporated by reference).
  • A. Dideoxy-Mediated Chain Termination Method Of DNA Sequencing [0006]
  • In the dideoxy-mediated or “Sanger” chain termination method of DNA sequencing, the sequence of a DNA molecule is obtained through the extension of an oligonucleotide primer which is hybridized to the nucleic acid molecule being sequenced. In brief, four separate primer extension reactions are conducted. In each reaction, a DNA polymerase is added along with the four nucleotide triphosphates needed to polymerize DNA. Each of the reactions is carried out in the additional presence of a 2′,3′ dideoxy derivative of the A, T, C, or G nucleoside triphosphates. Such derivatives differ from conventional nucleotide triphosphates in that they lack a hydroxyl residue at the 3′ position of deoxyribose. Thus, although they can be incorporated by a DNA polymerase into the newly synthesized primer extension, the absence of the 3′ hydroxyl group causes them to be incapable of forming a phosphodiester bond with a succeeding nucleotide triphosphate. Thus, the incorporation of a dideoxy derivative results in the termination of the extension reaction. Since the dideoxy derivatives are present in lower concentrations than their corresponding, conventional nucleotide triphosphate analogs, the net result of each of the four reactions is to produce a set of nested oligonucleotides each of which is terminated by the particular dideoxy derivative used in the reaction. By subjecting the reaction products of each of the extension reactions to electrophoresis, it is possible to obtain a series of four “ladders.” Since the position of each “rung” of the ladder is determined by the size of the molecule, and since such size is determined by the incorporation of the dideoxy derivative, the appearance and location of a particular “rung” can be readily translated into the sequence of the extended primer. Thus, through an electrophoretic analysis, the sequence of the extended primer can be determined. [0007]
  • One deficiency of the dideoxy-mediated sequencing method is the need to optimize the ratio of dideoxy nucleoside triphosphates to conventional nucleoside triphosphates in the chain-extension/chain-termination reactions. Such adjustments are needed in order to maximize the amount of information which can be obtained from each primer. Additionally, the efficiency of dideoxy nucleotide incorporation in a particular target molecule is partially dependent upon the primary and secondary structures of the target. [0008]
  • The dideoxy-mediated method thus requires single-stranded templates, specific oligonucleotide primers, and high quality preparations of a DNA polymerase (typically the Klenow fragment of [0009] E. coli DNA polymerase I). Initially, these requirements delayed the wide spread use of the method. However, with the ready availability of synthetic primers, and the availability of bacteriophage M13 and phagemid vectors (Maniatis, T., et al., Molecular Cloning, a Laboratory Manual, 2nd Edition, Cold Spring Harbor Press Cold Spring Harbor, N.Y. (1989), herein incorporated by reference), the dideoxy-mediated chain termination method is now extensively employed.
  • B. The Maxam-Gilbert Method Of DNA Sequencing [0010]
  • The Maxam-Gilbert method of DNA sequencing is a degradative method. In this procedure, a fragment of DNA is labeled at one end and partially cleaved in four separate chemical reactions, each of which is specific for cleaving the DNA molecule at a particular base (G or C) at a particular type of base (A/G, C/T, or A>C). As in the above-described dideoxy method, the effect of such reactions is to create a set of nested molecules whose lengths are determined by the locations of a particular base along the length of the DNA molecule being sequenced. The nested reaction products are then resolved by electrophoresis, and the end-labeled molecules are detected, typically by autoradiography when a [0011] 32p label is employed. Four single lanes are typically required in order to determine the sequence.
  • The Maxam-Gilbert method thus uses simple chemical reagents which are readily available. Nevertheless, the dideoxy-mediated method has several advantages over the Maxam-Gilbert method. The Maxam-Gilbert method is extremely laborious and requires meticulous experimental technique. In contrast, the Sanger method may be employed on larger nucleic acid molecules. [0012]
  • Significantly, in the Maxam-Gilbert method the sequence is obtained from the original DNA molecule, and not from an enzymatic copy. For this reason, the method can be used to sequence synthetic oligonucleotides, and to analyze DNA modifications such as methylation, etc. It can also be used to study both DNA secondary structure and protein-DNA interactions. Indeed, it has been readily employed in the identification of the binding sites of DNA binding proteins. [0013]
  • Methods for sequencing DNA using either the dideoxy-mediated method or the Maxam-Gilbert method are widely known to those of ordinary skill in the art. Such methods are, for example, disclosed in Maniatis, T., et al., [0014] Molecular Cloning, a Laboratory Manual, 2nd Edition. Cold Spring Harbor Press, Cold Spring Harbor, N.Y. (1989), and in Zyskind, J. W., et al., Recombinant DNA Laboratory Manual, Academic Press, Inc. New York (1988), both herein incorporated by reference.
  • Both the dideoxy-mediated method and the Maxam-Gilbert method of DNA sequencing require the prior isolation of the DNA molecule which is to be sequenced. The sequence information is obtained by subjecting the reaction products to electrophoretic analysis (typically using polyacrylamide gels). Thus, a sample is applied to a lane of a gel, and the various species of nested fragments are separated from one another by their migration velocity through the gel. The number of nested fragments which can be separated in a single lane is approximately 200-300 regardless of whether the Sanger or the Maxam-Gilbert method is used. Those of great skill in the art can separate up to 600 fragments in a single lane. Thus, in order to sequence large DNA molecules, it is necessary to fragment the molecule, and to sequence the fragments in separate lanes of the sequencing gel. The sequence of the entire molecule is obtained by orienting and ordering the sequence data obtained from each fragment. [0015]
  • Two approaches have been employed by those of skill in this art to accomplish this goal. In a random or shotgun sequencing approach, sequence data is collected by subcloning fragments of the target DNA molecule. No attempt is initially made to determine the linear orientation or order of the subclones with respect to the intact target DNA molecule. Instead, the accumulated data are stored and ultimately arranged into order by a computer (Staden, R., [0016] Nucleic Acids Res. 14:217 (1986); Anderson, S. et al., Nature 290:457 (1981); Gingeras, T. R., J. Biol. Chem. 257:13475 (1982); Sanger, F. et al., J. Mol. Biol. 162:729 (1982), and Baer, R. et al., Nature 310:207 (1984)). As will be appreciated, such random shotgun approaches often result in the multiple sequencing of the same oligonucleotide fragment, and thus are often inefficient in terms of time and materials.
  • In contrast, directed approaches have been employed in which sequences of the target DNA are obtained in a systematic fashion. For example, the target DNA molecule may be ordered by restriction mapping using the methods described above, and the discrete restriction fragments sequenced. Alternatively, the target molecule may be sequenced by sequencing nested sets of deletions which begin at one of its ends. The use of such nested fragments progressively brings more and more remote regions of the target DNA into range for sequencing. Lastly, sequence information obtained from a particular target molecule can be used to prepare a primer which can then be used in a subsequent sequencing reaction in order to obtain additional sequence information. As will be perceived, a directed sequence analysis of a target DNA molecule often requires substantial a priori information regarding the sequence. Moreover, for large target molecules (of sizes on the order of kilobases) such as would be encountered in the sequencing of eukaryotic (and in particular, mammalian) chromosomes, directional sequencing is quite arduous. [0017]
  • II. Microsequencing and GBA™ Genetic Analysis [0018]
  • In contrast to the “Sanger Method” and the “Maxam-Gilbert method,” which identify the sequence of all of the nucleotides of a target polynucleotide, “microsequencing” methods determine the identity of only a single nucleotide at a “predetermined” site. Such methods have particular utility in determining the presence and identity of polymorphisms in a target polynucleotide. [0019]
  • The GBA™ Genetic Bit Analysis method disclosed by Goelet, P. et al. (WO 92/15712, herein incorporated by reference) is a particularly useful microsequencing method. In GBA™, the nucleotide sequence information surrounding a predetermined site of interrogation is used to design an oligonucleotide primer that is complementary to the region immediately adjacent to, but not including, the predetermined site. The target DNA template is selected from the biological sample and hybridized to the interrogating primer. This primer is extended by a single labeled dideoxynucleotide using DNA polymerase in the presence of at least two, and most preferably all four chain terminating nucleoside triphosphate precursors. [0020]
  • Additional, primer-guided, nucleotide incorporation procedures for assaying polymorphic sites in DNA have also been described (Komher, J. S. et al, [0021] Nucl. Acids. Res. 17:7779-7784 (1989); Sokolov, B. P., Nucl. Acids Res. 18:3671 (1990); Syvänen, A. -C., et al., Genomics 8:684-692 (1990); Kuppuswamy, M. N. et al., Proc. Natl. Acad. Sci. (U.S.A.) 88:1143-1147 (1991); Prezant, T. R. et al., Hum. Mutat. 1:159-164 (1992); Ugozzoli, L. et al., GATA 9:107-112 (1992); Nyrén, P. et al., Anal. Biochem. 208:171-175 (1993); and Wallace, WO89/10414). These methods differ from Genetic Bit™ Analysis in that they all rely on the incorporation of labeled deoxynucleotides to discriminate between bases at a polymorphic site. In such a format, since the signal is proportional to the number of deoxynucleotides incorporated, polymorphisms that occur in runs of the same nucleotide can result in signals that are proportional to the length of the run (Syvänen, A. -C., et al., Amer. J. Hum. Genet. 52:46-59 (1993)). Such a range of locus-specific signals could be more complex to interpret, especially for heterozygotes, compared to the simple, ternary (2:0, 1:1, or 0:2) class of signals produced by the GBA™ method. In addition, for some loci, incorporation of an incorrect deoxynucleotide can occur even in the presence of the correct dideoxynucleotide (Komher, J. S. et al., Nucl. Acids. Res. 17:7779-7784 (1989)). Such deoxynucleotide misincorporation events may be due to the Km of the DNA polymerase for the mispaired deoxy-substrate being comparable, in some sequence contexts, to the relatively poor Km of even a correctly base paired dideoxy-substrate (Komberg, A., et al., In: DNA Replication, Second Edition (1992), W. H. Freeman and Company, New York; Tabor, S. et al., Proc. Natl. Acad. Sci. (U.S.A.) 86:4076-4080 (1989)). This effect would contribute to the background noise in the polymorphic site interrogation.
  • Mundy, C. R. (U.S. Pat. No. 4,656,127) discusses alternative microsequencing methods for determining the identity of the nucleotide present at a particular polymorphic site. Mundy's methods employ a specialized exonuclease-resistant nucleotide derivative. A primer complementary to the allelic sequence immediately 3′-to the polymorphic site is permitted to hybridize to a target molecule obtained from a particular animal or human. If the polymorphic site on the target molecule contains a nucleotide that is complementary to the particular exonucleotide-resistant nucleotide derivative present, then that derivative will be incorporated by a polymerase onto the end of the hybridized primer. Such incorporation renders the primer resistant to exonuclease, and thereby permits its detection. Since the identity of the exonucleotide-resistant derivative of the sample is known, a finding that the primer has become resistant to exonucleases reveals that the nucleotide present in the polymorphic site of the target molecule was complementary to that of the nucleotide derivative used in the reaction. The Mundy method has the advantage that it does not require the determination of large amounts of extraneous sequence data. It has the disadvantages of destroying the amplified target sequences, and unmodified primer and of being extremely sensitive to the rate of polymerase incorporation of the specific exonuclease-resistant nucleotide being used. [0022]
  • Cohen, D. et al. (French Patent 2,650,840; PCT Appln. No. WO91/02087) discuss a solution-based method for determining the identity of the nucleotide of a polymorphic site. As in the Mundy method of U.S. Pat. No. 4,656,127, a primer is employed that is complementary to allelic sequences immediately 3′- to a polymorphic site. The method determines the identity of the nucleotide of that site using labeled dideoxynucleotide derivatives, which, if complementary to the nucleotide of the polymorphic site will become incorporated onto the terminus of the primer. [0023]
  • In contrast to the method of Cohen et al. (French Patent 2,650,840; PCT Appln. No. WO91/02087) the GBA™ method of Goelet, P. et al. can be conducted as a heterogeneous phase assay, in which the primer or the target molecule is immobilized to a solid phase. It is thus easier to perform, and more accurate than the method discussed by Cohen. The method of Cohen has the significant disadvantage of being a solution-based extension method that uses labeled dideoxynucleoside triphosphates. In the Cohen method, the target DNA template is usually prepared by a DNA amplification reaction, such as the PCR, that uses a high concentration of deoxynucleoside triphosphates, the natural substrates of DNA polymerases. These monomers will compete in the subsequent extension reaction with the dideoxynucleoside triphosphates. Therefore, following the PCR, an additional purification step is required to separate the DNA template from the unincorporated dNTPs. Because it is a solution-based method, the unincorporated dNTPs are difficult to remove and the method is not suited for high volume testing. [0024]
  • III. Sequencing Via Hybridization To Ordered Oligonucleotide Arrays [0025]
  • In response to the difficulties encountered in employing gel electrophoresis to analyze sequences, alternative methods have been developed. Macevicz (U.S. Pat. No. 5,002,867), for example, describes a method for determining nucleic acid sequence via hybridization with multiple mixtures of oligonucleotide probes. In accordance with such method, the sequence of a target polynucleotide is determined by permitting the target to sequentially hybridize with sets of probes having an invariant nucleotide at one position, and a variant nucleotides at other positions. The Macevicz method determines the nucleotide sequence of the target by hybridizing the target with a set of probes, and then determining the number of sites that at least one member of the set is capable of hybridizing to the target (i.e. the number of “matches”). This procedure is repeated until each member of a sets of probes has been tested. [0026]
  • IV. Limitation Of Conventional Methods [0027]
  • Several factors may limit the use of conventional methods in the analysis of the nucleotide sequence of a target molecule. Typically, each lane of a sequencing gel can resolve only about 300 different fragments. Thus, in order to determine the nucleotide sequence of a large DNA molecule, multiple sequencing gels are often needed. This, in turn, limits the amount of new sequence information which can be readily obtained per day. For a large nucleic acid molecule, a substantial number of technically demanding and time consuming steps must be performed. In particular, since the above-described techniques are capable of analyzing only one'set of nested oligonucleotides per sample, the sequencing of large DNA molecules requires the use of multiple sequencing gels each having a large number of lanes. The electrophoretic analysis step in the sequencing process thus comprises a significant limitation to the amount of sequence information which can be obtained and the rate with which it can be processed. [0028]
  • In summary, a method which would permit accurate nucleotide sequencing without gel analysis would be highly desirable. Indeed, for the analysis of very large genomes, such as the human genome, the development of such methods may be essential. [0029]
  • SUMMARY OF THE INVENTION
  • The invention provies a solid phase sequencing method for determining the sequence of nucelic acid molecules (either DNA or RNA). In detail, the invention provides a method for determining the nucleotide sequence of a nucleic acid molecule which comprises the steps of: [0030]
  • (A) arraying a set of nested primer oligonucleotides onto a solid support, each array position containing a different array member having a predetermined sequence; [0031]
  • (B) incubating oligonucleotides of the array in the presence of a preparation of the nucleic acid molecules, a polymerase and at least one chain terminator nucleotide; wherein the incubation is under conditions sufficient to permit DNA hybridization to occur between the oligonucleotides of the incubation and the nucleic acid molecules; wherein the incubation is conducted in the substantial absence of any non-chain terminator nucleotides; [0032]
  • (C) (1) in the case wherein the 3′ terminal nucleotide of an oligonucleotide is hybridized to the nucleic acid molecule, permitting oligonucleotides hybridized to nucleic acid molecules to be extended by polymerase-mediated incorporation of a single chain terminator nucleotide residue onto the 3′ terminus of the hybridized oligonucleotide, wherein for each hybridized oligonucleotide being so extended, the incorporated nucleotide residue is complementary to the nucleotide residue immediately 5′ to the nucleotide residue of the nucleic acid molecule that is hybridized with that oligonucleotide's 3′ terminal nucleotide residue; then performing step (D); [0033]
  • (2) in the case wherein the 3′ terminal nucleotide of an oligonucleotide is not hybridized to the nucleic acid molecule, either: [0034]
  • (a) not permitting oligonucleotides hybridized to nucleic acid molecules to be extended by polymerase-mediated incorporation of a single chain terminator nucleotide residue onto the 3′ terminus of the hybridized oligonucleotide, or [0035]
  • (b) permitting the removal of any non-hybridized nucleotide residues from the 3′ terminus of the hybridized oligonucleotides, so as to form a truncated primer oligonucleotide whose 3′ terminus is hybridized to the nucleic acid molecule, and then permitting polymerase-mediated incorporation of a single chain terminator nucleotide residue onto the 3′ terminus of the hybridized truncated oligonucleotide, wherein for each hybridized truncated oligonucleotide being so extended, the incorporated nucleotide residue is complementary to the nucleotide residue immediately 5′ to the nucleotide residue of the nucleic acid molecule that is hybridized with that truncated oligonucleotide's 3′ terminal nucleotide residue; then performing step (D); [0036]
  • (D) determining, at each array position at which an oligonucleotide has incorporated a single chain terminator nucleotide residue, the identity of the incorporated chain terminator nucleotide residue; and [0037]
  • (E) determining the nucleotide sequence of the nucleic acid molecule from the determined identity of the incorporated nucleotide of primer oligonucleotides of the array, and known sequence of the oligonucleotide at each array position. [0038]
  • The invention particularly concerns embodiments in each array position contains a primer oligonucleotide that is capable of hybridizing to a region of the nucleic acid molecule, and/or wherein in step (C), at least some array positions contain nucleic acid molecules hybridized to oligonucleotides the whose 3′ terminal nucleotide is not hybridized to the nucleic acid molecule, and wherein step (C)(1) is conducted for such oligonucleotides. [0039]
  • Either a Thermosequenase class polymerase or a Klenow class polymerase may be employed in the method. [0040]
  • The invention particularly includes the embodiments in which the array is a random oligonucleotide array, and in which the array is a nested oligonucleotide array (especially one containing oligonucleotide members having all possible permutations of nucleotides over a region of from 1 to 20 bases. [0041]
  • The invention is particularly adaped for conducting the method in the presence of at least four chain terminator nucleotide species, at least one of which is labeled, and more preferably wherein all of the chain terminator nucleotide species are labeled, and wherein the label of any such species can be distinguished from the label of any other species present. [0042]
  • The invention particularly provides a method of sequence determination for genomic DNA of a human or non-human mammal, and is especially adapted for use in determining the sequence of DNA suspected to contain a genetic variation associated with a disease (e.g., cancer or cystic fibrosis), and in which the method is employed to determine whether the DNA contains the variation. [0043]
  • In a preferred embodiment of the method, the oligonucleotides are immobilized onto the solid support, such as plastic or glass). [0044]
  • The invention also provides a kit for determining the sequence of a nucleic acid molecule which comprises a solid support containing an array of spaced apart receptacles for oligonucleotides, each receptacle containing a different primer oligonucleotide. The kit may additionally contain at least four chain terminator nucleotide species, at least one of which is labeled. A highly preferred kit contains at least four chain terminator nucleotide species, wherein all of the chain terminator nucleotide species are labeled, and wherein the label of any such species can be distinguished from the label of any other species present. [0045]
  • The kit is particularly suited for determining the nucleotide sequence of DNA suspected to contain a genetic variation associated with a disease, and to provide a determination of nucleotide sequence sufficient to determine whether the DNA contains the variation. [0046]
  • BRIEF DESCRIPTION OF THE FIGURES
  • FIG. 1 shows the result of a nested GBA™ (N-GBA™) experiment. [0047]
  • FIG. 2 shows the four major p53 mutational hot-spot regions containing most cited p53 mutations are indicated by the black bars marked A-D: A=codons 132-143, B=codons 174-179, C=codons 236-258 and D=codons 272-282 (del=deletion; ins=insertion).[0048]
  • DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • The present invention provides a method of sequencing which provides the advantages of micro- and nano-sequencing and has the ability to sequence polynucleotide regions. In brief, the method employs ordered arrays of linear primers that are capable of hybridizing to a target molecule and reporting the identity of the single nucleotide that is present in the hybridized molecule immediately 5′ to the 3′ terminus of the primer. By employing a suitable array of such primers, the invention permits one to ascertain he complete nucleotide sequence of a target polynucleotide. There are thus two central aspects to the present invention: the method of sequence analysis, and the nature of the primer array. [0049]
  • I. GBA™ Sequence Analysis [0050]
  • The most preferred method of the present invention employs a modification of the GBA™ method of analyzing a predetermined site as the means for accomplishing sequence analysis. The GBA™ method can be conducted in a variety of ways. In particular, such interrogation can be accomplished via a polymerase-mediated analysis or by a ligase-mediated analysis. [0051]
  • A. Polymerase-Mediated Analysis [0052]
  • The polymerase-mediated analysis is more fully described by Goelet, P. et al. (PCT Application WO92/15712, herein incorporated by reference). In this assay, a purified oligonucleotide having a defined sequence (complementary to an immediate proximal or distal sequence of a polymorphism) is bound to a solid support, especially a microtiter dish. A sample, suspected to contain the target molecule, or an amplification product thereof, is placed in contact with the support, and any target molecules present are permitted to hybridize to the bound oligonucleotide. [0053]
  • In one preferred embodiment, an oligonucleotide having a sequence that is complementary to an immediately distal sequence of a polymorphism is prepared using the above-described methods (and preferably that of Nikiforov, T. (U.S. patent application Ser. No. 08/005,061, herein incorporated by reference). The terminus of the oligonucleotide is attached to the solid support, as described, for example by Goelet, P. et al. (PCT Application WO 92/15712), such that the 3′-end of the oligonucleotide can serve as a substrate for primer extension. [0054]
  • The immobilized primer is then incubated in the presence of a DNA molecule (preferably a genomic DNA molecule) having a single nucleotide polymorphism whose immediately 3′-distal sequence is complementary to that of the immobilized primer. Preferably, such incubation occurs in the complete absence of any dNTP (i.e. dATP, dCTP, dGTP, or dTTP), but only in the presence of one or more chain terminating nucleotide triphosphate derivatives (such as a dideoxy derivative), and under conditions sufficient to permit the incorporation of such a derivative on to the 3′-terminus of the primer. As will be appreciated, where the polymorphic site is such that only two or three alleles exist (such that only two or three species of dNTPs, respectively, could be incorporated into the primer extension product), the presence of unusable nucleotide triphosphate(s) in the reaction is immaterial. In consequence of the incubation, and the use of only chain terminating nucleotide derivatives, a single dideoxynucleotide is added to the 3′-terminus of the primer. The identity of that added nucleotide is determined by, and is complementary to, the nucleotide of the polymorphic site of the polymorphism. [0055]
  • In this embodiment, the nucleotide of the polymorphic site is thus determined by assaying which of the set of labeled nucleotides has been incorporated onto the 3′-terminus of the bound oligonucleotide by a primer-dependent polymerase. Most preferably, where multiple dideoxynucleotide derivatives are simultaneously employed, different labels will be used to permit the differential determination of the identity of the incorporated dideoxynucleotide derivative. [0056]
  • B. Polymerase/Ligase-Mediated Analysis [0057]
  • In an alternative embodiment, the identity of the nucleotide of the polymorphic site is determined using a polymerase/ligase-mediated process. As in the above embodiment, an oligonucleotide primer is employed, that is complementary to the immediately 3′-distal invariant sequence of the polynucleotide being analyzed. A second oligonucleotide, is tethered to the solid phase via its 3′-end. The sequence of this oligonucleotide is complementary to the 5′-proximal sequence of the predetermined site being analyzed, but is incapable of hybridizing to the oligonucleotide primer. [0058]
  • These oligonucleotides are incubated in the presence of DNA containing the single nucleotide polymorphism that is to be analyzed, and at least one 2′, 5′-deoxynucleotide triphosphate. The incubation reaction further includes a DNA polymerase and a DNA ligase. [0059]
  • The tethered and soluble oligonucleotides are thus capable of hybridizing to the same strand of the single nucleotide polymorphism under analysis. The sequence considerations cause the two oligonucleotides to hybridize to the proximal and distal sequences of the polynucleotide site that flank the predetermined site; the hybridized oligonucleotides are thus separated by a “gap” of a single nucleotide at the precise position of the predetermined site. [0060]
  • The presence of a polymerase and a deoxynucleotide complementary to the nucleotide of the gap permits ligation of the primer extended with the complementary deoxynucleotide to the immobilized oligonucleotide complementary to the distal sequence, a deoxynucleotide triphosphate that is complementary to the nucleotide of the polymorphic site permits the creation of a ligatable substrate. The ligation reaction immobilizes the deoxynucleotide and the previously soluble primer oligonucleotide to the solid support. [0061]
  • The identity of the polymorphic site that was opposite the “gap” can then be determined by any of several means. In a preferred embodiment, the deoxynucleotide of the reaction is labeled, and its detection thus reveals the identity of the complementary nucleotide of the predetermined site. Several different deoxynucleotides may (and preferably will) be present, each differentially labeled. Alternatively, separate reactions can be conducted, each with a different deoxynucleotide. In an alternative sub-embodiment, the deoxynucleotides are unlabeled and a labeled dideoxynucleotide is employed, and the second, soluble oligonucleotide is labeled. Separate reactions are conducted, each using a different unlabeled dideoxynucleotide. The reaction that contains the complementary nucleotide permits the ligatable substrate to form, and is detected by detecting the immobilization of the previously soluble oligonucleotide. [0062]
  • C. Signal-Amplification [0063]
  • The sensitivity of nucleic acid hybridization detection assays may be increased by altering the manner in which detection is reported or signaled to the observer. Thus, for example, assay sensitivity can be increased through the use of detectably labeled reagents. A wide variety of such signal amplification methods have been designed for this purpose. Kourilsky et al. (U.S. Pat. No. 4,581,333) describe the use of enzyme labels to increase sensitivity in a detection assay. Fluorescent labels (Prober, J. et al. Science 238:336-340 (1987); Albarella et al., EP 144914), chemical labels (Sheldon III et al., U.S. Pat. No. 4,582,789; Albarella et al. U.S. Pat. No. 4,563,417), modified bases (Miyoshi et al., EP 119448), etc. have also been used in an effort to improve the efficiency with which hybridization can be observed. [0064]
  • It is preferable to employ fluorescent, and more preferably chromogenic (especially enzyme) labels, such that the identity of the incorporated nucleotide can be determined in an automated, or semi-automated manner using a spectrophotometer. [0065]
  • D. Use of GBA™ Analysis in the Methods of the Present Invention [0066]
  • GBA™ was developed as a solid-phase single nucleotide polymorphism genotyping method based on single-base extension of an interrogation primer across a target base of interest. In contrast to gel-based testing, a solid-phase array can be manufactured in a standardized way with quality control, thereby ensuring that variation in performance of the test is more a factor of input DNA quality and less of operator expertise. The present invention extends this method to a N-GBA™ format, with the complementary interrogation primers nested at one (or more) base intervals across the target sequence, and thus enables detailed sequence analysis of a complex target DNA sequence. While GBA™ is well suited to single-base interrogations, the N-GBA™ method of the present invention is ideally suited to analysis of intermediate length (10-100 base) DNA target sequences. Application of the N-GBA™ method in a Sequence Confirmation/composition ANalysis (SCAN™)—chip prototype (a miniaturized array of interrogation primers on a glass slide) is the most preferred embodiment of the method, and permits standardized (through manufacture of the oligonucleotide interrogation primer arrays), lower cost (through miniaturization of the test) and accurate (through use of the GBA™ biochemistry) scanning for p53 mutations. [0067]
  • The solid-phase format of the present invention also provides advantages in processing, since reagents can be added by hand at small scales, or by robots on a larger scale, without changes to the test. The size of the arrays can be controlled as well, so that the advantages of miniaturization can be realized: thus a 30 μl PCR reaction can be hybridized simultaneously to hundreds or thousands of oligonucleotides in an array only a few millimeters in diameter. In this way, processing can be performed at a “macro” scale, using standard pipetters, and information extracted at a “micro” or “nano” scale using fluorescent imaging. These advantages provide a lower cost test having much more reproducible performance. Unlike methods that rely on hybridization as the method of analysis, the methods of the present invention exploit the use of primer extension biochemistry for nucleotide-by-nucleotide analysis and its application to a solid-phase oligonucleotide array format. The addition of primer extension to solid-phase analysis adds significant increases in test accuracy and differential sensitivity over hybridization-based approaches while exploiting the advantages of solid-phase-based testing over gel-based tests. [0068]
  • This strategy of nesting the GBA™ across a region of interest eliminates any need to “expect” (i.e., guess in advance) a particular mutation. Nesting eliminates any need to limit analysis to a specific nucleotide. Current GBA™ detection technology is a “two-result” system (distinguishing wild-type from mutant). An additional innovation of the preferred embodiments of the present invention involves the use of a “four-result” system, which, by parallel detection of all four possible DNA bases for each site in the sequence, provides enhanced accuracy. With this innovation, the change of any nucleotide in the target region to any other possible base will be detectable in a base-specific fashion, thus any mutation in a proposed target hot-spot will be identifiable, including novel mutations. [0069]
  • In one embodiment, this is accomplished by separating the arrays into four identical array spots to which PCR or other amplified product can hybridize equally. The GBA™ extension is thus preferably divided into four reaction mixes, each containing a different haptenated dideoxynucleotide triphosphate (ddNTP). The four spots represent the four possible bases: G, A, T and C, and incorporation of each possible base can be evaluated for each oligonucleotide in the array and from this the sequence composition of the target fragment deduced. The SCAN™-chip format, utilizing N-GBA™ biochemistry, will thus enable: highly accurate mutation detection due to the sensitivity of primer extension to hybridization mismatch at the 3′ (extended) end of the interrogation primer; increased informativeness since the mutation is detected in a highly localized fashion; relatively standardizable and simple testing due to the SCAN™ format; and cost-effectiveness due to miniaturization of the arrays. [0070]
  • In accordance with the methods of the present invention, the target polynucleotide (i.e., the nucleic acid molecule that is to be sequenced) is provided to each array position of a spacially separated array of oligonucleotide primers in single-stranded form, under conditions sufficient to permit hybridization to occur. As used herein, an array of oligonucleotides is said to be “spacially separated” if an oligonucleotide of one sequence is separated from an oligonucleotide of another sequence. In the microminiaturized method described below, each oligonucleotide species of the array is provided to a separate microtiter well. In contrast, in the nanominiaturiuzed method, each oligonucleotide species of the array is provided to a distinct region of a surface, such as a glass slide, etc. As used herein, the term “array” is intended to define a two dimensional or three dimensional matrix having a definition of X,Y or X,Y,Z, such that, for example, at [0071] array position 1,1 a particular oligonucleotide is found; an oligonucleotide of different sequence is found at array position 1,2 or 2,1, etc. For each array, the oligonucleotide found at each array position is defined and known in advance of any reaction.
  • The sequence of each oligonucleotide of each array position is selected such that it will be shorter in length than the target polynucleotide being sequenced. Most preferably, such nucleotides will be less that 30 bases in length, and most preferably less than 10 bases. Oligonucleotides of 5 bases in length are preferred. As such, if an oligonucleotide of N residues hybridizes to the target polynucleotide, its 3′ terminus (residue N) will hybridize to a nucleotide of the target polynucleotide, and can be extended via a template-dependent polymerization reaction to incorporate an “interrogation nucleotide” as residue N+1 of that oligonucleotide. The identity of the “interrogation nucleotide” is dependent upon (and is complementary to) the nucleotide species of the target polynucleotide that is present immediately 5′ adjacent to the nucleotide that hybridizes to the 3′ terminus of the oligonucleotide, prior to the polymerization reaction. [0072]
  • Each array position additionally contains more than one different nucleotide specie, such that nucleotide species are present that are complementary to at least two, and in the most preferred embodiment, all four of the nucleotide species of DNA (i.e., adenosine, cytosine, thymidine and guanosine, designated A, C, T and G, respectively). The nucleotide species present are “chain terminator” nucleotides. Although such nucleotide species can be incorporated onto the 3′ terminus of an oligonucleotide by a DNA polymerase, the resultant extended oligonucleotide cannot be further extended by a polymerase, even in the presence of non-terminator nucleotides. The most preferred chain terminator nucleotide species of the present invention are 2′-deoxynucleoside 5′-triphosphates. The chain terminator nucleotide species are detectably labeled, such that an extension reaction that results in the incorporation of a nucleotide complementary to one of the nucleotide species of DNA can be distinguished from an extension reaction that results in the incorporation of a nucleotide complementary to a different nucleotide species of DNA. Any of the conventionally used radioisotopic, enzymatic, fluorescent or chemiluminescent labels may be used in accordance with the methods of the present invention. In lieu of such labels, haptenic labels, such as biotin or other labels such as ligands, antigens, etc. may be used. Suitable labels are disclosed, for example, by Kourilsky et al. (U.S. Pat. No. 4,581,333), Prober et al. (Science 238:336-340 (1987)); Albarella et al., (EP 144914), Sheldon III et al. (U.S. Pat. No. 4,582,789), Albarella et al. (U.S. Pat. No. 4,563,417), and Miyoshi et al. (EP 119448). [0073]
  • It is however, preferred to employ the enzyme-mediated fluorescence precipitation method (Huang, Z. et al., [0074] Anal Biochem 207:32-39 (1992), herein incorporated by reference). In this method of detection, a fluorogenic signal is determined by precipitation at a localized reaction site. This novel detection chemistry actually combines the powers of enzymatic amplification, rapid in situ product precipitation, high contrast of fluorescence signal over (glass) background, and quantitation of fluorescent signal. The method thus provides greater sensitivity than direct fluorescence detection and is operationally compatible with a high density oligonucleotide glass array format.
  • A polymerase, and suitable salts and buffers are also provided to each array position. The reaction conditions are maintained such that the oligonucleotides stably and specifically hybridize to the target polynucleotide, and so that the 3′-terminus of the oligonucleotides are extended by addition of single chain terminator nucleotide (i.e., the interrogation nucleotide). As used herein, “stable” hybridization refers to a hybridization that has a Tm greater than the temperature under which the interrogation assay is to be run (generally 20-40° C.). The term “specific” hybridization denotes that the length and/or sequence complexity of the oligonucleotides involved in the hybridization are sufficient to preclude non-desired spurious hybridization (as might occur, for example, between sequences that are only partially complementary). The hybridization is usually carried out for 15 to 30 minutes at room temperature in a solution containing 1.5 M NaCl and 10 mM EDTA. Other hybridization conditions can alternatively be used. The sequence of the immobilized oligonucleotide is selected such that it will hybridize to the invariant sequence that flanks the polymorphic site of the polymorphism that is to be interrogated. [0075]
  • If the ligase/polymerase mediated GBA™ interrogation method is to be employed, the methods of Nikiforov et al. (U.S. patent application Ser. No. 08/192,631, herein incorporated by reference) are preferably employed. [0076]
  • Most preferably, the oligonucleotides present at each array position are immobilized to the solid surface of the array support. Such a support may be a microtiter dish, test tube array, etched glass surface, etc. [0077]
  • II. Nature of the Oligonucleotide Array [0078]
  • The nature of the oligonucleotide array may vary depending upon the amount of prior sequence information available concerning the target molecule. In one embodiment of the invention, the array is “non-random.” As used herein, a “non-random” oligonucleotide array is a set of oligonucleotides whose members do not contain all possible permutations of nucleotides. A non-random array is preferably employed when determining the nucleotide sequence of a polynucleotide for which some a priori sequence information is available. Thus, for example, non-random arrays would be employed in sequencing those genes of a patient for which the sequence of “normal” alleles had been previously determined. In contrast, a “random” array of oligonucleotides is a set of oligonucleotides whose members do contain all possible permutations of nucleotides. A random array is preferably employed when determining the nucleotide sequence of a polynucleotide for which little or no a priori sequence information is available. [0079]
  • Primer design is preferably facilitated through the use of the GBA™ Primer 1.0 program (Molecular Tool, Inc.) Primer stability (measured in −kcal/mol) and potential sequence-based sources of noise are evaluated by this program. A number of sequence-based features can lead to GBA™ noise for a particular target site. The most common source of noise is template-independent noise (TIN) and results from self-priming by the GBA™ primer. To eliminate TIN, GBA™ primers may be modified by a base substitution with C3 linker or by shortening the primer at the 5′ end without sacrificing hybridization stability of the template strand. In the N-GBA™ system, a set of GBA™ primers which complement the target sequence and are staggered by one base will be designed according to the standard GBA™ primer design strategies described above. An example of N-GBA™ primer design was shown in a model study described in the relevant experience section. [0080]
  • A. Non-Random Nested Arrays [0081]
  • In circumstances in which the part of the sequence of the target molecule (or of a normal or reference sequence) has been previously determined, the oligonucleotide array can comprise a set of non-random nested oligonucleotides. [0082]
  • In the simplest embodiment, the nested primer array will contain all possible divergent sequences over the region whose sequence is to be determined. The maximum number of primers needed to determine the sequence of N nucleotides is given by the equation: [0083] 1 N 4 N - 1
    Figure US20020192662A1-20021219-M00001
  • As such, the maximum number of sequences needed to obtain the sequence of even a relatively small region rises rapidly when non-random arrays are employed, the method is not preferred when more extensive sequencing is desired. For example, a maximum of 349,525 primers would be needed to obtain 10 nucleotides of sequence information by this method. Hence, for obtaining such (or even more extensive) sequence information, the random array method described below is preferably employed. [0084]
  • Thus, to sequence four nucleotides in the simplest embodiment, a set of (1+4+16+64=) 85 primers would be needed. This aspect of the invention is illustrated in Table 1, which shows the sequences of four arrays of nested non-random 25-mer oligonucleotides (“oligos”), comprising all possible permutations of sequence on the final 3 nucleotides. The extent of nesting shown in Table 1 is one nucleotide, however, the array oligonucleotides can be nested by more than one nucleotide if desired. By using each oligonucleotide of a set as a GBA™ primer in a GBA™ reaction (either in the presence of three unlabeled terminator nucleotides and one labeled chain terminator nucleotide or in the presence of four differentially labeled chain terminator nucleotides), it is possible to determine the nucleotide sequence of the particular nucleic acid molecule of a sample that is complementary to the set of primers. [0085]
  • In some circumstances fewer primers may be employed. For example, if it were known that only one of two nucleotide candidates were possible at position 27 (e.g., either A or C, but not T or G), only (1+2+8=) 11 primers would be needed to sequence the three nucleotide positions of any particular target molecule. [0086]
    TABLE 1
    SEQ ID Nucleotide Sequence Position
    NO of Positions 1-25 Sequenced
    1 CTTGTGCTGACTTACCAGATGGGAC 26
    2 TTGTGCTGACTTACCAGATGGGACA 27
    3 TTGTGCTGACTTACCAGATGGGACC
    4 TTGTGCTGACTTACCAGATGGGACT
    5 TTGTGCTGACTTACCAGATGGGACG
    6 TGTGCTGACTTACCAGATGGGACAA 28
    7 TGTCCTGACTTACCAGATGGGACAC
    8 TGTGCTGACTTACCAGATGGGACAT
    9 TGTGCTGACTTACCAGATGGGACAG
    10 TGTGCTGACTTACCAGATGGGACCA
    11 TGTGCTGACTTACCAGATGGGACCC
    12 TGTGCTGACTTACCAGATGGGACCT
    13 TGTGCTGACTTACCAGATGGGACCG
    14 TGTGCTGACTTACCAGATGGGACTA
    15 TGTGCTGACTTACCAGATGGGACTC
    16 TGTGCTGACTTACCAGATGGGACTT
    17 TGTGCTGACTTACCAGATGGGACTG
    18 TGTGCTGACTTACCAGATGGGACGA
    19 TGTGCTGACTTACCAGATGGGACGC
    20 TGTCCTGACTTACCAGATGGCACGT
    21 TGTGCTGACTTACCAGATGGGACGG
    22 GTGCTGACTTACCAGATGGGACAAA 28
  • However, and as discussed above, the GBA™ reaction exploits the ability of the 3′ terminus of the GBA™ primer to hybridize to the target molecule being interrogated. This characteristic of the present invention permits sequence determinations with far fewer primers, depending upon the class of polymerase being employed in the GBA™ reaction. In general, there are two classes of polymerases. One class, typified by the Klenow fragment of [0087] E. coli DNA polymerase I (Klenow class) possess 3′ to 5′ exonuclease activity, and are able to correct 3′ base mismatches in the extended primer. The second class, typified by the thermostable polymerase, Thermosequenase (USB), (Thermosequenase class) do not possess 3′ to 5′ exonuclease activity, and are thus unable to correct 3′ base mismatches in the extended primer. Polymerases of either class can be employed in accordance with the present invention. The characteristics of polymerases are shown in Table 2.
    TABLE 2
    3′ to 5′ Ability to
    Exonuclease Correct Possible Outcome
    Enzyme Activity Mismatch (Signal:Noise)
    Klenow Fragment Strong Strong Low
    Exo(−) Klenow None Moderate Moderate
    Sequenase None Moderate Moderate
    AmpliTaq None Weak High
    Bst Polymerase None Weak High
    Thermosequenase None Weak High
  • Since Thermosequenase class polymerases do not possess 3′ to 5′ exonuclease activity, unless a priori sequence information is available, it is preferable to employ each oligonucleotide in a nested set of all possible permutations. Nevertheless, in many circumstances incomplete sets of oligonucleotides may be employed in concert with Thermosequenase class polymerases. For example, if SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:6 and SEQ ID NO:22 were employed to sequence a target having a sequence other than GTTT at positions 25-28, one or more of the oligonucleotides would fail to hybridize its 3′ terminus to the target, and minimal nucleotide incorporation would result. Hence a result indicating that GBA™ reactions on a particular target molecule led to the incorporation of a label for SEQ ID NO:1 and SEQ ID NO:2, but not for SEQ ID NO:6 or SEQ ID NO:22 would indicate that the nucleotides at positions 27-28 were not Ts. In one embodiment, such an observation of impaired incorporation is a useful indication that the sequence of the target molecule differs from that of the reference allele. As such, this embodiment is useful in identity and paternity analysis, and in genetic screening. [0088]
  • In contrast, since Klenow class polymerases can correct mismatches as well as extend primers, when such polymerases are employed in the GBA™ reaction, incorporation of label may reflect primer repair as well as primer extension. Thus, the use of Klenow class polymerases in the present invention has a salient advantage. Instead of needing to provide all permutations of the sequence to be determined, one need provide only one oligonucleotide for each position to be determined. Thus, to determine the sequence of positions 2629 in the example shown above, one would need to provide at most 4 oligonucleotides (i.e. an oligonucleotide, such as SEQ ID NO:1 ending at position 26, an oligonucleotide, such as SEQ ID NO:2 ending at position 27, an oligonucleotide, such as SEQ ID NO:6 ending at position 28, and an oligonucleotide, such as SEQ ID NO:22 ending at position 29. [0089]
  • Thus, when Klenow class polymerases are employed, two possibilities exist with respect to such an array: a particular nucleotide may become labeled by extension, or it may become labeled by primer mismatch repair. In general, only a single unambiguous sequence will be obtained. For example, Table 3 gives the results that would be obtained from the use of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:6 and SEQ ID NO:22 to evaluate a particular target molecule having the sequence CATGCG at nucleotide positions 25-30. [0090]
    TABLE 3
    Position No.
    SEQ ID Nucleotide Sequence of Nucleotide
    NO. of Positions 1-25 Sequenced by Nucleotide
    23 CCGTACTCCCATCTCGTAAGTCAGCACAAG Array Reported
    1 CTTGTGCTGACTTACCAGATGGGAC 25 C
    2 TTGTGCTGACTTACCAGATGGGACA 25 G
    6 TGTGCTGACTTACCAGATGGGACAA 28 C
    22 GTGCTGACTTACCAGATGGGACAAAA 28 C
  • In the case of SEQ ID NO:1, the incorporation of G reflects the removal of the 3′ terminal C residue, and the incorporation of a G (as the nucleotide complementary to the C at position 25 in the target). In the case of SEQ ID NO:2, the incorporation of G reflects the removal of the 3′ terminal A and C residues, and the incorporation of a G (as the nucleotide complementary to the C at position 25 in the target). In the case of SEQ ID NO:6, the incorporation of T reflects the hybridization of the 3′ terminus of the primer to the target, and the extension of the primer by one nucleotide (C, the nucleotide complementary to the C at position 25 in the target. In the case of SEQ ID NO:22, the incorporation of C reflects the removal of the 3′ terminal A and A residues, and the incorporation of a C (as the nucleotide complementary to the G at position 28 in the target). [0091]
  • Such sequence assignments flow from the known rules of base pairing. In the above-example, the incorporation of G, G, C and C could not mean that positions 26-29 of the target were CCGG, because such a nucleotide sequence is incompatible with the (known) sequence of the 3′ terminus of SEQ ID NO:22. In a similar manner, consideration of the known sequences of the oligonucleotides reveals the nucleotide position being reported by a particular nucleotide of the array. In a preferred embodiment, such consideration is facilitated by performing separate sequence determinations with both a nucleotide array and its complement (such that the sequences both strands of a target molecule are obtained). [0092]
  • Table 4 illustrates typical oligonucleotide arrays by displaying sets of oligonucleotides sufficient to permit sequence analysis of exon 23 of the human BRCA1 gene (E23) and cystic fibrosis (CF) (at the locus of nucleotide 549) genes with a Klenow class polymerase. In the Table, the sign (+/−) indicates the strand of the target being sequenced; the number (1-6) indicates the position of the target being interrogated. [0093]
    TABLE 4
    SEQ ID
    NO Oligo Nudeotide Sequence
    24 E23+1 CTTGTGCTGACTTACCAGATGGGAC
    25 E23+2  TTGTGCTGACTTACCAGATGGGACA
    26 E23+3   GTGCTGACTTACCAGATGGGACAC
    27 E23+4    TGCTGACTTACCAGATGGGACACT
    28 E23+5     TGCTGACTTACCAGATGGGACACTC
    29 E23+6      GCTCACTTACCAGATGGGACSCTCT
    30 E23−1 GTCATTATGCTATGCAGAAAATCT
    31 E23−2  TCATTAATGCTATGCAGAAATCTT
    32 E23−3   CATTAATGCTATGCAGAAATCTTA
    33 E23−4    ATTAATGCTATCCAGAAATCTTAG
    34 E23−5     TTAATGCTATGCAGAAATCTTAGA
    35 E23−6      TATGCTATGCAGAAAATCTTAGAG
    36 CF549+1 AAAGAAATTCTTGCTCGTTGACCTC
    37 CF549+2  AAGAAATTCTTGCTCGTTGACCTCC
    38 CF549+3   AGAAATTCTTGCTCGTTGACCTCCA
    39 CF549+4    GAAATTCTTGCTCGTTGACCTCCAC
    40 CF549+5     AAATTCTTGCTCGTTGACCTCCACT
    41 CF549−1 TTCTTGGAGAAGGTGGAATCACACT
    42 CF549−2  TCTTGGAGAAGGTGGAATCACACTG
    43 CF549−3   CTTGGAGAAGGTGGAATCACACTGA
    44 CF549−4    TTGGAGAAGGTGGAATCACACTGAG
    45 CF549−5     TGGAGAAGGTGGAATCACACTGAGT
  • As will be recognized, the use of a Klenow class polymerase permits sequence determinations using far fewer than the maximum number of oligonucleotides that would otherwise be required. Nevertheless, because repair of mismatches may complicate analysis, Thermosequenase class polymerases are the preferred polymerases of the present invention. Since such polymerases do not repair mismatches, they are preferably used in embodiments in which oligonucleotides having all possible permutations of 3′ sequence are provided, or more preferably, in embodiments in which two oligonucleotide arrays are employed (one complementary to one strand, and the other complementary to the second strand). [0094]
  • B. Random Nested Arrays [0095]
  • Whereas the non-random nested array method described above is predicated on providing the target molecule with hybridization oligonucleotides that possess the exact -sequence of the target, the random nested array method is predicated on deriving sequence information from the pattern of oligonucleotides of the array that are extended in the GBA™ reaction as well as from the identity of the nucleotide added to each extended oligonucleotide. [0096]
  • In the method, an array of oligonucleotide primers is employed. The lengths of the primers are most preferably uniform, and can vary from 6-20 nucleotides in length. For an array of N nucleotides, there are 4[0097] N possible sequence permutations. However, because each oligonucleotide can (if hybridized to target in a GBA™ reaction) be extended by one nucleotide, the use of an array of random primers of N nucleotides in length can generate sequence information for 4N+1 nucleotides. Hence, an array of 4,096 oligonucleotides (comprising a random permutation of all possible 6-mers) could simultaneously sequence 16,384 bases of a target molecule.
  • The random array method may be illustrated as follows. An array of all possible 6-mers is prepared such that the x,y array location and sequence of -each oligonucleotide of the array is known. Each array position is incubated with the same target polynucleotide, and a GBA™ reaction is conducted for each array position. These parallel (or sequential) reactions lead to the formation of a sequence signature consisting of array positions whose oligonucleotides have not been extended, and those whose oligonucleotides have been extended by addition of A, T, C or G. [0098]
  • One array position for an extended oligonucleotide is selected at random (although, in a preferred automated mode, multiple positions may be processed in parallel). The sequence of the extended oligonucleotide at the selected array position is determined using the oligonucleotide's initial (predefined 6-mer sequence) and the identity of the labeled nucleotide added to the oligonucleotide's 3′ terminus in the GBA™ reaction. This determination defines a second 6-mer oligonucleotide (consisting of nucleotides 2-7 of the selected oligonucleotide). The array location of this second 6-mer position is identified, and the extension product formed by the oligonucleotide at that array position is determined. Such sequence information defines a third 6-mer oligonucleotide (consisting of nucleotides 2-7 of the second selected oligonucleotide). In like manner, the entire sequence stored in the array can be deduced. [0099]
  • A salient feature of the use of the GBA™ reaction in accordance with the methods of the present invention is the capacity to miniaturize such methods, resulting in a savings of space, reagents, and time, and providing increased throughput and reliability. [0100]
  • III. Microminiaturized Analysis Method [0101]
  • In one embodiment, a microminiaturized analysis format is employed. As used herein, a microminiaturized reaction is one conducted in a reaction volume of greater than 50 μl, but less than 200 μl, and most preferably less than 100 μl. Such analysis is most preferably conducted in 96 well microtiter well plates, using the indirect fluorescent colorimetry method of Huang, Z. et al. ([0102] Anal Biochem 207:32-39 (1992)), and the use of liquid handling robots to deliver reagents.
  • A preferred format involves designing the GBA™ primers so that they are associated with biotinylated spacer arms sufficient to permit them to become bound to a glass or plastic support (such as a glass slide, etc.). This attachment approach has the advantage of high specificity and results in minimal nonspecific backgrounds during attachment and hybridization. A preferred glass slide support for oligonucleotide immobilization has wells of exposed glass surrounded by a hydrophobic Teflon coating (Cel-line Associates, Inc.). The plates have 12 wells (7 mm in diameter), and are designed such that solutions can be dispensed with standard, multichannel pipetting instruments, and signals can be read on existing plate readers. Avidin will be covalently attached onto a glass-slide using our proprietary attachment chemistry. A 50 μl solution of 0.4 μm biotinylated oligonucleotide will then be added to each well, and incubated for 2 hrs, then rinsed with TNTw (10 mM Tris-HCl, pH7.5, 150 mM NaCl, 0.05% Tween-20). [0103]
  • IV. Nanominiaturized Analysis Method [0104]
  • In an alternate embodiment, a nanominiaturized analysis format is employed. As used herein, a microminiaturized reaction is one conducted in a reaction volume of less than 50 μl, and most preferably less than 10 μl In a preferred nanominiaturized embodiment, the support will be an etched glass plates that will hold several hundred to several thousand nanowells (0.1-5 μl volume per well), such that entire arrays can be evaluated simultaneously. The determination of the result of the GBA™ reaction will most preferably be performed via a automated processing using, for example a pixel by pixel CCD camera equipped to distinguish the labels of the nucleotides being employed. Detection of the extension may be accomplished using a variety of labels, however, two detection schemes are preferred i) direct fluorescence detection on glass, and ii) enzyme-mediated fluorescence detection. [0105]
  • Having now generally described the invention, the same will be more readily understood through reference to the following examples which are provided by way of illustration, and are not intended to be limiting of the present invention, unless specified. [0106]
  • EXAMPLES Example 1 Nested GBA™ Analysis
  • In order to demonstrate the biochemical feasibility of adapting GBA™ technology to determine all 4 bases at each nucleotide position within a string of sequence, the following N-GBA™ experiment was conducted. A target polynucleotide having the sequence: [0107]
    SEQ ID NO:46
    (Wild-type) 5′ CCAGAAGAAA GGGCCTTCAC AGTGTCCTT
    ATGTAAGAAT GATATAACC-3′
    or
    SEQ ID NO:47
    (Mutant) 5′ CCAGAAGAAA GGGCCTTCAC AGGGTCCTTT
    ATGTAAGAAT GATATAACC-3′
  • was interrogated with a set of primers that had been immobilized on to the surface of a 96 well microtiter plate in order to type the central five bases (shown in boldface) of the “wild-type” sequence (AGTGT) and of a single-base “mutant” sequence (AG[0108] GGT). The primers used had the following sequence:
    SEQ ID NO:48 (Primer 1) 5′ GGTTATATCATTCTTACATAAAGG 3′
    SEQ ID NO:49 (Primer 2) 5′ GTTATATCATTCTTACATAAAGGA 3′
    SEQ ID NO:50 (Primer 3) 5′ TTATATCATTCTTACATAAAGGAC 3′
    SEQ ID NO:51 (Primer 4) 5′ TATATCATTCTTACATAAAGGACA 3′
    SEQ ID NO:52 (Primer 5) 5′ ATATCATTCTTACATAAAGGACAC 3′
  • Two commercially available DNA polymerases, the Klenow fragment of [0109] E. coli DNA polymerase I and the thermostable Thermosequenase (USB), were used for the single-base extension reaction. Primers were immobilized onto polystyrene plate via cationic detergent (Octyldimethylamine) promoted passive adsorption (Nikiforov, T. T. et al., Anal Biochem 227:201-209 (1995)) at defined locations. The wild-type and mutant templates were hybridized to the immobilized GBA™ primers, and the 3′ end of the GBA™ primers were extended by a single fluorescent labeled chain terminator ddNTP by either Klenow or Thermosequenase. Enzyme-mediated fluorescence signal were obtained using the Cytofluor II fluorescent plate reader. The results of the experiment are shown in Table 5.
  • As shown in Table 5, the final colorimetric readouts from the extensions of Klenow fragment and Thermosequenase with the matching primer set and wild-type template were consistent with the true base sequence. When the mutant template was present, however, the two DNA polymerases gave quite different readout patterns. Klenow, known for its 3′ to 5′ exonuclease activity, was able to correct the 3′ base mismatches of Primers 4 and 5 with the mutant template and extend only the C base from these primers. On the other hand, Thermosequenase could not repair and extend at any of these mismatches, resulting a lack of signal for both Primers 4 and 5. Either enzyme could produce very distinct and differential patterns of colorimetric readout for the wild-type and mutant templates, demonstrating the use of this N-GBA™ approach to screen for mutations. [0110]
    TABLE 5
    Base Extended
    Template Primer Used A G T C
    Polymerase
    Klenow
    Wild-type Primer 1 2.0 0.43 0.15 0.24
    Primer 2 0.94 0.42 0.16 2.0
    Primer 3 2.0 0.30 0.15 0.79
    Primer 4 0.50 0.15 0.19 1.9
    Primer 5 0.28 0.14 1.7 0.55
    Mutant Primer 1 0.96 0.45 0.15 0.16
    Primer 2 0.88 0.43 0.17 1.1
    Primer 3 0.55 0.33 0.23 1.8
    Primer 4 0.26 0.16 0.10 1.3
    Primer 5 0.25 0.15 0.11 1.3
    Thermosequenase
    Wild-type Primer 1 2.2 0.28 0.11 0.14
    Primer 2 0.33 0.18 0.15 2.1
    Primer 3 2.1 0.16 0.12 0.22
    Primer 4 0.20 0.11 0.15 2.1
    Primer 5 0.15 0.12 2.2 0.16
    Mutant Primer 1 1.2 0.19 0.11 0.12
    Primer 2 0.23 0.19 0.18 1.3
    Primer 3 0.22 0.14 0.13 1.5
    Primer 4 0.10 0.10 0.10 0.16
    Primer 5 0.10 0.10 0.15 0.14
  • Overall, this data reveals two important points: 1) Thermosequenase reduced template dependent noise due to its better S:N ratios when compared to Klenow, and 2) Thermosequenase did not extend at a non-specific base (i.e., it stopped when the primer overlapped the non-specific base), thus clearly indicating a mismatch which can be used to locate the position of the mutation. These advantages suggest that the exonuclease-free Thermosequenase enzyme is better suited to the N-GBA™ technology, however, since Thermosequenase does not give false data at a mismatch, SCAN™ must also be performed from the other strand to determine the sequence that follows the mutation. [0111]
  • EXAMPLE 2 Nested GBA™ Analysis On Glass Slides
  • The feasibility of conducting nested GBA™ (N-GBA™) reactions on glass slides was evaluated. For this purpose, 25-mer GBA™ primers were 5′ specifically attached onto the surface of a glass slide via avidin-biotin affinity interactions. The glass slides had wells of exposed glass surrounded by hydrophobic Teflon coating (Cel-line Associates, Inc.). The 12 wells were 7 mm in diameter, and were designed such that solutions could be dispensed with standard, multichannel pipetting instruments, and signals could be read on existing plate readers. A 50 μl solution of 0.4 μm biotinylated oligonucleotide was added to each well, incubated for 2 hrs (1.5 mM NaCl, 10 mM EDTA, and 0.5 μM target synthetic template strands), and then rinsed with TNTw (10 mM Tris-HCl, pH7.5, 150 mM NaCl, 0.05% Tween-20). [0112]
  • GBA™ biochemistry (Nikiforov, T. T. et al., [0113] Nucl Acids Res 22:4167-4175 (1994); Nikiforov, T. T. et al., PCR Methods and Apps 3:285-291 (1994), both herein incorporated by reference) was used to analyze the synthetic templates; each synthetic template was split into four different wells, and each well was treated with extension mix containing all the extension reaction components, exonuclease free Klenow fragment of E. coli polymerase, and each of four fluorescein-labeled ddNTPs and co-ddNTPs. Enzyme-mediated fluorescence signal were obtained using the Cytofluor II fluorescent plate reader. Synthetic template 1 was designed to give a GBA™ signal in base A, and synthetic template 2 was designed to give a GBA™ signal in base G.
  • The GBA™ extension reactions are detected using the enzyme-mediated fluorescence precipitation method (Huang, Z. et al., [0114] Anal Biochem 207:32-39 (1992); Huang, Z. et al., J Histochem Cytochem 41:313-317 (1993)). The glass slide containing the fluorescein GBA™ signal are incubated for about 30 minutes with anti-fluorescein alkaline phosphatase solution under a blocking condition commonly used in ELISA or histochemical procedures. After washing, a droplet of an alkaline phosphatase fluorogenic precipitating substrate solution (Molecular Probes) is applied to either individual reaction wells or the entire slide. Following a 15 minute incubation and wash, the GBA™ signal can be immediately visualized under a conventional fluorescent microscope equipped with a 360 nm excitation filter and a 530 nm emission filter, or quantitated by a fluorescence microtiter plate scanner (Cytofluor II) equipped with the same filter set.
  • The results of this experiment are shown in FIG. 1. The results were as expected: both templates gave strong signals in correct bases with virtually no noise in other bases observed (the S:N ratio ranged from 28 to 14.2). This experiment demonstrated the feasibility of performing GBA™ biochemistry on glass, and detection of GBA™ signal using sensitive enzyme-mediated fluorescence detection using a commercially available fluorescent plate reader, the Cytofluor II. The high quality of the results strongly suggest that the proposed N-GBA™ biochemistry should perform very well on glass surface when combined with the enzyme-mediated fluorescence detection, and puts us on the path towards a low-cost miniaturizable GBA™ processing format. [0115]
  • Example 3 Nested GBA™ Analysis of the BRCA1 Gene
  • The feasibility of utilizing the nested GBA™ (N-GBA™) approach to accurately identify mutations in exon 23 of the human BRCA1 gene was evaluated. [0116]
  • Mutations in the human BRCA1 gene have been implicated as correlated with familial breast cancer. In particular, a mutation located at position 354-359 of the normal (wild-type) sequence (TAGAGT) has been correlated with familial breast cancer. Primers having the sequences SEQ ID Nos: 24-29 and 30-35 were used to sequence sample BRCA1 genes (Table 6). [0117]
    TABLE 6
    SEQ ID NO Nudeotide Sequence of E23
    53 Oligo TCTTAGAGTGTCCCATCTGGTAAGTCAGCACAAG
    24 E23+1 CTTGTGCTGACTTACCAGATGGAC
    25 E23+2  TTGTGCTGACTTACCAGATGGACA
    26 E23+3   TGTGCTGACTTACCAGATGGACAC
    27 E23+4    GTGCTGACTTACCAGATGGACACT
    28 E23+5     TGCTGACTTACCAGATGGACACTC
    29 E23+6      GCTGACTTACCAGATGGACACTCT
    SEQ ID NO Nucleotide Sequence
    54 Oligo GACACTCTAAGATTTTCTGCATAGCATTAATGAC
    30 E23−1 GTCATTAATGCTATGCAGAAAATCT
    31 E23−2  TCATTAATGCTATGCAGAAAATCTT
    32 E23−3   CATTAATGCTATGCAGAAAATCTTA
    33 E23−4    ATTAATGCTATGCAGAAAATCTTAG
    34 E23−5     TTAATGCTATGCAGAAAATCTTAGA
    35 E23−6      TAATGCTATGCAGAAAATCTTAGAG
  • [0118]
  • Thus, nested GBA™ reactions were performed using Klenow and exo-Klenow polymerase, and fluorocein labeled ddNTPs. The results of this experiment are shown in Table 7. [0119]
    TABLE 7
    Nested GBA ™ Reaction Using “+” Template Strand of E23 of BRCA1
    A C G T
    Primer Sig- Sig- Sig- Sig-
    Used nal TIN nal TIN nal TIN nal TIN
    Nucleotide Extended Using Exo-Klenow
    E23+
    1 3.07 0.15 0.93 0.12 0.38 0.19 0.38 0.12
    E23+2 0.94 0.14 2.97 0.15 0.35 0.16 0.45 0.14
    E23+3 0.42 0.11 0.41 0.10 0.26 0.11 2.28 0.10
    E23+4 0.49 0.15 3.21 0.16 0.38 0.34 0.75 0.16
    E23+5 0.37 0.14 0.49 0.14 0.31 0.14 2.70 0.13
    E23+6 3.13 0.22 0.40 0.18 0.62 2.82 0.62 0.16
    Nucleotide Extended Using Klenow
    E23+
    1 3.34 0.38 1.38 0.21 0.81 0.21 0.41 0.15
    E23+2 1.55 0.35 2.96 0.15 0.58 0.19 0.44 0.17
    E23+3 1.04 0.38 1.17 0.13 0.52 0.16 3.08 0.12
    E23+4 1.03 0.57 3.35 0.13 0.69 0.45 1.29 0.22
    E23+5 0.57 0.12 1.54 0.15 0.47 0.31 3.34 0.19
    E23+6 3.36 0.31 0.88 0.17 1.41 1.64 1.12 0.18
  • The results shown in Table 7 thus show that Klenow and Exo-Klenow gave the same sequence (ACTCTA) for the primer extension, thereby indicating that the “+” strand of the E23 locus being sequenced had the complementary sequence (5′ TAGAGT 3′). To confirm this result, a nested GBA™ reaction was performed using the “−” template strand of E23 of BRCA1. The results of this experiment are shown in Table 8. [0120]
    TABLE 8
    Nested GBA ™ Reaction Using “−” Template Strand of E23 of BRCA1
    A C G T
    Primer Sig- Sig- Sig- Sig-
    Used nal TIN nal TIN nal TIN nal TIN
    Nucleotide Extended Using Exo-Klenow
    E23−1 0.54 0.08 0.28 0.11 0.43 0.22 2.00 0.12
    E23−2 2.04 0.09 0.19 0.08 0.31 0.08 0.68 0.08
    E23−3 0.69 0.11 0.16 0.10 3.29 0.09 0.48 0.08
    E23−4 3.15 0.36 0.18 0.14 0.44 0.10 0.69 0.09
    E23−5 0.26 0.14 0.17 0.09 2.34 0.10 1.00 1.37
    E23−6 0.38 0.10 0.24 0.09 0.47 0.09 2.57 0.24
    Nucleotide Extended Using Klenow
    E23−1 0.47 1.24 0.78 0.12 0.66 0.37 3.43 0.70
    E23−2 1.78 0.12 0.26 0.08 0.24 0.1 0.96 0.11
    E23−3 1.48 0.16 0.31 0.09 3.42 0.24 0.97 0.51
    E23−4 3.30 0.22 0.29 0.08 0.80 0.15 0.72 0.30
    E23−5 0.53 0.22 0.17 0.09 2.29 0.11 2.57 1.85
    E23−6 0.73 0.14 0.25 0.11 0.95 0.17 3.48 0.42
  • The results shown in Table 8 thus show that Klenow and Exo-Klenow gave the same sequence (TAGAGT) for the primer extension, thereby indicating that the “−” strand of the E23 locus being sequenced had the complementary sequence (5′ ACTCTA 3′). [0121]
  • In order to demonstrate the ability of the present invention to discern mutations in the E23 locus, two additional experiments were performed. In the first experiment, a target strand having a deletion of the AG nucleotides was analyzed. In the second, a mixture of the normal and AG deletion target template was made, and analyzed via the nested GBA™ method. The first experiment thus discerns the profile that would be presented by an individual having a homozygous mutation in the E23 locus, while the second experiment analyzes the profile that would be presented by an individual having heterogygosity in this region. The results of these experiments are shown in Tables 9 and 10. [0122]
    TABLE 9
    Nested GBA ™ Reaction Using “AG” Deletion in “+” Template Strand
    of E23 of BRCA1
    Nudeotide Extended
    Primer A Signal C Signal G Signal T Signal
    E23 + 1 2.77 0.13 0.11 0.1
    E23 + 2 0.22 2.45 0.11 0.11
    E23 + 3 0.12 0.13 0.1 1.51
    E23 + 4 2.53 0.1 0.1 0.11
    E23 + 5 1.53 0.1 0.11 0.12
    E23 + 6 0.24 0.11 0.61 0.13
  • The results shown in Table 9 define an extended sequence for this sample of ACTAA, thereby indicating that the “+” strand of the E23 locus being sequenced had the complementary sequence (5′ TTAGT 3′) (see SEQ ID NO:53). The observed sequence is explained as follows: Primers 1-3 sequence bases that precede the deletion, and hence report the wild-type sequence (ACT). Primer 4, which ends just before the deletion, reports the sequence of the first nucleotide of the target strand that follows deletion (i.e., A). Primer 5, when hybridized to the deletion ends with a one base mismatch, which is removed by the polymerase. The truncated hybridized primer 5 then sequences the same nucleotide as that sequenced by Primer 4. Primer 6, which has a two base mismatch is not extended in the reaction. [0123]
    TABLE 10
    Nested GBA ™ Reaction Using A Mixture Of Templates
    “AG” Deletion And Normal “+” Template Strand of E23 of BRCA1
    Nucleotide Extended
    Primer A Signal C Signal G Signal T Signal
    E23 + 1 3.31 0.39 0.34 0.36
    E23 + 2 0.5 2.87 0.39 0.41
    E23 + 3 0.29 0.35 0.33 1.51
    E23 + 4 1.97 2.53 0.38 0.4
    E23 + 5 0.51 0.38 0.53 1.64
    E23 + 6 3.21 0.4 0.41 0.43
  • Table 10 reveals that Primers 1-3 were extended as expected to yield extension products A, C, and T, respectively for both wild-type and AG deletion target molecules. The presence of wild-type target results in the extension of Primer 4 with a C residue (consistent with the results obtained above; see Table 7). Similarly, the presence of the wild-type target causes Primers 5 and 6 to be extended by T and A, respectively (see Table 7). The presence of the AG deletion target causes. Primer 4 to be extended by an A (consistent with the result shown in Table 9). Consistent with the fact that the target mixture is 1:1 wild-type:mutant, the signals of A and C addition for Primer 4 are approximately equal. Neither Primer 5 nor Primer 6 are extended when hybridized to the AG deletion target because their 3′ terminal nucleotides would not be base-paired with the AG deletion target mutant. The failure of Primer 5 to be extended when hybridized to the AG mutant reflects the relatively lower binding avidity of the polymerase for Primer 5:mutant duplexes as compared to Primer 5:wild-type duplexes (in which there would be no mismatch). [0124]
  • Example 4 Nested GBA™ Analysis of the CF Gene
  • The feasibility of utilizing the nested GBA™ (N-GBA™) approach to accurately identify mutations in the nucleotide 549 locus of the human CF gene was also evaluated. [0125]
  • Thus, a set of primers (shown in Table 11) was prepared, and used in a nested GBA™ reaction to sequence a locus of the cystic fibrosis gene (CF) around nucleotide 549. [0126]
    TABLE 11
    SEQ ID NO Nucleotide Sequence
    55 Oligo CTGAGTGGAGGTCAACGAGCAAGAATTTCTTT
    36 CF549+1 AAAGAAATTCTTGCTCGTTGACCTC
    37 CF549+2  AAGAAATTCTTGCTCGTTGACCTCC
    38 CF549+3   AGAAATTCTTGCTCGTTGACCTCCA
    39 CF549+4    GAAATTCTTGCTCGTTGACCTCCAC
    40 CF549+5     AAATTCTTGCTCGTTGACCTCCACT
    SEQ ID NO Nudeotide Sequence
    56 Oligo TCCACTCAGTGTGATTCCACCTTCTCCAAGAA
    41 CF549−1 TTCTTGGAGAAGGTGGAATCACACT
    42 CF549−2  TCTTGGAGAAGGTGGAATCACACTG
    43 CF549−3   CTTGGAGAAGGTGGAATCACACTGA
    44 CF549−4    TTGGAGAAGGTGGAATCACACTGAG
    45 CF549−5     TGGAGAAGGTGGAATCACACTGAGT
  • Table 12 shows the result of this experiment with respect to the “+” strand of this target molecule. [0127]
    TABLE 12
    Nested GBA ™ Reaction Using “+” Template Strand of CF Gene At
    Locus 549
    A C G T
    Primer Sig- Sig- Sig- Sig-
    Used nal TIN nal TIN nal TIN nal TIN
    Nucleotide Extended Using Exo-Klenow
    CF549+1 0.25 0.09 2.99 0.09 0.20 0.09 0.29 0.09
    CF549+2 3.49 0.10 0.74 0.10 0.19 0.10 0.25 0.10
    CF549+3 1.01 0.31 3.23 0.12 0.26 0.13 0.38 0.12
    CF549+4 0.47 0.19 0.89 0.16 0.56 0.25 2.76 0.14
    CF549+5 0.30 0.11 2.97 0.12 0.21 0.13 0.39 0.12
    Nucleotide Extended Using Klenow
    CF549+1 0.43 0.15 3.37 0.10 0.29 0.15 0.82 0.64
    CF549+2 3.45 0.36 1.38 0.10 0.33 0.22 0.48 0.14
    CF549+3 1.52 0.13 3.60 0.11 0.36 0.18 0.51 0.11
    CF549+4 1.41 0.17 1.87 0.13 0.92 0.26 3.48 0.15
    CF549+5 0.60 0.11 3.22 0.12 0.28 0.15 0.59 0.11
  • As indicated in Table 12, both Klenow and Exo-Klenow gave nested GBA™ extension products of C, A, C, T and C, respectively for primers CF549+ through CF549+5. The deduced sequence for the 549 locus of the target is therefore GAGTG, as expected. The results obtained above were confirmed by performing a nested GBA™ reaction using the “−” CF strand. The results of this experiment are presented in Table 13. [0128]
    TABLE 13
    Nested GBA ™ Reaction Using “−” Template Strand of CF Gene At
    Locus 549
    A C G T
    Primer Sig- Sig- Sig- Sig-
    Used nal TIN nal TIN nal TIN nal TIN
    Nucleotide Extended Using Exo-Klenow
    CF549−1 0.37 0.11 0.28 0.15 3.09 0.10 0.45 0.13
    CF549−2 3.32 0.11 0.25 0.13 0.52 0.13 0.27 0.12
    CF549−3 1.65 0.12 0.19 0.19 3.88 0.13 1.34 0.19
    CF549−4 0.69 0.09 0.11 0.12 1.16 0.10 2.97 0.09
    CF549−5 1.10 0.12 0.56 0.31 3.45 0.54 1.45 0.15
    Nucleotide Extended Using Klenow
    CF549−1 1.31 0.13 0.77 0.53 3.37 0.17 1.21 0.21
    CF549−2 3.52 0.13 0.74 0.63 1.36 0.15 0.89 0.21
    CF549−3 1.65 0.12 0.19 0.19 3.88 0.13 1.34 0.19
    CF549−4 0.69 0.09 0.11 0.12 1.16 0.10 2.97 0.09
    CF549−5 1.10 0.12 0.56 0.31 3.45 0.54 1.45 0.15
  • As indicated in Table 13, both Klenow and Exo-Klenow gave nested GBA™ extension products of G, A, G, T and G, respectively for primers CF549-1 through CF549-5. The deduced sequence for the 549 locus of the target is therefore CACTC, as expected. The results obtained above were confirmed by performing a nested GBA™ reaction using the “−” CF strand. The results of this experiment are presented in Table 13. Klenow, Exo-Klenow and Sequenase were compared for their ability to serve as the polymerase in the nested GBA™ reaction shown in Example 13. The enzymes gave equivalent N-GBA™ results. [0129]
  • Example 5 Nested GBA™ Analysis of Hot Spots in the p53 Gene
  • The feasibility of utilizing the nested GBA™ (N-GBA™) approach to accurately detect p53 mutations was evaluated. [0130]
  • The p53 gene encompasses an approximately 19 kilobase stretch, comprising 11 exons (393 codons), of chromosome region 17p13.105-p12. Characterized as a tumor antigen in 1979, then as an oncogene, and finally as a tumor suppressor gene, p53 has received increasing study in cancer research. Mutations in the p53 gene are the single most common genetic alteration in human cancers and generally result in loss of function of the protein. The p53 protein's apparent role in regulating cell growth and apoptosis suggests it is a core protein in determination of tumorigenesis, with mutations in p53 being part of the cascade necessary for the development of many tumors. Three quarters of colon cancers and half of lung and breast cancers have been reported to contain p53 mutations (Levine, A. J., [0131] Canc. Surveys 12:59-79 (1992); herein incorporated by reference). Since more than 100,000 additional cases of each of these cancers is diagnosed each year, the potential application of p53 analysis is significant both clinically and commercially. The majority of p53 mutations are missense (ranging from 75% to more than 90%), tightly clustered between codons 118 and 309, the DNA binding region of the protein. Amino acids 175, 248, 249, 273, and 282 account for 40% of the total reported missense mutations, and the predominance of these so-called “hot-spots” vary depending on the tissue of origin of the cancer. The diversity and dispersion of clinically relevant mutations poses a significant challenge to the development of routine detection strategies. Because of the high prevalence of p53 mutations in a wide variety of common cancers and the large number of potential mutations in a defined gene region, p53 is an excellent target for development of a sequence composition/confirmation analysis tool such as SCAN™.
  • Nested GBA™ primers were designed for all DNA bases in a hot-spot (codons 272-282) of the target p53 gene. FIG. 2 displays the four mutational hot-spot regions of p53 gene with the wild-type and known representative mutant sequences of [0132] codons 272 to 282 (region D) highlighted. Specifically, three synthetic templates are designed to match three DNA samples, each containing a mutation in either codon 273, 275, or 281. Two additional synthetic templates are designed to be representative of a deletion mutation (codons 266 and 267 deleted) and an insertion mutation (C insertion at codon 280).
  • One PCR primer for the each primer set will have four phosphorothioate linkages at its 5′ end in preparation for TargEx™ treatment. TargEx™ is a method developed to quantitatively convert double-stranded PCR product into single-stranded DNA by selectively degrading one of the strands with bacteriophage T7 gene 6 exonuclease (Nikiforov, T. T. et al., [0133] Nucl Acids Res 22:4167-4175 (1994); Nikiforov, T. T. et al., PCR Methods and Apps 3:285-291 (1994)). Specifically, PCR product amplified from human genomic DNA using one fluorescein-labeled, phosphorothioated PCR primer and one unmodified primer are treated with T7 gene 6 exonuclease (U.S. Biochemical) at a final concentration of 2 U/μl PCR (diluted in buffer supplied by manufacturer). After 1 hr of incubation at room temperature, NaCl and EDTA are added to a concentration of 1.5M and 10 mM, respectively, to stop the exonuclease digestion. The mixture is then applied to the immobilized GBA™ primer for subsequent hybridization and extension. After extension, the standard ABI fluorescent cycle sequencing system is used to analyze the reaction.
  • The 5′ end of the primers are specifically attached to glass slides to form a SCAN™ array. Synthetic oligonucleotide templates corresponding to portions of the target hot-spot and containing various known mutations are used to test the array and the GBA™ biochemistry to demonstrate that robust, unambiguous (low noise and background) data can be obtained from such an analysis. Permutations of the standard GBA™ biochemistry, in particular the use of different DNApolymerases, are evaluated to ensure optimal signal:noise (S:N) characteristics for all 4 nucleotides in the feasibility test system. [0134]
  • Primer pairs will be qualified by amplification of human genomic DNA at a concentration of 12.5 μg/ml in 30 μl reactions in 96 well V-bottom polycarbonate plates (Costar). The final concentration of the reaction mixture will be 400 μM each dNTP, 50 mM KCl, 10 mM Tris HCl (pH 8.5),1.5 mM MgCl[0135] 2, 0.5 μM each primer, 2.5 ng/μl DNA, and 0.025 U/μl Taq DNA polymerase (Perkin-Elmer). Each reaction will be overlayed with 30 μl mineral oil and cycled in a BioIII thermocycler (Sun BioScience Inc., Branford Conn.). Following an initial two minute denaturation step at 94° C., 35 cycles will be carried out, each consisting of denaturation (1 min at 94° C.), annealing (2 min at 55° C.), and extension (3 min at 72° C.). Ten μl of PCR product will be run on 15% non-denaturing polyacrylamide gels at 40 W for 40 min to analyze yield. The amplification products will be quantified by comparison with multiple dilutions of a Mass Marker (BRL).
  • Example 6 Use of Nested GBA™ Primers on PCR-Generated Templates
  • The performance of the Nested GBA™ method is assessed using PCR-amplified genomic DNA as the target for analysis. At least two overlapping PCR primer pairs are designed and tested on wild-type and mutant-containing genomic DNAs (five total), and the resultant PCR products tested by N-GBA™ on the SCAN™ arrays produced in Example 5. The PCR products will be evaluated for hybridization and extension efficiencies relative to the synthetic templates of Example 5 to ensure that analysis of PCR products is equally robust. [0136]
  • Example 7 Analysis of Primer Extension at Position of 3′ Terminal Nucleotide Mismatch
  • An experiment was performed in order to determine the capacity of various polymerases to extend a primer having a mismatch at its 3′ terminus. Two 6-mer primers were prepared and were separaterly hybridized to each member of a set of four template molecules whose sequences differed only in the identity of the 6th nucleotide, as shown in Table 14. In Table 14, “X” denotes the 3′ terminal nucleotide of the primer; “Y” denotes the nucleotide of the template that is opposite to “X” when the primer and template are hybridized to one another. [0137]
    TABLE 14
    Nucleotide Nucleotide
    SEQ ID NO. Molecule Sequence X Y
    Primer 4748 TATGGC C
    57 Template T1 CGGTTACCATA A
    58 Template T2 CGGTTCCCATA C
    58 Template T3 CGGTTGCCATA G
    60 Template T4 CGGTTTCCATA T
    Primer 4749 TATGGA A
    57 Template T1 CGGTTACCATA A
    58 Template T2 CGGTTCCCATA C
    59 Template T3 CGGTTGCCATA G
    60 Template T4 CGGTTTCCATA T
  • Thus, for each primer, the efficiency and capacity of extension was determined using four parallel reactions, in which three comprise efforts to extend a mismatched 3′ terminus, and one comprised a control in which the 3′ terminal nucleotide of the primer was correctly base paired. Extension was determined by GBA™ reaction. [0138]
  • The results of this experiment are shown in Table 15, with respect to four polymerases: “K” (Klenow), “exo-K” (exo-Klenow), “Bst” (Bst polymerase) and “Therm” (Thermosequenase). The data are expressed in optical density units. [0139]
  • Table 15 shows that Thermosequenase did not extend primers whose 3′ terminal nucleotides were not based paired to the template. In contrast, Klenow and Exo-Klenow were both able to incorporate label onto the 3′ terminus of 3′ terminally mismatched primers, consistent with the data presented above. [0140]
    TABLE 15
    Labeled DNA Polymerase
    Primer/ Nucleotide Employed
    Template X:Y Present K exo-K Therm Bst
    4748/T1 C:A A 0.70 1.80 0.40 0.55
    C 0.30 0.60 0.27 0.25
    G 0.80 0.30 0.32 0.32
    T 2.20 0.60 0.45 0.25
    4748/T2 C:C A 0.70 0.60 0.40 0.28
    C 0.30 0.65 0.25 0.21
    G 1.60 0.30 0.47 0.37
    T 0.50 0.30 0.25 0.20
    4748/T3 C:G A 2.70 2.80 2.10 2.80
    C 1.00 0.60 0.40 0.25
    G 0.90 0.20 0.30 0.20
    T 0.50 0.30 0.22 0.18
    4748/T4 C:T A 2.50 2.10 0.75 1.10
    C 0.30 0.70 0.25 0.21
    G 1.30 0.40 0.35 0.30
    T 0.50 0.30 0.15 0.20
    4749/T1 A:A A 0.80 0.50 0.40 0.30
    C 0.30 0.50 0.30 0.30
    G 1.30 0.50 0.40 0.30
    T 2.10 0.70 0.50 0.30
    4749/T2 A:C A 1.00 1.10 0.30 0.40
    C 0.30 0.40 0.30 0.30
    G 2.30 0.70 0.50 0.50
    T 0.60 0.40 0.75 0.20
    4749/T3 A:G A 0.90 0.50 0.50 0.40
    C 3.00 0.90 0.70 0.30
    G 1.40 0.60 0.40 0.40
    T 0.70 0.40 0.30 0.20
    4749/T4 A:T A 3.00 3.00 2.00 2.40
    C 0.30 0.30 0.30 0.20
    G 1.20 0.50 0.30 0.30
    T 0.60 0.30 0.30 0.20
  • While the invention has been described in connection with specific embodiments thereof, it will be understood that it is capable of further modifications and this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features hereinbefore set forth and as follows in the scope of the appended claims. [0141]
  • 1 60 25 base pairs nucleic acid single linear DNA (genomic) YES 1 CTTGTGCTGA CTTACCAGAT GGGAC 25 25 base pairs nucleic acid single linear DNA (genomic) YES 2 TTGTGCTGAC TTACCAGATG GGACA 25 25 base pairs nucleic acid single linear DNA (genomic) YES 3 TTGTGCTGAC TTACCAGATG GGACC 25 25 base pairs nucleic acid single linear DNA (genomic) YES 4 TTGTGCTGAC TTACCAGATG GGACT 25 25 base pairs nucleic acid single linear DNA (genomic) YES 5 TTGTGCTGAC TTACCAGATG GGACG 25 25 base pairs nucleic acid single linear DNA (genomic) YES 6 TGTGCTGACT TACCAGATGG GACAA 25 25 base pairs nucleic acid single linear DNA (genomic) YES 7 TGTGCTGACT TACCAGATGG GACAC 25 25 base pairs nucleic acid single linear DNA (genomic) YES 8 TGTGCTGACT TACCAGATGG GACAT 25 25 base pairs nucleic acid single linear DNA (genomic) YES 9 TGTGCTGACT TACCAGATGG GACAG 25 25 base pairs nucleic acid single linear DNA (genomic) YES 10 TGTGCTGACT TACCAGATGG GACCA 25 25 base pairs nucleic acid single linear DNA (genomic) YES 11 TGTGCTGACT TACCAGATGG GACCC 25 25 base pairs nucleic acid single linear DNA (genomic) YES 12 TGTGCTGACT TACCAGATGG GACCT 25 25 base pairs nucleic acid single linear DNA (genomic) YES 13 TGTGCTGACT TACCAGATGG GACCG 25 25 base pairs nucleic acid single linear DNA (genomic) YES 14 TGTGCTGACT TACCAGATGG GACTA 25 25 base pairs nucleic acid single linear DNA (genomic) YES 15 TGTGCTGACT TACCAGATGG GACTC 25 25 base pairs nucleic acid single linear DNA (genomic) YES 16 TGTGCTGACT TACCAGATGG GACTT 25 25 base pairs nucleic acid single linear DNA (genomic) YES 17 TGTGCTGACT TACCAGATGG GACTG 25 25 base pairs nucleic acid single linear DNA (genomic) YES 18 TGTGCTGACT TACCAGATGG GACGA 25 25 base pairs nucleic acid single linear DNA (genomic) YES 19 TGTGCTGACT TACCAGATGG GACGC 25 25 base pairs nucleic acid single linear DNA (genomic) YES 20 TGTGCTGACT TACCAGATGG GACGT 25 25 base pairs nucleic acid single linear DNA (genomic) YES 21 TGTGCTGACT TACCAGATGG GACGG 25 25 base pairs nucleic acid single linear DNA (genomic) YES 22 GTGCTGACTT ACCAGATGGG ACAAA 25 30 base pairs nucleic acid single linear DNA (genomic) YES 23 CCGTACTCCC ATCTGGTAAG TCAGCACAAG 30 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 24 CTTGTGCTGA CTTACCAGAT GGGAC 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 25 TTGTGCTGAC TTACCAGATG GGACA 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 26 TGTGCTGACT TACCAGATGG GACAC 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 27 GTGCTGACTT ACCAGATGGG ACACT 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 28 TGCTGACTTA CCAGATGGGA CACTC 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 29 GCTGACTTAC CAGATGGGAC ACTCT 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 30 GTCATTAATG CTATGCAGAA AATCT 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 31 TCATTAATGC TATGCAGAAA ATCTT 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 32 CATTAATGCT ATGCAGAAAA TCTTA 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 33 ATTAATGCTA TGCAGAAAAT CTTAG 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 34 TTAATGCTAT GCAGAAAATC TTAGA 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 35 TAATGCTATG CAGAAAATCT TAGAG 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 36 AAAGAAATTC TTGCTCGTTG ACCTC 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 37 AAGAAATTCT TGCTCGTTGA CCTCC 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 38 AGAAATTCTT GCTCGTTGAC CTCCA 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 39 GAAATTCTTG CTCGTTGACC TCCAC 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 40 AAATTCTTGC TCGTTGACCT CCACT 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 41 TTCTTGGAGA AGGTGGAATC ACACT 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 42 TCTTGGAGAA GGTGGAATCA CACTG 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 43 CTTGGAGAAG GTGGAATCAC ACTGA 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 44 TTGGAGAAGG TGGAATCACA CTGAG 25 25 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 45 TGGAGAAGGT GGAATCACAC TGAGT 25 49 base pairs nucleic acid single linear DNA (genomic) YES 46 CCAGAAGAAA GGGCCTTCAC AGTGTCCTTT ATGTAAGAAT GATATAACC 49 49 base pairs nucleic acid single linear DNA (genomic) YES 47 CCAGAAGAAA GGGCCTTCAC AGGGTCCTTT ATGTAAGAAT GATATAACC 49 24 base pairs nucleic acid single linear DNA (genomic) YES 48 GGTTATATCA TTCTTACATA AAGG 24 24 base pairs nucleic acid single linear DNA (genomic) YES 49 GTTATATCAT TCTTACATAA AGGA 24 24 base pairs nucleic acid single linear DNA (genomic) YES 50 TTATATCATT CTTACATAAA GGAC 24 24 base pairs nucleic acid single linear DNA (genomic) YES 51 TATATCATTC TTACATAAAG GACA 24 24 base pairs nucleic acid single linear DNA (genomic) YES 52 ATATCATTCT TACATAAAGG ACAC 24 34 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 53 TCTTAGAGTG TCCCATCTGG TAAGTCAGCA CAAG 34 34 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS EXON 23 BRCA1 GENE 54 GACACTCTAA GATTTTCTGC ATAGCATTAA TGAC 34 32 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 55 CTGAGTGGAG GTCAACGAGC AAGAATTTCT TT 32 32 base pairs nucleic acid single linear cDNA NO HOMO SAPIENS CYSTIC FIBROSIS GENE 56 TCCACTCAG TGTGATTCCAC CTTCTCCAAG AA 32 11 base pairs nucleic acid single linear DNA (genomic) YES 57 CGGTTACCAT A 11 11 base pairs nucleic acid single linear DNA (genomic) YES 58 CGGTTCCCAT A 11 11 base pairs nucleic acid single linear DNA (genomic) YES 59 CGGTTGCCAT A 11 11 base pairs nucleic acid single linear DNA (genomic) YES 60 CGGTTTCCAT A 11

Claims (24)

What is claimed is:
1. A method for determining the nucleotide sequence of a nucleic acid molecule which comprises the steps of:
(A) arraying a set of nested primer oligonucleotides onto a solid support, each array position containing a different array member having a predetermined sequence;
(B) incubating oligonucleotides of said array in the presence of a preparation of said nucleic acid molecules, a polymerase and at least one chain terminator nucleotide; wherein said incubation is under conditions sufficient to permit DNA hybridization to occur between the oligonucleotides of said incubation and said nucleic acid molecules; wherein said incubation is conducted in the substantial absence of any non-chain terminator nucleotides;
(C) (1) in the case wherein the 3′ terminal nucleotide of an oligonucleotide is hybridized to said nucleic acid molecule, permitting oligonucleotides hybridized to nucleic acid molecules to be extended by polymerase-mediated incorporation of a single chain terminator nucleotide residue onto the 3′ terminus of said hybridized oligonucleotide, wherein for each hybridized oligonucleotide being so extended, said incorporated nucleotide residue is complementary to the nucleotide residue immediately 5′ to the nucleotide residue of the nucleic acid molecule that is hybridized with that oligonucleotide's 3′ terminal nucleotide residue; then performing step (D);
(2) in the case wherein the 3′ terminal nucleotide of an oligonucleotide is not hybridized to said nucleic acid molecule, either:
(a) not permitting oligonucleotides hybridized to nucleic acid molecules to be extended by polymerase-mediated incorporation of a single chain terminator nucleotide residue onto the 3′ terminus of said hybridized oligonucleotide, or
(b) permitting the removal of any non-hybridized nucleotide residues from the 3′ terminus of said hybridized oligonucleotides, so as to form a truncated primer oligonucleotide whose 3′ terminus is hybridized to said nucleic acid molecule, and then permitting polymerase-mediated incorporation of a single chain terminator nucleotide residue onto the 3′ terminus of said hybridized truncated oligonucleotide, wherein for each hybridized truncated oligonucleotide being so extended, said incorporated nucleotide residue is complementary to the nucleotide residue immediately 5′ to the nucleotide residue of the nucleic acid molecule that is hybridized with that truncated oligonucleotide's 3′ terminal nucleotide residue; then performing step (D);
(D) determining, at each array position at which an oligonucleotide has incorporated a single chain terminator nucleotide residue, the identity of the incorporated chain terminator nucleotide residue; and
(E) determining the nucleotide sequence of said nucleic acid molecule from the determined identity of the incorporated nucleotide of primer oligonucleotides of said array, and known sequence of the oligonucleotide at each array position.
2. The method of claim 1, wherein each array position contains a primer oligonucleotide that is capable of hybridizing to a region of said nucleic acid molecule.
3. The method of claim 1, wherein said polymerase is a Thermosequenase class polymerase.
4. The method of claim 1, wherein said polymerase is a Klenow class polymerase.
5. The method of claim 4, wherein in step (C), at least some array positions contain nucleic acid molecules hybridized to oligonucleotides said whose 3′ terminal nucleotide is not hybridized to the nucleic acid molecule, and wherein step (C)(1) is conducted for such oligonucleotides.
6. The method of claim 1, wherein said array is a random oligonucleotide array.
7. The method of claim 1, wherein said array is a nested oligonucleotide array.
8. The method of claim 7, wherein said nested array contains oligonucleotide members having all possible permutations of nucleotides over a region of from 1 to 20 bases.
9. The method of claim 1, wherein said method is conducted in the presence of at least four chain terminator nucleotide species, at least one of which is labeled.
10. The method of claim 9, wherein all of said chain terminator nucleotide species are labeled, and wherein the label of any such species can be distinguished from the label of any other species present.
11. The method of claim 1, wherein said nucleic acid molecule is a DNA molecule.
12. The method of claim 1, wherein said nucleic acid molecule is RNA.
13. The method of claim 1, wherein said method is performed both on said nucleic acid molecule, and on a complement of said nucleic acid molecule.
14. The method of claim 11, wherein said DNA is genomic DNA of a human or non-human mammal.
15. The method of claim 11, wherein said DNA is human genomic DNA.
16. The method of claim 15, wherein said DNA is suspected to contain a genetic variation associated with a disease, and said method is employed to determine whether said DNA contains said variation.
17. The method of claim 16, wherein said disease is cancer or cystic fibrosis.
18. The method of claim 1, wherein said oligonucleotides are immobilized onto said solid support.
19. The method of claim 18, wherein said support is plastic or glass.
20. A kit for determining the sequence of a nucleic acid molecule which comprises a solid support containing an array of spaced apart receptacles for oligonucleotides, each receptacle containing a different primer oligonucleotide.
21. The kit of claim 20, wherein each array receptacle additionally contains at least four chain terminator nucleotide species, at least one of which is labeled.
22. The kit of claim 20, wherein all of said chain terminator nucleotide species are labeled, and wherein the label of any such species can be distinguished from the label of any other species present.
23. The kit of claim 20, wherein said kit determines the nucleotide sequence of DNA suspected to contain a genetic variation associated with a disease, and wherein said kit permits sufficient determination of nucleotide sequence to determine whether said DNA contains said variation.
24. The kit of claim 23, wherein said disease is cancer or cystic fibrosis.
US09/962,537 1996-03-19 2001-09-24 Method for determining the nucleotide sequence of a polynucleotide Abandoned US20020192662A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/962,537 US20020192662A1 (en) 1996-03-19 2001-09-24 Method for determining the nucleotide sequence of a polynucleotide

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US61690696A 1996-03-19 1996-03-19
US09/522,494 US6294336B1 (en) 1996-03-19 2000-03-09 Method for analyzing the nucleotide sequence of a polynucleotide by oligonucleotide extension on an array
US09/962,537 US20020192662A1 (en) 1996-03-19 2001-09-24 Method for determining the nucleotide sequence of a polynucleotide

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/522,494 Continuation US6294336B1 (en) 1996-03-19 2000-03-09 Method for analyzing the nucleotide sequence of a polynucleotide by oligonucleotide extension on an array

Publications (1)

Publication Number Publication Date
US20020192662A1 true US20020192662A1 (en) 2002-12-19

Family

ID=24471465

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/522,494 Expired - Fee Related US6294336B1 (en) 1996-03-19 2000-03-09 Method for analyzing the nucleotide sequence of a polynucleotide by oligonucleotide extension on an array
US09/962,537 Abandoned US20020192662A1 (en) 1996-03-19 2001-09-24 Method for determining the nucleotide sequence of a polynucleotide

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/522,494 Expired - Fee Related US6294336B1 (en) 1996-03-19 2000-03-09 Method for analyzing the nucleotide sequence of a polynucleotide by oligonucleotide extension on an array

Country Status (3)

Country Link
US (2) US6294336B1 (en)
AU (1) AU2320597A (en)
WO (1) WO1997035033A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7645596B2 (en) 1998-05-01 2010-01-12 Arizona Board Of Regents Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US7666593B2 (en) 2005-08-26 2010-02-23 Helicos Biosciences Corporation Single molecule sequencing of captured nucleic acids
US7897345B2 (en) 2003-11-12 2011-03-01 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
US7981604B2 (en) 2004-02-19 2011-07-19 California Institute Of Technology Methods and kits for analyzing polynucleotide sequences
US9096898B2 (en) 1998-05-01 2015-08-04 Life Technologies Corporation Method of determining the nucleotide sequence of oligonucleotides and DNA molecules

Families Citing this family (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5885775A (en) * 1996-10-04 1999-03-23 Perseptive Biosystems, Inc. Methods for determining sequences information in polynucleotides using mass spectrometry
US6482590B1 (en) * 1996-12-20 2002-11-19 Aventis Behring Gmbh Method for polynucleotide amplification
WO1998051823A1 (en) 1997-05-16 1998-11-19 Mosaic Technologies Electrophoretic analysis of molecules using immobilized probes
JP2002505846A (en) 1997-11-06 2002-02-26 モザイク テクノロジーズ Multiple sequential polynucleotide displacement reactions for signal amplification and processing
US6322968B1 (en) * 1997-11-21 2001-11-27 Orchid Biosciences, Inc. De novo or “universal” sequencing array
US6218118B1 (en) 1998-07-09 2001-04-17 Agilent Technologies, Inc. Method and mixture reagents for analyzing the nucleotide sequence of nucleic acids by mass spectrometry
US6150105A (en) 1998-08-20 2000-11-21 Genetic Assays, Inc. Methods of screening nucleic acids for nucleotide variations
US6238927B1 (en) 1998-10-05 2001-05-29 Mosaic Technologies, Incorporated Reverse displacement assay for detection of nucleic acid sequences
EP1165839A2 (en) * 1999-03-26 2002-01-02 Whitehead Institute For Biomedical Research Universal arrays
WO2001025485A2 (en) * 1999-10-06 2001-04-12 Amersham Biosciences Corp Method for detecting mutations using arrayed primer extension
WO2001090417A2 (en) * 2000-05-19 2001-11-29 Eragen Biosciences, Inc. Materials and methods for detection of nucleic acids
US6977161B2 (en) * 2000-10-14 2005-12-20 Eragen Biosciences, Inc. Solid support assay systems and methods utilizing non-standard bases
EP3034627B1 (en) 2000-10-06 2019-01-30 The Trustees of Columbia University in the City of New York Massive parallel method for decoding dna and rna
US9708358B2 (en) 2000-10-06 2017-07-18 The Trustees Of Columbia University In The City Of New York Massive parallel method for decoding DNA and RNA
US20040038206A1 (en) * 2001-03-14 2004-02-26 Jia Zhang Method for high throughput assay of genetic analysis
EP1572878A4 (en) 2001-06-29 2007-03-21 Dnaprint Genomics Inc Compositions and methods for inferring a response to a statin
US20030087298A1 (en) * 2001-11-02 2003-05-08 Roland Green Detection of hybridization on oligonucleotide microarray through covalently labeling microarray probe
US20030170637A1 (en) * 2002-03-06 2003-09-11 Pirrung Michael C. Method of analyzing mRNA splice variants
US20030235827A1 (en) * 2002-06-25 2003-12-25 Orchid Biosciences, Inc. Methods and compositions for monitoring primer extension and polymorphism detection reactions
CA2491117A1 (en) * 2002-06-28 2004-01-08 Orchid Biosciences, Inc. Methods and compositions for analyzing compromised samples using single nucleotide polymorphism panels
EP1576174A4 (en) * 2002-08-02 2007-07-25 Orchid Cellmark Inc Methods and compositions for genotyping
US7192700B2 (en) * 2002-12-20 2007-03-20 Orchid Cellmark Inc. Methods and compositions for conducting primer extension and polymorphism detection reactions
KR20040035248A (en) * 2002-10-19 2004-04-29 제노프라 주식회사 Method for identification and analysis of certain molecules using single strand nucleic acid array and nucleic acid ligands to e. coli therefor
US20040126765A1 (en) * 2002-12-27 2004-07-01 Adams Craig W. Method and compositions for sequencing nucleic acid molecules
ATE417129T1 (en) * 2003-01-17 2008-12-15 Eragen Biosciences Inc NUCLEIC ACID AMPLIFICATION WITH NON-STANDARD BASES
CA2521127C (en) 2003-04-01 2014-05-27 Eragen Biosciences, Inc. Polymerase inhibitor and method of using same
DE10325098B3 (en) * 2003-06-03 2004-12-02 IPK-Institut für Pflanzengenetik und Kulturpflanzenforschung Procedure for SNP analysis on biochips with oligonucleotide areas
US7972783B2 (en) * 2003-11-24 2011-07-05 Branhaven LLC Method and markers for determining the genotype of horned/polled cattle
WO2005081776A2 (en) * 2004-01-30 2005-09-09 Eragen Biosciences, Inc. Materials and methods for the detection of sars
US20060073486A1 (en) * 2004-10-01 2006-04-06 Sana Theodore R Multiple array substrates and methods for using the same
US20090124993A1 (en) 2005-02-17 2009-05-14 Burkly Linda C Treating neurological disorders
WO2006101913A2 (en) * 2005-03-18 2006-09-28 Eragen Biosciences, Inc. Methods for detecting multiple species and subspecies of neiserria
CN101189023B (en) 2005-03-31 2013-01-30 通用医疗公司 Monitoring and modulating hgf/hgfr activity
CA2919210C (en) 2005-06-07 2019-03-05 Luminex Corporation Methods for detection and typing of nucleic acids
US20070264694A1 (en) * 2006-04-07 2007-11-15 Eragen Biosciences, Inc. Use of non-standard bases and proximity effects for gene assembly and conversion of non-standard bases to standard bases during dna synthesis
EP2084296B1 (en) * 2006-09-29 2015-08-05 Agendia N.V. High-throughput diagnostic testing using arrays
EP2092074A4 (en) * 2006-11-02 2010-06-09 Univ Yale Assessment of oocyte competence
US7883869B2 (en) 2006-12-01 2011-02-08 The Trustees Of Columbia University In The City Of New York Four-color DNA sequencing by synthesis using cleavable fluorescent nucleotide reversible terminators
WO2008074101A2 (en) 2006-12-21 2008-06-26 Agriculture Victoria Services Pty Limited Artificial selection method and reagents
RU2009146054A (en) * 2007-05-14 2011-06-20 Инсайт Дженетикс, Инк. (Us) METHODS FOR SCREENING NUCLEIC ACIDS FOR THE IDENTIFICATION OF SINGLE NUCLEOTIDE VARIATIONS
US20100203051A1 (en) 2007-05-31 2010-08-12 The University Of Queensland Diagnostic markers for ankylosing spondylitis and uses thereof
WO2008153804A2 (en) * 2007-05-31 2008-12-18 Monsanto Technology Llc Soybean polymorphisms and methods of genotyping
US9115163B2 (en) 2007-10-19 2015-08-25 The Trustees Of Columbia University In The City Of New York DNA sequence with non-fluorescent nucleotide reversible terminators and cleavable label modified nucleotide terminators
EP4310194A2 (en) 2007-10-19 2024-01-24 The Trustees of Columbia University in the City of New York Design and synthesis of cleavable fluorescent nucleotides as reversible terminators for dna sequencing by synthesis
GB2457300A (en) 2008-02-08 2009-08-12 Norwegian School Of Veterinary Diagnosis of cone-rod dystrophy
CN102639712A (en) * 2009-10-23 2012-08-15 卢米耐克斯公司 Amplification primers with non-standard bases for increased reaction specificity
GB201418718D0 (en) * 2014-10-21 2014-12-03 Cambridge Epigenetix Ltd Improved nucleic acid re-sequencing using a reduced number of identified bases
JP7377544B2 (en) 2017-05-17 2023-11-10 マイクロバイオ・プロプライエタリー・リミテッド Biomarkers and their use
EP3692049A4 (en) * 2017-10-04 2021-06-23 Centrillion Technology Holdings Corporation Method and system for enzymatic synthesis of oligonucleotides

Family Cites Families (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4307189A (en) 1980-03-21 1981-12-22 Malon Kit Method for the quantitative determination of terminal deoxynucleotidyl transferase in biological samples
FI63596C (en) 1981-10-16 1983-07-11 Orion Yhtymae Oy MICROBIA DIAGNOSIS FOERFARANDE SOM GRUNDAR SIG PAO SKIKTSHYBRIDISERING AV NUCLEINSYROR OCH VID FOERFARANDET ANVAENDA KOMBINATIONER AV REAGENSER
GB8311018D0 (en) 1983-04-22 1983-05-25 Amersham Int Plc Detecting mutations in dna
US4883750A (en) 1984-12-13 1989-11-28 Applied Biosystems, Inc. Detection of specific sequences in nucleic acids
GB8432118D0 (en) 1984-12-19 1985-01-30 Malcolm A D B Sandwich hybridisation technique
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4760017A (en) 1985-12-23 1988-07-26 E. I. Du Pont De Nemours And Company Arabinonucleic acid probes for DNA/RNA assays
US4968602A (en) 1986-03-05 1990-11-06 Molecular Diagnostics, Inc. Solution-phase single hybridization assay for detecting polynucleotide sequences
EP0238332A2 (en) 1986-03-19 1987-09-23 Cetus Corporation Liquid hybridization method and kit for detecting the presence of nucleic acid sequences in samples
US4851331A (en) 1986-05-16 1989-07-25 Allied Corporation Method and kit for polynucleotide assay including primer-dependant DNA polymerase
GB8612087D0 (en) 1986-05-19 1986-06-25 Ici Plc Hybridisation probes
AU622104B2 (en) 1987-03-11 1992-04-02 Sangtec Molecular Diagnostics Ab Method of assaying of nucleic acids, a reagent combination and kit therefore
IL85551A0 (en) 1987-04-01 1988-08-31 Miles Inc Rapid hybridization assay and reagent system used therein
CA1317535C (en) 1987-06-30 1993-05-11 Nanibhushan Dattagupta Assay of sequences using amplified genes
FI80476C (en) 1987-10-09 1990-06-11 Orion Yhtymae Oy Improved hybridization process, used in the process and reagent packaging
GB8810400D0 (en) * 1988-05-03 1988-06-08 Southern E Analysing polynucleotide sequences
IE61148B1 (en) 1988-03-10 1994-10-05 Ici Plc Method of detecting nucleotide sequences
ATE108491T1 (en) 1988-03-18 1994-07-15 Baylor College Medicine MUTATION DETECTION BY COMPETITIVE OLIGONUCLEOTIDE PRIMING.
SE8801070D0 (en) 1988-03-23 1988-03-23 Pharmacia Ab METHOD FOR IMMOBILIZING A DNA SEQUENCE ON A SOLID SUPPORT
US5002867A (en) 1988-04-25 1991-03-26 Macevicz Stephen C Nucleic acid sequence determination by multiple mixed oligonucleotide probes
AU3694689A (en) 1988-04-28 1989-11-24 Mark H. Skolnick Amplified sequence polymorphisms (asps)
US4962020A (en) 1988-07-12 1990-10-09 President And Fellows Of Harvard College DNA sequencing
EP0425563B1 (en) 1988-07-20 1996-05-15 David Segev Process for amplifying and detecting nucleic acid sequences
AU629845B2 (en) 1988-08-30 1992-10-15 Abbott Laboratories Detection and amplification of target nucleic acid sequences
GB8822228D0 (en) 1988-09-21 1988-10-26 Southern E M Support-bound oligonucleotides
CA2002076A1 (en) 1988-11-21 1990-05-21 Brent A. Burdick Diagnostic kit and method using a solid phase capture means for detecting nucleic acids
GB8827160D0 (en) 1988-11-21 1988-12-29 Apothekernes Lab Detection & quantitative determination of rna & dna
DK175170B1 (en) 1988-11-29 2004-06-21 Sangtec Molecular Diagnostics Method and Reagent Combination to Determine Nucleotide Sequences
US5856092A (en) 1989-02-13 1999-01-05 Geneco Pty Ltd Detection of a nucleic acid sequence or a change therein
DK0457824T4 (en) 1989-02-13 2004-10-11 Geneco Pty Ltd Detection of a nucleic acid sequence or a change therein
WO1990011372A1 (en) 1989-03-21 1990-10-04 Collaborative Research, Inc. Multiplex dna diagnostic test
US5744101A (en) 1989-06-07 1998-04-28 Affymax Technologies N.V. Photolabile nucleoside protecting groups
US5143854A (en) 1989-06-07 1992-09-01 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof
FR2650840B1 (en) 1989-08-11 1991-11-29 Bertin & Cie RAPID DETECTION AND / OR IDENTIFICATION OF A SINGLE BASED ON A NUCLEIC ACID SEQUENCE, AND ITS APPLICATIONS
IL97222A (en) 1990-02-16 1995-08-31 Orion Yhtymae Oy Method and reagent for determining specific nucleotide variations
US6013431A (en) 1990-02-16 2000-01-11 Molecular Tool, Inc. Method for determining specific nucleotide variations by primer extension in the presence of mixture of labeled nucleotides and terminators
US5200314A (en) 1990-03-23 1993-04-06 Chiron Corporation Polynucleotide capture assay employing in vitro amplification
US5846710A (en) 1990-11-02 1998-12-08 St. Louis University Method for the detection of genetic diseases and gene sequence variations by single nucleotide primer extension
US6004744A (en) 1991-03-05 1999-12-21 Molecular Tool, Inc. Method for determining nucleotide identity through extension of immobilized primer
US5888819A (en) 1991-03-05 1999-03-30 Molecular Tool, Inc. Method for determining nucleotide identity through primer extension
WO1992016657A1 (en) 1991-03-13 1992-10-01 E.I. Du Pont De Nemours And Company Method of identifying a nucleotide present at a defined position in a nucleic acid
US5605662A (en) 1993-11-01 1997-02-25 Nanogen, Inc. Active programmable electronic devices for molecular biological analysis and diagnostics
US6153379A (en) 1993-06-22 2000-11-28 Baylor College Of Medicine Parallel primer extension approach to nucleic acid sequence analysis
SE501439C2 (en) 1993-06-22 1995-02-13 Pharmacia Lkb Biotech Method and apparatus for analyzing polynucleotide sequences
US6156501A (en) 1993-10-26 2000-12-05 Affymetrix, Inc. Arrays of modified nucleic acid probes and methods of use
US5429807A (en) 1993-10-28 1995-07-04 Beckman Instruments, Inc. Method and apparatus for creating biopolymer arrays on a solid support surface
US5846719A (en) 1994-10-13 1998-12-08 Lynx Therapeutics, Inc. Oligonucleotide tags for sorting and identification
US5763263A (en) 1995-11-27 1998-06-09 Dehlinger; Peter J. Method and apparatus for producing position addressable combinatorial libraries

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7645596B2 (en) 1998-05-01 2010-01-12 Arizona Board Of Regents Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US10214774B2 (en) 1998-05-01 2019-02-26 Life Technologies Corporation Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US10208341B2 (en) 1998-05-01 2019-02-19 Life Technologies Corporation Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US9957561B2 (en) 1998-05-01 2018-05-01 Life Technologies Corporation Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US9725764B2 (en) 1998-05-01 2017-08-08 Life Technologies Corporation Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US9096898B2 (en) 1998-05-01 2015-08-04 Life Technologies Corporation Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US9212393B2 (en) 1998-05-01 2015-12-15 Life Technologies Corporation Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US9458500B2 (en) 1998-05-01 2016-10-04 Life Technologies Corporation Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US9540689B2 (en) 1998-05-01 2017-01-10 Life Technologies Corporation Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US9657344B2 (en) 2003-11-12 2017-05-23 Fluidigm Corporation Short cycle methods for sequencing polynucleotides
US9012144B2 (en) 2003-11-12 2015-04-21 Fluidigm Corporation Short cycle methods for sequencing polynucleotides
US7897345B2 (en) 2003-11-12 2011-03-01 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
US7981604B2 (en) 2004-02-19 2011-07-19 California Institute Of Technology Methods and kits for analyzing polynucleotide sequences
US9868978B2 (en) 2005-08-26 2018-01-16 Fluidigm Corporation Single molecule sequencing of captured nucleic acids
US7666593B2 (en) 2005-08-26 2010-02-23 Helicos Biosciences Corporation Single molecule sequencing of captured nucleic acids

Also Published As

Publication number Publication date
WO1997035033A1 (en) 1997-09-25
AU2320597A (en) 1997-10-10
US6294336B1 (en) 2001-09-25

Similar Documents

Publication Publication Date Title
US6294336B1 (en) Method for analyzing the nucleotide sequence of a polynucleotide by oligonucleotide extension on an array
EP0820524B1 (en) Detecting dna sequence variations
CN111032881B (en) Accurate and large-scale parallel quantification of nucleic acids
US6972174B2 (en) Method for detecting single nucleotide polymorphisms (SNP's) and point mutations
US8975019B2 (en) Deducing exon connectivity by RNA-templated DNA ligation/sequencing
US20020094525A1 (en) Methods for the detection of multiple single nucleotide polymorphisms in a single reaction
AU8162498A (en) Methods for the detection of multiple single nucleotide polymorphisms in a single reaction
US20110003301A1 (en) Methods for detecting genetic variations in dna samples
EP1256632A2 (en) High throughput polymorphism screening
CA2295996A1 (en) High-throughput screening method for identification of genetic mutations or disease-causing microorganisms using segmented primers
WO1992016657A1 (en) Method of identifying a nucleotide present at a defined position in a nucleic acid
US6692915B1 (en) Sequencing a polynucleotide on a generic chip
US20040126765A1 (en) Method and compositions for sequencing nucleic acid molecules
EP4060050B1 (en) Highly sensitive methods for accurate parallel quantification of nucleic acids
EP4060049B1 (en) Methods for accurate parallel quantification of nucleic acids in dilute or non-purified samples
AU6676898A (en) Nucleic acid analysis methods
US20030077584A1 (en) Methods and compositons for bi-directional polymorphism detection
JP3499795B2 (en) Gene analysis method
US20100285970A1 (en) Methods of sequencing nucleic acids
Galbiati et al. High‐throughput mutational screening for beta‐thalassemia by single‐nucleotide extension
EP4332235A1 (en) Highly sensitive methods for accurate parallel quantification of variant nucleic acids
Ronaghi et al. Single nucleotide polymorphisms: discovery, detection and analysis
Park et al. DNA Microarray‐Based Technologies to Genotype Single Nucleotide Polymorphisms
JP2024035109A (en) Methods for accurate parallel detection and quantification of nucleic acids
Li et al. Tag/anti-tag liquid-phase primer extension array: a flexible and versatile genotyping platform

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION