WO1992014844A1 - Consensus sequence primed polymerase chain reaction method for fingerprinting genomes - Google Patents

Consensus sequence primed polymerase chain reaction method for fingerprinting genomes Download PDF

Info

Publication number
WO1992014844A1
WO1992014844A1 PCT/US1992/001491 US9201491W WO9214844A1 WO 1992014844 A1 WO1992014844 A1 WO 1992014844A1 US 9201491 W US9201491 W US 9201491W WO 9214844 A1 WO9214844 A1 WO 9214844A1
Authority
WO
WIPO (PCT)
Prior art keywords
primer
pcr
trna
species
sequence
Prior art date
Application number
PCT/US1992/001491
Other languages
French (fr)
Inventor
Michael Mcclelland
John T. Welsh
Original Assignee
California Institute Of Biological Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by California Institute Of Biological Research filed Critical California Institute Of Biological Research
Publication of WO1992014844A1 publication Critical patent/WO1992014844A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria

Definitions

  • This invention is directed toward a method of identifying segments of nucleic acid characteristic of a particular genome by generating a set of discrete DNA amplification products characteristic of the genome.
  • This set of discrete DNA products can generate a fingerprint that can be used to identify the genome.
  • RNA sequences oese, in "Evolution in Procaryotes” (Schleifer and Stackebrandt, Eds., Academic Press, London, 1986)
  • strain-specific fluorescent oligonucleotides DeLong et al.. Science. 243, 1360-1363 (1989); Amann et al., J. Bact.. 172, 762-770 (1990)
  • PCR polymerase chain reaction
  • DNA markers genetically linked to a selected trait can be used for diagnostic procedures.
  • the DNA markers commonly used are restriction fragment length polymorphisms (RFLPs) .
  • RFLPs restriction fragment length polymorphisms
  • Polymorphisms useful in genetic mapping are those polymorphisms that segregate in populations.
  • RFLPs have been detected by hybridization methodology (e.g. Southern blot) , but such techniques are time-consuming and inefficient.
  • Alternative methods include assays for polymorphisms using PCR.
  • the PCR method allows amplification of a selected region of DNA by providing two DNA primers, each of which is complementary to a portion of one strand within the selected region of DNA. These primers are used to hybridize to the separated strands within the region of DNA sought to be amplified, forming DNA molecules that are partially single-stranded and partially double-stranded. The double-stranded regions are then extended by the action of DNA polymerase, forming completely double-stranded molecules. These double-stranded molecules are then denatured and the denatured single strands are rehybridized to the primers. Repetition of this process through a number of cycles results in the generation of DNA strands that correspond in sequence to the region between the originally used primers.
  • PCR primer pairs can be used to identify genes characteristic of a particular species or even strain. PCR also obviates the need for cloning in order to compare the sequences of genes from related organisms, allowing the very rapid construction of phylogenies based on DNA sequence. For epidemiological purposes, specific primers to informative pathogenic features can be used in conjunction with PCR to identify pathogenic organisms.
  • PCR is a very powerful method for amplifying DNA
  • conventional PCR procedures require the use of at least two separate primers complementary to spec, c regions of the genome to be amplified. This requirement means that primers cannot be prepared unless the target DNA sequence information is available, and the primers must be "custom built" for each location within the genome of each species or strain whose DNA is to be amplified.
  • the neurofibromatosis gene is a recent example of this strategy (Xu et al.. Cell 62:599-608 (1990)).
  • the genetic map is a useful framework upon which to assemble partially completed arrays of clones. In the short term, it is likely that arrays of human genomic clones such as cosmids or yeast artificial chromosomes (YACs, Burke et al., Science. 236:806-812 (1987)) will form disconnected contigs that can be oriented relative to each other with probes that are on the genetic map or the in situ map (Lichter et al..
  • VNTRs Variable Number Tandem Repeats
  • VNTR VNTR consensus sequences may be used to display a fingerprint.
  • VNTR fingerprints have been used to assign polymorphisms in the mouse (Julier et al., Proc. Natl. Acad. Sci. USA. 87:4585-4589 (1990)), but these polymorphisms must be cloned to be of use in application to restriction mapping or contig assembly.
  • VNTR probes are useful in the mouse because a large number of crosses are likely to be informative at a particular position.
  • the mouse offers the opportunity to map in interspecific crosses which have a high level of polymorphism relative to most other inbred lines.
  • a dense genetic map of DNA markers would facilitate cloning genes that have been mapped genetically in the mouse. Cloning such genes would be aided by the identification of very closely linked DNA polymorphisms. About 3000 mapped DNA polymorphisms are needed to provide a good probability of one polymorphism being within 500 kb of the gene. To place so many DNA markers on the map it is desirable to have a fast and cost-effective genetic mapping strategy.
  • consensus sequence primed polymerase chain reaction provides a distinctive variation of the PCR technique by employing "consensus” sequence polynucleotide primers as defined herein.
  • CP-PCR consensus sequence primed polymerase chain reaction
  • the CP-PCR method is suitable for the rapid identification and classification of organisms throughout the plant, prokaryotic or eukaryotic kingdoms and for the generation of polymorphisms suitable for genetic mapping of eukaryotes. Only a small sample of biological material is needed, and knowledge of the target DNA sequence to be identified is not required. In addition, reagents specific for a given species are not required.
  • CP-PCR is a method for generating a set of discrete DNA products ("amplification products") characteristic of a genome by priming target nucleic acid obtained from a genome with at least one single-stranded primer to form primed nucleic acid.
  • the primed nucleic acid is then amplified by performing at least one cycle of polymerase chain reaction (PCR) amplification, and preferably at least 10 cycles, of PCR amplification to generate a set of discrete DNA amplification products characteristic of the genome.
  • PCR polymerase chain reaction
  • the genome to which the CP-PCR method is applied can be a viral genome; a bacterial genome, including Staphylococcus and Streptococcus; a plant genome, including rice, maize, or soybean; or an animal genome, including a human genome. It can also be a genome of a cultured cell line.
  • the cultured cell line can be a chimeric cell line with at least one human chromosome in a non-human background i.e., a hybrid cell line.
  • the CP-PCR method can be used to identify an organism as a species of a genus of bacteria, for example, Staphylococcus. from a number of different species. Similarly, the method can be used to determine the strain to which an isolate of the genus Streptococcus belongs, by comparing the DNA amplification products produced by CP-PCR for the isolate to the patterns produced from known strains with the same primer.
  • the CP-PCR method can also be used to verify the assignment of a bacterial isolate to a species by comparing the CP-PCR fingerprint from the isolate with the CP-PCR fingerprints produced by known bacterial species with the same primer.
  • the primer is chosen as described herein to maximize interspecific difference of the discrete DNA amplification products.
  • the target nucleic acid of the genome can be DNA, RNA or polynucleotide molecules. If the CP-PCR method is used to characterize RNA, the method also preferably includes the step of extending the primed RNA with an enzyme having reverse transcriptase activity to produce a hybrid DNA-RNA molecule, and priming the DNA of the hybrid with an arbitrary single-stranded primer.
  • the enzyme with reverse transcriptase activity can be avian yeloblastosis virus reverse transcriptase or Moloney leukemia virus reverse transcriptase.
  • the discrete DNA amplification products produced by the CP-PCR method can be manipulated in a number of ways. For example, they can be separated in a medium capable of separating DNA fragments by size, such as a polyacrylamide or agarose gel, in order to produce a fingerprint of the amplification products as separated bands. Additionally, at least one separated band can be isolated from the fingerprint and reamplified by conventional PCR. The isolated separated band can also be cleaved with a restriction endonuclease. The reamplified fragments can then be isolated and cloned in a bacterial host. The isolated band or reamplified fragments can be sequenced.
  • a medium capable of separating DNA fragments by size such as a polyacrylamide or agarose gel
  • Consensus primers particularly structural RNA consensus primers are also contemplated, as are kits containing the primers in combination with control genomic DNA for typing isolated genomes.
  • Figure 1 shows the CP-PCR patterns produced by using isolates representing five different species of Staphylococcus. and illustrates the differences apparent- between species, as described in Example 2.
  • PCR was performed using the primers T5A in group a, T3A in group b, or T5A plus T3A in group c, at 50°C.
  • Each numbered lane consists of three adjacent lanes having 80, 16 or 3.2 ng of template.
  • Lane 1 S. haemolyticus CC 12 2.
  • Lane 2 S. hominis 27844.
  • Lane 3 S. warneri CPB10E2.
  • Lane 4 S. cohnni JL 143.
  • Lane 5 S. aureus ISP-8.
  • Figure 2 shows the CP-PCR patterns produced by using forty strains of bacteria from three different genera, and illustrates the differences detectable between the strains and the general similarity of the patterns from the same species, as described in Example 2.
  • PCR was performed using the primers T5A plus T3A at 50°C with 100 ng of template.
  • the templates in lanes 1 to 17 contain Streptococcus DNAs.
  • Lanes 18 and 19 contain Enterococcus DNAs.
  • Lanes 20 to 40 contain Staphylococcus DNAs. See Table 1 for the strains used in each lane.
  • Figure 3 shows the CP-PCR patterns produced by using genomes from species across the three kingdoms and illustrates the existence of polymorphisms, as described in Example 2.
  • the reaction was performed using 50 ng of template under the standard PCR conditions.
  • the low temperature annealing step was 50°C.
  • Lanes 1 to 9 used the primers T5A and T3A.
  • Lanes 10 to 19 used T5B and T3A. See Table 3 for the strains used in each lane.
  • Figure 4 shows fingerprints generated by tDNA-PCR in various Staphylococcus species. Strains from five species of Staphylococcus. listed in Table 3, were fingerprinted. Examples from each species are shown. PCR was performed using the primers T5A and T3B at 50°C with 1 ng of template. The products were resolved on a 5% acrylamide, 50% urea gel. The arrow on the right indicates the region of the gel from which products discussed in this paper were recovered for sequencing.
  • Lanes 1 and 2 Strains ATCC 27844 type strain and ATCC 27846 of Staphylococcus hominis both isolated from humans. Lanes 3 and 4: Strains CPB10E2 and GAD473 of S. warneri. isolated from a Cercopithecus monkey and Bush baby, respectively.
  • Lanes 5 and 6 Strains ATCC 8432 and ATCC 15564 of S . aureus. isolated from a bird and a phage typing strain isolated from humans, respectively.
  • Lanes 7 and 8 Strains CC12J2 and PAY9F2 of S. haemolvticus. isolated from a Mangabey and chimpanzee, respectively.
  • Lane 9 Strain JL143 of _>. cohnii. subspecies urealvticum_ isolated from a human.
  • Figure 5 shows the sequence of the tRNA iMet - tRNAAsp-tRNA Phe region from three species of Staphylococcus. The sequences shown are from the polymorphisms detected in Figure 4. All genes were sequenced from a minimum of two separate clones and in both strands. The coding strand from one region of a larger clone is shown in each case. Intergenic spacers are in bold. The sequences are compared to the equivalent genes in Bacillus subtilis (Void, Structure and Organization of Genes for Transger Ribonucleic Acid in Bacillus subtilis.
  • Figure 6 shows the results of PCR amplifying the tRNA iMet -tRNA phe intergenic spacer from five species of Staphylococcus ⁇ Strains from five species of Staphylococcus. listed in Table 3, were investigated. Examples from each species are shown.
  • panel A high stringency PCR was performed using the primers StaphiMet3 and StaphAspS with 1 ng of template DNA and 1,000 ng of human DNA.
  • the primers were StaphAsp3 and StaphPheS.
  • Lanes 1 and 2 Strains ATCC 27844 type strain and ATCC 27846 of Staphylococcus hominis both isolated from humans.
  • Lanes 3 and 4 Strains CPB10E2 and GAD473 of S. warneri. isolated from a Cercopithecus monkey and Bushbaby, respectively.
  • Lanes 5 and 6 Strains ATCC 6538 and Sau3A of S . aureus « both isolated from humans.
  • Lanes 7 and 8 Strains AW263 and ATCC 29970 type strain of f>. haemolvticus . both isolated from humans.
  • Lane 9 Strain CM89 of £. cohnii subsp. urealyticum isolated from a human.
  • This invention relates to a method for generating a set of discrete DNA amplification products characteristic of a genome.
  • This set of discrete DNA amplification products can be resolved by techniques such as gel electrophoresis, producing a distinctive pattern, known as a "fingerprint", that can be used to identify the genome.
  • This method uses a distinctive and novel variation of the polymerase chain reaction (PCR) technique that employs one or more consensus primers ased on a consensus sequence described herein and is therefore designated the “consensus sequence primed polymerase chain reaction” (“CP-PCR”) method.
  • PCR polymerase chain reaction
  • the method of the invention involves the following steps:
  • the amplification products produced by the invention can be used to assemble genetic maps for genome analysis. Each of these steps is discussed in detail below.
  • the method of the present invention is particularly well suited to the generation of discrete DNA amplification products from nucleic acids obtained from genomes of all sizes from 5 x 10 4 nucleotide bases (viruses) to 3 x 10 9 bases and greater (animals and plants) .
  • Nucleic acids as that term is used herein means that class of molecules including single-stranded and double-stranded deoxyribonucleic acid (DNA) , ribonucleic acid (RNA) and polynucleotides.
  • the CP-PCR method can be applied to such economically important plants as rice, maize, and soybean. It can also be applied to the human genome and to the genome of a cultured cell line.
  • the cultured cell line can be chimeric with at least one human chromosome in an otherwise non-human background.
  • the non-human background can be rodent, such as mouse or Chinese hamster.
  • the DNA amplification products can be used to determine that an unidentified sample of an organism such as from a bacterium belongs to the genus Staphylococcus and can be used further to determine to which species and/or strain of that genus the organism belongs.
  • Genomic DNA is used in an art recognized manner to refer to a population of DNA that comprises the complete genetic component of a species.
  • genomic DNA comprises the complete set of genes present in a preselected species.
  • the complete set of genes in a species is also referred to as a genome.
  • genomic DNA can vary in complexity, and in number of nucleic acid molecules. In higher organisms, genomic DNA is organized into discrete nucleic acid molecules (chromosomes) .
  • a genome is significantly less complex than for a species high in the evolutionary scale.
  • E. coli is estimated to contain approximately 2.4 x 10 9 grams per mole of haploid genome
  • man contains about 7.4 x 10 12 grams per mole of haploid genome.
  • Genomic DNA is typically prepared by bulk isolation of the total population of high molecular weight nucleic acid molecules present in a biological material derived from a single member of a species. Genomic DNA can be prepared from a tissue sample, from a whole organism or from a sample of cells derived from the organism.
  • Exemplary biological materials for preparing mammalian genomic DNA include a sample of blood, muscle or skin cells, tissue biopsy or cells cultured from tissue, methods for isolating high molecular weight DNA are well known. See, for example, Maniatis et al., in Molecular Cloning: A Laboratory Manual,. Cold Spring Harbor Laboratory, N.Y. (1982); and U.S. Patent No. 4,800,159 to Mullis et al.
  • Rendering the nucleic acids of the genome accessible to priming requires that the nucleic acids be available for base-pairing by primers and that DNA polymerases and other enzymes that act on the primer-template complex can do so without interference.
  • the nucleic acids must be substantially free of protein that would interfere with priming or the PCR process, especially active nuclease, as well as being substantially free of nonprotein inhibitors of polymerase action such as heavy metals.
  • nucleic acids in a condition accessible to priming.
  • methods involve treatment of cells or other nucleic acid-containing structures, such as virus particles, with a protease such as proteinase K or pronase and a strong detergent such as sodium dodecyl sulfate (“SDS”) or sodium lauryl sarcosinate (“Sarkosyl”) to lyse the cells.
  • SDS sodium dodecyl sulfate
  • Sarkosyl sodium lauryl sarcosinate
  • This nucleic acid is then precipitated with ethanol and redissolved as needed. (See Example 1, infra) .
  • a small portion (- 0.5 mm 2 ) of a single bacterial colony can be removed with a 200- ⁇ L automatic pipette tip and suspended in 5 ⁇ L of TE (0.01 M Tris-HCl, pH 8.0, 1 mM EDTA) in a plastic microfuge tube and boiled for 5 minutes. After the sample is boiled, the debris is pelleted by centrifugation. The CP-PCR method can then be performed directly on the nucleic acids present in the supernatant sample after appropriate dilution.
  • a primer for use in this inventions is a consensus sequence polynucleotide primer, or consensus primer.
  • a consensus primer is a polynucleotide having a nucleotide sequence that comprises a region at its
  • the related genes from which a consensus primer is derived are a class of genes that occur in the genome as a cluster within the genome.
  • Clusters of related genes are known to occur for a variety of gene families, any of which are suitable as a source of related genes for deriving a consensus primer for use in the present invention.
  • Gene clusters are regions of a genome in which related genes are organized within a single nucleic acid molecule of the genome, i.e., are genetically linked. Gene clusters comprise two domains: (1) the nucleotide sequences that define each of the related genes that are members of the cluster, and (2) the nucleotide sequences that define the spacer region between each member of the related genes of the cluster.
  • the spacer region of a gene cluster is more variable in nucleotide sequence than the nucleotide sequence defining a member of a cluster.
  • Variability in the spacer regions of a gene cluster provides the polymorphisms that produce a fingerprint by the present methods which is characteristic of the organism being analyzed.
  • Variability in spacer regions can be manifest by differences in actual sequence, by differences in spacer length between members of the cluster, and even by differences in overall organization of the members of a cluster.
  • Organization of related genes in a cluster can vary both in the linear order of the members of the cluster on the nucleic acid molecule defining the cluster and in the orientation of each member of the cluster relative, to one another.
  • Typical and preferred gene clusters are the structural RNA families, namely the family of genes that encode transfer RNA (tRNA) molecules, and the family of genes that encode riboso al RNA (rRNA) molecules known as 28s, 16s and 5s rRNA's.
  • Other gene clusters are the linked genetic elements of an operon.
  • the tRNA gene cluster is particularly preferred and is exemplary of the general methods described herein.
  • sequence of the primer can vary widely, so long as it comprises a consensus sequence at it's 3' termini, some guidelines to primer selection are found in Innis and Gelfand, "Optimization of PCRs," in PCR Protocols: A Guide to Methods and Applications (M.A. Innis, D.H. Gelfand, J.J. Sninsky and T.J. White, eds.. Academic Press, New York, 1990), pp. 3-12, incorporated herein by this reference.
  • the primer typically has 50 to 60% G + C composition and is free of runs of three or more consecutive C•s or G's at the 3•-end or of palindromic sequences, although having a (G + C)-rich region near the 3'-end may be desirable.
  • G + C a region near the 3'-end
  • These guidelines are general and intended to be nonlimiting. Additionally, in many applications it is desirable to avoid primers with a T at the 3' end because such primers can prime relatively efficiently at mismatches, creating a degree of mismatching greater than desired, and affect the background amplification.
  • the CP-PCR method is based on the rationale that for any preselected gene cluster, which comprises at least two related and genetically linked genes, there is a spacer region between the linked genes which is variable and contains nucleotide sequence differences when compared to the same region from a different sub ⁇ species, species, genus, family or other evolutionary division of organisms having members of the gene cluster.
  • the consensus primer is selected to amplify one or more specific primer extension products that contain the nucleotide sequence of one or more of the spacer regions between two genes of a cluster.
  • the consensus primer amplifies DNA segments containing spacer regions because the primer is selected to provide a 3' terminus for primer extension that "points" the direction of primer extension across the spacer region.
  • the consensus sequence of the primer is selected such that there is a degree of homology with a consensus region within the individual members of a gene cluster that the primers can be expected to anneal to many consensus sequences contained within a variety of the members of a gene cluster. Some of these will be within a few hundred basepairs of each other and on opposite.strands thereby satisfying the requirements for PCR amplification. Thus, the sequences between these consensus sequence positions will be PCR amplifiable. The extent to which sequences amplify will depend on the efficiency of priming at each pair of primer annealing sites.
  • sequence of the primer is selected to contain some degree of homology with a consensus sequence with respect to the target nucleic acid sequence of the genome, a substantial degree of hybridization between the DNA strands of the primer and the target nucleic acids of the genome is expected to occur.
  • “Substantial degree of hybridization” is defined herein to mean in the context of a primer extension reaction thermocycle in which primer annealing occurs, that the hybridizing conditions favor annealing of homologous nucleotide sequences under "high stringency conditions".
  • the hybridizing conditions can be carried out under low stringency or intermediate stringency conditions so that up to 10% of the nucleotide bases of a primer sequence are paired with inappropriate
  • non-complementary bases in the target nucleic acid e.g. a guanine base in the primer is paired with an adenine base in the target nucleic acid.
  • the phrase "internal mismatching" in its various grammatical forms refers to non- complementary nucleotide bases in the primer, relative to a template to which it is hybridized, that occur between the 5*-terminal most and 3'-terminal most bases of the primer that are complementary to the template. Thus, 5'-terminal and/or 3'-terminal non- complementary bases are not “internally mismatched” bases.
  • a “substantial degree of "internal mismatching” is such that at least 6.5% of the nucleotide bases of the primer sequence are paired with inappropriate bases in the target nucleic acid.
  • the genome may be primed with a single consensus primer, a combination of two or more primers or a mixture of heterogeneous primers, each individual primer in the mixture having a different, but related sequence.
  • the consensus primer is about 10 to about 50 nucleotide bases long, and more preferably, about 17 to about 40 bases long.
  • the primer can be of any sequence so long as it comprises a consensus sequence as defined herein.
  • the primer can have sequence redundancies reducing the occurrence of mismatches.
  • both the template and the primer are DNA.
  • the template can also be single-stranded RNA molecules, for example messenger RNA, in which case an enzyme with reverse transcriptase activity, such as avian myeloblastosis virus (AMV) reverse transcriptase or Moloney murine leukemia virus (Mo-MLV) reverse transcriptase, is used to generate a hybrid DNA-RNA molecule with an arbitrary primer or a poly T primer. The DNA strand of this hybrid DNA-RNA molecule is then used as the starting material for AP-PCR.
  • the primer can also be a single- stranded ribonucleotide of the appropriate length. which is extended at its 3'-hydroxyl terminus by reverse transcriptase, forming a double-stranded molecule in which one strand is partially DNA and partially RNA.
  • AMV avian myeloblastosis virus
  • Mo-MLV Moloney murine leukemia virus
  • tRNA genes occur in multiple copies dispersed throughout the genome in most species and tend to be clustered. McBride et al., Genomics. 5:561-573 (1989) . In E. coli. there is estimated to be at least 100 tRNA genes of which about 30 are mapped. Jinks- Robertson et al, in "E. coli and S. typhimurium", Neidhardt, ed., ASM Press, Washington, pp. 1358-1385 (1989) .
  • mapped genes About half of the mapped genes are in seven clusters of two to seven genes per cluster, with a spacing of genes being variable ranging from 10 to 200 basepair ⁇ .
  • the genes are generally arranged in a head to tail fashion and are, at least in some cases, organized into operons.
  • Bacillus subtilis Photobacterium phosphoreu and Spiroplasma the genes are more tightly clustered. See, Void, Microbiol. Rev.. 49:71-90 (1985); Giroux et al., J. Bacteriol.. 170:5601-5606 (1988); and Rogers et al. , Isreal J. Med. Sci.. 20:768-772 (1984), respectively.
  • subtilis there are two main clusters consisting of 16 and 21 tRNA genes.
  • One operon in P. phosphoreum has eight genes and five tRNA Pro pseudogenes, all in less than 1,500 base pairs.
  • In the human nuclear genome there are estimated to be 1300 genes and a large number of tRNA pseudogenes. At least seven clusters are on seven different chromosomes. McBride et al., Genomics. 5:561-573 (1989) .
  • the tRNA genes are not in operons, being oriented in all possible directions within clusters.
  • Organelle genomes are much smaller than nuclear genomes but nevertheless encode tRNA genes for a more redundan t genetic code.
  • organelle genomes are much smaller than nuclear genomes but nevertheless encode tRNA genes for a more redundan t genetic code.
  • the very small (circa 16,000 bp) animal mitochondrial DNA has 22 tRNA genes, some of which are closely spaced. Fox, Ann. Rev. Genet.. 21:67-91 (1987).
  • Chloroplast and mitochondrial DNA from plants are generally more complex than those in animals and often have more tRNA genes.
  • the 121,024 base pair chloroplast genome of liverwort has 36 tRNA genes, a few of which are clustered.
  • the yeast mitochondrial genome has at least 25 tRNA genes in 78,000 base pairs. Ohyama et al. , Nature. 322:572-574 (1986) .
  • Consensus tRNA primers for use in the present methods were developed using the known tRNA sequences available in nucleotide sequence databases. Presently, over 500 tRNA sequences are known.
  • a primer sequence is a consensus sequence if at least ive nucleotide ⁇ at the 3' end of the primer are a perfect match in homology to at least several members of a set of possible members of the family of genes in the cluster, together with extensive homology in the rest of the primer as described herein.
  • the match is to at least 10 percent of the members where the set comprises at least one tRNA gene for each natural amino acid (i.e., a set size of 21 tRNA genes) . More preferably, the match is to at least 30 percent, and still more preferably at least 50 percent, of the members of a set comprising 21 members.
  • a consensus sequence contains the most common nucleotide of the set at any given position or the next most common nucleotide of the set that occurs at that same position in excess of 10 percent, and more preferably 30 percent, of the members where the set comprises 21 members.
  • a tRNA consensus sequence can be defined in terms of a subset of tRNA genes, with the objective being to design a primer that produces a fingerprint that is more specific to a particular genome.
  • the consensus can be limited to a family of tRNA genes for a particular amino acid, such as phenylalanine, i.e., tRNA-Phe, where the tRNA- Phe genes are from a single genome, or from a family of related species, or related genera.
  • a consensus sequence was developed by comparing the tRNA genes for a complete set of natural amino acids (21 members) from the genome of the bacteria Bacillus.
  • the consensus although derived from Bacillus, produced consensus primers that would produce fingerprints in species of organisms across the kingdoms including plant, bacteria and animal.
  • the selection of the consensus sequence is not critical to the general method of the present invention, so long as the sequence represents a consensus in the manner defined herein. However, the selection of a particular consensus sequence, and the extent of homology contained in the sequence will alter the particular fingerprint observed for a particular genome.
  • T5A also has at least 10 of the remaining 19 Nucleotide bases perfectly matched with those 8 tRNA genes.
  • Consensus primer T5A has the nucleotide base sequence shown in Table 1 and is derived from the complement of a consensus sequence at the 5' terminal region of the Bacillus tRNA genes. Therefor the consensus primer faces out from a tRNA gene in the 5' to 3 • direction across a spacer region located upstream from any tRNA gene having a sequence match to which T5A can hybridize.
  • Consensus primers T5B, T3 and T3B were fashioned in a similar manner as above for T5A. Namely, tRNA genes were aligned for all 21 amino acid tRNA genes, and "best match" consensus sequences were developed.
  • T5B like T5A, faces out and upstream from the 5' end of the consensus sequence, whereas T3A and T3B represent consensus sequences at the 3' end of the tRNA gene facing out and downstream from the 3' end of the tRNA gene and will primer extend across the spacer region downstream from a tRNA gene.
  • tRNA consensus primers T5A, T5B, T3A and T3B are shown in Table 1 below and are written in the direction of 5' to 3•.
  • the consensus primers shown on Table 1 are particularly preferred and are used as exemplary of the methods of the present invention.
  • consensus tRNA primers can be designed so long as the resulting sequence contains a minimum of a three base perfect match, and preferably at least a 5 base perfect match, at the 3' end of the consensus primer, together with a minimum of at least a 50 percent match with the next 15 bases of the primer adjacent to the perfect match region of the primer, when the consensus sequence is compared to at least ten of the 21 amino acid tRNA genes for a given genome.
  • a consensus tRNA primer can have a nucleotide sequence at its 3•-terminus corresponding to the sequence 5'-CTGAG-3' , 5'-GAACT- 3' , 5'-CCCCA-3' , or 5'-AATCC-3• , which sequences are derived from the tRNA primers utilized.in Example 2.
  • These consensus primers have a length of at least fifteen nucleotides, and additionally have about 30 percent homology, and preferably at least fifty percent homology, to a consensus tRNA sequence in the fifteen nucleotides located at the 3'terminus of the primer.
  • a consensus primer can also be a nucleotide sequence based on another consensus primer of this invention but being progressively truncated at the 5' end.
  • a single tRNA consensus primer can provide the requisite "pair" of primers for a PCR amplification product to form in those cases where separate primer molecules having the same consensus sequence independently hybridize to opposite polarity strands of the target genome.
  • This independent hybridization is possible for several reasons.
  • the target genome can contain tRNA genes in opposite relative orientations on the genome so that two tRNA genes each provide a consensus sequence such that the resulting hybridized primers "fact out” from their respective tRNA genes and at the same time "face toward” each other.
  • both the 5* and 3' ends of tRNA gene exhibit some degree of homology, which is well known and contributes to the classical "cloveleaf" structure of a tRNA molecule.
  • the same homology that allows.for the 5' and 3' ends of a tRNA molecule to self-hybridize is available to hybridize to a consensus primer selected to have homology to both a 3' and 5' end of the tRNA gene.
  • Exemplary of single consensus tRNA primers useful to produce a characteristic fingerprint are the primers T5A and T3A, each described in Example l.
  • tRNA gene cluster is a preferred gene cluster for practicing the fingerprint characterization methods described herein. Due to the nature of evolutionary genetics and the size and organization of tRNA gene clusters, tRNA polymorphisms observed by the present fingerprinting methods are variances due to dif erences in spacer length and sequence content rather than di ferences in overall cluster organization.
  • the results using consensus tRNA primers indicates that the fingerprint method of CP-PCR is most useful to make identification of a particular genus, although in some cases consensus tRNA primers can be used to distinguish species. See, for example Figure 1.
  • c. conserved Ribosomal RNA Primers The gene cluster formed by the ribosomal RNA (rRNA) gene family is less complex than the tRNA gene cluster (having fewer members) and is considerably more well conserved than the tRNA gene cluster both at the level of organization of the cluster, and at the level of the sequences within the rRNA genes.
  • the rRNA cluster is comprised of three major species, the 28s, the 16s and the 5s rRNA genes.
  • the "consensus” in the case of the rRNA cluster is not a similar nucleotide sequence found that is common to the members of the cluster as was the case for the tRNA gene family members. Rather, the "consensus” is found between species. For example, the 5' end of the 28s rRNA gene is evolutionarily conserved between all genera across the bacterial kingdom to the extent that consensus primers can be defined from the 28s rRNA gene that produce family specific, order specific and even kingdom specific fingerprints when using consensus rRNA primers in the present invention.
  • polymorphisms observed using consensus rRNA primers are at the level of family, order and higher evolutionary categories than the polymorphisms observed for consensus tRNA primers. This is primarily due to the fact that rRNA gene clusters are smaller and more evolutionarily conserved than tRNA clusters. d. Use of Mixtures of Primers
  • primers are constructed to avoid self-priming internally and the creation of artifacts.
  • a heterogeneous mixture of primers may contain some primers that match with the consensus sequences on target nucleic acids in a manner that provides a more distinct fingerprint.
  • the use of such primers may allow the initial priming steps to be performed at a higher temperature (higher stringency) or might allow a consistency of pattern over a wider range of template concentrations.
  • the primers are used simultaneously in the same CP-PCR reaction. These combinations provide a very different pattern from that produced by each primer alone. See, for example, the difference observed in Figure 1 between group a, group b and group c utilizing one, one, and two primers, respectively. Therefore, a combination of primers provides a different fingerprint than is generated by using each individual primer alone. When primers are used in such combinations, only primer pairs that do not produce a primer artifact can be used.
  • a mixture of primers comprises two or more primers where the nucleotide sequences of each primer are substantially identical, except that a few nucleotide bases differ at a single position or at two positions.
  • These mixtures contain consensus primers that each individually exhibit consensus matches with different subsets of members of a gene cluster.
  • the use of two primers produces a distinct pattern that is typically more complex and therefore more characteristic of any given genome than the use of either primer alone.
  • the quantity of the nucleic acid genome used in the CP-PCR amplification depends on the complexity of the particular genome used.
  • Simple genomes such as bacterial genomes have a genome size of less than about 5 million base pairs (5 megabases) .
  • Complex genomes, such as sativa species (rice) have a genome size of about 700-1000 megabases.
  • Other complex genomes such as maize or humans have a genome size of about 3000 megabases.
  • the amount of simple genome nucleic acid used as template is from about 10 pg to about 250 ng. preferably from about 30 pg to about 7.5 ng. Most preferred is an amount of simple genome nucleic acid template of about 1 ng.
  • the amount of nucleic acid of a complex genome used as a template is from about 250 ng to about 0.8 ng. More preferably, the amount of nucleic acid of a complex genome used as template is from about 51 ng to about 0.8 ng. Most preferred, are amounts complex genome nucleic acid template of about 50 ng to about 10 ng.
  • the priming step is carried out as part of the PCR amplification process, and the conditions under which it is performed are discussed below under "Performance of PCR.” D. Performance of PCR
  • the present invention utilizes an amplification method where the single- stranded template is hybridized with a primer or primers to form a primer-template hybridization product or products.
  • a hybridization reaction admixture is prepared by admixing effective amounts of a primer, a template nucleic acid and other components compatible with a hybridization reaction. Templates of the present methods can be present in any form, with respect to purity and concentration, compatible with the hybridization reaction.
  • the hybridization reaction mixture is maintained under hybridizing conditions for a time period sufficient for the primer(s) to hybridize to the templates to form a hybridization product, i.e., a complex containing primer and template nucleic acid strands.
  • hybridizing conditions when used with a maintenance time period, indicates subjecting the hybridization reaction admixture, in the context of the concentrations of reactants and accompanying reagents in the admixture, to tine, temperature and pH conditions sufficient to allow the primer(s) to anneal with the template, typically to form a nucleic acid duplex.
  • Such time, temperature and pH conditions required to accomplish hybridization depend, as is well known in the art, on the length of the primer to be hybridized, the degree of complementarity between the primer and the template, the guanidine and cytosine content of the polynucleotide the stringency of hybridization desired, and the presence of salts or additional reagents in the hybridization reaction admixture as may affect the kinetics of hybridization.
  • Methods for optimizing hybridization conditions for a given hybridization reaction admixture are well known in the art.
  • primer refers to a polynucleotide, whether purified from a nucleic acid restriction digest or produced synthetically which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a template is induced, i.e., in the presence of nucleotides and an agent for polymerization such as DNA polymerase, reverse transcriptase and the like, and at a suitable temperature and pH.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the agents for polymerization.
  • the exact lengths of the primers will depend on many factors, including temperature and the source of primer.
  • a polynucleotide primer typically contains from about 8 to about 30 or more nucleotides, although it can contain fewer nucleotides. As few as 8 nucleotides in a polynucleotide primer have been reported as effective for use. Studier et al., Proc. Natl. Acad Sci. USA. 86:6917-21 (1989). Short primer molecules generally require lower temperatures to form sufficiently stable hybridization complexes with template to initiate primer extension.
  • the primers used herein are selected to be “substantially" complementary to the different strands of each specific sequence to be synthesized or amplified. This means that the primer must contain at its 3• terminus a nucleotide sequence sufficiently complementary to nonrandomly hybridize with its respective template. Therefore, the primer sequence may not reflect the exact sequence of the template.
  • a non-complementary polynucleotide can be attached to the 5' end of the primer, with the remainder of the primer sequence being substantially complementary to the strand.
  • Such noncomplementary polynucleotides might code for an endonuclease restriction site or a site for protein binding.
  • noncomplementarity bases or longer sequences can be interspersed into the primer, provided the primer sequence has sufficient complementarity with the sequence of the strand to be synthesized or amplified to non-randomly hybridize therewith and thereby form an extension product under polynucleotide synthesizing conditions.
  • Sommer et al., Nuc. Acid Res.. 17:6749 (1989) reports that primers having as little a 3 nucleotide exact match at the 3• end of the primer were capable of specifically initiating primer extension products, although less nonspecific hybridization occurs when the primer contains more nucleotides at the 3 • end having exact complementarity with the template sequence.
  • a substantially complementary primer as used herein must contain at its 3' end at least 3 nucleotides having exact complementarity to the template sequence.
  • a substantially complementary primer preferably contains at least 8 nucleotides, more preferably at least 18 nucleotides, and still more preferably at least 24 nucleotides, at its 3' end having the aforementioned complementarity. Still more preferred are primers whose entire nucleotide sequence has exact complementarity with the template sequence.
  • primers nucleotide sequence depends on factors such as the distance from the region coding for the desired specific nucleic acid sequence present in a nucleic acid of interest and its hybridization site on the nucleic acid relative to any second primer to be used.
  • the primer is preferably provided in single- stranded form for maximum efficiency, but may alternatively be double-stranded. If double-stranded, the primer is first treated to separate its strands before being used to prepare extension products, preferably, the primer is a oligodeoxyribonucleotide.
  • Primers can be prepared by a variety of methods including de novo chemical synthesis and derivation of nucleic acid fragments from native nucleic acid sequences existing as genes, or parts of genes, in a genome, plasmid, or other vector, such as by restriction endonuclease digest of larger double- stranded nucleic acids and strand separation or by enzymatic synthesis using a nucleic acid template.
  • De novo chemical synthesis of a primer can be conducted using any suitable method, such as, for example, the phosphotriester or phosphodiester methods. See Narang et al., Meth. Enzymol. , 68:90 (1979); U.S. patent No. 4,356,270; Itakura et al., Ann. Rev. Biochem.. 53:323-56 (1989); and Brown et al., Meth. Enzvmol.. 68:109 (1979).
  • Derivation of a primer from nucleic acids involves the cloning of a nucleic acid into an appropriate host by means of a cloning vector, replication of the vector and therefore multiplication of the amount of the cloned nucleic acid, and then the isolation of subfragments of the cloned nucleic acids.
  • a cloning vector for a description of subcloning nucleic acid fragments, see Maniatis et al.. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, pp. 390-401 (1982); and see U.S. Patents No. 4,416,988 and No. 4,403,036.
  • the primed template is used to produce a strand of nucleic acid having a nucleotide sequence complementary to the template, i.e., a template- complement.
  • the template is subjected to a first primer extension reaction by treating (contacting) the template with a (first) primer.
  • the primer is capable of initiating a primer extension reaction by non- randomly hybridizing (annealing) to a template nucleotide sequence, preferably at least about 8 nucleotides in length and more preferably at least about 20 nucleotide in length. This is accomplished by mixing an effective amount of the primer with the template and an effective amount of nucleic acid synthesis inducing agent to form a primer extension reaction admixture.
  • the admixture is maintained under polynucleotide synthesizing conditions for a time period, which is typically predetermined, sufficient for the formation of a primer extension reaction product.
  • the primer extension reaction is performed using any suitable method.
  • polynucleotide synthesizing conditions are those wherein the reaction occurs in a buffered aqueous solution, preferably at a pH of 7-9, most preferably about 8.
  • a molar excess for genomic nucleic acid, usually about 10 6 :1 primer:template
  • a large molar excess is preferred to improve the efficiency of the process.
  • polynucleotide primers of about 20 to 25 nucleotides in length a typical ratio is in the range of 50 ng to 1 ug, preferably 250 ng, of primer per 100 ng to 500 ng of mammalian genomic DNA.
  • the deoxyribonucleotide triphosphates (dNTPs) dATP, dCTP, dGTP, and dTTP are also admixed to the primer extension reaction admixture in amounts adequate to support the synthesis of primer extension products, and depends on the size and number of products to be synthesized.
  • the resulting solution is heated to about 90°C-100 ⁇ C for about 1 to 10 minutes, preferably from 1 to 4 minutes. After this heating period the solution is allowed to cool to room temperature, which is preferable for primer hybridization.
  • an appropriate agent for inducing or catalyzing the primer extension reaction is allowed to occur under conditions known in the art.
  • the synthesis reaction may occur at from room temperature up to a temperature above which the inducing agent no longer functions efficiently.
  • the temperature is generally no greater than about 40°C unless the polymerase is heat-stable.
  • the inducing agent may be any compound or system which will function to accomplish the synthesis of primer extension products, including enzymes.
  • Suitable enzymes for this purpose include, for example, E. coli. DNA polymerase I, Klenow fragment of E. coli DNA polymerase I., T4 DNA polymerase, T7 DNA polymerase, recombinant modified T7 DNA polymerase described by Tabor et al., U.S. Patent Nos. 4,942,130 and 4,946,786, other available DNA polymerases, reverse transcriptase, and other enzymes, including heat-stable enzymes, which will facilitate combination of the nucleotides in the proper manner to form the primer extension products which are complementary to each nucleic acid strand.
  • Heat-stable DNA polymerases are particularly preferred as they are stable in a most preferred embodiment in which PCR is conducted in a single solution in which the temperature is cycled.
  • Representative heat-stable polymerases are the DNA polymerases isolated from Bacillus stearothermophilus (Bio-Rad) , Thermus thermophilous (FINZYME, ATCC #27634) , Thermus species (ATCC #31674) , Thermus aquaticus strain TV 11518 (ATCC 25105) , Sulfolobus acidocaldarius. described by Bukhrashuili et al., Biochem. Biophvs. Acta. 1008:102-7 (1909) and by Elie et al., Biochem. Biophys. Actz. 951:261-7 (1988), and Thermus filiformis (ATCC #43280) .
  • Taq DNA polymerase available from a variety of sources including Perkin-Elmer-Cetus, (Norwalk, CT) , Promega (Madison, WI) and Stratagene (La Jolla, CA) , and AmpliTaq DNA polymerase, a recombinant Thermus aquaticus Taq DNA polymerase available from Perkin-Elmer-Cetus and described in U.S. Patent No. 4,889,818.
  • the synthesis will be initiated at the 3' end of each primer and proceed in the 5' direction along the template strand, until synthesis terminates, producing molecules of different lengths. There may be inducing agents, however, which initiate synthesis at the 5' end and proceed in the above direction, using the same process as described above.
  • the primer extension reaction product is then subjected to a second primer extension reaction by treating it with a second polynucleotide primer having a preselected nucleotide sequence.
  • the second primer is capable of initiating the second reaction by hybridizing to a nucleotide sequence, preferably at least about 8 nucleotides in length and more preferably at least about 20 nucleotides in length, found in the first product. This is accomplished by mixing the second primer, preferably a predetermined amount thereof, with the first reaction product, preferably a predetermined amount thereof, to form a second primer extension reaction admixture.
  • the admixture is maintained under polynucleotide synthesizing conditions for a time period, which is typically predetermined, sufficient for the formation of a second primer extension reaction product.
  • the first and second primer extension reactions are the first and second primer extension reactions in a polymerase chain reaction (PCR) .
  • PCR is carried out by simultaneously cycling, i.e., performing in one admixture, the above described first and second primer extension reactions, each cycle comprising polynucleotide synthesis followed by separation of the double-stranded polynucleotides formed.
  • PCR is preferably performed using a distinguishable variation of the standard protocol as described in U.S. Patent No. 4,683,192, No. 4,683,202, No. 4,800,159 and No. 4,965,188 to Mullis et al., and No. 4,889,818 to Gelfand et al., and in the Innis & Gelfand reference described above, employing only one primer.
  • the principles of the PCR process have been described under "Background of the Invention," supra.
  • the DNA polymerase used in CP-PCR is the thermostable DNA polymerase purified from Thermus aguaticus and known as Taq I. However, other heat- stable DNA polymerases can be used.
  • a PCR thermocycle is the changing of a PCR admixture from a first temperature to another temperature and then back to the first temperature. That is, it is cycling the temperature of the PCR admixture within (up and down through) a range of temperatures.
  • the change in temperature is not linear with time, but contains periods of slow or no temperature change and periods of rapid temperature change, the former corresponding to, depending on the temperature, a hybridization (annealing) , primer extension or denaturation phase, and the later to temperature transition phases.
  • PCR amplification is performed by repeatedly subjecting the PCR admixture to a PCR temperature gradient where the gradient includes temperatures where the hybridization, primer extension and denaturation reactions occur.
  • Preferred PCR temperature gradients are from about 35 ⁇ C to about 94 ⁇ C, from about 40°C to about 94 ⁇ C, and from about 50°C to about 94 ⁇ C.
  • cycles of PCR are performed under high stringency conditions using a consensus primer of this invention.
  • these cycles are generally performed so as to have the following phases: 94°C for 30 seconds for denaturation, the high stringency annealing temperature for 30 seconds, and 72 ⁇ C for 2 minutes for extension.
  • thermostable DNA polymerases can be used, in which case the denaturation, high stringency annealing, and extension temperatures are adjusted according to the thermostability of the particular DNA polymerase.
  • the high stringency annealing temperature is about the melting temperature of the double- stranded DNA formed by annealing, about 35°C to about 65°C, generally greater than about 55°C, and preferably about 60°C.
  • at least on initial cycle of PCR is conducted under low stringency annealing conditions as a first step, followed by a second step which comprises the thermocycles described above under high stringency conditions.
  • the annealing temperature in the second step is greater than the annealing temperature in the first step.
  • the annealing temperature in the second step is lower for shorter primers, because the melting temperature of short double-stranded helices is decreased. Conversely, it is higher for longer primers.
  • At least one initial cycle of PCR is performed, starting with the at least one consensus primer and the genomic nucleic acids to be amplified.
  • Tag I polymerase the initial cycle(s) of PCR are performed under "low stringency annealing conditions".
  • stringency refers to the degree of mismatch tolerated during hybridization of the primer and template; the higher the stringency, the less mismatch is tolerated.
  • one to five cycles of amplification are performed under these conditions. These cycles are generally performed so as to have the following phases: 94°C for 5 minutes to denature, 5 minutes at the low stringency annealing temperature, and 72°C for 5 minutes for extension. More preferably, one to four or one to three low stringency amplification cycles are performed. Most preferred, are one to two low stringency amplification cycles.
  • the low stringency annealing temperature can be from about 30°C to about 55°C, preferably from about 35°C to about 55 ⁇ C, and more preferably from about 40 ⁇ C to about 48 ⁇ C.
  • thermostable DNA polymerase describing the thermostable DNA polymerase of Thermus ac ⁇ aaticus.
  • thermostable DNA polymerases present in any of the thermophillic bacteria is well known and described in U.S. Patent No. 4,889,818.
  • CP-PCR conditions will depend upon the particular thermostable DNA polymerase used for the amplification reaction and are typically optimized for that particular thermostable DNA polymerase.
  • Effective amounts of the primer(s) and target nucleic acid are admixed in an aqueous PCR buffer that includes an effective amount of an inducing agent, an effective amount of each dNTP.
  • the buffer typically contains an effective amount of Taq, 50 mM KCl, 10 mM Tris-HCl, pH 8.4, 4 mM MgCl 2 , and 100 ⁇ g/ml gelatin.
  • Each deoxyribonucleoside triphosphate i.e., A, T, G and C
  • any particular sequence can be amplified by CP-PCR depends on three general factors: (1) the frequency of priming at flanking sites; (2) the ability of the DNA polymerase used, typically Taq polymerase from Thermus aquaticus. to extend the template completely; and (3) the total number of productive cycles.
  • E. Comparison of the DNA Amplification Products With Those Produced From Known Genomes If the object of the performance of the CP-PCR method is to identify the genome from which the discrete products were produced, the DNA amplification products (fingerprints) obtained from a sample are compared with the amplification products resulting from the performance of CP-PCR on nucleic acids isolated from known genera, species, subspecies and/or strains using the same primer or mixture of primers, in separate reactions.
  • the samples selected for comparison depend on the expected identification of the test (isolate) organism of unknown genome.
  • identification of an organism of an unknown bacterial genome can be narrowed down by means of the site of infection or other clinical factors.
  • the presence of a wound infection may suggest that the test organism is a member of the genus Staphylococcus.
  • the unknown organism might be Staphylococcus.
  • various species of Staphylococcus could be screened simultaneously as in a panel of preselected DNA samples, such as S. haemolvticus. S. hominis . S. aureus, S. warneri and S. cohnii ; or multiple strains for each species could be used.
  • the unknown organism might be a strain of Streptococcus .
  • the samples selected for comparison are various predetermined and identified strains of Streptococcus. If the unknown organism is a bacterium of enteric origin, various strains of Escherichia. Klebsiella. Enterobacter. Serratia. Salmonella. Shigella. Proteus and Providencia are used. Additional bacterial genera of clinical relevance could also be included in the panel such as a Clostridium or Pseudomonas.
  • CP-PCR can be used effectively to reveal a prior misassignment of a strain. Strains that have been assigned to the wrong species are very rapidly uncovered by the CP-PCR method. Typically, when
  • CP-PCR is used to verify the assignment of a bacterial isolate to a species, the primer is chosen to maximize interspecific difference of the discrete DNA amplification products generated by CP-PCR. Primers for this application typically exclude regions substantially complementary to regions of DNA highly conserved between the species being studied.
  • the comparison between the CP-PCR products of the organism of unknown genome and those produced from known genomes is typically performed by separating the discrete DNA amplification products in an apparatus containing a medium capable of separating DNA fragments by size in order to produce a "fingerprint" of the amplification products as separated bands, and then comparing the fingerprint patterns.
  • the fingerprint patterns are diagnostic of the genus, species, and/or strain to which the test organism of unknown genome belongs.
  • separation is carried out by electrophoresis, for example, using gel electrophoresis on agarose or polyacrylamide gels to display the resulting DNA products for visual examination.
  • Many protocols for electrophoresis are known in the art; see U.S. Patent No. 4,729,947 and B.
  • the production of a fingerprint for comparison typically comprises the steps of (1) applying the set of discreet DNA segments produced in a PCR reaction into a channel of the medium in the separating apparatus and (2) separating the discrete DNA segments according to size (size-separating) into bands within the channel to form a fingerprint of the DNA segments characteristic of the genome.
  • One such representative technique is electrophoresis through 5% polyacrylamide containing 50% urea. The concentration of acrylamide is varied according to the size of the products to be resolved.
  • Commercially available size markers typically derived from the digestion of a plasmid or phage of known sequence with a restriction enzyme are added to the gel.
  • the individual bands present in the fingerprint are detected by various techniques, such as ethidium bromide staining.
  • At least one of the deoxyribonucleotide triphosphate monomers used in the second stage of the reaction can be radioactive, allowing detection of the bands of the fingerprint by autoradiography, or the primer itself can be radioactively labeled by treatment with an appropriate kinase.
  • fluorescent nucleotides can be incorporated and detection carried out by means of fluorescence.
  • Isolated separated fragments can be cleaved with a restriction endonuclease capable of generating polymorphisms, such as Taol or Mspl. Separated fragments produced by CP-PCR and resolved on gels can also be isolated from the gel and reamplified in a conventional PCR procedure to increase the quantity of the isolated band. Isolated fragments can, if desired, be cloned in a bacterial host, typically a strain of Escherichia coli. capable of preserving the integrity of any genetically unstable DNA structures such as long, direct and inverted repeats. Such cloned bands then can be sequenced by well-known, conventional techniques, such as the Sanger dideoxynucleotide sequencing technique or the
  • Maxam-Gilbert chemical cleavage sequencing technique For many procedures, such as the preparation of DNA probes, it is not necessary either to clone or recut the DNA fragments amplified by CP-PCR and isolated from the gel. Such fragments can be used as probes after further amplification by conventional PCR during which radioactive nucleotides are incorporated in the amplified fragments.
  • Staphylococcus is a human pathogen and frequently responsible for serious infections occurring in surgical patients. Accordingly, rapid identification of Staphylococcus species is particularly important in a clinical setting.
  • the discrete DNA amplification products produced from the sample of DNA from the test organism are compared with the DNA amplification products produced from known Staphylococcus species when the same primer is used.
  • CP-PCR can be used to identify particular strains of Streptococcus.
  • the discrete DNA amplification products produced from the test organism of unknown strain are compared with the DNA ampli ication products produced from DNA of known Streptococcus strains when the same primer is used.
  • Streptococcus is also an important human pathogen, causing potentially severe infections of the skin and mucous membranes, and its rapid identification is clinically important.
  • the DNA sequences that represent polymorphisms differing from individual to individual of a species obtained from application of the CP-PCR method of the invention are useful in genetic mapping of eukaryotes, including plants such as maize and soybeans, animals, and humans.
  • CP-PCR can be used to reveal polymorphisms based on the CP-PCR fingerprint.
  • polymorphisms are particularly useful for genetic mapping.
  • the polymorphisms generated can be correlated with other markers such as restriction fragment length polymorphisms (RFLPs) , which in turn have been linked to genetic markers of known function.
  • RFLPs restriction fragment length polymorphisms
  • a RFLP is a detectable difference in the cleavage pattern of DNA from different individuals of a particular species when that DNA is cleaved with a particular restriction endonuclease. Such differences arises when a mutation affects the sequence cut by the enzyme, removing a site previously present or adding a new site.
  • CP-PCR can be used to track genetic differences in rice, with a 600-megabase haploid genome (Example 2) and in maize, with a 3000-megabase haploid genome (Example 2) .
  • Maize has a genomic complexity comparable to that of the human genome. Similar results are expected with soybeans.
  • the heterozygosity of the maize genome has been estimated to be about 0.05.
  • Each primer used in the CP-PCR method can probably detect more than one polymorphism between strains at that level of heterozygosity.
  • Phenotypes can be scored in a number of ways, including morphological features and molecular features, such as electrophoretic mobility on proteins and variations in intensity of proteins on two-dimensional gels (Higginbotham et al. , "The Genetic Characterization of Inbred Lines of Maize (Zea mays L.) Using Two Dimensional Protein Profiles,"
  • the CP-PCR map will automatically orient itself with respect to the genetic map.
  • Such physical linkage can be studies by pulsed field electrophoresis (PFE) .
  • PFE pulsed field electrophoresis
  • restriction endonucleases making rare cuts, PFE, and Southern blotting to maize or soybean DNA and probing with genetically linked CP-PCR probes
  • the size of the physical region for large fragments of chromosomes isolated by PFE can be compared with the rate of recombination.
  • Analogous techniques can be employed for mapping the mouse or human genome. This is of interest because recombination is not equal throughout the genome.
  • the CP-PCR method is particularly suitable for this purpose because a great many markers can, in principle, be identified for an area of interest.
  • the number of individual progeny from crosses that can be inspected and the amount of polymorphism in each marker determines the accuracy with which markers can be mapped.
  • the segregation of polymorphisms revealed by the CP-PCR method in the context of the RFLPs that are already mapped improves the ability to measure genetic distance between them.
  • Computer programs are available for genetic linkage analysis including LIPED (Ott, Amer. J. Human Genet.. 28:528:529 (1976) for two point linkage analysis, ILINK and CILINK from the LINKAGE package (Lathrop et al., Proc. Natl. Acad. Sci. USA.
  • RFLPs that have been linked to interesting genetic markers can be correlated with the CP-PCR map.
  • tightly linked flanking RFLP markers have been found for the Mdml gene on chromosome 6S in maize. This gene is involved in resistance to Maize Dwarf Mosaic Virus (MDMV) (McMullen & Louie, Mol. Plant-Microbe Interactions 2., 309 (1989)).
  • MDMV Maize Dwarf Mosaic Virus
  • RFLPs themselves can be generated from CP-PCR fingerprints.
  • TaqI restriction endonuclease which recognizes the site TCGA, will cleave CP-PCR products in which there is at least one TaqI site. If a TaqI site is present in one of the CP-PCR fingerprint products in some individuals but not in others, there will be a difference in the fingerprint of TaqI digested DNA from these individuals. This allows the detection of TaqI RFLPs from CP-PCR patterns.
  • Such TaqI RFLPs are among the most common RFLPs known in the genome because the TaqI recognition site contains the hypermutable dinucleotide CpG.
  • Mspl digests cut at the recognition site of CCGG, can be used to detect the relatively abundant Mspl polymorphisms.
  • RFLPs can be either mapped directly in families by genetic mapping or cut out of gels and amplified with radioactively labeled deoxyribonucleoside tripho ⁇ phates, such as ⁇ -labeled triphosphate ⁇ , in conventional PCR to u ⁇ e them to probe Southern blots of the appropriately cleaved human DNAs.
  • the extracted fragments can be recut with the same enzyme following extraction.
  • the bands isolated from CP-PCR fingerprints can be cloned and sequenced. Preferably, such bands should be cloned in Sure E. coli (Stratagene, Cloning Systems, San Diego, California) to preserve the integrity of terminal repeats.
  • the CP-PCR method of the invention permits genetic mapping of DNA polymorphisms in mammals without having to first identify RFLP probes.
  • Each polymorphic band in the fingerprint produced by the method represents a heritable characteristic. No clones must be made or plasmids purified. Polymorphisms can be generated by almost any primer selected.
  • the technique requires less than 1/100 of the amount of genomic DNA per lane compared to that needed to prepare a Southern blot for conventional RFLP analysis.
  • the method can use ethidium detection, fluorescent detection or only small amounts of labeled bases relative to Southern hybridization.
  • CP-PCR generated DNA polymorphisms can be isolated directly from gels and reamplified to use as probes in "genome walking" or restriction mapping strategies without cloning. Sequencing of some of these polymorphism ⁇ will al ⁇ o not require cloning.
  • One approach for u ⁇ ing the CP-PCR method in human genetics can produce products as ⁇ ignable to the human fragment in a somatic cell hybrid. As long as the recipient is the same for a set of hybrids, the product ⁇ that will be different from a non-hybrid control CP-PCR will be the human fragments. Such bands would assign the human fragment on the genetic map if the band was already genetically assigned. Al ⁇ o, ⁇ uch bands can be isolated from the gel and used to make a DNA probe.
  • reagents described herein e.g., nucleic acids such as primers, vectors, and the like
  • the reagents described herein can be packaged in kit form.
  • the term "package" refers to a solid matrix or material customarily utilized in such a kit system in the form of an enclo ⁇ ure that is capable of holding within fixed limits one or more of the reagent components for u ⁇ e in a method of the pre ⁇ ent invention.
  • Such materials include glass and plastic (e.g., polyethylene, polypropylene and polycarbonate) bottles, vials, paper, plastic and plastic-foil laminated envelope ⁇ and the like.
  • a package can be a glass vial used to contain the appropriate quantities of polynucleotide primer( ⁇ ) , genomic DNA, vector ⁇ , re ⁇ triction enzyme(s) , DNA polymerase, DNA ligase, or a combination thereof.
  • An aliquot of each component ⁇ ufficient to perform at lea ⁇ t one PCR thermocycle will be provided in each container.
  • Kits useful for producing a primer extension product for amplification of a specific nucleic acid sequence using a primer extension reaction methodology also typically include, in separate container ⁇ within the kit, dNTPs where N is adenine, thymine, guanine and cytosine, and other like agents for performing primer extension reactions.
  • the reagent species of any system described herein can be provided in solution, as a liquid disper ⁇ ion or as a ⁇ ubstantially dry powder, e.g., the primers may be provided in lyophilized form.
  • the present invention contemplates a kit for typing an isolate of organism comprising an enclosure containing, in separate containers, at least on consensus primer, preferably a structural RNA consensus primer, and at least one genomic DNA sample for use a ⁇ a control in a typing method of this invention.
  • a panel of genomic DNA samples derived from predetermined species are included, as described herein.
  • the consensus primers and the genomic DNA for use in a kit for typing an isolate of organism are comprised as previously described.
  • the kit can further contain, in one or more separate enclo ⁇ ures, one or more panels of genomic DNA representative of groups of species, combined in a manner to allow comparison of subspecies within a species, of species within a genus, of genera within families, and the like, for determining the location of an isolate organism on the evolutionary scale.
  • the clear lysates were extracted with phenol and then chloroform; the DNA was then precipitated with ethanol.
  • the precipitated DNA wa ⁇ di ⁇ solved in TE, and its final concentration was estimated by agarose gel electrophore ⁇ i ⁇ and ethidium bromide staining.
  • Genomic DNA was isolated from the other species shown in Table 2 by the same detergent lysis and phenol extraction protocol described above.
  • PCR reaction admixtures were prepared in a volume of 50 ⁇ L containing 1 x Taq polymerase buffer (Stratagene Cloning Systems, San Diego) adjusted to 4 mM with MgCl 2 , 0.2 mM of each deoxyribonucleotide triphosphate, 1.25 units Taq polymerase, 1 uM consensus primer (or primers) , 50 uCi alpha [32P] dCTP, and template DNA at various quantities from 100 ng to 3.2 ng as indicated.
  • the reactions were overlaid with oil and cycled forty times through the following temperature profile: 94°C for 30 seconds for denaturation, 50°C for thirty seconds for annealing of primer, and 72°C for two minutes for extension.
  • the results of this set of PCR cycles was the formation of a discrete set of amplified DNA segments (primer extension products) .
  • the resulting products were re ⁇ olved by electrophore ⁇ i ⁇ in IX TBE through 5% Acrylamide-50% Urea and vi ⁇ ualized by autoradiography using Kodak X-OmatTM AR film with an intensifying screen at -70°C for 6 hours.
  • CP-PCR patterns were very similar between the species, indicating that the tRNA gene clusters probably evolve relatively slowly. This i ⁇ in contra ⁇ t to arbitrarily primed (AP)-PCR, Welsh et al., Nucleic Acids Re ⁇ .. 18: 7213-7218 (1990) , or total genome restriction digestion, Cinco et al., FEMS Microbiol. Immunol.. 47: 511-514 (1989) , that give very different patterns when different species are compared.
  • AP arbitrarily primed
  • Figure 2 shows that within a species there was generally no variation in the CP-PCR pattern. There were only two exceptions.
  • a Streptococcus pyogenes strain K58Hg that was designated serotype A (lane 8) gave a pattern identical to serotype b (lanes 7, 13 and 14) .
  • Organelle genomes despite their small size, can contribute up to half of the DNA in the cell, due to their high copy number.
  • the be ⁇ t matche ⁇ with the primer are probably more important than copy number and the ⁇ e be ⁇ t matches will generally be to nuclear genes because the number of different nuclear tRNA gene sequences greatly exceeds the number of different organelle tRNA gene sequences.
  • CP-PCR is a useful method of specie ⁇ classification.
  • Such AP-PCR fingerprint ⁇ are not conserved between specie ⁇ .
  • CP-PCR using consensus tRNA primers is a simple and. fast method that complements those that already exist and has the virtue that the polymorphisms measured are not them ⁇ elve ⁇ likely to be selected. They are, potentially, more likely to be near neutral than other characters ⁇ uch a ⁇ nutrition and perhaps less likely to result from convergent evolution, which is a drawback of classification by morphological criteria. It is also an advantage that the data is collected for regions scattered throughout the genome rather than from ⁇ equence differences in a 59 single location that may not reflect the whole genome. A single CP-PCR fingerprint will generally have les ⁇ information than comparing the DNA sequence of a specific region in each organism.
  • CP-PCR is 5 les ⁇ technically demanding and less time consuming than DNA sequencing.
  • CP-PCR could be a method of choice when large number of individuals are to be screened or as a first step when identifying ⁇ pecie ⁇ based on genomic sequence. Since data acquisition is
  • the method presented represents the simplest available universal way to reliably compare genomes of
  • the method has applications in ecology and epidemiology.
  • Example 2 consensus tRNA gene primers were used to amplify tRNA intergenic regions (tDNA-PCR) . Under low stringency PCR conditions these primers generated products some of which were conserved within genera and some of which varied in length between
  • Genomic DNAs are prepared from late log pha ⁇ e culture ⁇ . If necessary, cell walls were treated with lyso ⁇ taphin, streptoly ⁇ in or lysozyme followed by incubation at 65°C in 1 mg/ml proteinase K, 100 mM EDTA, 1% SDS for 2 hours. DNA was purified by phenol extraction followed by chloroform extraction and isopropanol precipitation.
  • CP-PCR with consensus primers for amplifying transfer RNA genes was performed as follows: Fifty ⁇ l reactions contained 1.25 units of Taq polymerase, IX Taq polymerase buffer (50 mM KCl, 10 mM Tris HC1 pH 8.3, 1.5 mM MgCl 2 ) , 0.2 mM of each dNTP, 5 ⁇ Ci alpha-[ 32 P] dCTP, 0.5 ⁇ M of each primer and 10 ng of template DNA.
  • the reaction was cycled 40 times through the following temperature profile: 94°C for 30 sec to denature, 50°C for 30 seconds for annealing of primer and 72°C for 2 minutes for extension using a Perkin Elmer Cetus 9600 thermocycler.
  • the resulting products were resolved by electrophoresis using 5% acrylamide-50% urea in IX TLE and visualized by autoradiography using Kodak X-OmatTM AR film with an intensifying screen at -70°C for 6 hours.
  • the products of tDNA-PCR could also be visualized on NuSeive agarose or native acrylamide gels by ethidium bromide staining.
  • Radioactive ink spots were u ⁇ ed to align the autoradiogram with the gel.
  • PCR product ⁇ were cut out of the gel, placed in 50 ⁇ l of TE and the DNA wa ⁇ eluted for 1 hour at 65°C. 1 ⁇ l wa ⁇ re-amplified by PCR using the same primers.
  • High stringency PCR amplification used 1 X Taq pol buffer, 0.2 mM each dNTP, 0.5 ⁇ M each primer, 30 ng of template, 1.25 units of Taq polymerase in a total volume of 50 ⁇ l. Cycling parameters were 94°C, 1 minute; 50°C, 1 minute; 72°C, 2 minutes for 30 cycles. For tDNA-ILPs, the cycling times could be truncated if the expected product is short. In these experiments PCR amplification used 1 X Tag buffer, 1 unit of Perfect Match (Stratagene) , 0.5 ⁇ M each primer, 1 ng of template, 1.25 units Taq polymerase in a total volume of 50 ⁇ l.
  • Colonies were picked, boiled in 100 ⁇ l of water, and 10 ⁇ l was amplified in a 50 ⁇ l PCR reaction using the Universal (Uni) and reverse (Rev) sequencing primers. The products were electrophoresed through 1.5 % agarose (IX TBE) . Insert ⁇ of the correct size were asymmetrically amplified and sequenced.
  • results The purpose of the studies in this Example was to determine the nature and extent of intragenic variation in tRNA gene organization within a eubacterial genus and, if possible, utilize this variation to develop a PCR method to classify strains within this genus into the correct species.
  • the staphylococci were chosen because a number of species are pathogenic in humans and they represent a diverse group for which the basic phylogenetic relationships are already known (Kloos and Wolfshohl, 1979; Kloos, 1980; Kloos and Wolfshohl, 1983; Kloos and Schleifer, 1986) .
  • T5A and T5B Two primer ⁇ (T5A and T5B) for PCR were derived from a consensus of Bacillus tRNA genes. These primers were located facing outward and about 15 base ⁇ from the end of the tRNA gene consensus sequence. PCR products using these primers were expected to occur between the pair of tRNA genes that best matched the primers and were in the correct orientation for PCR.
  • the consensus primer facing out from the 5' end of the consensu ⁇ tRNA resembled most closely tRNA 1M ⁇ t , tRNA Ly ⁇ and tRNA Il ⁇ , wherea ⁇ the primer de ⁇ igned to face out from the 3' end of the consensu ⁇ most closely resembled tRNA ph ⁇ "
  • tRNA genes existing in the correct orientation within a few hundred base pairs of each other the region between them was expected to be represented in the CP-PCR fingerprint. If these tRNA genes did not exist in the right orientation in a particular species, then the intergenic region between the next best pair of matches would amplify.
  • CP-PCR amplification of Staphylococcus DNA using consensus tRNA gene primers gave fingerprint ⁇ that displayed the distances between the best matches with the primers in the tRNA gene clusters (see Figure 4) .
  • CP-PCR was performed on strains, listed in Table 3, including at lea ⁇ t ⁇ even ⁇ train ⁇ from each of four Staphylococcus species (data not shown) . Strains were isolated from a wide variety of hosts ranging from lemurs to birds and originating on a number of continents (mainly isolated by Dr. W. Kloos, North Carolina State U.). These very divergent strain ⁇ were compared to maximize the possibility of detecting intraspecific difference ⁇ . Consistent with the results in Example 2, there were rarely any intraspecific differences.
  • PCR products of about 160 bp that were polymorphic in length between various Staphylococcus species (but which were generally not polymorphic within a species) were removed from the gel, reamplified with the same primers and cloned into pBSKII * .
  • This size class was ⁇ elected because it presumably span two intergenic regions but was small enough to sequence completely in both strands without further subcloning.
  • One clone from £>. cohnii contained different tRNAs. Sequences for portion ⁇ of the homologous PCR products are presented in Figure 2. The sequence comparison ⁇ showed that the complete
  • Staphylococcus tRNA**- 3 , and partial tRNA iMet , and tRNA"" gene sequences were similar to those of Bacillus (Void, 1985) and Mycoplasma (Muto et al., 1990) which are also members of the low G+C subdivision of the gram-positive phylum (Woese, 1987) .
  • the gene order tRNA iMet -tRNA Asp -tRNA Phe found for the 160 bp tDNA-PCR products from three Staphylococcus species is also found in Bacillus (Void, 1985) and Mvcoolasma (Muto et al., 1990), but not in the more distantly related species _ ⁇ . coli (Jinks-Robertson and Nomura, 1987) . With more limited data available, the closest equivalent gene order in Streptococcus i ⁇ tRNA iHet -tRNA ph .
  • Figure 6A show ⁇ the results of using the tRNA iHet and tRNA**- 3 specific primers StaphiMet3 and StaphAspS on Staphylococcus species in the presence of a 1000-fold excess (by mass) of human genomic DNA.
  • Figure 6B A similar study using the StaphAsp3 and StaphPheS primers is shown in Figure 6B.
  • tDNA-ILP products of the expected size (which includes the primer ⁇ and parts of the tRNAs) are seen in Figure 6A for ⁇ 3. hominis. £>. warneri and S. aureus (68, 60, and 63 bp, respectively) .
  • the species S . haemolyticus and S. cohnii. for which sequence data was not collected generally gave products of about 58 bp and 63 bp, respectively.
  • the method may sometime ⁇ fail to classify a strain, but it i ⁇ unlikely to classify a strain incorrectly. On rare occasions such data may even indicate previously unknown specie ⁇ . This situation is in contrast to some conventional PCR strategies where a strain is classified on the basis of the presence or absence of a PCR product that is not intrinsically polymorphic in length.
  • the tDNA-ILP method may be more likely than PCR of protein coding regions to detect Staphylcoccus.
  • becau ⁇ e tDNA-ILP ⁇ primer ⁇ are directed to sequences that are much more conserved among ⁇ taphylococci than protein coding region ⁇ are likely to be.
  • StaphAspS or StaphAspS plus StaphPheS were designed to produce a PCR product only in Staphylococcus and only in the diagnostic range around 50 bp to 100 bp. To determine if this was in fact the case, a variety of closely related and unrelated genera were investigated. Total genomic DNAs at concentrations ranging from 1 ng to 1 mg were tested for the species listed in Table 3 to determine if these would yield tDNA-ILP products with the primers presumed to be Staphylococcus-specific. Genomic DNAs from
  • Streptococcus Bacillus and Enterococcus. which are among the genera most closely related to Staphylococcus. gave no detectable PCR products in the 40 bp to 200 bp range under high ⁇ tringency PCR. Nor did DNA from less related E. coli or human.
  • a two step procedure was developed that (1) identifies length polymorphisms in tRNA intergenic spacers using low ⁇ tringency tRNA gene consensus primers and (2) uses the sequences of these polymorphisms to design high stringency tRNA gene primers. Unlike conventional pairs of PCR primers, these primers can be used in a number of related ⁇ pecie ⁇ to produce a PCR product the length of which defines most or all strain ⁇ in each ⁇ pecie ⁇ .
  • Staphylococcu ⁇ species were chosen in this study as a model for a general method. These represent a well studied group of mainly commensals and opportunistic pathogens on mammal ⁇ and birds (Kloos et al., Sy ⁇ tematic Bacteriology. 2:1013-39, 1986).
  • the primer ⁇ developed for thi ⁇ genus may be useful for epidemiology, ecology or diagnosis. It is possible that PCR of tRNA gene spacers will be more sensitive and reliable than other methods such as immunological methods, while also identifying the occasional occurrence of a specie ⁇ of Staphylococcus in an unusual context.
  • the u ⁇ e of the dUTP/ung ⁇ y ⁇ tem to minimize cross-contamination should make a routine test quite reliable.
  • the consensus tRNA gene primers used in Example 2 produce fingerprints in many species of bacteria. In addition to the results shown here, these consensus primers have been u ⁇ ed to create fingerprints that show interspecific polymorphi ⁇ ms in Streptococcus (Welsh et al., J. Clin. Microbiol.. 1992a). Polymorphisms in CP-PCR fingerprints identify spacers that are good candidates for the tDNA-ILP method. Primers that produce ILPs may be developed for any group of bacteria and probably even for lower eukaryote ⁇ of medical or agricultural importance, ⁇ uch as Pneumocv ⁇ ti ⁇ carini. and fungal infection ⁇ ⁇ uch as Asper ⁇ illus and Candida.
  • the polymorphic sequences obtained from tDNA-PCR products could also be the basi ⁇ for choosing primer pairs that generate a PCR product in only a single species.
  • conventional species-specific primers suffer disadvantages.
  • the main advantage of ILPs would be lo ⁇ t, namely the ability to detect most or all individuals in a number of ⁇ pecies with a single pair of primers based on highly conserved genes flanking length polymorphisms.
  • at least some part of a primer sequence must be from a non-con ⁇ erved region to prevent it from producing a product in a closely related specie ⁇ . As a consequence, some ⁇ trains within the specie ⁇ may not be detected by ⁇ uch primer pairs because the non-conserved sequences are inevitably more likely to show intraspecific variation.
  • primer ⁇ could be located in coding regions to amplify intergenic non-coding region (protein-ILPs) .
  • protein-ILPs protein-ILPs
  • high stringency primers for protein coding regions are unlikely to produce a diagnostic product in all species in a genus because changes in the third position of codons occur much more rapidly than sequence changes in tRNA genes.
  • tDNA-ILP primers could be designed directly without initial consensus tDNA-PCR fingerprinting. tDNA-ILPs may have advantages over rDNA-ILP ⁇ for several reasons.
  • the rRNA gene spacers are generally a few hundred ba ⁇ e pair ⁇ in length and a difference of a few ba ⁇ e pair ⁇ in the intergenic spacer length between related species would be difficult to detect.
  • tRNA-PCR fingerprinting methods allows one to survey a large number of different intergenic regions for length polymorphisms in a single experiment.
  • Consensus primers can easily be devised that interact with different sets of tRNA genes.
  • primers can be selected to give genus-speci ic products based on the fact that the order of tRNA genes can vary between closely related genera. The gene order of ribosomal RNAs does not vary in eubacteria.
  • the present invention provides a method with several advantages for identification of bacteria and other biological materials.
  • the method i ⁇ simple to perform and rapid; results can be obtained in as little as 36 hour ⁇ when the template nucleic acids are isolated by boiling. Only small samples of material, e.g., nanogram amounts, are needed.
  • the method yields information that allows the differentiation of even closely related species and can be extended to differentiate between subspecies or strain ⁇ of the same species.
  • the method requires no prior knowledge of any biochemical characteristics, including the nucleotide sequence of the target nucleic acids, of the organism to be identified. Initially, it requires the use of no species-specific reagents, because the primer used i ⁇ ba ⁇ ed on consensus sequences as described herein. Additionally, the method possesses the important advantage of requiring only one primer sequence for amplification although two or more primers can be u ⁇ ed in some embodiments.
  • the CP-PCR method of the invention can be used to provide identification of other types of organism ⁇ , including viruses, fungi, mammals and plants.
  • the method al ⁇ o provides an efficient way of generating polymorphisms for use in genetic mapping, especially of eukaryotes, including animals, particularly mice and humans. This method has many applications in mammalian population genetics, pathology, epidemiology and forensics.

Abstract

A rapid method for generating a set of discrete DNA amplification products characteristic of a genome as a 'fingerprint' for typing the genome comprises the steps of: forming a polymerase chain reaction (PCR) admixture by combining, in a PCR buffer, genomic DNA and at least one structural RNA consensus primer, and subjecting the PCR admixture to a plurality of PCR thermocycles to produce a plurality of DNA segments, thereby forming a set of discrete DNA amplification products. The method is known as the consensus sequence primed polymerase chain reaction (CP-PCR) method and is suitable for the identification of bacterial species and strains, including Staphylococcus and Streptococcus species, mammals and plants. The method of the present invention can identify species rapidly, using only a small amount of biological material, and does not require knowledge of the nucleotide sequence or other molecular biology of the nucleic acids of the organisms to be identified. Only one primer sequence is required for amplification and/or identification. The method can also be used to generate detectable polymorphisms for use in genetic mapping of animals and humans.

Description

/ CONSENSUS SEQUENCE PRIMED POLYMERASE CHAIN
REACTION METHOD FOR FINGERPRINTING GENOMES
Field of the Invention This invention is directed toward a method of identifying segments of nucleic acid characteristic of a particular genome by generating a set of discrete DNA amplification products characteristic of the genome. This set of discrete DNA products can generate a fingerprint that can be used to identify the genome.
Background of the Invention
For many purposes, it is important to be able to identify the species to which an organism belongs rapidly and accurately. Such rapid identification is necessary for pathogens such as viruses, bacteria, protozoa, and multicellular parasites, and assists in diagnosis and treatment of human and animal disease, as well as studies in epidemiology and ecology. In particular, because of the rapid growth of bacteria and the necessity for immediate and accurate treatment of diseases caused by them, it is especially important to have a fast method of identification. Traditionally, identification and classification of bacterial species has been performed by study of morphology, determination of nutritional requirements or fermentation patterns, determination of antibiotic resistance, comparison of isoenzyme patterns, or determination of sensitivity to bacteriophage strains. These methods are time-consuming, typically requiring at least 48 to 72 hours, often much more, other more recent methods include the determination of RNA sequences ( oese, in "Evolution in Procaryotes" (Schleifer and Stackebrandt, Eds., Academic Press, London, 1986)) , the use of strain-specific fluorescent oligonucleotides (DeLong et al.. Science. 243, 1360-1363 (1989); Amann et al., J. Bact.. 172, 762-770 (1990)) , and the polymerase chain reaction (PCR) technique (U.S. Patent Nos. 4,683,195 and 4,683,202 to Mullis et al.; Mulliε & Faloona, Methods Enzvmol.. 154, 335-350 (1987)).
In addition, DNA markers genetically linked to a selected trait can be used for diagnostic procedures. The DNA markers commonly used are restriction fragment length polymorphisms (RFLPs) . Polymorphisms useful in genetic mapping are those polymorphisms that segregate in populations. Traditionally, RFLPs have been detected by hybridization methodology (e.g. Southern blot) , but such techniques are time-consuming and inefficient. Alternative methods include assays for polymorphisms using PCR.
The PCR method allows amplification of a selected region of DNA by providing two DNA primers, each of which is complementary to a portion of one strand within the selected region of DNA. These primers are used to hybridize to the separated strands within the region of DNA sought to be amplified, forming DNA molecules that are partially single-stranded and partially double-stranded. The double-stranded regions are then extended by the action of DNA polymerase, forming completely double-stranded molecules. These double-stranded molecules are then denatured and the denatured single strands are rehybridized to the primers. Repetition of this process through a number of cycles results in the generation of DNA strands that correspond in sequence to the region between the originally used primers. Specific PCR primer pairs can be used to identify genes characteristic of a particular species or even strain. PCR also obviates the need for cloning in order to compare the sequences of genes from related organisms, allowing the very rapid construction of phylogenies based on DNA sequence. For epidemiological purposes, specific primers to informative pathogenic features can be used in conjunction with PCR to identify pathogenic organisms.
Although PCR is a very powerful method for amplifying DNA, conventional PCR procedures require the use of at least two separate primers complementary to spec, c regions of the genome to be amplified. This requirement means that primers cannot be prepared unless the target DNA sequence information is available, and the primers must be "custom built" for each location within the genome of each species or strain whose DNA is to be amplified.
Although the newer methods have advantages over previous methods for genome identification, there is still a need for a rapid, simple method that can be applied to any species for which DNA can be prepared and that does not require reagents that are specific for each species or knowledge of the DNA sequence of the isolate being identified. It is also desirable that such a method be capable of identifying a species from a relatively small quantity of biological material. Additionally, it is highly desirable that such a method is also capable of generating polymorphisms useful in genetic mapping, especially of eukaryotes. In addition to identification of related plant, animal and bacteria species, DNA segments or "markers" may be used to construct human genetic maps for genome analysis. Goals for the present human genome project include the production of a genetic map and an ordered array of clones along the genome. Using a genetic map, inherited phenotypes such as those that cause genetic diseases, can be localized on the map and ultimately cloned. The neurofibromatosis gene is a recent example of this strategy (Xu et al.. Cell 62:599-608 (1990)). The genetic map is a useful framework upon which to assemble partially completed arrays of clones. In the short term, it is likely that arrays of human genomic clones such as cosmids or yeast artificial chromosomes (YACs, Burke et al., Science. 236:806-812 (1987)) will form disconnected contigs that can be oriented relative to each other with probes that are on the genetic map or the in situ map (Lichter et al.. Science. 24:64-69 (1990)), or both. The usefulness of the contig map will depend on its relation to interesting genes, the locations of which may only be known genetically. Similarly, the restriction maps of the human genome generated by pulsed field electrophoresis (PFE) of large DNA fragments, are unlikely to be completed without the aid of closely spaced markers to orient partially completed maps. Thus, a restriction map and an array of clones covering an entire mammalian genome, for example the mouse genome, is desirable.
Recently, RFLPs that have Variable Number Tandem Repeats (VNTRs) have become a method of choice for human mapping because such VNTRs tend to have multiple alleles and are genetically informative because polymorphisms are more likely to be segregating within a family. The production of fingerprints by Southern blotting with VNTRs (Jeffreys et al. , Nature ,
316:76-79 (1985)) has proven useful in forensics. There are two classes of VNTRs; one having repeat units of 9 to 40 base pairs, and the other consisting of inisatellite DNA with repeats of two or three base pairs. The longer VNTRs have tended to be in the proterminal regions of autosomes. VNTR consensus sequences may be used to display a fingerprint. VNTR fingerprints have been used to assign polymorphisms in the mouse (Julier et al., Proc. Natl. Acad. Sci. USA. 87:4585-4589 (1990)), but these polymorphisms must be cloned to be of use in application to restriction mapping or contig assembly. VNTR probes are useful in the mouse because a large number of crosses are likely to be informative at a particular position. The mouse offers the opportunity to map in interspecific crosses which have a high level of polymorphism relative to most other inbred lines. A dense genetic map of DNA markers would facilitate cloning genes that have been mapped genetically in the mouse. Cloning such genes would be aided by the identification of very closely linked DNA polymorphisms. About 3000 mapped DNA polymorphisms are needed to provide a good probability of one polymorphism being within 500 kb of the gene. To place so many DNA markers on the map it is desirable to have a fast and cost-effective genetic mapping strategy.
Sn-m-ma-rγ of the Invention Accordingly, the methods of the present invention, referred to herein as consensus sequence primed polymerase chain reaction or "CP-PCR" fingerprinting, provides a distinctive variation of the PCR technique by employing "consensus" sequence polynucleotide primers as defined herein. We have unexpectedly found that the use of at least one c*. sensus primer, preferably a structural RNA consensus primer, in a standard PCR amplification procedure reproducibly generates specific discrete products that can be resolved into a manageable number of individual bands providing a species "fingerprint". The CP-PCR method is suitable for the rapid identification and classification of organisms throughout the plant, prokaryotic or eukaryotic kingdoms and for the generation of polymorphisms suitable for genetic mapping of eukaryotes. Only a small sample of biological material is needed, and knowledge of the target DNA sequence to be identified is not required. In addition, reagents specific for a given species are not required.
In general, CP-PCR is a method for generating a set of discrete DNA products ("amplification products") characteristic of a genome by priming target nucleic acid obtained from a genome with at least one single-stranded primer to form primed nucleic acid. The primed nucleic acid is then amplified by performing at least one cycle of polymerase chain reaction (PCR) amplification, and preferably at least 10 cycles, of PCR amplification to generate a set of discrete DNA amplification products characteristic of the genome.
The genome to which the CP-PCR method is applied can be a viral genome; a bacterial genome, including Staphylococcus and Streptococcus; a plant genome, including rice, maize, or soybean; or an animal genome, including a human genome. It can also be a genome of a cultured cell line. The cultured cell line can be a chimeric cell line with at least one human chromosome in a non-human background i.e., a hybrid cell line.
The CP-PCR method can be used to identify an organism as a species of a genus of bacteria, for example, Staphylococcus. from a number of different species. Similarly, the method can be used to determine the strain to which an isolate of the genus Streptococcus belongs, by comparing the DNA amplification products produced by CP-PCR for the isolate to the patterns produced from known strains with the same primer. The CP-PCR method can also be used to verify the assignment of a bacterial isolate to a species by comparing the CP-PCR fingerprint from the isolate with the CP-PCR fingerprints produced by known bacterial species with the same primer. For this application, the primer is chosen as described herein to maximize interspecific difference of the discrete DNA amplification products.
The target nucleic acid of the genome can be DNA, RNA or polynucleotide molecules. If the CP-PCR method is used to characterize RNA, the method also preferably includes the step of extending the primed RNA with an enzyme having reverse transcriptase activity to produce a hybrid DNA-RNA molecule, and priming the DNA of the hybrid with an arbitrary single-stranded primer. In this application, the enzyme with reverse transcriptase activity can be avian yeloblastosis virus reverse transcriptase or Moloney leukemia virus reverse transcriptase.
The discrete DNA amplification products produced by the CP-PCR method can be manipulated in a number of ways. For example, they can be separated in a medium capable of separating DNA fragments by size, such as a polyacrylamide or agarose gel, in order to produce a fingerprint of the amplification products as separated bands. Additionally, at least one separated band can be isolated from the fingerprint and reamplified by conventional PCR. The isolated separated band can also be cleaved with a restriction endonuclease. The reamplified fragments can then be isolated and cloned in a bacterial host. The isolated band or reamplified fragments can be sequenced. These methods are particularly useful in the detection and isolation of DNA sequences that represent polymorphisms differing from individual to individual of a species. The ability of the AP-PCR method to generate polymorphisms makes it useful, as well, in the mapping and characterization of eukaryotic genomes, including plant genomes, animal genomes, and the human genome. These polymorphisms are particularly useful in the generation of linkage maps and can be correlated with RFLPs and other markers.
Consensus primers, particularly structural RNA consensus primers are also contemplated, as are kits containing the primers in combination with control genomic DNA for typing isolated genomes.
Brief Description of the Drawings
In the drawings forming a portion of this disclosure: Figure 1 shows the CP-PCR patterns produced by using isolates representing five different species of Staphylococcus. and illustrates the differences apparent- between species, as described in Example 2. PCR was performed using the primers T5A in group a, T3A in group b, or T5A plus T3A in group c, at 50°C. Each numbered lane consists of three adjacent lanes having 80, 16 or 3.2 ng of template. Lane 1: S. haemolyticus CC 12 2. Lane 2: S. hominis 27844. Lane 3: S. warneri CPB10E2. Lane 4: S. cohnni JL 143. Lane 5: S. aureus ISP-8.
Figure 2 shows the CP-PCR patterns produced by using forty strains of bacteria from three different genera, and illustrates the differences detectable between the strains and the general similarity of the patterns from the same species, as described in Example 2. PCR was performed using the primers T5A plus T3A at 50°C with 100 ng of template. The templates in lanes 1 to 17 contain Streptococcus DNAs. Lanes 18 and 19 contain Enterococcus DNAs. Lanes 20 to 40 contain Staphylococcus DNAs. See Table 1 for the strains used in each lane.
Figure 3 shows the CP-PCR patterns produced by using genomes from species across the three kingdoms and illustrates the existence of polymorphisms, as described in Example 2. The reaction was performed using 50 ng of template under the standard PCR conditions. The low temperature annealing step was 50°C. Lanes 1 to 9 used the primers T5A and T3A. Lanes 10 to 19 used T5B and T3A. See Table 3 for the strains used in each lane.
Figure 4 shows fingerprints generated by tDNA-PCR in various Staphylococcus species. Strains from five species of Staphylococcus. listed in Table 3, were fingerprinted. Examples from each species are shown. PCR was performed using the primers T5A and T3B at 50°C with 1 ng of template. The products were resolved on a 5% acrylamide, 50% urea gel. The arrow on the right indicates the region of the gel from which products discussed in this paper were recovered for sequencing.
Lanes 1 and 2: Strains ATCC 27844 type strain and ATCC 27846 of Staphylococcus hominis both isolated from humans. Lanes 3 and 4: Strains CPB10E2 and GAD473 of S. warneri. isolated from a Cercopithecus monkey and Bush baby, respectively.
Lanes 5 and 6: Strains ATCC 8432 and ATCC 15564 of S . aureus. isolated from a bird and a phage typing strain isolated from humans, respectively. Lanes 7 and 8: Strains CC12J2 and PAY9F2 of S. haemolvticus. isolated from a Mangabey and chimpanzee, respectively.
Lane 9: Strain JL143 of _>. cohnii. subspecies urealvticum_ isolated from a human. Figure 5 shows the sequence of the tRNAiMet- tRNAAsp-tRNAPhe region from three species of Staphylococcus. The sequences shown are from the polymorphisms detected in Figure 4. All genes were sequenced from a minimum of two separate clones and in both strands. The coding strand from one region of a larger clone is shown in each case. Intergenic spacers are in bold. The sequences are compared to the equivalent genes in Bacillus subtilis (Void, Structure and Organization of Genes for Transger Ribonucleic Acid in Bacillus subtilis. Microbiological Reviews, 49:71-80, 1985) and Mycoplasma capricolum (Muto et al.. Nucleic Acids Res.. 18:5037-5042, 1990). Underlined bases are difference between the primers and tRNA gene sequences. The compliments of the primers StaphAspS and StaphPheS are shown.
Figure 6 shows the results of PCR amplifying the tRNAiMet-tRNAphe intergenic spacer from five species of Staphylococcus♦ Strains from five species of Staphylococcus. listed in Table 3, were investigated. Examples from each species are shown. In panel A high stringency PCR was performed using the primers StaphiMet3 and StaphAspS with 1 ng of template DNA and 1,000 ng of human DNA. In panel B the primers were StaphAsp3 and StaphPheS. Panel A and Panel B:
Lanes 1 and 2: Strains ATCC 27844 type strain and ATCC 27846 of Staphylococcus hominis both isolated from humans.
Lanes 3 and 4: Strains CPB10E2 and GAD473 of S. warneri. isolated from a Cercopithecus monkey and Bushbaby, respectively.
Lanes 5 and 6: Strains ATCC 6538 and Sau3A of S . aureus« both isolated from humans. Lanes 7 and 8: Strains AW263 and ATCC 29970 type strain of f>. haemolvticus . both isolated from humans. Lane 9: Strain CM89 of £. cohnii subsp. urealyticum isolated from a human.
Detailed Description of the Invention In order that the invention herein described may be more fully understood, the following detailed description is set forth.
This invention relates to a method for generating a set of discrete DNA amplification products characteristic of a genome. This set of discrete DNA amplification products can be resolved by techniques such as gel electrophoresis, producing a distinctive pattern, known as a "fingerprint", that can be used to identify the genome. This method uses a distinctive and novel variation of the polymerase chain reaction (PCR) technique that employs one or more consensus primers ased on a consensus sequence described herein and is therefore designated the "consensus sequence primed polymerase chain reaction" ("CP-PCR") method.
I. THE GENERAL METHOD
In general, the method of the invention involves the following steps:
(1) rendering target nucleic acids of the genome accessible to priming;
(2) priming the target nucleic acids of the genome with a preselected single-stranded consensus sequence primer to form primed nucleic acids;
(3) performing a number of cycles of PCR on the primed nucleic acids to generate a set of discrete amplification products; and
(4) if the discrete DNA amplification products are to be used for the identification of a genome, comparing the amplification products with those produced from nucleic acids obtained from genomes of known species.
Alternatively, the amplification products produced by the invention can be used to assemble genetic maps for genome analysis. Each of these steps is discussed in detail below. A. Selection of Genome
The method of the present invention is particularly well suited to the generation of discrete DNA amplification products from nucleic acids obtained from genomes of all sizes from 5 x 104 nucleotide bases (viruses) to 3 x 109 bases and greater (animals and plants) .
"Nucleic acids" as that term is used herein means that class of molecules including single-stranded and double-stranded deoxyribonucleic acid (DNA) , ribonucleic acid (RNA) and polynucleotides.
The CP-PCR method can be applied to such economically important plants as rice, maize, and soybean. It can also be applied to the human genome and to the genome of a cultured cell line. The cultured cell line can be chimeric with at least one human chromosome in an otherwise non-human background. The non-human background can be rodent, such as mouse or Chinese hamster. As described in Example 2, infra, the DNA amplification products can be used to determine that an unidentified sample of an organism such as from a bacterium belongs to the genus Staphylococcus and can be used further to determine to which species and/or strain of that genus the organism belongs. B. Rendering the Nucleic Acids of the Genome Accessible to Priming
"Genomic DNA" is used in an art recognized manner to refer to a population of DNA that comprises the complete genetic component of a species. Thus genomic DNA comprises the complete set of genes present in a preselected species. The complete set of genes in a species is also referred to as a genome. Depending on the species, genomic DNA can vary in complexity, and in number of nucleic acid molecules. In higher organisms, genomic DNA is organized into discrete nucleic acid molecules (chromosomes) .
For species low in the evolutionary scale, such as bacteria, viruses, yeast, fungi and the like, a genome is significantly less complex than for a species high in the evolutionary scale. For example, whereas E. coli is estimated to contain approximately 2.4 x 109 grams per mole of haploid genome, man contains about 7.4 x 1012 grams per mole of haploid genome.
Genomic DNA is typically prepared by bulk isolation of the total population of high molecular weight nucleic acid molecules present in a biological material derived from a single member of a species. Genomic DNA can be prepared from a tissue sample, from a whole organism or from a sample of cells derived from the organism.
Exemplary biological materials for preparing mammalian genomic DNA include a sample of blood, muscle or skin cells, tissue biopsy or cells cultured from tissue, methods for isolating high molecular weight DNA are well known. See, for example, Maniatis et al., in Molecular Cloning: A Laboratory Manual,. Cold Spring Harbor Laboratory, N.Y. (1982); and U.S. Patent No. 4,800,159 to Mullis et al. Rendering the nucleic acids of the genome accessible to priming requires that the nucleic acids be available for base-pairing by primers and that DNA polymerases and other enzymes that act on the primer-template complex can do so without interference. The nucleic acids must be substantially free of protein that would interfere with priming or the PCR process, especially active nuclease, as well as being substantially free of nonprotein inhibitors of polymerase action such as heavy metals.
A number of methods well-known in the art are suitable for the preparation of nucleic acids in a condition accessible to priming. Typically, such methods involve treatment of cells or other nucleic acid-containing structures, such as virus particles, with a protease such as proteinase K or pronase and a strong detergent such as sodium dodecyl sulfate ("SDS") or sodium lauryl sarcosinate ("Sarkosyl") to lyse the cells. This is followed by extraction with phenol and chloroform to yield an aqueous phase containing the nucleic acid. This nucleic acid is then precipitated with ethanol and redissolved as needed. (See Example 1, infra) .
Alternatively, as where the genome is in bacteria, a small portion (- 0.5 mm2) of a single bacterial colony can be removed with a 200-μL automatic pipette tip and suspended in 5 μL of TE (0.01 M Tris-HCl, pH 8.0, 1 mM EDTA) in a plastic microfuge tube and boiled for 5 minutes. After the sample is boiled, the debris is pelleted by centrifugation. The CP-PCR method can then be performed directly on the nucleic acids present in the supernatant sample after appropriate dilution.
In some applications, it is possible to introduce samples such as blood or bacteria directly into the PCR protocol as described below without any preliminary step because the first cycle at 94°C bursts the cells and inactivates any enzymes present. C. Priming the Target Nucleic Acids 1. The Consensus Primer Sequence a. General Considerations The sample of target nucleic acids is primed with a single-stranded primer. Individual single-stranded primers, pairs of single-stranded primers or a mixture of single-stranded primers can be used.
A primer for use in this inventions is a consensus sequence polynucleotide primer, or consensus primer. A consensus primer is a polynucleotide having a nucleotide sequence that comprises a region at its
3' terminus that is homologous to a consensus sequence derived from a family of related genes within a genome, or derived from related genes found in the genomes of different species. The related genes from which a consensus primer is derived are a class of genes that occur in the genome as a cluster within the genome.
Clusters of related genes are known to occur for a variety of gene families, any of which are suitable as a source of related genes for deriving a consensus primer for use in the present invention. Gene clusters are regions of a genome in which related genes are organized within a single nucleic acid molecule of the genome, i.e., are genetically linked. Gene clusters comprise two domains: (1) the nucleotide sequences that define each of the related genes that are members of the cluster, and (2) the nucleotide sequences that define the spacer region between each member of the related genes of the cluster. Whereas the related genes (members) of the cluster are conserved when compared at the level of sub-species, species, family, order or other division of evolutionary relatednesε, the spacer region of a gene cluster is more variable in nucleotide sequence than the nucleotide sequence defining a member of a cluster.
Variability in the spacer regions of a gene cluster provides the polymorphisms that produce a fingerprint by the present methods which is characteristic of the organism being analyzed.
Variability in spacer regions can be manifest by differences in actual sequence, by differences in spacer length between members of the cluster, and even by differences in overall organization of the members of a cluster.
Organization of related genes in a cluster can vary both in the linear order of the members of the cluster on the nucleic acid molecule defining the cluster and in the orientation of each member of the cluster relative, to one another.
Typical and preferred gene clusters are the structural RNA families, namely the family of genes that encode transfer RNA (tRNA) molecules, and the family of genes that encode riboso al RNA (rRNA) molecules known as 28s, 16s and 5s rRNA's. Other gene clusters are the linked genetic elements of an operon.
The tRNA gene cluster is particularly preferred and is exemplary of the general methods described herein. Although the sequence of the primer can vary widely, so long as it comprises a consensus sequence at it's 3' termini, some guidelines to primer selection are found in Innis and Gelfand, "Optimization of PCRs," in PCR Protocols: A Guide to Methods and Applications (M.A. Innis, D.H. Gelfand, J.J. Sninsky and T.J. White, eds.. Academic Press, New York, 1990), pp. 3-12, incorporated herein by this reference. Briefly, the primer typically has 50 to 60% G + C composition and is free of runs of three or more consecutive C•s or G's at the 3•-end or of palindromic sequences, although having a (G + C)-rich region near the 3'-end may be desirable. These guidelines, however, are general and intended to be nonlimiting. Additionally, in many applications it is desirable to avoid primers with a T at the 3' end because such primers can prime relatively efficiently at mismatches, creating a degree of mismatching greater than desired, and affect the background amplification. The CP-PCR method is based on the rationale that for any preselected gene cluster, which comprises at least two related and genetically linked genes, there is a spacer region between the linked genes which is variable and contains nucleotide sequence differences when compared to the same region from a different sub¬ species, species, genus, family or other evolutionary division of organisms having members of the gene cluster. The consensus primer is selected to amplify one or more specific primer extension products that contain the nucleotide sequence of one or more of the spacer regions between two genes of a cluster.
The consensus primer amplifies DNA segments containing spacer regions because the primer is selected to provide a 3' terminus for primer extension that "points" the direction of primer extension across the spacer region. The consensus sequence of the primer is selected such that there is a degree of homology with a consensus region within the individual members of a gene cluster that the primers can be expected to anneal to many consensus sequences contained within a variety of the members of a gene cluster. Some of these will be within a few hundred basepairs of each other and on opposite.strands thereby satisfying the requirements for PCR amplification. Thus, the sequences between these consensus sequence positions will be PCR amplifiable. The extent to which sequences amplify will depend on the efficiency of priming at each pair of primer annealing sites. Because the sequence of the primer is selected to contain some degree of homology with a consensus sequence with respect to the target nucleic acid sequence of the genome, a substantial degree of hybridization between the DNA strands of the primer and the target nucleic acids of the genome is expected to occur. "Substantial degree of hybridization" is defined herein to mean in the context of a primer extension reaction thermocycle in which primer annealing occurs, that the hybridizing conditions favor annealing of homologous nucleotide sequences under "high stringency conditions". In some embodiments, where less evolutionary relatedness is desired, the hybridizing conditions can be carried out under low stringency or intermediate stringency conditions so that up to 10% of the nucleotide bases of a primer sequence are paired with inappropriate
(non-complementary) bases in the target nucleic acid, e.g. a guanine base in the primer is paired with an adenine base in the target nucleic acid.
As used herein, the phrase "internal mismatching" in its various grammatical forms refers to non- complementary nucleotide bases in the primer, relative to a template to which it is hybridized, that occur between the 5*-terminal most and 3'-terminal most bases of the primer that are complementary to the template. Thus, 5'-terminal and/or 3'-terminal non- complementary bases are not "internally mismatched" bases. A "substantial degree of "internal mismatching" is such that at least 6.5% of the nucleotide bases of the primer sequence are paired with inappropriate bases in the target nucleic acid. In the CP-PCR method of the invention the genome may be primed with a single consensus primer, a combination of two or more primers or a mixture of heterogeneous primers, each individual primer in the mixture having a different, but related sequence.
When a mixture of primers is used, some, but not all, of the primers can match more efficiently. An example of use of a mixture of primers is provided in Example 2 infra. Preferably, the consensus primer is about 10 to about 50 nucleotide bases long, and more preferably, about 17 to about 40 bases long. In principle, the shorter the oligonucleotide, the more perfect a match must be in order to permit priming. The primer can be of any sequence so long as it comprises a consensus sequence as defined herein. The primer can have sequence redundancies reducing the occurrence of mismatches.
Preferably, both the template and the primer are DNA. The template can also be single-stranded RNA molecules, for example messenger RNA, in which case an enzyme with reverse transcriptase activity, such as avian myeloblastosis virus (AMV) reverse transcriptase or Moloney murine leukemia virus (Mo-MLV) reverse transcriptase, is used to generate a hybrid DNA-RNA molecule with an arbitrary primer or a poly T primer. The DNA strand of this hybrid DNA-RNA molecule is then used as the starting material for AP-PCR. Alternatively, the primer can also be a single- stranded ribonucleotide of the appropriate length. which is extended at its 3'-hydroxyl terminus by reverse transcriptase, forming a double-stranded molecule in which one strand is partially DNA and partially RNA. b. Conserved Transfer RNA Primers
The gene clusters formed by tRNA genes provides a preferred family of related genes for practicing the methods of the present invention. tRNA genes occur in multiple copies dispersed throughout the genome in most species and tend to be clustered. McBride et al., Genomics. 5:561-573 (1989) . In E. coli. there is estimated to be at least 100 tRNA genes of which about 30 are mapped. Jinks- Robertson et al, in "E. coli and S. typhimurium", Neidhardt, ed., ASM Press, Washington, pp. 1358-1385 (1989) . About half of the mapped genes are in seven clusters of two to seven genes per cluster, with a spacing of genes being variable ranging from 10 to 200 basepairε. The genes are generally arranged in a head to tail fashion and are, at least in some cases, organized into operons. In Bacillus subtilis, Photobacterium phosphoreu and Spiroplasma the genes are more tightly clustered. See, Void, Microbiol. Rev.. 49:71-90 (1985); Giroux et al., J. Bacteriol.. 170:5601-5606 (1988); and Rogers et al. , Isreal J. Med. Sci.. 20:768-772 (1984), respectively. For instance, in B. subtilis there are two main clusters consisting of 16 and 21 tRNA genes. One operon in P. phosphoreum has eight genes and five tRNAPro pseudogenes, all in less than 1,500 base pairs. In the human nuclear genome there are estimated to be 1300 genes and a large number of tRNA pseudogenes. At least seven clusters are on seven different chromosomes. McBride et al., Genomics. 5:561-573 (1989) . However, in the few characterized cases in mammals, the tRNA genes are not in operons, being oriented in all possible directions within clusters.
Fungi, plants and animals have organelle genomes in addition to their nuclear genomes. Organelle genomes are much smaller than nuclear genomes but nevertheless encode tRNA genes for a more redundant genetic code. For example, the very small (circa 16,000 bp) animal mitochondrial DNA has 22 tRNA genes, some of which are closely spaced. Fox, Ann. Rev. Genet.. 21:67-91 (1987).
Chloroplast and mitochondrial DNA from plants are generally more complex than those in animals and often have more tRNA genes. For instance, the 121,024 base pair chloroplast genome of liverwort has 36 tRNA genes, a few of which are clustered. The yeast mitochondrial genome has at least 25 tRNA genes in 78,000 base pairs. Ohyama et al. , Nature. 322:572-574 (1986) .
Consensus tRNA primers for use in the present methods were developed using the known tRNA sequences available in nucleotide sequence databases. Presently, over 500 tRNA sequences are known.
Given the variability in tRNA gene sequences between isoacceptors from different species, and the even greater difference between tRNAs for different amino acids, a substantial universal consensus does not exist. However, a reasonable match with a fraction of all tRNAs can readily be devised. With this objective, it is possible to identify and produce consensus primers that have (a) at least a five base perfect match between the 3• end of the primer and many tRNA genes and (b) extensive homology in the rest of the primer with a number of different tRNA genes from a wide variety of sources. A primer sequence is a consensus sequence if at least ive nucleotideε at the 3' end of the primer are a perfect match in homology to at least several members of a set of possible members of the family of genes in the cluster, together with extensive homology in the rest of the primer as described herein.
Preferably, the match is to at least 10 percent of the members where the set comprises at least one tRNA gene for each natural amino acid (i.e., a set size of 21 tRNA genes) . More preferably, the match is to at least 30 percent, and still more preferably at least 50 percent, of the members of a set comprising 21 members.
Because more than one possible nucleotide can reside at a position of a consensus sequence and satisfy the requirements above, one embodiment contemplates that a consensus sequence contains the most common nucleotide of the set at any given position or the next most common nucleotide of the set that occurs at that same position in excess of 10 percent, and more preferably 30 percent, of the members where the set comprises 21 members.
Alternatively, a tRNA consensus sequence can be defined in terms of a subset of tRNA genes, with the objective being to design a primer that produces a fingerprint that is more specific to a particular genome. In this case, the consensus can be limited to a family of tRNA genes for a particular amino acid, such as phenylalanine, i.e., tRNA-Phe, where the tRNA- Phe genes are from a single genome, or from a family of related species, or related genera.
In a preferred embodiment that is exemplary herein, a consensus sequence was developed by comparing the tRNA genes for a complete set of natural amino acids (21 members) from the genome of the bacteria Bacillus. Of importance is the fact that the consensus, although derived from Bacillus, produced consensus primers that would produce fingerprints in species of organisms across the kingdoms including plant, bacteria and animal. Thus, the selection of the consensus sequence is not critical to the general method of the present invention, so long as the sequence represents a consensus in the manner defined herein. However, the selection of a particular consensus sequence, and the extent of homology contained in the sequence will alter the particular fingerprint observed for a particular genome.
For example, using the known 21 Bacillus tRNA genes coding the 21 common amino acids, a consensus can be identified where 8 of the 21 tRNA genes exhibit a 5 base perfect match at the resulting consensus primer's 3' end. This primer, designated T5A, also has at least 10 of the remaining 19 Nucleotide bases perfectly matched with those 8 tRNA genes.
Consensus primer T5A has the nucleotide base sequence shown in Table 1 and is derived from the complement of a consensus sequence at the 5' terminal region of the Bacillus tRNA genes. Therefor the consensus primer faces out from a tRNA gene in the 5' to 3 direction across a spacer region located upstream from any tRNA gene having a sequence match to which T5A can hybridize.
By "faces out" is meant, with respect to members of a gene cluster and the respective spacer region, that the 3' end of the primer, when hybridized to a consensus sequence region of a member of the gene cluster, provides an initiator 3' end for primer extension to pass into the spacer region and away from the interior of the member gene, i.e., the primer extension product extends "out" into the spacer rather than "in" into the member gene. Consensus primers T5B, T3 and T3B were fashioned in a similar manner as above for T5A. Namely, tRNA genes were aligned for all 21 amino acid tRNA genes, and "best match" consensus sequences were developed. T5B, like T5A, faces out and upstream from the 5' end of the consensus sequence, whereas T3A and T3B represent consensus sequences at the 3' end of the tRNA gene facing out and downstream from the 3' end of the tRNA gene and will primer extend across the spacer region downstream from a tRNA gene.
The sequences of the tRNA consensus primers T5A, T5B, T3A and T3B are shown in Table 1 below and are written in the direction of 5' to 3•.
Table 1
T5A 5•-AGTCCGGTGCTCTAACCAACTGAG- 3'
T5B 5«-AATGCTCTACCAACTGAACT- 3'
T3A 5'-GGGGGTTCGAATTCCCGCCGGCCCCA- 3»
T3B 5'-AGGTCGCGGGTTCGAATCC-3'
Given that there are one hundred or more divergent tRNA genes in a typical genome, there are likely to be many matches for the above primers and these will vary from almost perfect to rather poor matches depending on the consensus sequence target to which the primers are hybridized. Other consensus sequences can be similarly derived by comparing the tRNA gene families of other species, both within the family of genes for a given species or by comparison to tRNA genes in different species.
The consensus primers shown on Table 1 are particularly preferred and are used as exemplary of the methods of the present invention.
However, other consensus tRNA primers can be designed so long as the resulting sequence contains a minimum of a three base perfect match, and preferably at least a 5 base perfect match, at the 3' end of the consensus primer, together with a minimum of at least a 50 percent match with the next 15 bases of the primer adjacent to the perfect match region of the primer, when the consensus sequence is compared to at least ten of the 21 amino acid tRNA genes for a given genome.
In one embodiment, a consensus tRNA primer can have a nucleotide sequence at its 3•-terminus corresponding to the sequence 5'-CTGAG-3' , 5'-GAACT- 3' , 5'-CCCCA-3' , or 5'-AATCC-3• , which sequences are derived from the tRNA primers utilized.in Example 2. These consensus primers have a length of at least fifteen nucleotides, and additionally have about 30 percent homology, and preferably at least fifty percent homology, to a consensus tRNA sequence in the fifteen nucleotides located at the 3'terminus of the primer. A consensus primer can also be a nucleotide sequence based on another consensus primer of this invention but being progressively truncated at the 5' end.
A single tRNA consensus primer can provide the requisite "pair" of primers for a PCR amplification product to form in those cases where separate primer molecules having the same consensus sequence independently hybridize to opposite polarity strands of the target genome. This independent hybridization is possible for several reasons. First, the target genome can contain tRNA genes in opposite relative orientations on the genome so that two tRNA genes each provide a consensus sequence such that the resulting hybridized primers "fact out" from their respective tRNA genes and at the same time "face toward" each other. Second, both the 5* and 3' ends of tRNA gene exhibit some degree of homology, which is well known and contributes to the classical "cloveleaf" structure of a tRNA molecule. Thus, the same homology that allows.for the 5' and 3' ends of a tRNA molecule to self-hybridize is available to hybridize to a consensus primer selected to have homology to both a 3' and 5' end of the tRNA gene.
Exemplary of single consensus tRNA primers useful to produce a characteristic fingerprint are the primers T5A and T3A, each described in Example l.
The tRNA gene cluster, as described above, is a preferred gene cluster for practicing the fingerprint characterization methods described herein. Due to the nature of evolutionary genetics and the size and organization of tRNA gene clusters, tRNA polymorphisms observed by the present fingerprinting methods are variances due to dif erences in spacer length and sequence content rather than di ferences in overall cluster organization.
The results using consensus tRNA primers indicates that the fingerprint method of CP-PCR is most useful to make identification of a particular genus, although in some cases consensus tRNA primers can be used to distinguish species. See, for example Figure 1. c. Conserved Ribosomal RNA Primers The gene cluster formed by the ribosomal RNA (rRNA) gene family is less complex than the tRNA gene cluster (having fewer members) and is considerably more well conserved than the tRNA gene cluster both at the level of organization of the cluster, and at the level of the sequences within the rRNA genes. The rRNA cluster is comprised of three major species, the 28s, the 16s and the 5s rRNA genes. The "consensus" in the case of the rRNA cluster is not a similar nucleotide sequence found that is common to the members of the cluster as was the case for the tRNA gene family members. Rather, the "consensus" is found between species. For example, the 5' end of the 28s rRNA gene is evolutionarily conserved between all genera across the bacterial kingdom to the extent that consensus primers can be defined from the 28s rRNA gene that produce family specific, order specific and even kingdom specific fingerprints when using consensus rRNA primers in the present invention.
Therefore, polymorphisms observed using consensus rRNA primers are at the level of family, order and higher evolutionary categories than the polymorphisms observed for consensus tRNA primers. This is primarily due to the fact that rRNA gene clusters are smaller and more evolutionarily conserved than tRNA clusters. d. Use of Mixtures of Primers
As discussed above, mixtures of heterogeneous primers can also be used, with each primer in the mixture having a different sequence. The individual primers in the mixture can all be the same length. Preferably, primers are constructed to avoid self-priming internally and the creation of artifacts.
A heterogeneous mixture of primers may contain some primers that match with the consensus sequences on target nucleic acids in a manner that provides a more distinct fingerprint. The use of such primers may allow the initial priming steps to be performed at a higher temperature (higher stringency) or might allow a consistency of pattern over a wider range of template concentrations. When combinations of two or more individual primers are used, the primers are used simultaneously in the same CP-PCR reaction. These combinations provide a very different pattern from that produced by each primer alone. See, for example, the difference observed in Figure 1 between group a, group b and group c utilizing one, one, and two primers, respectively. Therefore, a combination of primers provides a different fingerprint than is generated by using each individual primer alone. When primers are used in such combinations, only primer pairs that do not produce a primer artifact can be used.
In one embodiment, a mixture of primers comprises two or more primers where the nucleotide sequences of each primer are substantially identical, except that a few nucleotide bases differ at a single position or at two positions. These mixtures contain consensus primers that each individually exhibit consensus matches with different subsets of members of a gene cluster. Thus, the use of two primers produces a distinct pattern that is typically more complex and therefore more characteristic of any given genome than the use of either primer alone.
2. Concentration of Primer and Template The quantity of the nucleic acid genome used in the CP-PCR amplification depends on the complexity of the particular genome used. Simple genomes, such as bacterial genomes have a genome size of less than about 5 million base pairs (5 megabases) . Complex genomes, such as sativa species (rice) have a genome size of about 700-1000 megabases. Other complex genomes such as maize or humans have a genome size of about 3000 megabases.
The amount of simple genome nucleic acid used as template is from about 10 pg to about 250 ng. preferably from about 30 pg to about 7.5 ng. Most preferred is an amount of simple genome nucleic acid template of about 1 ng.
The amount of nucleic acid of a complex genome used as a template is from about 250 ng to about 0.8 ng. More preferably, the amount of nucleic acid of a complex genome used as template is from about 51 ng to about 0.8 ng. Most preferred, are amounts complex genome nucleic acid template of about 50 ng to about 10 ng.
The priming step is carried out as part of the PCR amplification process, and the conditions under which it is performed are discussed below under "Performance of PCR." D. Performance of PCR
In one embodiment, the present invention utilizes an amplification method where the single- stranded template is hybridized with a primer or primers to form a primer-template hybridization product or products. A hybridization reaction admixture is prepared by admixing effective amounts of a primer, a template nucleic acid and other components compatible with a hybridization reaction. Templates of the present methods can be present in any form, with respect to purity and concentration, compatible with the hybridization reaction.
The hybridization reaction mixture is maintained under hybridizing conditions for a time period sufficient for the primer(s) to hybridize to the templates to form a hybridization product, i.e., a complex containing primer and template nucleic acid strands.
The phrase "hybridizing conditions" and its grammatical equivalents, when used with a maintenance time period, indicates subjecting the hybridization reaction admixture, in the context of the concentrations of reactants and accompanying reagents in the admixture, to tine, temperature and pH conditions sufficient to allow the primer(s) to anneal with the template, typically to form a nucleic acid duplex. Such time, temperature and pH conditions required to accomplish hybridization depend, as is well known in the art, on the length of the primer to be hybridized, the degree of complementarity between the primer and the template, the guanidine and cytosine content of the polynucleotide the stringency of hybridization desired, and the presence of salts or additional reagents in the hybridization reaction admixture as may affect the kinetics of hybridization. Methods for optimizing hybridization conditions for a given hybridization reaction admixture are well known in the art.
The term "primer" as used herein refers to a polynucleotide, whether purified from a nucleic acid restriction digest or produced synthetically which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a template is induced, i.e., in the presence of nucleotides and an agent for polymerization such as DNA polymerase, reverse transcriptase and the like, and at a suitable temperature and pH.
The primer must be sufficiently long to prime the synthesis of extension products in the presence of the agents for polymerization. The exact lengths of the primers will depend on many factors, including temperature and the source of primer. For example, depending on the complexity of the template sequence, a polynucleotide primer typically contains from about 8 to about 30 or more nucleotides, although it can contain fewer nucleotides. As few as 8 nucleotides in a polynucleotide primer have been reported as effective for use. Studier et al., Proc. Natl. Acad Sci. USA. 86:6917-21 (1989). Short primer molecules generally require lower temperatures to form sufficiently stable hybridization complexes with template to initiate primer extension.
In some cases, the primers used herein are selected to be "substantially" complementary to the different strands of each specific sequence to be synthesized or amplified. This means that the primer must contain at its 3• terminus a nucleotide sequence sufficiently complementary to nonrandomly hybridize with its respective template. Therefore, the primer sequence may not reflect the exact sequence of the template. For example, a non-complementary polynucleotide can be attached to the 5' end of the primer, with the remainder of the primer sequence being substantially complementary to the strand. Such noncomplementary polynucleotides might code for an endonuclease restriction site or a site for protein binding. Alternatively, noncomplementarity bases or longer sequences can be interspersed into the primer, provided the primer sequence has sufficient complementarity with the sequence of the strand to be synthesized or amplified to non-randomly hybridize therewith and thereby form an extension product under polynucleotide synthesizing conditions. Sommer et al., Nuc. Acid Res.. 17:6749 (1989), reports that primers having as little a 3 nucleotide exact match at the 3• end of the primer were capable of specifically initiating primer extension products, although less nonspecific hybridization occurs when the primer contains more nucleotides at the 3 • end having exact complementarity with the template sequence. Therefore, a substantially complementary primer as used herein must contain at its 3' end at least 3 nucleotides having exact complementarity to the template sequence. A substantially complementary primer preferably contains at least 8 nucleotides, more preferably at least 18 nucleotides, and still more preferably at least 24 nucleotides, at its 3' end having the aforementioned complementarity. Still more preferred are primers whose entire nucleotide sequence has exact complementarity with the template sequence.
The choice of a primer's nucleotide sequence depends on factors such as the distance from the region coding for the desired specific nucleic acid sequence present in a nucleic acid of interest and its hybridization site on the nucleic acid relative to any second primer to be used.
The primer is preferably provided in single- stranded form for maximum efficiency, but may alternatively be double-stranded. If double-stranded, the primer is first treated to separate its strands before being used to prepare extension products, preferably, the primer is a oligodeoxyribonucleotide. Primers can be prepared by a variety of methods including de novo chemical synthesis and derivation of nucleic acid fragments from native nucleic acid sequences existing as genes, or parts of genes, in a genome, plasmid, or other vector, such as by restriction endonuclease digest of larger double- stranded nucleic acids and strand separation or by enzymatic synthesis using a nucleic acid template. De novo chemical synthesis of a primer can be conducted using any suitable method, such as, for example, the phosphotriester or phosphodiester methods. See Narang et al., Meth. Enzymol. , 68:90 (1979); U.S. patent No. 4,356,270; Itakura et al., Ann. Rev. Biochem.. 53:323-56 (1989); and Brown et al., Meth. Enzvmol.. 68:109 (1979).
Derivation of a primer from nucleic acids involves the cloning of a nucleic acid into an appropriate host by means of a cloning vector, replication of the vector and therefore multiplication of the amount of the cloned nucleic acid, and then the isolation of subfragments of the cloned nucleic acids. For a description of subcloning nucleic acid fragments, see Maniatis et al.. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, pp. 390-401 (1982); and see U.S. Patents No. 4,416,988 and No. 4,403,036. The primed template is used to produce a strand of nucleic acid having a nucleotide sequence complementary to the template, i.e., a template- complement.
The template is subjected to a first primer extension reaction by treating (contacting) the template with a (first) primer. The primer is capable of initiating a primer extension reaction by non- randomly hybridizing (annealing) to a template nucleotide sequence, preferably at least about 8 nucleotides in length and more preferably at least about 20 nucleotide in length. This is accomplished by mixing an effective amount of the primer with the template and an effective amount of nucleic acid synthesis inducing agent to form a primer extension reaction admixture. The admixture is maintained under polynucleotide synthesizing conditions for a time period, which is typically predetermined, sufficient for the formation of a primer extension reaction product. The primer extension reaction is performed using any suitable method. Generally polynucleotide synthesizing conditions are those wherein the reaction occurs in a buffered aqueous solution, preferably at a pH of 7-9, most preferably about 8. Preferably, a molar excess (for genomic nucleic acid, usually about 106:1 primer:template) of the primer is admixed to the buffer containing the template strand. A large molar excess is preferred to improve the efficiency of the process. For polynucleotide primers of about 20 to 25 nucleotides in length, a typical ratio is in the range of 50 ng to 1 ug, preferably 250 ng, of primer per 100 ng to 500 ng of mammalian genomic DNA.
The deoxyribonucleotide triphosphates (dNTPs) dATP, dCTP, dGTP, and dTTP are also admixed to the primer extension reaction admixture in amounts adequate to support the synthesis of primer extension products, and depends on the size and number of products to be synthesized. The resulting solution is heated to about 90°C-100βC for about 1 to 10 minutes, preferably from 1 to 4 minutes. After this heating period the solution is allowed to cool to room temperature, which is preferable for primer hybridization. To the cooled mixture is added an appropriate agent for inducing or catalyzing the primer extension reaction, and the reaction is allowed to occur under conditions known in the art. The synthesis reaction may occur at from room temperature up to a temperature above which the inducing agent no longer functions efficiently. For example, if DNA polymerase is used as inducing agent, the temperature is generally no greater than about 40°C unless the polymerase is heat-stable.
The inducing agent may be any compound or system which will function to accomplish the synthesis of primer extension products, including enzymes. Suitable enzymes for this purpose include, for example, E. coli. DNA polymerase I, Klenow fragment of E. coli DNA polymerase I., T4 DNA polymerase, T7 DNA polymerase, recombinant modified T7 DNA polymerase described by Tabor et al., U.S. Patent Nos. 4,942,130 and 4,946,786, other available DNA polymerases, reverse transcriptase, and other enzymes, including heat-stable enzymes, which will facilitate combination of the nucleotides in the proper manner to form the primer extension products which are complementary to each nucleic acid strand.
Heat-stable DNA polymerases are particularly preferred as they are stable in a most preferred embodiment in which PCR is conducted in a single solution in which the temperature is cycled.
Representative heat-stable polymerases are the DNA polymerases isolated from Bacillus stearothermophilus (Bio-Rad) , Thermus thermophilous (FINZYME, ATCC #27634) , Thermus species (ATCC #31674) , Thermus aquaticus strain TV 11518 (ATCC 25105) , Sulfolobus acidocaldarius. described by Bukhrashuili et al., Biochem. Biophvs. Acta. 1008:102-7 (1909) and by Elie et al., Biochem. Biophys. Actz. 951:261-7 (1988), and Thermus filiformis (ATCC #43280) . Particularly preferred is Taq DNA polymerase available from a variety of sources including Perkin-Elmer-Cetus, (Norwalk, CT) , Promega (Madison, WI) and Stratagene (La Jolla, CA) , and AmpliTaq DNA polymerase, a recombinant Thermus aquaticus Taq DNA polymerase available from Perkin-Elmer-Cetus and described in U.S. Patent No. 4,889,818.
Generally, the synthesis will be initiated at the 3' end of each primer and proceed in the 5' direction along the template strand, until synthesis terminates, producing molecules of different lengths. There may be inducing agents, however, which initiate synthesis at the 5' end and proceed in the above direction, using the same process as described above.
The primer extension reaction product is then subjected to a second primer extension reaction by treating it with a second polynucleotide primer having a preselected nucleotide sequence. The second primer is capable of initiating the second reaction by hybridizing to a nucleotide sequence, preferably at least about 8 nucleotides in length and more preferably at least about 20 nucleotides in length, found in the first product. This is accomplished by mixing the second primer, preferably a predetermined amount thereof, with the first reaction product, preferably a predetermined amount thereof, to form a second primer extension reaction admixture. The admixture is maintained under polynucleotide synthesizing conditions for a time period, which is typically predetermined, sufficient for the formation of a second primer extension reaction product.
In preferred strategies, the first and second primer extension reactions are the first and second primer extension reactions in a polymerase chain reaction (PCR) . PCR is carried out by simultaneously cycling, i.e., performing in one admixture, the above described first and second primer extension reactions, each cycle comprising polynucleotide synthesis followed by separation of the double-stranded polynucleotides formed.
PCR is preferably performed using a distinguishable variation of the standard protocol as described in U.S. Patent No. 4,683,192, No. 4,683,202, No. 4,800,159 and No. 4,965,188 to Mullis et al., and No. 4,889,818 to Gelfand et al., and in the Innis & Gelfand reference described above, employing only one primer. The principles of the PCR process have been described under "Background of the Invention," supra. Typically, the DNA polymerase used in CP-PCR is the thermostable DNA polymerase purified from Thermus aguaticus and known as Taq I. However, other heat- stable DNA polymerases can be used.
A PCR thermocycle is the changing of a PCR admixture from a first temperature to another temperature and then back to the first temperature. That is, it is cycling the temperature of the PCR admixture within (up and down through) a range of temperatures. Typically, the change in temperature is not linear with time, but contains periods of slow or no temperature change and periods of rapid temperature change, the former corresponding to, depending on the temperature, a hybridization (annealing) , primer extension or denaturation phase, and the later to temperature transition phases. Thus, PCR amplification is performed by repeatedly subjecting the PCR admixture to a PCR temperature gradient where the gradient includes temperatures where the hybridization, primer extension and denaturation reactions occur. Preferred PCR temperature gradients are from about 35βC to about 94βC, from about 40°C to about 94βC, and from about 50°C to about 94βC.
In preferred embodiments at least about 10, preferably about 10 to 40, cycles of PCR are performed under high stringency conditions using a consensus primer of this invention. With Taq I polymerase, these cycles are generally performed so as to have the following phases: 94°C for 30 seconds for denaturation, the high stringency annealing temperature for 30 seconds, and 72βC for 2 minutes for extension. (See Example 2, infra.. Alternatively, other thermostable DNA polymerases can be used, in which case the denaturation, high stringency annealing, and extension temperatures are adjusted according to the thermostability of the particular DNA polymerase. The high stringency annealing temperature is about the melting temperature of the double- stranded DNA formed by annealing, about 35°C to about 65°C, generally greater than about 55°C, and preferably about 60°C. In a related embodiment, at least on initial cycle of PCR is conducted under low stringency annealing conditions as a first step, followed by a second step which comprises the thermocycles described above under high stringency conditions. Preferably, the annealing temperature in the second step is greater than the annealing temperature in the first step. The annealing temperature in the second step is lower for shorter primers, because the melting temperature of short double-stranded helices is decreased. Conversely, it is higher for longer primers.
In this two step embodiment, at least one initial cycle of PCR is performed, starting with the at least one consensus primer and the genomic nucleic acids to be amplified. Using Tag I polymerase, the initial cycle(s) of PCR are performed under "low stringency annealing conditions". The term "stringency" refers to the degree of mismatch tolerated during hybridization of the primer and template; the higher the stringency, the less mismatch is tolerated.
Preferably, one to five cycles of amplification are performed under these conditions. These cycles are generally performed so as to have the following phases: 94°C for 5 minutes to denature, 5 minutes at the low stringency annealing temperature, and 72°C for 5 minutes for extension. More preferably, one to four or one to three low stringency amplification cycles are performed. Most preferred, are one to two low stringency amplification cycles. The low stringency annealing temperature can be from about 30°C to about 55°C, preferably from about 35°C to about 55βC, and more preferably from about 40βC to about 48βC. If mixtures of primers that have considerable sequence homology with the target genome are used, higher temperatures for annealing in the initial cycle(s) can be tolerated, presumably because some of the sequences in the mixtures inevitably anneal quite well to any complex genome and efficiently generate amplification products. The reaction is performed in a buffer optimized for activity of the particular thermostable DNA polymerase employed. A number of thermostable DNA polymerases have been isolated. See U.S. Patent No. 4,889,818 describing the thermostable DNA polymerase of Thermus acπaaticus. In addition, the thermostable DNA polymerases present in any of the thermophillic bacteria is well known and described in U.S. Patent No. 4,889,818.
The particular CP-PCR conditions employed will depend upon the particular thermostable DNA polymerase used for the amplification reaction and are typically optimized for that particular thermostable DNA polymerase. Effective amounts of the primer(s) and target nucleic acid are admixed in an aqueous PCR buffer that includes an effective amount of an inducing agent, an effective amount of each dNTP. For Taq DNA polymerase, the buffer typically contains an effective amount of Taq, 50 mM KCl, 10 mM Tris-HCl, pH 8.4, 4 mM MgCl2, and 100 μg/ml gelatin. Each deoxyribonucleoside triphosphate (i.e., A, T, G and C) is typically present at about 0.2 mM concentration when Taq DNA polymerase is used.
The extent to which any particular sequence can be amplified by CP-PCR depends on three general factors: (1) the frequency of priming at flanking sites; (2) the ability of the DNA polymerase used, typically Taq polymerase from Thermus aquaticus. to extend the template completely; and (3) the total number of productive cycles. E. Comparison of the DNA Amplification Products With Those Produced From Known Genomes If the object of the performance of the CP-PCR method is to identify the genome from which the discrete products were produced, the DNA amplification products (fingerprints) obtained from a sample are compared with the amplification products resulting from the performance of CP-PCR on nucleic acids isolated from known genera, species, subspecies and/or strains using the same primer or mixture of primers, in separate reactions.
The samples selected for comparison depend on the expected identification of the test (isolate) organism of unknown genome. In many clinical situations, identification of an organism of an unknown bacterial genome can be narrowed down by means of the site of infection or other clinical factors. For example, the presence of a wound infection may suggest that the test organism is a member of the genus Staphylococcus. If the unknown organism might be Staphylococcus. various species of Staphylococcus could be screened simultaneously as in a panel of preselected DNA samples, such as S. haemolvticus. S. hominis . S. aureus, S. warneri and S. cohnii ; or multiple strains for each species could be used. Similarly, if the unknown organism might be a strain of Streptococcus . the samples selected for comparison are various predetermined and identified strains of Streptococcus. If the unknown organism is a bacterium of enteric origin, various strains of Escherichia. Klebsiella. Enterobacter. Serratia. Salmonella. Shigella. Proteus and Providencia are used. Additional bacterial genera of clinical relevance could also be included in the panel such as a Clostridium or Pseudomonas.
Because the most substantial differences in the CP-PCR amplification products from different bacterial isolates represent differences between species, CP-PCR can be used effectively to reveal a prior misassignment of a strain. Strains that have been assigned to the wrong species are very rapidly uncovered by the CP-PCR method. Typically, when
CP-PCR is used to verify the assignment of a bacterial isolate to a species, the primer is chosen to maximize interspecific difference of the discrete DNA amplification products generated by CP-PCR. Primers for this application typically exclude regions substantially complementary to regions of DNA highly conserved between the species being studied.
The comparison between the CP-PCR products of the organism of unknown genome and those produced from known genomes is typically performed by separating the discrete DNA amplification products in an apparatus containing a medium capable of separating DNA fragments by size in order to produce a "fingerprint" of the amplification products as separated bands, and then comparing the fingerprint patterns. The fingerprint patterns are diagnostic of the genus, species, and/or strain to which the test organism of unknown genome belongs. Generally, such separation is carried out by electrophoresis, for example, using gel electrophoresis on agarose or polyacrylamide gels to display the resulting DNA products for visual examination. Many protocols for electrophoresis are known in the art; see U.S. Patent No. 4,729,947 and B. Perbal, "A Practical Guide to Molecular Cloning," Ch. 9, "Separation of DNA Fragments by Electrophoresis," pp. 340-362, (2d ed., John Wiley & Sons, New York (1988)), incorporated herein by this reference. Thus, the production of a fingerprint for comparison typically comprises the steps of (1) applying the set of discreet DNA segments produced in a PCR reaction into a channel of the medium in the separating apparatus and (2) separating the discrete DNA segments according to size (size-separating) into bands within the channel to form a fingerprint of the DNA segments characteristic of the genome. One such representative technique is electrophoresis through 5% polyacrylamide containing 50% urea. The concentration of acrylamide is varied according to the size of the products to be resolved. Commercially available size markers typically derived from the digestion of a plasmid or phage of known sequence with a restriction enzyme are added to the gel.
The individual bands present in the fingerprint are detected by various techniques, such as ethidium bromide staining. At least one of the deoxyribonucleotide triphosphate monomers used in the second stage of the reaction can be radioactive, allowing detection of the bands of the fingerprint by autoradiography, or the primer itself can be radioactively labeled by treatment with an appropriate kinase. Alternatively, fluorescent nucleotides can be incorporated and detection carried out by means of fluorescence. F. Further Manipulation of Fragments Produced
Figure imgf000045_0001
Isolated separated fragments can be cleaved with a restriction endonuclease capable of generating polymorphisms, such as Taol or Mspl. Separated fragments produced by CP-PCR and resolved on gels can also be isolated from the gel and reamplified in a conventional PCR procedure to increase the quantity of the isolated band. Isolated fragments can, if desired, be cloned in a bacterial host, typically a strain of Escherichia coli. capable of preserving the integrity of any genetically unstable DNA structures such as long, direct and inverted repeats. Such cloned bands then can be sequenced by well-known, conventional techniques, such as the Sanger dideoxynucleotide sequencing technique or the
Maxam-Gilbert chemical cleavage sequencing technique. For many procedures, such as the preparation of DNA probes, it is not necessary either to clone or recut the DNA fragments amplified by CP-PCR and isolated from the gel. Such fragments can be used as probes after further amplification by conventional PCR during which radioactive nucleotides are incorporated in the amplified fragments.
II. APPLICATION TO IDENTIFICATION OF STAPHYLOCOCCUS SPECIES
One significant application of the general method of the present invention is the identification of the species to which an isolate of Staphylococcus belongs. Staphylococcus is a human pathogen and frequently responsible for serious infections occurring in surgical patients. Accordingly, rapid identification of Staphylococcus species is particularly important in a clinical setting. In the identification of Staphylococcus species by CP-PCR, the discrete DNA amplification products produced from the sample of DNA from the test organism are compared with the DNA amplification products produced from known Staphylococcus species when the same primer is used. We have found between three and twenty products predominate in the CP-PCR products obtained from Staphylococcus genomes. These products are species-specific and can be used to distinguish between S. haemolvticus. S. hominis. S. aureus. S. warneri and S. cohnii. In some cases, subspecies and/or strains of these species are also distinguished. (See Example 2, infra) .
III. APPLICATION TO IDENTIFICATION OF STREPTOCOCCUS STRAINS
In a similar manner, CP-PCR can be used to identify particular strains of Streptococcus. In the identification of Streptococcus strains by CP-PCR, the discrete DNA amplification products produced from the test organism of unknown strain are compared with the DNA ampli ication products produced from DNA of known Streptococcus strains when the same primer is used. Streptococcus is also an important human pathogen, causing potentially severe infections of the skin and mucous membranes, and its rapid identification is clinically important.
As shown below in Example 2, CP-PCR performed on a number of strains of Streptococcus reveals a fingerprint of amplified bands with some species-specific features, as well as some isolate-specific differences. One can clearly group almost all members of a species based on common bands and group subsets of strains within species based on shared bands that are not present in other strains. IV. APPLICATION OF CP-PCR TO GENETICS OF EUKARYOTES
The DNA sequences that represent polymorphisms differing from individual to individual of a species obtained from application of the CP-PCR method of the invention are useful in genetic mapping of eukaryotes, including plants such as maize and soybeans, animals, and humans. In particular, CP-PCR can be used to reveal polymorphisms based on the CP-PCR fingerprint. Such polymorphisms are particularly useful for genetic mapping. The polymorphisms generated can be correlated with other markers such as restriction fragment length polymorphisms (RFLPs) , which in turn have been linked to genetic markers of known function. A RFLP is a detectable difference in the cleavage pattern of DNA from different individuals of a particular species when that DNA is cleaved with a particular restriction endonuclease. Such differences arises when a mutation affects the sequence cut by the enzyme, removing a site previously present or adding a new site.
CP-PCR can be used to track genetic differences in rice, with a 600-megabase haploid genome (Example 2) and in maize, with a 3000-megabase haploid genome (Example 2) . Maize has a genomic complexity comparable to that of the human genome. Similar results are expected with soybeans.
The heterozygosity of the maize genome has been estimated to be about 0.05. Each primer used in the CP-PCR method can probably detect more than one polymorphism between strains at that level of heterozygosity.
Such approaches should allow determination of the linkage distance between polymorphisms and various phenotypes. Phenotypes can be scored in a number of ways, including morphological features and molecular features, such as electrophoretic mobility on proteins and variations in intensity of proteins on two-dimensional gels (Higginbotham et al. , "The Genetic Characterization of Inbred Lines of Maize (Zea mays L.) Using Two Dimensional Protein Profiles,"
Symposium, 1990) . It is interesting to note that when protein abundance or state of modification is followed as a phenotype, linkage is to the genetic element that causes that variation and often not to the protein being observed. Such genetic element can be a regulator or other control element, or a gene for a modifying enzyme. It is possible, however, to link many protein electrophoretic mobility variants to the CP-PCR map. A polymorphism can be correlated with a phenotypic character through repeated backcrossing. This introgression method simplifies the background. Comparing the backcrosses with the parents detects polymorphisms linked to the gene of interest. Another application of the CP-PCR method is in creating a physical CP-PCR map by correlating the recombination frequencies between CP-PCR fragments. By choosing the crosses used in the development of the physical map judiciously, the CP-PCR map will automatically orient itself with respect to the genetic map. Such physical linkage can be studies by pulsed field electrophoresis (PFE) . By applying restriction endonucleases making rare cuts, PFE, and Southern blotting to maize or soybean DNA and probing with genetically linked CP-PCR probes, the size of the physical region for large fragments of chromosomes isolated by PFE can be compared with the rate of recombination. Analogous techniques can be employed for mapping the mouse or human genome. This is of interest because recombination is not equal throughout the genome. The CP-PCR method is particularly suitable for this purpose because a great many markers can, in principle, be identified for an area of interest. The number of individual progeny from crosses that can be inspected and the amount of polymorphism in each marker determines the accuracy with which markers can be mapped. The segregation of polymorphisms revealed by the CP-PCR method in the context of the RFLPs that are already mapped improves the ability to measure genetic distance between them. Computer programs are available for genetic linkage analysis including LIPED (Ott, Amer. J. Human Genet.. 28:528:529 (1976) for two point linkage analysis, ILINK and CILINK from the LINKAGE package (Lathrop et al., Proc. Natl. Acad. Sci. USA. 81:3443-3446 (1984); Lathrop et al., Amer. J. Human Genet.. 37:482-498 (1985)), GMS (Lathrop et al., Genomics. 2:157-164 (1988)), and MAPMAKER (Lander et al., Genomics. 1:174-181 (1987)) for multipoint analysis. Additionally, quantitation of the bands allows distinction between homozygotes and heterozygotes for a particular band in the CP-PCR fingerprint.
The use of such linkage analysis techniques allows determination of linkage distance between the polymorphisms and various phenotypes. RFLPs that have been linked to interesting genetic markers can be correlated with the CP-PCR map. For example, tightly linked flanking RFLP markers have been found for the Mdml gene on chromosome 6S in maize. This gene is involved in resistance to Maize Dwarf Mosaic Virus (MDMV) (McMullen & Louie, Mol. Plant-Microbe Interactions 2., 309 (1989)). Similarly, a RFLP marker less than 1 centiMorgan (cM) from the Htll gene, which confers resistance to the fungal pathogen Helminthospornffl tu-rcinum . has been found (Bentolila et al.. Symposium, 1990).
Another approach to mapping makes use of the fact that RFLPs themselves can be generated from CP-PCR fingerprints. For instance, TaqI restriction endonuclease, which recognizes the site TCGA, will cleave CP-PCR products in which there is at least one TaqI site. If a TaqI site is present in one of the CP-PCR fingerprint products in some individuals but not in others, there will be a difference in the fingerprint of TaqI digested DNA from these individuals. This allows the detection of TaqI RFLPs from CP-PCR patterns. Such TaqI RFLPs are among the most common RFLPs known in the genome because the TaqI recognition site contains the hypermutable dinucleotide CpG. Similarly, Mspl digests, cut at the recognition site of CCGG, can be used to detect the relatively abundant Mspl polymorphisms. Such RFLPs can be either mapped directly in families by genetic mapping or cut out of gels and amplified with radioactively labeled deoxyribonucleoside triphoεphates, such as α-labeled triphosphateε, in conventional PCR to uεe them to probe Southern blots of the appropriately cleaved human DNAs. To ensure purity, the extracted fragments can be recut with the same enzyme following extraction. Alternatively, the bands isolated from CP-PCR fingerprints can be cloned and sequenced. Preferably, such bands should be cloned in Sure E. coli (Stratagene, Cloning Systems, San Diego, California) to preserve the integrity of terminal repeats.
These techniques can also be employed to analyze animal genomes, including the genomes of mice, as well as the human genome. They are particularly useful for filling in the genetic map by linking known markers more precisely.
The CP-PCR method of the invention permits genetic mapping of DNA polymorphisms in mammals without having to first identify RFLP probes. Each polymorphic band in the fingerprint produced by the method represents a heritable characteristic. No clones must be made or plasmids purified. Polymorphisms can be generated by almost any primer selected. The technique requires less than 1/100 of the amount of genomic DNA per lane compared to that needed to prepare a Southern blot for conventional RFLP analysis. The method can use ethidium detection, fluorescent detection or only small amounts of labeled bases relative to Southern hybridization. Moreover, CP-PCR generated DNA polymorphisms can be isolated directly from gels and reamplified to use as probes in "genome walking" or restriction mapping strategies without cloning. Sequencing of some of these polymorphismε will alεo not require cloning. One approach for uεing the CP-PCR method in human genetics can produce products asεignable to the human fragment in a somatic cell hybrid. As long as the recipient is the same for a set of hybrids, the productε that will be different from a non-hybrid control CP-PCR will be the human fragments. Such bands would assign the human fragment on the genetic map if the band was already genetically assigned. Alεo, εuch bands can be isolated from the gel and used to make a DNA probe.
V. COMPOSITIONS AND KITS
Many of the reagents described herein (e.g., nucleic acids such as primers, vectors, and the like) have a number of forms, particularly variably protonated forms, and in equilibrium with each other. As the skilled practitioner will understand, repreεentation herein of one form of a compound or reagent is intended to include all forms thereof that are in equilibrium with each other. The reagents described herein can be packaged in kit form. As uεed herein, the term "package" refers to a solid matrix or material customarily utilized in such a kit system in the form of an encloεure that is capable of holding within fixed limits one or more of the reagent components for uεe in a method of the preεent invention. Such materials include glass and plastic (e.g., polyethylene, polypropylene and polycarbonate) bottles, vials, paper, plastic and plastic-foil laminated envelopeε and the like. Thuε, for example, a package can be a glass vial used to contain the appropriate quantities of polynucleotide primer(ε) , genomic DNA, vectorε, reεtriction enzyme(s) , DNA polymerase, DNA ligase, or a combination thereof. An aliquot of each component εufficient to perform at leaεt one PCR thermocycle will be provided in each container.
Kits useful for producing a primer extension product for amplification of a specific nucleic acid sequence using a primer extension reaction methodology also typically include, in separate containerε within the kit, dNTPs where N is adenine, thymine, guanine and cytosine, and other like agents for performing primer extension reactions.
The reagent species of any system described herein can be provided in solution, as a liquid disperεion or as a εubstantially dry powder, e.g., the primers may be provided in lyophilized form.
In one embodiment, the present invention contemplates a kit for typing an isolate of organism comprising an enclosure containing, in separate containers, at least on consensus primer, preferably a structural RNA consensus primer, and at least one genomic DNA sample for use aε a control in a typing method of this invention. In preferred embodiments, a panel of genomic DNA samples derived from predetermined species are included, as described herein. The consensus primers and the genomic DNA for use in a kit for typing an isolate of organism are comprised as previously described. The kit can further contain, in one or more separate encloεures, one or more panels of genomic DNA representative of groups of species, combined in a manner to allow comparison of subspecies within a species, of species within a genus, of genera within families, and the like, for determining the location of an isolate organism on the evolutionary scale.
In order that the invention described herein may be more fully understood, the following examples are set forth. It should be understood that the following examples are for illustrative purposes only and are not to be construed as limiting the invention.
EXAMPLES
1. Isolation of DNA for CP-PCR Strains of Staphylococcus listed in Table 2 were grown overnight at 37 °C in 2-5 ml of brain heart infusion media. The cells were pelleted, resuspended in 0.2 ml of TE (0.01 M Tris-HCl, pH 8.0, 1 mM EDTA) with 0.2 mg/ml lysostaphin and incubated at 37°C for one hour. Following this incubation, 0.2 ml proteinaεe K solution (containing 0.5 mg/ml proteinase K, 1% Sarkosyl, 200 mM EDTA, and 1 mM calcium chloride) was added to each sample. The samples were then digested at 50°C for one hour. The clear lysates were extracted with phenol and then chloroform; the DNA was then precipitated with ethanol. The precipitated DNA waε diεsolved in TE, and its final concentration was estimated by agarose gel electrophoreεiε and ethidium bromide staining.
TABLE 2 Strains and Species Analyzed by CP-PCR
Species Primer Figure Lane
Staphylococcus
S. haemolyticuε 29970 T5A+T3A S. haemolyticuε CC 12J2 T5A+T3A
It T5A
T3A
T5A+T3A
S. haemolyticuε PAY 9F2 T5A+T3A
S. haemolyticus AW 263 T5A+T3A
S. haemolyticus MID 563 T5A+T3A
S. hominis 27844 T5A ti T3A
It T5A+T3A II T5A+T3A
S. hominis 27846 T5A+T3A
S. warneri CPB10E2 T5A+T3A n T5A
II T3A it T5A+T3A
S. warneri GAD473 T5A+T3A
S. warneri MCY3E6 T5A+T3A
S. warneri PBNZP4D3 T5A+T3A
S. aureus ISP8 T5A+T3A
It T5A II T3A •I T5A+T3A
Figure imgf000054_0001
Figure imgf000055_0001
S. pyogenes (A) D471Rot T5A+T3A
S. pyogenes (G) 1/E9 T5A+T3A
S. pyogeneε (G) 040/011 T5A+T3A
S. mutanε T8 T5A+T3A
S. pyogeneε (B) 50316 T5A+T3A
S. pyogenes (A) UAB 092 T5A+T3A
Figure imgf000056_0001
Enterococcus
E. faecalis, OGI X E. faecaliε, JH2-2
Maize (Zea Maize) B73
II
Maize Mol7
Human (Homo εapien) 584
M
Human 694
Rice (Oryza εativa) Gl
Rice G2
Figure imgf000056_0002
All Staphylococcuε strains shown in Table 2 were kindly provided by W.E. Kloos of North Carolina State Univerεity except ISP-8 from Peter Pattee, (Iowa State Univerεity, Ames, IW) and those from the American Type Culture Collection designated by the four or five digit numerals. Other abbreviations are arbitrary designations for laboratory strains.
DNAs from the human pathogenic strains of Streptococcus pyogenes, S. mutans and Enterococcus faecalis were all kindly εupplied by Susan Hollingshead (Univ. of Alabama, Birmingham, AL) .
Total genomic DNA from maize and rice strains were kindly provided by Rhonda Honeycutt, (Iowa State U. , Ames IW) . Human DNAs from normal intestines were kindly provided by Manuel Perucho (CIBR, CA) .
Genomic DNA was isolated from the other species shown in Table 2 by the same detergent lysis and phenol extraction protocol described above.
2. Performance of CP-PCR Amplification
Primers described in Table 1 were chemically synthesized and obtained from Genosyε, (Houston, TX) .
PCR reaction admixtures were prepared in a volume of 50 μL containing 1 x Taq polymerase buffer (Stratagene Cloning Systems, San Diego) adjusted to 4 mM with MgCl2, 0.2 mM of each deoxyribonucleotide triphosphate, 1.25 units Taq polymerase, 1 uM consensus primer (or primers) , 50 uCi alpha [32P] dCTP, and template DNA at various quantities from 100 ng to 3.2 ng as indicated. The reactions were overlaid with oil and cycled forty times through the following temperature profile: 94°C for 30 seconds for denaturation, 50°C for thirty seconds for annealing of primer, and 72°C for two minutes for extension. The results of this set of PCR cycles was the formation of a discrete set of amplified DNA segments (primer extension products) . The resulting products were reεolved by electrophoreεiε in IX TBE through 5% Acrylamide-50% Urea and viεualized by autoradiography using Kodak X-Omat™ AR film with an intensifying screen at -70°C for 6 hours.
The resultε, εhown in Figure 1, indicate that reproducible fingerprints can be obtained over a 25-fold range of template concentration at 50°C. Other experiments, not shown, indicated that the fingerprint did not vary when the low stringency annealing step was varied between 45°C and 50°c, which is suitable for partly mismatched primers. Suitable temperatures probably range from 40°C to 55βC. There were a number of products generated for each genome in Figure 1 whether T5A and T3A were used alone or together, indicating PCR initiated at a variety of places in the genome, as expected. With the exception of S. cohnii, which was already known to be the most divergent species within the genus, the
CP-PCR patterns were very similar between the species, indicating that the tRNA gene clusters probably evolve relatively slowly. This iε in contraεt to arbitrarily primed (AP)-PCR, Welsh et al., Nucleic Acids Reε.. 18: 7213-7218 (1990) , or total genome restriction digestion, Cinco et al., FEMS Microbiol. Immunol.. 47: 511-514 (1989) , that give very different patterns when different species are compared.
A survey was performed on forty εtrains of bacteria, representing many strainε from five εpecies of Staphylococcus, four species of Streptococcus and a species of Enterococcus. The organization of the tRNA genes in these species has not been described, but they are presumably similar to those of other related bacteria, such as Bacillus. Figure 2 shows that within a species there was generally no variation in the CP-PCR pattern. There were only two exceptions. A Streptococcus pyogenes strain K58Hg that was designated serotype A (lane 8) gave a pattern identical to serotype b (lanes 7, 13 and 14) .
Interestingly, AP-PCR experiments group this strain with εerotype b and not serotype a. It is likely that K58Hg is in fact a serotype b. The other exception was a strain of S. haemolyticus (lane 20) that was obtained from the ATCC. Although not explained, preliminary data indicates that S. haemolyticus consistε of at leaεt two groups of strains that are rather divergent and may in fact be different specieε. CP-PCR should work for a wide variety of εpecieε because tRNA genes are highly conserved, are abundant, and are generally arranged in clusters. Figure 3 shows CP-PCR reactions on the genomes of species from three kingdoms. In addition to the bacterial genomes, there were fingerprints generated for the maize, rice and human genomes with at least one of the two pairs of primers tested. For example, the rice fingerprints are identical between strains, but the T5A/T3A pair of primers (lanes 4 and 5) gives a completely different pattern than the T5B/T3A pair (lanes 6 and 7) . In the case of eukaryotes, a typical consensuε tRNA primer will prime both nuclear and organelle (mitochondrial and chloroplast) tRNA genes. Plant mitochondrial and chloroplast genomes do not seem to have a high rate of point mutation, however, the evolution of animal mitochondrial genomes iε fast. In this latter case, it can be expected that the resulting CP-PCR products will vary over a shorter evolutionary time than nuclear productε.
Organelle genomes, despite their small size, can contribute up to half of the DNA in the cell, due to their high copy number. Neverthelesε, for CP-PCR the beεt matcheε with the primer are probably more important than copy number and theεe beεt matches will generally be to nuclear genes because the number of different nuclear tRNA gene sequences greatly exceeds the number of different organelle tRNA gene sequences.
The data does not clearly prove that the CP-PCR products shown are, in fact, from tRNA genes nor, in the case of eukaryotes, if they are from the nuclear or organelle genomes. However, the patterns were identical between divergent individualε within each species. Regardless of the origin of the pattern, CP-PCR is a useful method of specieε classification. The consiεtency of the patternε between bacterial εpecieε iε εtrong evidence that the fingerprintε are not generated by arbitrarily primed-PCR. Such AP-PCR fingerprintε are not conserved between specieε.
The resultε in Figures 1-3 indicate that consensuε tRNA gene primers that amplify the region between tRNA geneε can be used to generate PCR fingerprintε that are generally invariant between εtrainε of the εa e εpecieε and are often substantially conserved between related species. This property makes the method applicable to the identification of organisms by a genome based method that is independent of other criteria such as morphology. The ease with which the method can be performed, independent of the genome size, sequence, concentration of genomic DNA, or the cycling parameters, indicates that it iε the method of choice when examining a large number of different strains for which a rapid and convenient method for categorization is desired.
While there are many wayε to determine species and genus, CP-PCR using consensus tRNA primers is a simple and. fast method that complements those that already exist and has the virtue that the polymorphisms measured are not themεelveε likely to be selected. They are, potentially, more likely to be near neutral than other characters εuch aε nutrition and perhaps less likely to result from convergent evolution, which is a drawback of classification by morphological criteria. It is also an advantage that the data is collected for regions scattered throughout the genome rather than from εequence differences in a 59 single location that may not reflect the whole genome. A single CP-PCR fingerprint will generally have lesε information than comparing the DNA sequence of a specific region in each organism. However, CP-PCR is 5 lesε technically demanding and less time consuming than DNA sequencing. CP-PCR could be a method of choice when large number of individuals are to be screened or as a first step when identifying εpecieε based on genomic sequence. Since data acquisition is
10 trivial in CP-PCR, the number of consensus primers and thus the number of patterns that can be generated is large. Any required number of different fingerprintε could be generated to provide the necessary markers for species classification. Furthermore, as
15 demonstrated, primers that produce fingerprints from the genomes of a wide variety of organisms can be devised by the present methods. Thus, organisms one essentially knows nothing about can immediately be examined. In addition, it is possible to develop tRNA
20 consensus primers that are targeted preferably to a particular kingdom or to either the nuclear genome or organelle genomes of eukaryotes.
The method presented represents the simplest available universal way to reliably compare genomes of
25 organismε at the species/genus level. The method has applications in ecology and epidemiology.
3. Detection of Bacterial Species Using Integrin Length Polymorphisms (ILPs.
30 In Example 2, consensus tRNA gene primers were used to amplify tRNA intergenic regions (tDNA-PCR) . Under low stringency PCR conditions these primers generated products some of which were conserved within genera and some of which varied in length between
35 related specieε. Using DNA sequencing, this study shows that the variation in tDNA-PCR fingerprints observed between related Staphylococcus species is due to length polymorphisms in the intergenic spacerε (intergenic length polymorphisms or ILPs) . Primers for high stringency (conventional) PCR that are derived from adjacent tRNA genes are used to amplify the homologous polymorphic intergenic spacer(s) from several specieε. The products are referred to as tRNA intergenic length polymorphisms (tDNA-ILPs) . In almost all cases, the size of the PCR product is constant in the strains investigated within species and varies between species.
This strategy can be applied to any set of closely related species. Thus, many important groups, such as human pathogens, may be detected and categorized by using only a small number of primers. A. Experimental Procedures
Genomic DNAs. Strains used in this study are listed in Table 3. Genomic DNAs were prepared from late log phaεe cultureε. If necessary, cell walls were treated with lysoεtaphin, streptolyεin or lysozyme followed by incubation at 65°C in 1 mg/ml proteinase K, 100 mM EDTA, 1% SDS for 2 hours. DNA was purified by phenol extraction followed by chloroform extraction and isopropanol precipitation.
Figure imgf000062_0001
Figure imgf000063_0001
Figure imgf000064_0001
cancer patient
WK : W. Kloεs. North Carolina State U.
SH : S. Hollingshead. U. of Alabama, Birmingham.
FF : F. Fang. U. California San Diego, Medical
Center. MP : M. Perucho at CIBR, La Jolla. GW : G. Wilson, New England Biolabs.
Primers. The following primers were purchased from Genosys (Houεton, TX) :
T5A 5' AGTCCGGTGCTCTAACCAACTGAG 3«
T3B 5' AGGTCGCGGGTTCGAATCC 3'
Seg 5« TTGTAAAACGACGGCCAG 3'
Rev 5' GGAAACAGCTATGACCATGA 3'
StaphiMet3 5! CGCGGGTTCGAATCCGCCTC 3'
StaphAspS 5' CGTGTTAACCGCTACACTAC 3»
StaphAsp3 5' GGGTTCGAATCCCGTCGAG 3» StaphPheS 5' AACCAACTGAGCTACTGAAC 3'
The following primers were purchased from Stratagene Inc. (La Jolla, CA) : 27 5'AATACGACTCACTATAG
T3 5' ATTAACCCTCACTAAAG
Polymerase Chain Reaction. CP-PCR with consensus primers for amplifying transfer RNA genes (tDNA-PCR) was performed as follows: Fifty μl reactions contained 1.25 units of Taq polymerase, IX Taq polymerase buffer (50 mM KCl, 10 mM Tris HC1 pH 8.3, 1.5 mM MgCl2) , 0.2 mM of each dNTP, 5 μCi alpha-[32P] dCTP, 0.5 μM of each primer and 10 ng of template DNA. The reaction was cycled 40 times through the following temperature profile: 94°C for 30 sec to denature, 50°C for 30 seconds for annealing of primer and 72°C for 2 minutes for extension using a Perkin Elmer Cetus 9600 thermocycler. The resulting products were resolved by electrophoresis using 5% acrylamide-50% urea in IX TLE and visualized by autoradiography using Kodak X-Omat™ AR film with an intensifying screen at -70°C for 6 hours. The products of tDNA-PCR could also be visualized on NuSeive agarose or native acrylamide gels by ethidium bromide staining.
Isolation and Amplification of Products. Radioactive ink spots were uεed to align the autoradiogram with the gel. PCR productε were cut out of the gel, placed in 50 μl of TE and the DNA waε eluted for 1 hour at 65°C. 1 μl waε re-amplified by PCR using the same primers.
High stringency PCR amplification used 1 X Taq pol buffer, 0.2 mM each dNTP, 0.5 μM each primer, 30 ng of template, 1.25 units of Taq polymerase in a total volume of 50 μl. Cycling parameters were 94°C, 1 minute; 50°C, 1 minute; 72°C, 2 minutes for 30 cycles. For tDNA-ILPs, the cycling times could be truncated if the expected product is short. In these experiments PCR amplification used 1 X Tag buffer, 1 unit of Perfect Match (Stratagene) , 0.5 μM each primer, 1 ng of template, 1.25 units Taq polymerase in a total volume of 50 μl. Cycling parameters: 94°C, 30 seconds; -60°C, 30 seconds; 72°C, 30 seconds for 40 cycles. Cloning. Gel purified and reamplified PCR products were purified using Gene Clean (BiolOl, San Diego, CA) . The DNA was cloned uεing 100 ng of pBSKII* digeεted with Smal and T4 DNA ligaεe with 0.5 mM ATP and the recommended buffer. Vector and enzymes were from Stratagene. Clones were detected as white colonies on 2X YT plates sprayed with IPTG and X-gal. Colonies were picked, boiled in 100 μl of water, and 10 μl was amplified in a 50 μl PCR reaction using the Universal (Uni) and reverse (Rev) sequencing primers. The products were electrophoresed through 1.5 % agarose (IX TBE) . Insertε of the correct size were asymmetrically amplified and sequenced.
Seouencing. 5 μl of PCR product from the Uni-Rev amplification was asymmetrically reamplified in each direction using each of the two sequencing primers separately. The largely single stranded DNA that resulted from asymmetric amplification waε then sequenced with the USB Sequenase kit (Cleveland, OH) using the pBSKII* TJ7 primer in the case of asymmetrically Uni amplified material and T3. primer in the case of asymmetrically Rev amplified material. The sequences were resolved on 5% acrylamide, 50 % urea gels followed by fixing, drying and autoradiography. B. Results The purpose of the studies in this Example was to determine the nature and extent of intragenic variation in tRNA gene organization within a eubacterial genus and, if possible, utilize this variation to develop a PCR method to classify strains within this genus into the correct species. The staphylococci were chosen because a number of species are pathogenic in humans and they represent a diverse group for which the basic phylogenetic relationships are already known (Kloos and Wolfshohl, 1979; Kloos, 1980; Kloos and Wolfshohl, 1983; Kloos and Schleifer, 1986) .
Amplification with pgnsensμs primers.
Two primerε (T5A and T5B) for PCR were derived from a consensus of Bacillus tRNA genes. These primers were located facing outward and about 15 baseε from the end of the tRNA gene consensus sequence. PCR products using these primers were expected to occur between the pair of tRNA genes that best matched the primers and were in the correct orientation for PCR.
For instance, the consensus primer facing out from the 5' end of the consensuε tRNA resembled most closely tRNA1Mβt, tRNALyβ and tRNAIlβ, whereaε the primer deεigned to face out from the 3' end of the consensuε most closely resembled tRNAphβ" Thus, for tRNA genes existing in the correct orientation within a few hundred base pairs of each other, the region between them was expected to be represented in the CP-PCR fingerprint. If these tRNA genes did not exist in the right orientation in a particular species, then the intergenic region between the next best pair of matches would amplify.
As in Example 2 CP-PCR amplification of Staphylococcus DNA using consensus tRNA gene primers gave fingerprintε that displayed the distances between the best matches with the primers in the tRNA gene clusters (see Figure 4) . CP-PCR was performed on strains, listed in Table 3, including at leaεt εeven εtrainε from each of four Staphylococcus species (data not shown) . Strains were isolated from a wide variety of hosts ranging from lemurs to birds and originating on a number of continents (mainly isolated by Dr. W. Kloos, North Carolina State U.). These very divergent strainε were compared to maximize the possibility of detecting intraspecific differenceε. Consistent with the results in Example 2, there were rarely any intraspecific differences. However, when closely related species were compared, there were usually several interspecific difference in the CP-PCR pattern that could be used to distinguish between species. The products in CP-PCR were generated by a low stringency PCR reaction, so it was posεible that products of similar size generated from different species were not homologs. However, because the consensus primers presumably picked the best pairs of matcheε in the genome and because the tRNA gene sequences and their organization were unlikely to differ very much between closely related species, products in a particular portion of the gel were likely to be from homologous regions when cloεely related species were compared. PCR products of about 160 bp that were polymorphic in length between various Staphylococcus species (but which were generally not polymorphic within a species) were removed from the gel, reamplified with the same primers and cloned into pBSKII*. This size class was εelected because it presumably span two intergenic regions but was small enough to sequence completely in both strands without further subcloning. Nine of ten clones εequenced εpanned apparently ho ologouε loci, including part of tRNAiMβt, all of tRNAAβp, and part of another tRNA, probably tRNAph*. One clone from £>. cohnii contained different tRNAs. Sequences for portionε of the homologous PCR products are presented in Figure 2. The sequence comparisonε showed that the complete
Staphylococcus tRNA**-3, and partial tRNAiMet, and tRNA"" gene sequences were similar to those of Bacillus (Void, 1985) and Mycoplasma (Muto et al., 1990) which are also members of the low G+C subdivision of the gram-positive phylum (Woese, 1987) .
Amplification with Specific Primers. High stringency (specific) primers for sequences within the tRNAiMβt and tRNAPhe genes were produced for amplifying the polymorphic intergenic region. The first consideration in choosing these primers was that they exactly match the tRNA genes sequenced in the three species of Staphylococcus (Figure 5) . The next consideration was to choose primers based on the range of specieε or genera in which a specific PCR product was desired. The gene order tRNAiMet-tRNAAsp-tRNAPhe found for the 160 bp tDNA-PCR products from three Staphylococcus species is also found in Bacillus (Void, 1985) and Mvcoolasma (Muto et al., 1990), but not in the more distantly related species _\. coli (Jinks-Robertson and Nomura, 1987) . With more limited data available, the closest equivalent gene order in Streptococcus iε tRNAiHet-tRNAph . Thuε, it would have been poεεible to select primers that amplify similar portionε of Staphylococcuε , Bacillus and Mvcoplaεma DNA but would fail to work or would yield productε of a εmaller εize from Streptococcus DNA. However, primers of about 20 bp were chosen (Figure 5) that had poor homology to the Bacilluε or Mvcoolasma tRNA,Het tRNAAsp and tRNAphe genes so that high stringency PCR was unlikely to yield products from these genera. Various PCR conditions were examined and it was found that quite high stringency conditions (60°C and in the presence of £. coli single stranded binding protein) were most reliable for obtaining the appropriate PCR products in Staphylococcus with the primers we chose. It is likely that PCR conditions would need to be optimized for each new pair of primerε, a εituation that pertains for high stringency PCR in general. The specific primers were used to amplify products from 36 strains from five Staphylococcus species listed in Table 3. Figures 6A and 6B show a few examples of the data obtained for Staphylcoccus. Figure 6A showε the results of using the tRNAiHet and tRNA**-3 specific primers StaphiMet3 and StaphAspS on Staphylococcus species in the presence of a 1000-fold excess (by mass) of human genomic DNA. A similar study using the StaphAsp3 and StaphPheS primers is shown in Figure 6B. tDNA-ILP products of the expected size (which includes the primerε and parts of the tRNAs) are seen in Figure 6A for Σ3. hominis. £>. warneri and S. aureus (68, 60, and 63 bp, respectively) . In addition, the species S . haemolyticus and S. cohnii. for which sequence data was not collected, generally gave products of about 58 bp and 63 bp, respectively.
In Figure 6A only a single molecular weight product was produced in each specieε using the StaphAsp3 and StaphPheS primerε. However, for some pairs of PCR primers one can expect that there will be two products because tRNA genes are often duplicated in the genome and the organization of the tRNA genes in the two clusters can be similar. This phenomenon, when it occurs, is helpful in.distinguiεhing species because the length of both products may vary between specieε and may do so independently. In Figure 6B there were two εizes of PCR product in some species, perhaps indicating that there are two occurrences of adjacent tRNAA* and tRNAphe genes in the genome. One product should correspond to the gene that was sequenced and indeed the upper band in the experiments with £. hPfflinis, _ . warneri and £. amreytg gave the expected size of products of about 66, 67, and 73 bp, respectively. An additional product of lower molecular weight was seen for S. hominis and S. aureus. Another species, S. haemolyticus. gave products of about 62 and 58 bp.
Of the 36 strains from five species of Staphylococcus that were tested (Table 3) , only four strains gave PCR products that appeared to be polymorphic within a specieε. In one case, as determined by independent criteria, a .§. aureus strain had been mis-classified as S. cohnii. tDNA-ILPs should be effective for detecting mis-classified strains. Three other strains could not be identified because both of the tDNA-ILP products generated differed in length from those of other members of their presumptive εpecies. While this may reflect genuine intraspecific variation, that both tDNA-ILPs differ from other strains in the presumptive species suggests that these three strains are members of perhaps two entirely different Staphylcoccuε species, not yet included in our analysis. It is important that specieε identification not be based on a single tDNA-ILP. However, strains that can be matched with a species on the basis of two or more tDNA-ILP can probably be assigned with confidence. Because the assignment of a εpecieε designation to a strain using the tDNA-ILP method involveε the compariεon to reference εtrainε εome of which may not be included in a study, the method may sometimeε fail to classify a strain, but it iε unlikely to classify a strain incorrectly. On rare occasions such data may even indicate previously unknown specieε. This situation is in contrast to some conventional PCR strategies where a strain is classified on the basis of the presence or absence of a PCR product that is not intrinsically polymorphic in length. The tDNA-ILP method may be more likely than PCR of protein coding regions to detect Staphylcoccus. whether claεεifiable or not, becauεe tDNA-ILPε primerε are directed to sequences that are much more conserved among εtaphylococci than protein coding regionε are likely to be. The high εtringency primerε StaohiMet3 plus
StaphAspS or StaphAspS plus StaphPheS were designed to produce a PCR product only in Staphylococcus and only in the diagnostic range around 50 bp to 100 bp. To determine if this was in fact the case, a variety of closely related and unrelated genera were investigated. Total genomic DNAs at concentrations ranging from 1 ng to 1 mg were tested for the species listed in Table 3 to determine if these would yield tDNA-ILP products with the primers presumed to be Staphylococcus-specific. Genomic DNAs from
Streptococcus. Bacillus and Enterococcus. which are among the genera most closely related to Staphylococcus. gave no detectable PCR products in the 40 bp to 200 bp range under high εtringency PCR. Nor did DNA from less related E. coli or human.
Occasionally, a primer dimer waε detected at 40 bp or a faint product over 200 bp. In order to determine the full εpectrum of εpecieε for which theεe primerε give a prominent product in the diagnoεtic size range a syεtematic analyεis of species in Staphylococcus and related genera is continuing. C. Discussion
A two step procedure was developed that (1) identifies length polymorphisms in tRNA intergenic spacers using low εtringency tRNA gene consensus primers and (2) uses the sequences of these polymorphisms to design high stringency tRNA gene primers. Unlike conventional pairs of PCR primers, these primers can be used in a number of related εpecieε to produce a PCR product the length of which defines most or all strainε in each εpecieε.
Staphylococcuε species were chosen in this study as a model for a general method. These represent a well studied group of mainly commensals and opportunistic pathogens on mammalε and birds (Kloos et al., Syεtematic Bacteriology. 2:1013-39, 1986). The primerε developed for thiε genus may be useful for epidemiology, ecology or diagnosis. It is possible that PCR of tRNA gene spacers will be more sensitive and reliable than other methods such as immunological methods, while also identifying the occasional occurrence of a specieε of Staphylococcus in an unusual context. The uεe of the dUTP/ung εyεtem to minimize cross-contamination (Perkin Elmer Cetus, Norwalk, CT) should make a routine test quite reliable.
The consensus tRNA gene primers used in Example 2, produce fingerprints in many species of bacteria. In addition to the results shown here, these consensus primers have been uεed to create fingerprints that show interspecific polymorphiεms in Streptococcus (Welsh et al., J. Clin. Microbiol.. 1992a). Polymorphisms in CP-PCR fingerprints identify spacers that are good candidates for the tDNA-ILP method. Primers that produce ILPs may be developed for any group of bacteria and probably even for lower eukaryoteε of medical or agricultural importance, εuch as Pneumocvεtiε carini. and fungal infectionε εuch as Asperσillus and Candida. The polymorphic sequences obtained from tDNA-PCR products could also be the basiε for choosing primer pairs that generate a PCR product in only a single species. However, such conventional species-specific primers suffer disadvantages. First, the main advantage of ILPs would be loεt, namely the ability to detect most or all individuals in a number of εpecies with a single pair of primers based on highly conserved genes flanking length polymorphisms. Second, at least some part of a primer sequence must be from a non-conεerved region to prevent it from producing a product in a closely related specieε. As a consequence, some εtrains within the specieε may not be detected by εuch primer pairs because the non-conserved sequences are inevitably more likely to show intraspecific variation.
In principle, one could distinguish between specieε using interspecific length polymorphismε in any region of the genome. For example, primerε could be located in coding regions to amplify intergenic non-coding region (protein-ILPs) . However, high stringency primers for protein coding regions are unlikely to produce a diagnostic product in all species in a genus because changes in the third position of codons occur much more rapidly than sequence changes in tRNA genes.
Another method to produce specieε-specific length polymorphismε uεing primers that work on a group of species could be based on rRNA genes. The rRNA gene sequences and their organization are well conserved within a genus. We have demonstrated the principle of uεing primerε to amplify the intergenic regionε of the 16S-23S and 23S-5S aε a diagnostic tool (Barry et al., PCR Methods and Applicationsf 1:51-56, 1991). The length of these spacers is often variable between species and this could be used to distinguish them. Because sequence data for rRNA genes is accumulating rapidly, the production of genus-εpecific primers may often not require a rDNA consensuε fingerprint aε a first step. Likewise, as large amounts of information about tRNA gene sequences and organization become available, such as that for Mvcoolasma capricolum (Muto et al., Nucleic Acids Res.. 18:5037-5042, 1990), tDNA-ILP primers could be designed directly without initial consensus tDNA-PCR fingerprinting. tDNA-ILPs may have advantages over rDNA-ILPε for several reasons. The rRNA gene spacers are generally a few hundred baεe pairε in length and a difference of a few baεe pairε in the intergenic spacer length between related species would be difficult to detect. A virtue of using tDNA-ILPs iε that the primers can be designed so that the PCR products are short. Short products can be amplified rapidly, run rapidly on gels and a difference in size of a few baεe pairε results in a large fractional change in mobility. In addition, products from contaminants will not compete well against a short subεtrate. Primers designed to amplify rDNA-ILPε cannot be adjuεted to yield εhort productε and εtill reεide within the highly conεerved rDNA gene sequence. Also, there are only a εmall number of rRNA genes in each eubacteria. In contrast, in each species of every genus there are as many as a hundred posεible combinations of highly conserved pairs of tRNA gene sequences, separated by variable short intergenic spacers, that can be compared to produce candidates primers for a genus-specific high stringency tDNA-ILP method. tDNA-PCR fingerprinting methods allows one to survey a large number of different intergenic regions for length polymorphisms in a single experiment. Consensus primers can easily be devised that interact with different sets of tRNA genes. Finally, primers can be selected to give genus-speci ic products based on the fact that the order of tRNA genes can vary between closely related genera. The gene order of ribosomal RNAs does not vary in eubacteria.
The above-discussed resultε demonstrate that tDNA-ILPε and rDNA-ILPε, can be developed for any group of closely related specieε. These represent a potentially valuable source of tools for rapid species identification in uncultured samples.
ADVANTAGES OF THE INVENTION
The present invention provides a method with several advantages for identification of bacteria and other biological materials. The method iε simple to perform and rapid; results can be obtained in as little as 36 hourε when the template nucleic acids are isolated by boiling. Only small samples of material, e.g., nanogram amounts, are needed. The method yields information that allows the differentiation of even closely related species and can be extended to differentiate between subspecies or strainε of the same species. The method requires no prior knowledge of any biochemical characteristics, including the nucleotide sequence of the target nucleic acids, of the organism to be identified. Initially, it requires the use of no species-specific reagents, because the primer used iε baεed on consensus sequences as described herein. Additionally, the method possesses the important advantage of requiring only one primer sequence for amplification although two or more primers can be uεed in some embodiments.
The CP-PCR method of the invention can be used to provide identification of other types of organismε, including viruses, fungi, mammals and plants. The method alεo provides an efficient way of generating polymorphisms for use in genetic mapping, especially of eukaryotes, including animals, particularly mice and humans. This method has many applications in mammalian population genetics, pathology, epidemiology and forensics.
Although the present invention has been described in considerable detail with regard to certain preferred versionε thereof, other versions are posεible. Therefore, the spirit and scope of the appended claims should not be limited to the descriptions of the preferred version contained herein.

Claims

What Is Claimed Is:
1. A method of generating a set of discrete DNA segments characteristic of a genome comprising:
(a) forming a polymerase chain reaction (PCR) admixture by combining, in a PCR buffer, genomic DNA and at least one structural RNA consensuε primer from about 10 to about 50 nucleotide baεeε in length;
(b) subjecting said PCR admixture of step (a) to a plurality of PCR thermocycles to produce a plurality of DNA segments, thereby forming a set of discrete DNA segments.
2. The method of claim 1 further comprising the stepε of:
(c) applying the εet of discrete DNA segments produced in step (b) to a channel of a separating apparatus;
(d) size-εeparating the applied segments into bands within the channel to form a fingerprint of segments characteristic of said genome.
3. The method of claim 1 wherein said at least one consensus primer compriseε at least two different polynucleotides.
4. The method of claim 1 wherein said structural RNA consensus primer is a transfer RNA or ribosomal RNA consensus primer.
5. The method of claim 4 wherein said consensus primer is a tRNA consensuε primer selected from the group consisting of:
(T5A) 5«-AGTCCGGTGCTCTAACCAACTGAG-3 ' , (T5B) 5'-AATGCTCTACCAACTGAACT-3 ' ,
(T3A) 5'-GGGGGTTCGAATTCCCGCCGGCCCCA-3 ' , and (T3B) 5«-AGGTCGCGGGTTCGAATCC-3 ' .
6. The method of claim 4 wherein the primer is a tRNA consenεus primer having a sequence at its 3'-terminuε correεponding to the sequence 5--CTGAG-3',
5•-GAACT-3' , 5»-CCCCA-3 ' , or 5•-AATCC-3 ' , and wherein the primer has a length of at least fifteen nucleotides, said fifteen nucleotides at the 3 'terminus having at least fifty percent homology to a consenεuε tRNA sequence.
8. The method of claim 1 wherein the consensus primer is at leaεt about 15 bases in length and lesε than about 40 bases in length.
9. The method of claim 1 wherein said plurality of PCR thermocycles is about 10 to 40 PCR thermocycles.
10. A method for typing an organism isolate having genomic DNA, which method comprises:
(a) forming a polymerase chain reaction (PCR) admixture by combining, in a PCR buffer, said genomic DNA and at leaεt one εtructural RNA consensus primer from about 10 to about 50 nucleotide baseε in length;
(b) subjecting said PCR admixture of step (a) to a plurality of PCR thermocycles to produce a plurality of DNA segments, thereby forming a set of amplified discrete DNA segments,
(c) applying the amplified set of discrete DNA segments produced in step (b) to a channel of a separating apparatus;
(d) size-separating the applied segments into bands within the channel to form a fingerprint of segments characteristic of said genome.
(e) comparing the fingerprint of step (d) with the fingerprints for control samples of genomic
DNA from a panel of predetermined specieε of organiεms prepared in accordance with stepε (a)-(d), and recording the reεultε of the comparison.
11. The method of claim 10 wherein the organism isolate is a bacterial isolate.
12. The method of claim 11 wherein said bacterial isolate belongs to the genus Staphylococcus and the panel of predetermined species comprises S. haemolvticus. S. hominis. S. aureus, gt warneri or S^_ cohnii.
13. The method of claim 11 wherein said bacterial isolate belongs to the genus Streptococcus and the panel of predetermined specieε compriεeε S. pyogenes or S. mutans.
14. The method of claim 11 wherein said bacterial isolate belongs to the genus Enterococcuε and the panel of predetermined εpecieε comprises E. faecalis.
15. The method of claim 10 wherein said panel of predetermined specieε compriεeε species representative of prokaryotes, eukaryoteε or plantε.
16. The method of claim 10 wherein εaid at least one structural RNA consenεuε primer compriεeε at leaεt two different polynucleotideε.
17. The method of claim 10 wherein εaid εtructural RNA consensus primer is a transfer RNA or riboεomal RNA consensus primer.
18. The method of claim 17 wherein said consensus primer iε a tRNA conεenεus primer selected from the group consisting of:
(T5A) 5»-AGTCCGGTGCTCTAACCAACTGAG-3' , (T5B) 5'-AATGCTCTACCAACTGAACT-3' , (T3A) 5'-GGGGGTTCGAATTCCCGCCGGCCCCA-3', and (T3B) 5'-AGGTCGCGGGTTCGAATCC-3' .
19. The method of claim 17 wherein the primer is a tRNA consensus primer having a sequence at its
3'-terminus corresponding to the sequence 5*-CTGAG-3' , 5*-GAACT-3« , 5'-CCCCA-3•, or 5'-AATCC-3', and wherein the primer has a length of at least fifteen nucleotides, said fifteen nucleotides at the
3'terminus having at least fifty percent homology to a consensuε tRNA εequence.
20. The method of claim 17 wherein the consensus primer is at least about 15 baseε in length and less than about 40 bases in length.
21. The method of claim 17 wherein said plurality of PCR thermocycles iε about 10 to 40 PCR thermocycles.
22. A consensuε tRNA primer from about 10 to about 50 nucleotide bases in length having a sequence at its 3•-terminus corresponding to the sequence 5•- CTGAG-3', 5'-GAACT-3', 5'-CCCCA-3», or 5'-AATCC-3», and wherein the primer has a length of at least fifteen nucleotides, said fifteen nucleotides at the 3' terminus having at least fifty percent homology to a consenεuε tRNA sequence.
23. The consensuε tRNA primer of claim 22 wherein εaid primer has a nucleotide sequence selected from the group consisting of:
(T5A) 5'-AGTCCGGTGCTCTAACCAACTGAG-3' , (T5B) 5»-AATGCTCTACCAACTGAACT-3• , (T3A) 5'-GGGGGTTCGAATTCCCGCCGGCCCCA-3' , and (T3B) 5•-AGGTCGCGGGTTCGAATCC-3' .
24. A kit for typing an isolate of organism compriεing an encloεure containing, in separate containers, at least one structural RNA consensuε primer and at leaεt one εample of iεolated genomic DNA from a panel of predetermined species of organismε.
25. The kit of claim 10 wherein εaid panel of predetermined species comprises species representative of prokaryotes, eukaryotes or plantε.
26. The kit of claim 25 wherein εaid panel of predetermined εpecies belongs to the genus Staphylococcus and the panel comprises S^ haemolvticus. S. hominis. S. aureus. S. warneri or S. cohnii.
27. The kit of claim 25 wherein said panel of predetermined species belongs to the genus Streptococcus and the panel comprises S. pyogenes or S. mutans.
28. The kit of claim 25 wherein said panel of predetermined species belongs to the genus Enterococcus and the panel compriseε E. faecalis.
29. The kit of claim 24 wherein said at least one structural RNA consenεuε primer compriεes at least two different polynucleotides.
30. The kit of claim 24 wherein εaid εtructural RNA consensus primer is a transfer RNA or ribosomal RNA consensus primer.
31. The kit of claim 30 wherein said structural RNA consensuε primer is a tRNA consensus primer selected from the group consisting of:
(T5A) 5•-AGTCCGGTGCTCTAACCAACTGAG-3', (T5B) 5'-AATGCTCTACCAACTGAACT-3' ,
(T3A) 5'-GGGGGTTCGAATTCCCGCCGGCCCCA-3', and (T3B) 5»-AGGTCGCGGGTTCGAATCC-3' .
32. The kit of claim 30 wherein said structural RNA conεenεus primer is a tRNA consenεuε primer having a sequence at its 3'-terminuε corresponding to the sequence 5«-CTGAG-3', 5'-GAACT-3', 5'-CCCCA-3', or 5*-AATCC-3•, and wherein the primer has a length of at least fifteen nucleotides, εaid fifteen nucleotides at the 3•terminus having at least fifty percent homology to a consenεus tRNA sequence.
PCT/US1992/001491 1991-02-25 1992-02-25 Consensus sequence primed polymerase chain reaction method for fingerprinting genomes WO1992014844A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US661,591 1991-02-25
US07/661,591 US5437975A (en) 1991-02-25 1991-02-25 Consensus sequence primed polymerase chain reaction method for fingerprinting genomes

Publications (1)

Publication Number Publication Date
WO1992014844A1 true WO1992014844A1 (en) 1992-09-03

Family

ID=24654250

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1992/001491 WO1992014844A1 (en) 1991-02-25 1992-02-25 Consensus sequence primed polymerase chain reaction method for fingerprinting genomes

Country Status (3)

Country Link
US (1) US5437975A (en)
AU (1) AU1662292A (en)
WO (1) WO1992014844A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994012669A2 (en) * 1992-11-30 1994-06-09 Genetics Institute, Inc. Detection of dna contaminants by pcr
WO1995001453A1 (en) * 1993-07-01 1995-01-12 The Board Of Trustees Of The Leland Stanford Junior University A heteroduplex mobility assay for the analysis of nucleic acid sequence diversity
US5491062A (en) * 1993-11-23 1996-02-13 Stratagene Polynucleotide amplification mycoplasma assay, primers, and kits therefore
US5656740A (en) * 1994-06-06 1997-08-12 E. I. Du Pont De Nemours And Company Nucleic acid fragments useful in the detection of Salmonella
US5747257A (en) * 1996-02-29 1998-05-05 E. I. Du Pont De Nemours And Company Genetic markers and methods for the detection of escherichia coli serotype-0157:H7
US5753467A (en) * 1991-12-04 1998-05-19 E. I. Du Pont De Nemours And Company Method for the identification of microorganisms by the utilization of directed and arbitrary DNA amplification
WO1999011823A2 (en) * 1997-09-05 1999-03-11 Sidney Kimmel Cancer Center Selection of pcr primer pairs to amplify a group of nucleotide sequences
US5922538A (en) * 1996-11-08 1999-07-13 E.I. Du Pont De Nemours And Company Genetic markers and methods for the detection of Listeria monocytogenes and Listeria spp
CN102154500A (en) * 2011-03-25 2011-08-17 中国水产科学研究院黄海水产研究所 Sextuple PCR (polymerase chain reaction) detection method of portunus trituberculatus miers microsatellite marker
CN103160593A (en) * 2013-04-03 2013-06-19 东北农业大学 Specific detection method of phytophthora sojae avirulence gene (Avrlc )
CN106148324A (en) * 2015-05-12 2016-11-23 中国科学院上海生命科学研究院 The Analysis and Identification method of RNA-RNA interaction and application thereof

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6074818A (en) * 1990-08-24 2000-06-13 The University Of Tennessee Research Corporation Fingerprinting of nucleic acids, products and methods
WO1995004161A1 (en) * 1993-07-29 1995-02-09 Sergio Danilo Junho Pena Method for recognition of the nucleotide sequence of a purified dna segment
JPH10510981A (en) * 1994-05-16 1998-10-27 ブリガム アンド ウィメンズ ホスピタル Methods, devices and compositions for characterizing nucleotide sequences
JP3579497B2 (en) * 1995-05-12 2004-10-20 株式会社飼料作物改良増殖技術研究所 Individual identification method by PCR reaction using MI primer
US6277571B1 (en) 1997-10-03 2001-08-21 Virginia Commonwealth University Intellectual Property Foundation Sequential consensus region-directed amplification of known and novel members of gene families
US20100130368A1 (en) * 1998-07-30 2010-05-27 Shankar Balasubramanian Method and system for sequencing polynucleotides
US20040106110A1 (en) * 1998-07-30 2004-06-03 Solexa, Ltd. Preparation of polynucleotide arrays
US20030022207A1 (en) * 1998-10-16 2003-01-30 Solexa, Ltd. Arrayed polynucleotides and their use in genome analysis
EP1155148B1 (en) * 1999-03-01 2006-02-08 Lapuco Investment N.V. Detection and quantification of micro-organisms using amplification and restriction enzyme analysis
US6846626B1 (en) * 1999-09-01 2005-01-25 Genome Technologies, Llc Method for amplifying sequences from unknown DNA
AU6052001A (en) 2000-03-29 2001-10-08 Ct For The Applic Of Molecular Methods for genotyping by hybridization analysis
JP2004510433A (en) 2000-10-06 2004-04-08 ザ・トラスティーズ・オブ・コランビア・ユニバーシティー・イン・ザ・シティー・オブ・ニューヨーク Massively parallel methods for decoding DNA and RNA
US9708358B2 (en) 2000-10-06 2017-07-18 The Trustees Of Columbia University In The City Of New York Massive parallel method for decoding DNA and RNA
US6949340B2 (en) 2001-03-28 2005-09-27 Creative Mines Llc Optical phase modulator
US20040023254A1 (en) * 2002-01-08 2004-02-05 Fuhrmann Jeffry J. Method to assess quorum sensing potential of microbial communities
US20070065834A1 (en) * 2005-09-19 2007-03-22 Hillis William D Method and sequences for determinate nucleic acid hybridization
US7455844B2 (en) 2006-03-29 2008-11-25 Merial Limited Vaccine against streptococci
GB2457402B (en) 2006-12-01 2011-10-19 Univ Columbia Four-color DNA sequencing by synthesis using cleavable fluorescent nucleotide reversible terminators
AU2008242250B2 (en) * 2007-04-19 2014-03-27 Molecular Detection Inc. Methods, compositions and kits for detection and analysis of antibiotic-resistant bacteria
US20110014611A1 (en) 2007-10-19 2011-01-20 Jingyue Ju Design and synthesis of cleavable fluorescent nucleotides as reversible terminators for dna sequences by synthesis
EP2209911B1 (en) 2007-10-19 2013-10-16 The Trustees of Columbia University in the City of New York Dna sequencing with non-fluorescent nucleotide reversible terminators and cleavable label modified nucleotide terminators and a deoxyinosine analogue with a reversible terminator group
WO2009089598A2 (en) * 2008-01-18 2009-07-23 Katholieke Universiteit Leuven Msmb-gene methylation based diagnosis, staging and prognosis of prostate cancer
US20110312503A1 (en) 2010-01-23 2011-12-22 Artemis Health, Inc. Methods of fetal abnormality detection
US20140342940A1 (en) 2011-01-25 2014-11-20 Ariosa Diagnostics, Inc. Detection of Target Nucleic Acids using Hybridization
US10533223B2 (en) 2010-08-06 2020-01-14 Ariosa Diagnostics, Inc. Detection of target nucleic acids using hybridization
US8700338B2 (en) 2011-01-25 2014-04-15 Ariosa Diagnosis, Inc. Risk calculation for evaluation of fetal aneuploidy
US20130040375A1 (en) 2011-08-08 2013-02-14 Tandem Diagnotics, Inc. Assay systems for genetic analysis
US20120034603A1 (en) 2010-08-06 2012-02-09 Tandem Diagnostics, Inc. Ligation-based detection of genetic variants
US10167508B2 (en) 2010-08-06 2019-01-01 Ariosa Diagnostics, Inc. Detection of genetic abnormalities
US11031095B2 (en) 2010-08-06 2021-06-08 Ariosa Diagnostics, Inc. Assay systems for determination of fetal copy number variation
US11203786B2 (en) 2010-08-06 2021-12-21 Ariosa Diagnostics, Inc. Detection of target nucleic acids using hybridization
US20130261003A1 (en) 2010-08-06 2013-10-03 Ariosa Diagnostics, In. Ligation-based detection of genetic variants
US8756020B2 (en) 2011-01-25 2014-06-17 Ariosa Diagnostics, Inc. Enhanced risk probabilities using biomolecule estimations
US9994897B2 (en) 2013-03-08 2018-06-12 Ariosa Diagnostics, Inc. Non-invasive fetal sex determination
US10131947B2 (en) 2011-01-25 2018-11-20 Ariosa Diagnostics, Inc. Noninvasive detection of fetal aneuploidy in egg donor pregnancies
US11270781B2 (en) 2011-01-25 2022-03-08 Ariosa Diagnostics, Inc. Statistical analysis for non-invasive sex chromosome aneuploidy determination
US8712697B2 (en) 2011-09-07 2014-04-29 Ariosa Diagnostics, Inc. Determination of copy number variations using binomial probability calculations
US10289800B2 (en) 2012-05-21 2019-05-14 Ariosa Diagnostics, Inc. Processes for calculating phased fetal genomic sequences
US9206417B2 (en) 2012-07-19 2015-12-08 Ariosa Diagnostics, Inc. Multiplexed sequential ligation-based detection of genetic variants
WO2014144883A1 (en) 2013-03-15 2014-09-18 The Trustees Of Columbia University In The City Of New York Raman cluster tagged molecules for biological imaging
US10100349B2 (en) * 2013-09-30 2018-10-16 President And Fellows Of Harvard College Methods of determining polymorphisms
GB201322034D0 (en) 2013-12-12 2014-01-29 Almac Diagnostics Ltd Prostate cancer classification

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, Vol. 86, issued August 1989, OHARA et al., "One-sided Polymerase Chain Reaction: The Amplification of cDNA", pages 5673-77. *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5753467A (en) * 1991-12-04 1998-05-19 E. I. Du Pont De Nemours And Company Method for the identification of microorganisms by the utilization of directed and arbitrary DNA amplification
WO1994012669A3 (en) * 1992-11-30 1994-07-21 Genetics Inst Detection of dna contaminants by pcr
US5393657A (en) * 1992-11-30 1995-02-28 Genetics Institute, Inc. Detection of residual host cell DNA by PCR
WO1994012669A2 (en) * 1992-11-30 1994-06-09 Genetics Institute, Inc. Detection of dna contaminants by pcr
WO1995001453A1 (en) * 1993-07-01 1995-01-12 The Board Of Trustees Of The Leland Stanford Junior University A heteroduplex mobility assay for the analysis of nucleic acid sequence diversity
US5491062A (en) * 1993-11-23 1996-02-13 Stratagene Polynucleotide amplification mycoplasma assay, primers, and kits therefore
US5660981A (en) * 1994-06-06 1997-08-26 E. I. Du Pont De Nemours And Company Selection of diagnostic genetic markers in microorganisms and use of a specific marker for detection of salmonella
US5656740A (en) * 1994-06-06 1997-08-12 E. I. Du Pont De Nemours And Company Nucleic acid fragments useful in the detection of Salmonella
US5747257A (en) * 1996-02-29 1998-05-05 E. I. Du Pont De Nemours And Company Genetic markers and methods for the detection of escherichia coli serotype-0157:H7
US5922538A (en) * 1996-11-08 1999-07-13 E.I. Du Pont De Nemours And Company Genetic markers and methods for the detection of Listeria monocytogenes and Listeria spp
WO1999011823A2 (en) * 1997-09-05 1999-03-11 Sidney Kimmel Cancer Center Selection of pcr primer pairs to amplify a group of nucleotide sequences
WO1999011823A3 (en) * 1997-09-05 1999-06-10 Sidney Kimmel Cancer Ct Selection of pcr primer pairs to amplify a group of nucleotide sequences
CN102154500A (en) * 2011-03-25 2011-08-17 中国水产科学研究院黄海水产研究所 Sextuple PCR (polymerase chain reaction) detection method of portunus trituberculatus miers microsatellite marker
CN102154500B (en) * 2011-03-25 2012-10-10 中国水产科学研究院黄海水产研究所 Sextuple PCR (polymerase chain reaction) detection method of portunus trituberculatus miers microsatellite marker
CN103160593A (en) * 2013-04-03 2013-06-19 东北农业大学 Specific detection method of phytophthora sojae avirulence gene (Avrlc )
CN103160593B (en) * 2013-04-03 2015-01-07 东北农业大学 Specific detection method of phytophthora sojae avirulence gene (Avrlc )
CN106148324A (en) * 2015-05-12 2016-11-23 中国科学院上海生命科学研究院 The Analysis and Identification method of RNA-RNA interaction and application thereof

Also Published As

Publication number Publication date
US5437975A (en) 1995-08-01
AU1662292A (en) 1992-09-15

Similar Documents

Publication Publication Date Title
US5437975A (en) Consensus sequence primed polymerase chain reaction method for fingerprinting genomes
US5861245A (en) Arbitrarily primed polymerase chain reaction method for fingerprinting genomes
EP0620862B1 (en) Method for the identification of microorganisms by the utilization of directed and arbitrary dna amplification
EP0610396B1 (en) Fingerprinting bacterial strains using repetitive dna sequence amplification
EP0395292B1 (en) Generation of specific probes for target nucleotide sequences
CA2078132C (en) A process for distinguishing nucleic acids on the basis of nucleotide differences
Jensen et al. Rapid identification of bacteria on the basis of polymerase chain reaction-amplified ribosomal DNA spacer polymorphisms
Walker et al. Multiplex strand displacement amplification (SDA) and detection of DNA sequences from Mycobacterium tuberculosis and other mycobacteria
US20070031869A1 (en) Template specific inhibition of PCR
EP0885310B1 (en) Genetic markers and methods for the detection of escherichia coli serotype-0157:h7
EP0948643B1 (en) Genetic markers and methods for the detection of listeria monocytogenes and listeria spp
US5660981A (en) Selection of diagnostic genetic markers in microorganisms and use of a specific marker for detection of salmonella
Boye et al. Identification of bacteria using two degenerate 16S rDNA sequencing primers
US20030113757A1 (en) Rapid and specific detection of campylobacter
Grattard et al. Analysis of the genetic diversity of Legionella by sequencing the 23S-5S ribosomal intergenic spacer region: from phylogeny to direct identification of isolates at the species level from clinical specimens
CA2150986C (en) Oligonucleotide primers and probes for detection of bacteria
US7883870B2 (en) Molecular identification of Staphylococcus-genus bacteria
JP4238319B2 (en) Mycoplasma detection method
Bricker Differentiation of hard-to-type bacterial strains by RNA mismatch cleavage
WO1994017203A1 (en) Amplified dna fingerprinting method for detecting genomic variation
Basha et al. Nucleic acid based methods in food borne pathogens
Delvecchio et al. Development of PCR-based assays for the detection and molecular genotyping of microorganisms of importance to biological warfare
Mukherjee et al. Advances in PCR based molecular markers and its application in biodiversity conservation
JPH11285383A (en) New insersion sequence

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU MC NL SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase