WO2001036679A2

WO2001036679A2 - METHODS FOR GENERATING SINGLE STRANDED cDNA FRAGMENTS

Info

Publication number: WO2001036679A2
Application number: PCT/US2000/031096
Authority: WO
Inventors: John G. Hartwell; Glenn D. Hoke
Original assignee: Hartwell John G; Hoke Glenn D
Priority date: 1999-11-15
Filing date: 2000-11-14
Publication date: 2001-05-25
Also published as: WO2001036679A3; AU1601701A

Abstract

New and improved methods are provided for generating amplified DNA fragments from limited quantities of longer DNAs. The methods are robust and reliable, and can be used to provide gene fragments for use in methods of analyzing gene expression patterns.

Description

Methods for Generating Single Stranded cDNA Fragments

CROSS REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application Ser. No. 60/164,932, filed November 15, 1999, which is hereby incorporated by reference. FIELD OF THE INVENTION

The present invention provides new and improved methods for generating amplified DNA fragments from limited quantities of longer DNAs. The methods are robust and reliable, and can be used to provide gene fragments for use in methods of analyzing gene expression patterns. BACKGROUND OF THE INVENTION

In recent years, methods have been developed for the analysis of gene expression in individual cells and tissues. These methods are providing powerful insights into the cellular processes that occur, for example, in disease states. For example, the gene expression profile for normal and diseased cells can be compared to provide information regarding the identity of genes whose expression levels are modified in the disease state. This information can provide insights that are useful in developing treatments for the disease, or in understanding the pathology of the disease.

Microfabricated arrays of large numbers of oligonucleotide probes, called "DNA chips" offer great promise for a wide variety of applications. In particular, DNA chips are useful for generating gene expression profiles of the type discussed above. Typically, DNA chip technology involves a microarray containing many thousands of unique DNA probes fixed to a solid support. Mixtures containing fragments of target nucleic acids are applied to the chip, and fragments that hybridize with the probes are retained on the chip while fragments that do not hybridize simply are washed away. The success of DNA chip technology, however, depends on the ability to obtain sufficient amount of single stranded nucleic acid molecules that are suitably fragmented and labeled and that can be hybridized to the chips. Moreover, the amounts of the single stranded nucleic acid molecules should reflect the amount of the corresponding mRNA in the cell or tissue of interest if the gene expression analysis is to provide any useful quantitative information. It is often desirable to fragment the target nucleic acid molecule prior to hybridization with a probe array, in order to provide segments which are more readily accessible to the probes, which hybridize more rapidly, and which avoid looping and/or hybridization to multiple probes. On the other hand, target molecules that are too short are more likely to hybridize in a non-specific manner, providing an inaccurate assessment of gene expression patterns. Current methods for fragmenting nucleic acids generally are carried out by some combination of one or more physical, chemical and/or enzymatic methods. These methods, however, frequently are not satisfactory because the fragmentation is completely random and non-reproducible and often fails to provide fragments of the desired length.

Obtaining sufficient mRNA, or corresponding cDNAs, for the study of gene expression often is problematic. Typically, amplification of the mRNA or cDNA is required to provide sufficient material for detection. The polymerase chain reaction (PCR) is an extremely powerful technique for amplifying specific nucleic acid sequences, including mRNA in certain circumstances. PCR, however, has several well-known limitations. For example, PCR typically requires that 5' terminus and 3' terminus sequence information be known for the synthesis of the primers. Another problem with PCR-based methods is that, because the amplification is exponential, rare sequences of the starting mRNA population often are extremely under- represented in the final amplification product.

By contrast, linear amplification methods result in proportional amplification which provides a more accurate representation of the relative abundance of expressed genes in a given cell or tissue, preserving rare sequences and providing more accurate quantitation. U.S. Patent No. 5,545,522, (Gelder et al.,) describes a method in which mRNA molecules are reverse-transcribed using a complementary primer linked to an RNA polymerase promoter region. Anti-sense RNA (aRNA) then is transcribed from the cDNA by introducing an RNA polymerase capable of binding to the promoter region. The resulting aRNA can be fragmented by heating. This method provides linear amplification of the starting mRNA molecules, however, because the resulting products are RNAs, this method suffers from the failings inherent in handling RNAs, that is, the RNAs are prone to degradation and formation of secondary structures. Moreover, the random nature of the fragmentation does not reproducibly provide uniform fragments of the desired size in the final amplification product.

It is apparent, therefore, that a need exists for improved methods that provide linear amplification of initial target nucleic acids and are able to reproducibly generate labeled DNA fragments of a desired size. Preferably, the overall methodologies will be capable of amplifying a broad range of target molecule without prior cloning and without knowledge of sequence in some instances. The present invention fulfills these and other needs.

SUMMARY OF THE INVENTION It is therefore an object of the present invention to provide improved methods for generating nucleic acid fragments that can be used in gene expression analysis and other applications.

In accomplishing these objects, there has been provided, in accordance with one aspect of the present invention, a method for generating a modified target nucleic acid molecule, comprising (a) carrying out a primer extension step on a target molecule in the presence of a modifiable nucleotide to generate a primer extension product which is complementary to the target nucleic, acid and which incorporates the modifiable nucleotide in a stochastic distribution amongst the possible sites of incorporation of said modifiable nucleotide; (b) contacting the primer extension product with a reagent that converts at least one modifiable nucleotide into a modified nucleotide, whereby the primer extension product is susceptible to cleavage at the site of the modified nucleotide, and (c) cleaving the primer extension product at the site of at least one modified nucleotide.

In one embodiment, the target nucleic acid molecule is a cDNA. In another embodiment, the cDNA comprises a 3 '-end polyadenylate tail and the primer hybridizes to the 3'-end polyadenylate tail.

In still another embodiment, the modifiable nucleotide is dUTP, and the modifiable nucleotide may be converted into the modified nucleotide by treatment with uracil N-glycosylase. In yet other embodiments, the modifiable nucleotide is 5-hydroxy-2'- deoxycytidine triphosphate or 5-hydroxy-2'-dexoyuridine triphosphate, and the modifiable nucleotide is converted into the modified nucleotide by treatment with E. coli endonuclease III or with formamidopyrimidine DNA N-glycosylase.

In yet another embodiment, the primer extension step is repeated at least 5 times, or at least 10 times. In still further embodiments, the concentration of dUTP is about 25% of that of dTTP. In other embodiments, the cleavage step (c) is via treatment at a sufficiently elevated temperature for a time sufficient to cleave at least about 95% of the nucleic acid molecules. The elevated temperature can be about 95 °C. In other embodiments, the cleavage step (c) is via treatment at a sufficiently elevated pH for sufficient time as to cleave at least about 95 % of the nucleic acid molecules.

In yet another embodiment, the primer is labeled with a detectable label. The nucleotide building blocks used in the primer extension reaction also may be labeled with a detectable label. The detectable label may be a radioisotope, a chromophore, a fluorophore, an enzyme, or a reactive group. Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 describes how repeated cycles of melting, annealing and primer extension can be used to generate multiple copies of a single-stranded DNA.

Figure 2 describes how the cleavage of the uracil base followed by heat or basic pH treatment fragments the DNA.

Figure 3 describes how the fragmented DNA can be captured by a probe attached to a microarray surface.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT In accordance with the present invention, novel methods are provided for the generation of gene fragments and other DNA fragments. In particular, the methods provide gene fragments in a quantity and form suitable for gene expression analysis. In general, the methods involve an amplification process, wherein at least one modifiable nucleotide is randomly incorporated into the amplified nucleic acid product at some sites in place of one of the constituent nucleotides of the nucleic acid. This is achieved by carrying out the amplification using a combination of the four constituent nucleotides, plus an amount of a modifiable nucleotide that can be incorporated during amplification in place of one of the constituent nucleotides ("the corresponding constituent nucleotide"). The concentration of the modifiable nucleotide typically is lower than that of the corresponding constituent nucleotide and, accordingly, the modifiable nucleotide is present in only a fraction of the possible sites in the amplified product. The modifiable nucleotide is distributed in stochastic fashion between all the possible sites in the amplified product. The proportion of the possible sites that the modifiable nucleotide occupies in the amplified product can be varied by adjusting the proportion of the amount of the modifiable nucleotide in the nucleotide mixture used for the amplification reaction. The amplified product is then treated under conditions that modify the modifiable nucleotide(s) in such a way that the nucleic acid strand can be cleaved at sites containing the modified nucleotide. This method provides fragmentation patterns that are reproducible because fragmentation occurs only at sites containing the modified nucleotide. The stochastic nature of the distribution of the modifiable nucleotide in the original population of the amplified product means that the fragmentation patterns in a given population also should be reproducible.

The present methods can be used in any method of primer-extension reaction or primer-driven amplification that currently is known or that will be developed in the future. For example, the methods can be used in a polymerase chain reaction (PCR), or in an isothermal amplification method such as NASBA. Preferably, however, for methods of gene expression analysis the amplification is a linear amplification method, such as single primer PCR (referred to hereinafter as one-sided amplification ("OSA"). In such a method a single primer is used, which anneals to a target nucleic acid. Chain extension then occurs using a thermostable polymerase, followed by another cycle of melting, annealing and chain extension. This process is illustrated in Figure 1. The product of the amplification increases linearly with the number of amplification cycles, rather than exponentially as occurs in a conventional two-primer PCR. By using a linear amplification method with cDNAs derived from cellular mRNA, nucleic acids corresponding to transcripts that are rare in the cellular mRNA are preserved. By contrast, use of exponential PCR amplification often results in loss of rare transcripts, as their amplification products are very poorly amplified compared to more abundant transcripts.

In a particular embodiment, the amplification products generated according to the present invention incorporate a modifiable nucleotide which is a substrate for a base-removing enzyme. The modifiable nucleotide is incorporated into the amplification product randomly during the linear amplification process. The amplification products containing the modifiable nucleotide are treated with an enzyme that removes the base moiety from the incorporated modified nucleotide to generate abasic sites. The nucleic acid then can be cleaved at the abasic site by treatment with high temperature or basic pH. In a particular embodiment, the nucleotide deoxyuracil (dUTP) can be added to the conventional nucleotide mixture of dTTP, dATP, dCTP, and dGTP. During an amplification reaction dU sometimes will be inserted into the amplified product in place of dT. The amount of dU that is inserted relative to the amount of dT is determined by the relative concentrations of dU and dT, and also by the relative efficiency with which the polymerase enzyme used for the amplification, reaction incorporates the two nucleotides. The amplified product is treated with uracil N- glycosylase (UNG, sometimes referred to as UDG) to create abasic sites wherever dU is incorporated. The nucleic acid then is cleaved by subjecting it to high temperature for a brief period, for example, at 95 °C for about 10 minutes. The resulting fragments may be used, for example, for gene expression analysis as described below. The amount of fragmentation can be varied not only by modifying the proportion of dUTP used in the amplification reaction, but also by varying the amount or activity of the UNG enzyme. The skilled artisan readily is aware of methods to optimize the amount of fragmentation by varying the dUTP concentration in the amplification reaction and/or by varying the UNG activity and/or concentration. Methods of carrying out a conventional polymerase chain where dTTP is completely omitted in favor of dUTP, and of cleaving the resulting DNA with UNG, are described in US Patent No. 5,945,313, ("the '313' patent") which is hereby incorporated by reference in its entirety. The '313 patent does not teach or suggest methods where only a portion of the dT residues are replaced by dU, as in one embodiment of the present invention. In a preferred embodiment, the nucleic acid to be amplified and fragmented is a cDNA or mixture of cDNAs prepared by reverse transcription from cellular mRNA. Methods for preparing cDNA from mRNA are well known in the art. The cDNA can be prepared using random primers for priming the reverse transcription, but for gene expression analysis, priming from the 3' end of the mRNA is preferred. This method provides a more reproducible cDNA sample from cellular mRNA. In a particular embodiment, a double stranded cDNA is prepared using standard methods, and the sense strand (second strand) of the double stranded cDNAs is used as the template, with a poly(dT)-containing primer that binds to the poly-dA tail of the sense strand. This means that, for gene expression analysis, the amplified and fragmented nucleic acids provided by the methods according to the invention correspond to the antisense strand of the gene fragment. In another embodiment, the poly(dT) containing primer contains a 5' extension that provides a site for binding of a specific primer. That primer binding site then can be used as the primer recognition sequence for the amplification reaction. In another embodiment, the amplification products according to the present invention incorporate a detectable label that may be used to detect the product during subsequent use, for example, in gene expression analysis. The label can be any label that is known in the art, for example, a radioisotope, fluorescent, or chemiluminescent label. The label can be contained within the nucleotide building blocks used for the amplification reaction, for example, a carbon-14, sulfur-35, phosphorus-32, phosphorus-33 or tritium-labeled nucleotide can be used. Alternatively, the label can be contained within the primer used for the amplification reaction. Inclusion of the label within the primer, and especially by placing the label in a 5' extension of the primer that does not hybridize to the target nucleic acid, allows use of labels that otherwise would interfere with the amplification reaction. In another embodiment, the label can be a moiety such as biotin that allows subsequence isolation of the amplified nucleic acid or fragmented nucleic acid, or that permits detection by using a detectable reagent that binds to the label. For example, labeled streptavidin can be used to isolate or detect a biotinylated nucleic acid.

Isolation of mRNAs and Synthesis of Double-stranded cDNAs The target mRNA population for the practice of this invention may be isolated from a cellular source using many available methods well-known in the art. The Chomczynski method, e.g., isolation of total cellular RNA by the guanidine isothiocyanate (described in U.S. Pat. No. 4,843,155) used in conjunction with, for example, oligo-dT streptavidin beads, is an exemplary mRNA isolation protocol. The mRNAs are converted to cDNA by reverse transcriptase. e.g., poly(dT)- primed first strand cDNA synthesis by reverse transcriptase, followed by second strand synthesis using a DNA polymerase such as DNA Polymerase I. and treatment with S 1 nuclease, using many protocols, and any variations thereof, well-known to the skilled artisan. For general description of these methods, please see Sambrook et al, 1989, Molecular Cloning - A Laboratory Manual, 2nd ed., Vol. 1-3: and Ausubel et al, 1989, Current Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, N. Y. When desired, the skilled artisan will recognize that primers specific for gene families can be used to provide cDNA mixtures containing a desired gene family. For example, it is known that G-protein coupled receptors contain regions of conserved sequence that can be used to design primers or primer mixtures that allow selective isolation of cDNAs encoding the receptors.

In preparing the first strand cDNA, the primer is contacted with the mRNA with a reverse transcriptase and other reagents necessary for primer extension under conditions sufficient for first strand cDNA synthesis, where additional reagents include: dNTPs; buffering agents, e.g. Tris-Cl; cationic sources, both monovalent and divalent, e.g. KCI, MgCI₂; RNAase inhibitor and sulfhydryl reagents, e.g. dithiothreitol; and the like. A variety of enzymes, usually DNA polymerases, possessing reverse transcriptase activity can be used for the first strand cDNA synthesis step. Examples of suitable DNA polymerases include the DNA polymerases derived from organisms selected from the group consisting of a thermophilic bacteria and archaebacteria, retroviruses, yeasts, insects, primates and rodents. Preferably, the DNA polymerase will be selected from the group consisting of Moloney murine leukemia virus (M-MLV) and M-MLV reverse transcriptase lacking RNaseH activity, human T-cell leukemia virus type I (HTLV-I), bovine leukemia virus (BLV), Rous sarcoma virus (RSV), human immunodeficiency virus (HIV) and Thermus aquaticus (Taq) or Thermus thermophilus (Tth), avian reverse transcriptase, and the like.

Suitable DNA polymerases possessing reverse transcriptase activity may be isolated from an organism, obtained commercially or obtained from cells which express high levels of cloned genes encoding the polymerases by methods known to those of skill in the art, where the particular manner of obtaining the polymerase will be chosen based primarily on factors such as convenience, cost, availability and the like. Of particular interest because of their commercial availability and well characterized properties are avian reverse transcriptase and M-MLV.

The order in which the reagents are combined may be modified as desired. One protocol that may be used involves the combination of all reagents except for the reverse transcriptase on ice, then adding the reverse transcriptase and mixing at around 4°C. Following mixing, the temperature of the reaction mixture is raised to 37°C, followed by incubation for a period of time sufficient for first strand cDNA primer extension product to form, usually about 1 hour.

First strand synthesis produces a mRNA/cDNA hybrid, which is then converted to double-stranded (ds) cDNA. Conversion of the mRNA/cDNA hybrid to ds DNA can be accomplished using a number of different techniques. One of such techniques is self-priming, described by Efstratiadis et al., Cell (1976)7: 279; Higuchi et al., Proc. Natl. Acad. Sci. (1976) 73:3146; Maniatis et al., Cell (1976) 8:163 and Rougeon and Mach, Proc. Nad. Acad. Sci. (1976) 73:3418 in which the hybrid is denatured, e.g. by boiling or hydrolysis of the mRNA, and the first strand cDNA is allowed to form a hairpin loop and self prime the second strand cDNA.

Alternatively the method introduced by Okayama and Berg, Mol. Cell Biol. (1982) 2:161 and modified by Gubler and Hoffman, Gene (1983) 25:263 may be employed, in which the hybrid is used as a template for nick translation. Alternatively, one may use terminal transferase to introduce a second primer hybridization site at the 3' termini of the first strand, as described by Rougeon et al, Nucleic Acids Res. (1975) 2:2365 and Land et al.. Nucleic Acids Res. (1981) 9:2251.

The second strand cDNA of the resultant ds cDNA is also called the sense strand, i.e. it comprises a sequence of nucleotide residues substantially, if not completely identical, to the mRNA, with the exception of Ts substituted for Us. In addition, the second strand cDNA may also contain additional sequences of nucleotides at its 3' end which are present as a result of the particular primer used to synthesis first strand cDNA. These additional sequences are advantageous for subsequent processes or manipulations. Linear Ampliflication of the cDNA

Following production of ds cDNA from the initial mRNA, the ds cDNA is then linearly amplified, or asymmetrically amplified with a primer under conditions sufficient to produce a sufficient amount of amplified product.

In a preferred embodiment, the ds cDNA is first denatured and the second strand, i.e. sense strand, cDNA is used as a template for synthesis of a DNA primer extension product via a one-sided amplification reaction. A preferred primer suitable for this embodiment is one that binds to the 3' end polyadenylate tail of the second strand cDNA. The skilled artisan will recognize that any primer that binds to the 3'- end of the sense strand cDNA may be used. In addition, random priming techniques well-known in the art may also be used. In another embodiment, primers that are gene specific, or that are specific for gene families can be used. For example, primers can be used that recognize conserved regions within a cDNA encoding a desired family of proteins. Thus, genes encoding antibodies contain conserved regions that can be recognized by primers or degenerate primer mixtures. Amplification using such primers will then specifically provide amplified products only for those desired genes. Alternatively, a mixture of gene specific primers can be used to amplify only selected genes, even where those genes are not related. For example, a set or sets of distinct primers could be used that amplify only those genes known to be involved in the cellular reaction to a known toxin. The skilled artisan will recognize that the ability to amplify a desired gene sequence is limited only by the ability to design a primer that specifically will amplify that sequence. Linear amplification, or asymmetric amplification according to the present invention is a variation of the polymerase chain reaction in which only a single primer complementary to at least a portion of the 3' terminus of a template DNA is employed. The polymerase chain reaction (PCR), as well as devices and reagents for use in performing PCR, are well-known to the skilled artisan. PCR is described in U.S. Pat. Nos.: 4,683,202; 4,683,195; 4,800,159; 4,965,188 and 5,512,462, the disclosures of which are herein incorporated by reference.

According to the present invention, the sense strand cDNA is used as the template for amplification with a primer that binds to the 3 '-end of the sense strand cDNA. A reaction mixture is provided in which the primer binds to the template and at least one round of primer extension is carried out, resulting in the desired primer extension product.

Typically the sense strand DNA will be used in a plurality of rounds or cycles of primer extension product synthesis, where by plurality is meant at least 2, and usually at least 5, more usually at least 10 and typically at least 20 cycles. The rounds of amplification can continue for 50 or more cycles if desired.

The enzymatic extension is carried out in the presence of a DNA polymerase, dNTPs, and suitable buffering and other reagents necessary or desirable for optimal synthesis of primer extension product, as are known in the art. A variety of different polymerases are known and may be used in the synthesis of this primer extension product. Suitable polymerases include: E. coli DNA polymerase I (holoenzyme), Klenow fragment, T4 and T7 encoded polymerases, modified bacteriophage T7 DNA polymerase (SΕQUΕNASΕ™), as well as thermostable DNA polymerases, such as Taq DNA polymerase and AMPLITAQ™. Since thermal cycling is typically used in this portion of the method, a thermostable DNA polymerase is preferably employed for the synthesis of this second captureable primer extension product, where Taq DNA polymerase is representative of suitable thermostable polymerase. Buffers and other requisite reagents for performing PCR as described above are well known to those of skill in the art. Following synthesis, the primer extension product, or amplification product, may or may not be purified, or separated from the remaining components of the reaction mixture, e.g. dNTPs, polymerase, and the like. If a need exists, purification may be by any of many well-known method for nucleic acid purification.

Incorporation of Modified Nucleotides into the Amplification

Product In accordance with the present invention, a modifiable nucleotide is incorporated into the primer extension product at some sites in place of one of the constituent nucleotide building blocks of the nucleic acid. The modifiable nucleotide, when incorporated into primer extension products, can be reacted to provide a modified nucleotide, such that the nucleic acid may be cleaved at the site of modified nucleotide. The modifiable nucleotide can be any nucleotide compound that can be reacted to provide a modified nucleotide that facilitates cleavage of the nucleic acid. For example, the nucleotide may be photolabile, such that photolysis provides the modified nucleotide. Alternatively, the modifiable nucleotide may be thermolabile or pH sensitive. In a preferred embodiment, the modifiable nucleotide is one that is a substrate for a base-removing enzyme, such that when the primer extension products containing the modified base are treated with the enzyme, the base from the incorporated modified nucleotide is removed and an abasic site is generated. These abasic sites are subject to breaking when treated with high temperature or basic pH. It is well-known to the skilled artisan that there are many suitable modified nucleotides for the practice of the present invention. A preferred embodiment utilizes dUTP as the modified nucleotide which, when incorporated into the primer extension product, is a substrate for the base-removing enzyme uridine-N-glycolase.

According to another embodiment, 5-hydroxy-2'-deoxycytidine triphosphate or 5-hydroxy-2'-dexoyuridine triphosphate also may be used as the modified nucleotide, and the base-removing enzyme may be E. coli endonuclease III or formamidopyrimidine DNA N-glycosylase. The skilled artisan will recognize that other repair enzymes of this type are known in the art and may be used in the present invention.

The incorporation of the modifiable nucleotide into the primer extension products may be achieved by including a suitable modifiable precursor nucleotide mixed in the amplification reaction with the normal DNA precursor nucleotides dATP, dGTP, dCTP and dTTT ("the dNTPs"). In general, the modifiable nucleotide is an analog of one of the dNTPs ("the native analog") and is incorporated into the primer extension product by the DNA polymerase. For example, dUTP and 5- hydroxy-2'-dexoyuridine triphosphate both are analogs of dTTP, and 5-hydroxy-2'- deoxycytidine triphosphate is an analog of dCTP, and their use will replace dTTP and dCTP, respectively, in the primer extension products.

The amount of modifiable nucleotides, in the primer extension product can be controlled by the ratio of the modified nucleotide to its corresponding analog. For example, when dUTP is used as the modified nucleotide, a reaction mixture may contain a suitable of amount of dATP, dGTP and dCTP in equal molar concentration. The skilled artisan, however, would recognize that, even though the sum of the molar concentration of dUTP and dTTP may be constant, the relative ratio of dUTP concentration to dTTP concentration can be varied.

According to an embodiment of the present invention, the ratio between the molar concentration of modified nucleotide and the molar concentration of its native analog ranges from 1/99 to 99/1. This ratio is usually between 10/90 and 90/10, often between 25/75 and 75/25, and preferably between 30/70 and 70/30. The skilled artisan will recognize that this ratio is chosen so that in the amplification process the modifiable nucleotide is occasionally incorporated in place of its native analog. Frequently, the native analog is incorporated by the DNA polymerase at a much higher efficiency than the modifiable nucleotide. If, for example, the ratio between dUTP and dTTP in the reaction mixture is about 25%, the ratio between dU to dT in the product DNA typically is only about 5 %. This results in a population of products bearing a low level of modifiable nucleotide residues randomly distributed throughout the amplified molecules. Because the amount of modifiable nucleotides incorporated in the primer extension product (together with the activity and concentration of the base-cleaving enzyme) determines the length of the final fragments generated after modification and cleavage, a person skilled in the art will adjust the relative ratio of the molar concentrations of modifiable nucleotide and the molar concentration of its native analog. Such ratio adjustment and/or optimization may be made by the skilled artisan according to procedures well-known in the art. Treatment of the primer extension products with base-cleaving enzymes and cleavage of abasic sites

Treatment of the primer extension products with a suitable base-cleaving enzyme and cleavage of the abasic sites results in cleavage of the molecules at the position of incorporation of the modifiable nucleotide residues. For example, treatment of DNA containing uracil bases with uracil DNA glycosylase results in cleavage of the glycosidic bond between the deoxyribose of the DNA sugar-phosphate backbone and the uracil base. The loss of the uracil creates an apyrimidinic site in the DNA (Schaaper, R., et al., Proc. Nad. Acad. Sci. USA 80:487 (1983)). The DNA sugar-phosphate backbone that remains after UNG cleavage of the glycosidic bond can then be cleaved by endomiclease IV, alkaline hydrolysis, tripeptides containing aromatic residues between basic ones such as Lys-Trp-Lys and Lys-Tyr-Lys (Pierre et al., J. Biol Chem. 256:10217-10226 (1981)) and the like. Because the modifiable nucleotide is incorporated randomly in place of its native analog, different primer extension product having different nucleotide sequences will be cleaved at different points depending on where the modifiable nucleotide residues is incorporated. The skilled artisan will recognize that the length of the resulting fragments also will be determined by the extent of the modifying reaction that occurs at the modifiable nucleotides. For example, when the modifiable nucleotide is dU, the extent of the fragmentation can be controlled by varying the concentration of UNG, so that only a portion of the dU residues are modified.

Incorporation of Labels into the Amplification Product According to a preferred embodiment of the invention, the primer extension products generated are labeled, by any of many methods well-known in the art, with a marker for easy detection. The labeled fragments are particularly desired for many purposes in biotechnology, such as for the analysis of gene expression patterns and determination of DNA polymorphism.

As used herein, the terms "label" or "labeled" refers to incorporation of a detectable marker, e.g., by incorporation of a radioactively or nonradioactively labeled nucleotide. Various methods of labeling polynucleotides are known in the art and may be used. Labeling of the primer extension product according to the present invention may be achieved by incorporating a marker-labeled nucleotide into the primer extension product, in a manner similar to the incorporation of modified nucleotides into the primer extension product herebefore described. A large portion of available labeling method currently in use are radioactive and they can be obtained from a wide variety of commercial sources. Examples of radiolabels include, but are not restricted to, ³²P, ³H, ¹⁴C, or ³⁵S.

A large number of convenient and sensitive non-isotopic markers are also available. In general, all of the non-isotopic methods of detecting hybridization probes that are currently available depend on some type of derivitization of the nucleotides to allow for detection, whether through antibody binding, or enzymatic processing, or through the fluorescence or chemiluminescence, of an attached "reporter" molecule. The primer extension product labeled with non-radioactive reporters incorporate single or multiple molecules of the label nucleotide which contain the reporter molecule, generally at specific cyclic or exocyclic positions.

Techniques for attaching reporter groups have largely relied upon (a) functionalization of 5' or 3' termini of the monomeric nucleosides by numerous chemical reactions (see Cardullo et al. (1988) Proc. Nat'l. Acad. Sci. 85:8790-8794); (b) synthesizing modified nucleosides containing (i) protected reactive groups, such as NH₂, SH, CHO, or COOH, (ii) activatable monofunctional linkers, such as NHS esters, aldehydes, or hydrazides, or (iii) affinity binding groups, such as biotin, attached to either the heterocyclic base or the furanose moiety.

According to one aspect of the invention, the labeled nucleotide are labeled with fluorogens. Examples of fluorogens include fluorescein and derivatives, isothiocyanate, dansyl chloride, phycoerythrin, allo-phycocyanin, phycocyanin, rhodamine, Texas Red™, SYBR-Green™ or other proprietary fluorogens. The fluorogens are generally attached by chemical modification. The fluorogens can be detected by a fluorescence detector.

In a preferred embodiment, the labeled nucleotide can alternatively be labeled with a chromogen to provide an enzyme or affinity label. For example, nucleotide may have biotinyl moieties that can be detected by marked avidin (e.g., streptavidin containing a fluorescent marker or enzymatic activity that can be detected by optical or calorimetric methods). The probe can be labeled with peroxidase, alkaline phosphatase or other enzymes giving a chromogenic or fluorogenic reaction upon addition of substrate. For example, additives such as 5-amino-2,3-dihydro-l,4- phthalazinedione (also known as LUMINOL™) (Sigma Chemical Company, St. Louis, Mo.) and rate enhancers such as p-hydroxybiphenyl (also known as p- phenylphenol) (Sigma Chemical Company, St. Louis, Mo.) can be used to amplify enzymes such as horseradish peroxidase through a luminescent reaction; and luminogeneic or fluorogenic dioxetane derivatives of enzyme substrates can also be used.

Usually, the labeled binding component comprises a direct label, such as a fluorescent label, radioactive label, or enzyme-conjugated label that catalyzes the conversion of a chromogenic substrate to a chromophore. However, it is possible, and often desirable for signal amplification, for the labeled binding component to be detected by at least one additional binding component that incorporates a label. Signal amplification can be accomplished by layering of reactants where the reactants are polyvalent.

The following examples are given to illustrate the present invention. It should be understood that the invention is not to be limited to the specific conditions or details described in these examples. Throughout the specification, any and all references to publicly available documents are specifically incorporated by reference. Example 1: One-sided amplification of cellular cDNA from Jurkat cells Total cellular mRNA was isolated from Jurkat cells, and double-stranded cDNA was prepared by standard methods using a primer containing 24 dT residues, with a T7 promoter at the 5' end that subsequently was used for priming the amplification reaction. The resulting cDNA was used in a PCR reaction mixture as shown below to test the effects of introducing varying percentages of dUTP in place of dTTP. The primer was directed to the T7 promoter region contained within the original primer used for reverse transcription. Reaction mixtures were set up as follows:

The amplification conditions were as follows: 50 cycles of 95 °C for 30 seconds, followed by 60°C for two minutes. After completion of the reaction, 5 μl of each reaction mixture was removed for later analysis. To the remainder of the reaction mixture, 0.5 μl of UNG (Perkin Elmer) was added, followed by incubation at 37°C for 30 minutes. Each sample was then heated to 95 °C for 10 minutes, and lOμl samples were removed for analysis. The lanes of a 1.2% agarose gel were loaded with 5 μl (non-UNG treated) or 10 μl (UNG-treated) reaction mixtures, followed by electrophoresis under standard conditions. The non-UNG treated samples showed, as expected, an average molecular weight that was significantly higher than that of UNG- treated samples. Moreover, also as expected, the average molecular weight of the samples in the UNG-treated tubes decreased with increasing percentage of dUTP.

The invention has been disclosed broadly and illustrated in reference to representative embodiments described above. Those skilled in the art will recognize that various modifications can be made to the present invention without departing from the spirit and scope thereof.

Claims

What is claimed is:

1. A method for generating a modified target nucleic acid molecule, comprising

(a) carrying out a primer extension step on a target molecule in the presence of a modifiable nucleotide to generate a primer extension product which is complementary to the target nucleic acid and which incorporates the modifiable nucleotide in a stochastic distribution amongst the possible sites of incorporation of said modifiable nucleotide;

(b) contacting said primer extension product with a reagent that converts at least one modifiable nucleotide into a modified nucleotide, whereby said primer extension product is susceptible to cleavage at the site of the modified nucleotide, and

(c) cleaving the primer extension product at the site of at least one modified nucleotide.

2. The method according to claim 1 , wherein the target nucleic acid molecule is a cDNA.

3. The method according to claim 2, wherein said cDNA comprises a 3'- end polyadenylate tail and the primer hybridizes to said 3'-end polyadenylate tail.

4. The method according to claim 1 , wherein said modifiable nucleotide is dUTP.

5. The method according to claim 4, wherein said modifiable nucleotide is converted into said modified nucleotide by treatment with uracil N-glycosylase.

6. The method according to claim 1, wherein said modifiable nucleotide is 5-hydroxy-2'-deoxycytidine triphosphate and wherein said modifiable nucleotide is converted into said modified nucleotide by treatment with E. coli endonuclease III.

7. The method according to claim 1, wherein said modifiable nucleotide is 5-hydroxy-2'-dexoyuridine triphosphate, and wherein said modifiable nucleotide is converted into said modified nucleotide by treatment with formamidopyrimidine DNA N-glycosylase.

8. The method according to claim 1, wherein said primer extension step is repeated at least 5 times.

9. The method according to claim 1 , wherein said primer-extension step is repeated at least 10 times.

10. The method according to claim 4, wherein the concentration of dUTP is about 25 % of that of dTTP.

11. The method according to claim 1 , wherein said cleavage step (c) is via treatment at a sufficiently elevated temperature for a time sufficient to cleave at least about 95 % of said nucleic acid molecules.

12. The method according to claim 11, wherein said elevated temperature is about 95 °C.

13. The method according to claim 1, wherein said cleavage step (c) is via treatment at a sufficiently elevated pH for sufficient time as to cleave at least about 95 % of said nucleic, acid molecules.

14. The method according to claim 1 , wherein said primer is libeled with a detectable label.

15. The method according to claim 1 , wherein the constituent nucleotides used in said primer extension reaction are labeled with a detectable label.

16. The method according to claim 14, wherein said detectable label is radioisotope, a chromophore, a fluorophore, an enzyme, or a reactive group.

17. The method according to claim 15, wherein said detectable label is radioisotope, a chromophore, a fluorophore, an enzyme, or a reactive group.