WO1999028494A1

WO1999028494A1 - Methods of using probes for analyzing polynucleotide sequence

Info

Publication number: WO1999028494A1
Application number: PCT/US1998/025664
Authority: WO
Inventors: Seth Taylor; Staf Van Cauter
Original assignee: Packard Bioscience Company
Priority date: 1997-12-04
Filing date: 1998-12-03
Publication date: 1999-06-10
Also published as: AU1623899A

Abstract

A method of detecting the presence of, or identifying, a genetic event comprising: providing a target nucleic acid having a genetic event; providing a plurality of nucleotide probes wherein each probe: (a) is complementary to a region immediately 5' to the site of the genetic event and complementary to a region immediately 3' to the site of the genetic event, (b) has an identification region, e.g., a 5' terminal region or a 3' terminal region, which identification region is not repeated in another probe in the plurality, i.e., it is unique in the plurality, and is complementary to the target sequence, and (c) wherein the nucleotide at the position corresponding to the site of the first genetic event can be chosen from A, T, G or C, wherein the plurality of nucleotide probes includes a first probe having a first base at the position corresponding to the site of the genetic event and a second probe having a second (i.e., different from the first base) base at the position corresponding to the site of the genetic event; contacting the target with the plurality of probes under conditions wherein only probes having complementarity with the target nucleic acid will hybridize; thereby analysing the target nucleic acid.

Description

METHODS OF USING PROBES FOR ANALYZING POLYNUCLEOTIDE SEQUENCE

BACKGROUND OF THE INVENTION

The invention relates to a method of analyzing a polynucleotide sequence.

SUMMARY OF THE INVENTION

In general, the invention features, a method of analyzing a polynucleotide sequence in a sample. The method includes: providing a sample which includes a polynucleotide sequence to be analyzed and at least one extraneous nucleic acid molecule; providing a plurality of selector probes wherein each selector probe is complementary to a region of the polynucleotide sequence to be analyzed and is not complementary to the extraneous nucleic acid (selector probes can be complementary to different regions of the poly-nucleotide sequence), contacting the sample with the plurality of selector probes under conditions which allow hybridization of the polynucleotide sequence to be analyzed to a selector probe but which do not allow hybridization of the extraneous nucleic acid to hybridize to a selector probe; allowing a nucleic acid molecule to be analyzed to hybridize with a selector probe; selecting a product of that hybridization, e.g., the selector probe, the nucleic acid to be analyzed, or both; providing an array of a plurality of capture probes, wherein each of the capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region complementary to the plurality of selector probes; hybridizing the selected nucleic acid molecule (which does not include the extraneous nucleic acid molecule) with the array of capture probes, thereby detecting or identifying a selected nucleic acid molecule which bound to the polynucleotide sequence and thereby analyzing the polynucleotide sequence. In preferred embodiments the polynucleotide sequence is: a DNA molecule: all or part of a known gene; wild type DNA; mutant DNA; a genomic fragment, particularly a human genomic fragment; a cDNA, particularly a human cDNA.

In preferred embodiments the polynucleotide sequence is: an RNA molecule: nucleic acids derived from RNA transcripts; wild type RNA; mutant RNA, particularly a human RNA.

In preferred embodiments the polynucleotide sequence is: a human sequence; a non- human sequence, e.g., a mouse, rat, pig, primate.

In preferred embodiments, the target is anchored to a substrate and the addition of selector probes to the anchored target is repeated at least one time to increase the number of selector probes.

In preferred embodiments, the selector probes are anchored to a substrate and the addition of target the anchored probes is repeated at least one time to increase the number of target molecules.

In preferred embodiments the polynucleotide sequence is coupled to a support prior to hybridizing with the selector probes.

In preferred embodiments the selector probes are coupled to a support prior to hybridizing with the target.

In preferred embodiments the method is performed: on a sample from a human subject; and a sample from a prenatal subject; as part of genetic counseling; to determine if the individual from which the target nucleic acid is taken should receive a drug or other treatment; to diagnose an individual for a disorder or for predisposition to a disorder; to stage a disease or disorder.

In preferred embodiments the selector probes are selected to be to maximize uniqueness, i.e., to minimize the number of non-unique selector probes. This can be done with computer-aided methods. This can be done prior to dividing the probes into pools. e.g., a computer-aided method can be used to select or locate a group of selector probes in which the frequency of repeated complementarity regions is minimized or reduced to zero. If there are no repeats there need be only one pool. If there is non-uniqueness, i.e., if a complementarity region is repeated, uniqueness can be obtained by placing the selector probes having the same complementarity region in separate pools.

In preferred embodiments the selector probes are labeled so that hybridization to the array of capture probes can be followed. The label can be a fluorescent label. The selector probes of a first pool can be labeled with a first label and the selector probes of a second pool can be labeled with a second label. If only one label is used then pools must be hybridized separately to the array of capture probes.

In preferred embodiments the capture probes are single stranded probes in an array.

In preferred embodiments the capture probes have a structure comprising a double stranded portion and a single stranded portion in an array.

In preferred embodiments hybridization is detected by mass spectrophotometry, e.g., by MALDI-TOF mass spectrophotometry.

In preferred embodiments probes are selected for minimal cross-hybridization with other probes.

In preferred embodiments the target or the selector probe has attached thereto a first member of a proximity detector pair and hybridization to the array allows the first member to be brought into proximity with a second member to provide a signal.

In another aspect, the invention features, a method of detecting a target sequence in a sample. The method includes: providing a sample containing a target sequence; providing a plurality of selector probes wherein each selector probe is complementary to the target sequence, contacting the target sequence with the plurality of selector probes under conditions wherein only selector probes having complete complementarity with the target sequence will hybridize; selecting selector probes that hybridize to the target sequence; providing an array of a plurality of capture probes, wherein each of the capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region complementary to the plurality of selector probes; hybridizing the selector probes which hybridize to the target sequence to the array of capture probes, thereby detecting or identifying a selector probe which bound to the target sequence and thereby detecting said target sequence in a sample.

In preferred embodiments the target sequence is: a DNA molecule: all or part of a known gene; wild type DNA; mutant DNA; a genomic fragment, particularly a human genomic fragment; a cDNA, particularly a human cDNA.

In preferred embodiments the target sequence is: an RNA molecule: nucleic acids derived from RNA transcripts; wild type RNA; mutant RNA, particularly a human RNA.

In preferred embodiments the target sequence is: a human sequence; a non-human sequence, e.g., a mouse, rat, pig, primate.

In preferred embodiments the selector probes are labeled, e.g., so that hybridization to the array of capture probes can be followed. The label can be a fluorescent label. The selector probes of a first pool can be labeled with a first label and the selector probes of a second pool can be labeled with a second label. If only one label is used then pools must be hybridized separately to the array of capture probes.

In preferred embodiments probes are selected for minimal crosshybridization with other probes. In preferred embodiments the target or the selector probe has attached thereto a first member of a proximity detector pair and hybridization to the array allows the first member to be brought into proximity with a second member to provide a signal.

In another aspect, the invention features, a method of analysing a target nucleic acid, e.g., of detecting the presence of, or identifying, a genetic event, e.g., a polymorphism, in a target nucleic acid, e.g., a DNA. The method includes: providing a target nucleic acid having a genetic event; providing a plurality of nucleotide probes wherein each probe: (a) is complementary to a region immediately 5' to the site of the genetic event and complementary to a region immediately 3' to the site of the genetic event ,

(b) has an identification region, e.g., a 5' terminal region or a 3' terminal region, which identification region is not repeated in another probe in the plurality, i.e., it is unique in the plurality, and is complementary to the target sequence, and

(c) wherein the nucleotide at the position corresponding to the site of the genetic event can be chosen from A, T, G, or C,

(optionally) wherein the plurality of nucleotide probes includes a first probe having a first base at the position corresponding to the site of the genetic event and a second probe having a second (i.e., different from the first base) base at the position corresponding to the site of the genetic event; contacting the target with the plurality of probes under conditions wherein only probes having complementarity with the target nucleic acid will hybridize; thereby analysing the target nucleic acid. The identification region can allow identification of the identity of a base at the genetic event.

In a particularly preferred embodiment the method includes: providing an array of a plurality of capture probes, wherein each of the capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region complementary to the identification region of a probe of the plurality of nucleotide probes; hybridizing the nucleotide probes which hybridize to the target to the array of capture probes, thereby detecting or identifying a nucleotide probe which bound to the target and thereby identifying the genetic event.

The identification region is a naturally occurring sequence in the target molecule. A probe sequence is chosen such that it corresponds to a part of the target sequence which includes an identification region. The identifier region can be at least 2, 4, 6, 8, 10, or 12 or more base pairs in length. In preferred embodiments it is less than 8, 10, 12, 14, 20, or 30 base pairs in length. In preferred embodiments it is between 2 and 40, 4 and 40, 6 and 30, 6 and 20 or 6 and 10 base pairs long. It can overlap with or be independent of other regions of the probe discussed herein, e.g., the target position, or positions on the 3' or 5' thereof. The length will, depend, in part, on the sequence of the region and on how many distinct identifier regions are required.

In preferred embodiments the target nucleic acid is: a DNA molecule: all or part of a known gene; wild type DNA; mutant DNA; a genomic fragment, particularly a human genomic fragment; a cDNA, particularly a human cDNA.

In preferred embodiments the genetic event is: a polymorphism, a single base pair change; a naturally occurring variant; an allele associated with a disease state; a mutation; an environmentally induced lesion.

In preferred embodiments the target nucleic acid is coupled to a support prior to hybridizing with the nucleotide probes.

In preferred embodiments the target nucleic acid is: a human sequence; a non- human sequence, e.g., a mouse, rat, pig, primate. In preferred embodiments the nucleotide probes are selected to be to maximize uniqueness of the identification regions, i.e., to minimize the number of non-unique nucleotide probes. This can be done with computer-aided methods. This can be done prior to dividing the probes into pools, e.g., a computer-aided method can be used to select or locate a group of nucleotide probes in which the frequency of repeated identification regions is minimized or reduced to zero. If there are no repeats there need be only one pool. If there is non-uniqueness, i.e., if an identification region is repeated, uniqueness can be obtained by placing the nucleotide probes having the same identification region in separate pools.

In preferred embodiments the nucleotide probes are labeled, e.g., so that hybridization to the array of capture probes can be followed. The label can be a fluorescent label. The nucleotide probes of a first pool can be labeled with a first label and the nucleotide probes of a second pool can be labeled with a second label. If only one label is used then pools must be hybridized separately to the array of capture probes.

In preferred embodiments the capture probes are single stranded probes in an ordered array.

In preferred embodiments the capture probes have a structure comprising a double stranded region and a single stranded region in an ordered array.

In preferred embodiments the target or the selector probe has attached thereto a first member of a proximity detector pair and hybridization to the array allows the first member to be brought into proximity with a second member to provide a signal. In preferred embodiments the target or a nucleotide probe has attached thereto a first member of a proximity detector pair and hybridization to the array allows the first member to be brought into proximity with a second member to provide a signal.

In preferred embodiments, the target is anchored to a substrate.

In another aspect, the invention features, a method of detecting the presence of, or identifying, one or more genetic events, e.g., one or more polymorphism, in a target nucleic acid, e.g., a DNA. The method includes:

(1) providing a target nucleic acid having a first genetic event, e.g., a polymorphism, and a second genetic event, e.g., a polymorphism;

(2) providing a first plurality of nucleotide probes wherein each probe: (a) is complementary to a region immediately 5' to the site of the first genetic event, e.g., a polymorphism, and complementary to a region immediately 3' to the site of the genetic event, e.g., a polymorphism, (b) has an identification region, e.g., a 5' terminal region or a 3' terminal region, which is complementary to the target sequence, and (c) wherein the nucleotide at the position corresponding to the site of the first genetic event, e.g., polymorphism, can be chosen from A, T, G, or C,

(3) providing a second plurality of nucleotide probes wherein each probe:

(a) is complementary to a region immediately 5¹ to the site of the second genetic event, e.g., a polymorphism, and complementary to a region immediately 3' to the site of the second genetic event, e.g., a polymorphism,

(b) has an identification region, e.g., a 5' terminal region or a 3' terminal region, which is complementary to the target sequence, and

(c) wherein the nucleotide at the position corresponding to the site of the second genetic event, e.g., a polymorphism, can be chosen from A, T, G, or

C, (4) optionally, organizing the first and second plurality of probes into one or more pools, such that within a pool, an identification region is not repeated in the pool, i.e., it is unique in the pool, and

(5) contacting the target with the first and second plurality of nucleotide probes under conditions wherein only probes having complementarity with the target will hybridize; thereby analysing the target nucleic acid. The identification region can allow identification of the identity of a base at the genetic event.

In a particularly preferred embodiment the method further includes:

(6) providing an array of a plurality of capture probes, wherein each of the plurality of capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region complementary to the identification region of a probe of the first and second plurality of nucleotide probes;

(7) hybridizing nucleotide probes which hybridize to the target to the array, thereby detecting or identifying a nucleotide probe which bound to the target and thereby identifying the polymoφhism, provided that in a hybridization all of the nucleotide probes have a unique identification region, e.g., they are all from a singe pool.

The identification region is a naturally occurring sequence in the target molecule. A probe sequence is chosen such that it corresponds to a part of the target sequence which includes an identification region. The identifier region can be at least 2, 4, 6, 8, 10, or 12 or more base pairs in length. It preferred embodiments it is less than 8, 10, 12, 14, 20, or 30 base pairs in length. In preferred embodiments it is between 2 and 40, 4 and 40, 6 and 30, 6 and 20 or 6 and 10 base pairs long. It can overlap with or be independent of other regions of the probe discussed herein, e.g., the target position, or positions on the 3' or 5' thereof. The length will, depend, in part, on the sequence of the region and on how many distinct identifier regions are required.

In preferred embodiments there is more than one pool and step 5, step 7, or both, are performed separately for each of a plurality of pools. The numbering in the method is provided for ease of understanding. As will be appreciated by one skilled in the art any order which will result in the objective of the method can be used, e.g., the order of steps 4 and 5 can be reversed. Similarly, the numbering does not mean that steps must be performed discreetly, e.g., steps 2 and 3 can be combined.

In preferred embodiments the first plurality of nucleotide probes includes a first probe having a first base at the position corresponding to the site of the first polymoφhism and a second probe having a second (i.e., different from the first base) base at the position corresponding to the site of the first polymoφhism. In preferred embodiments all of A, T, G, and C are represented.

In preferred embodiments the second plurality of nucleotide probes includes a first probe having a first base at the position corresponding to the site of the second polymoφhism and a second probe having a second (i.e., different from the first base) base at the position corresponding to the site of the second polymoφhism. In preferred embodiments all of A, T, G, and C are represented.

In preferred embodiments the probes are selected to be optimized for hybridization to the target, e.g., to the region having a polymoφhism.

In preferred embodiments the nucleotide probes are selected to be to maximize uniqueness of the identification regions, i.e., to minimize the number of non-unique nucleotide probes. This can be done with computer-aided methods. This can be done prior to dividing the probes into pools. E.g., a computer-aided method can be used to select or locate a group of nucleotide probes in which the frequency of repeated identification regions is minimized or reduced to zero. If there are no repeats there need be only one pool. If there is non-uniqueness, i.e., if an identification region is repeated, uniqueness can be obtained by placing the nucleotide probes having the same identification region in separate pools.

In preferred embodiments the target nucleic acid is: a DNA molecule: all or part of a known gene; all or part of a known polymoφhic gene; wild type DNA; mutant DNA; a genomic fragment, particularly a human genomic fragment; a cDNA, particularly a human cDNA.

In preferred embodiments the polymoφhism is: a single base pair change; a naturally occurring variant; an allele associated with a disease state; a mutation; an environmentally induced lesion.

In preferred embodiments the target nucleic acid is: a human sequence; a non- human sequence, e.g., a mouse, rat, pig, primate.

In preferred embodiments the genetic event is: a polymoφhism, a single base pair change; a naturally occurring variant; an allele associated with a disease state; a mutation; an environmentally induced lesion.

The method is described for use with a first and a second polymoφhism. It can also be modified for use in detecting a single polymoφhism by deleting the second plurality of probes. Likewise, it can be modified to detect more than two polymoφhisms. In preferred embodiments hybridization is detected by mass spectrophotometry, e.g., by MALDI-TOF mass spectrophotometry.

In preferred embodiments the target or a nucleotide probe has attached thereto a first member of a proximity detector pair and hybridization to the array allows the first member to be brought into proximity with a second member to provide a signal.

In another aspect, the invention features, a method of analyzing a target nucleotide sequence in a sample. The method includes: providing a target nucleic acid; hybridizing to the target nucleic acid a first probe having attached thereto a first member of a proximity detection pair; hybridizing to the target nucleic acid a second probe having attached thereto a second member of a proximity detection pair; and determining if a signal is generated by the proximity detection pair.

The signal from the probes hybridized to the target nucleic acid (e.g., a polymoφhic variant) can be compared to the signal generated by such probes when hybridized to a reference molecule (e.g., a wild type or non disease state associated sequence). Genetic events, e.g., rearrangements, e.g., inversions, deletions, insertions, or translocations, can alter the distance between the bound probes (or inhibit binding of a probe) and thereby generate a characteristic signal. E.g., a deletion could bring to probes into proximity and thereby cause generation of a signal. An insertion could increase the distance and inhibit production of a signal.

In preferred embodiments the method further includes providing an array of a plurality of capture probes, wherein each of the capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region and hybridizing the target to the array. Preferably, the region of the target which hybridizes to the array does not overlap with the region which binds the first probe or the regions which binds the second probe. Position on the array can provide sequence information. The target nucleic acid can be DNA, e.g., genomic DNA or cDNA. Target nucleic acid can be fragmented, e.g., by shearing or by enzyme digestion.

In preferred embodiments the target nucleic acid is: an RNA molecule: nucleic acids derived from RNA transcripts; wild type RNA; mutant RNA, particularly a human RNA.

In preferred embodiments probes are selected for minimal crosshybridization with other probes.

In one embodiment, the first member of the proximity detection pair absorbs energy at a first wavelength and transmits the absorbed energy to the second member of the proximity detection pair. The second member of the proximity detection pair absorbs the transmitted energy from the first member of the proximity detection pair and emits the absorbed energy at a second wavelength.

In another embodiment, the first member of the proximity detection pair absorbs energy at a first wavelength and transmits the absorbed energy to the second member of the proximity detection pair in a homogenous sample. The second member of the proximity detection pair absorbs the transmitted energy from the first member of the proximity detection pair and emits the absorbed energy at a second wavelength in a homogenous sample.

In another embodiment, the first member of the proximity detection pair is sensitizer particle that produces singlet oxygen, e.g., phthalocyanine, and the second member of the proximity detection pair is a chemiluminescer particle which reacts with the singlet oxygen to produce chemiluminescence, e.g., an olefin.

In one embodiment, the second member of the proximity detection pair is a component of a single stranded capture probes in an array.

In one embodiment, the second member of the proximity detection pair is a component of the capture probes comprising a double stranded portion and a single stranded portion in a capture array.

In one embodiment, the second member of the proximity detection pair is on the surface of each of a plurality of locations of the capture array.

In one embodiment, the second member of the proximity detection pair is in an aqueous solution which is applied to the capture array.

In another aspect, the invention features, a method of analyzing a polynucleotide sequence in a sample. The method includes: providing a target nucleic acid; hybridizing to the target nucleic acid a first probe having attached thereto a first member of a proximity detection pair; hybridizing to the target nucleic acid a second probe having attached thereto a second member of a proximity detection pair, wherein said second probe hybridizes to a region which contains a genetic event, e.g., a polymoφhism, e.g., a single nucleotide polymoφhism (an SNP), and wherein hybridization of the second probe is determined by the presence or absence of the genetic event; determining if a signal is generated by the proximity detection pair.

In preferred embodiments the method further includes providing an array of a plurality of capture probes, wherein each of the capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region and hybridizing the target to the array. Preferably, the region of the target which hybridizes to the array does not overlap with the region which binds the first probe or the regions which binds the second probe. Position on the array can provide sequence information.

The target nucleic acid can be DNA, e.g., genomic DNA or cDNA. Target nucleic acid can be fragmented, e.g., by shearing or by enzyme digestion.

In preferred embodiments the method is performed: on a sample from a human subject; and a sample from a prenatal subject; as part of genetic counseling; to determine if the individual from which the target nucleic acid is taken should receive a drug or other treatment; to diagnose an individual for a disorder or for predisposition to a disorder; to stage a disease or disorder. In preferred embodiments the capture probes are single stranded probes in an array.

In one embodiment, the second member of the proximity detection pair is a component of a single stranded capture probes in an array. In one embodiment, the second member of the proximity detection pair is a component of the capture probes comprising a double stranded protein and a single stranded portion in a capture array.

In another aspect the invention features, a method of analyzing a polynucleotide sequence in a sample. The method includes: providing a sample which includes a polynucleotide sequence to be analyzed having a test position or nucleotide: providing a plurality of selector probes which are complementary to a region which contains the test nucleotide, including a first probe which binds if the test nucleotide is a first nucleotide, e.g., an A, and a second probe which binds the region if the test nucleotide is a second nucleotide, e.g., a C (optionally the plurality includes a first probe which binds if the test nucleotide is a first nucleotide, a second probe which binds if the test nucleotide is a second nucleotide, a third probe which binds if the test nucleotide is a third nucleotide, and a fourth probe which binds if the test nucleotide is a fourth nucleotide), contacting the sample with the plurality of selector probes under conditions which allow hybridization of the polynucleotide sequence to be analyzed to a selector probe; allowing a nucleic acid molecule to be analyzed to hybridize with a selector probes and selecting probes which hybridize; providing an array of a plurality of capture probes, wherein each of the capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region complementary to the plurality of selector probes; hybridizing the selected selector probes with the array of capture probes, thereby detecting or identifying a selected nucleic acid molecule which bound to the polynucleotide sequence and thereby analyzing the polynucleotide sequence. In another aspect, the invention features, a method of analyzing a polynucleotide sequence in a sample. The method includes: providing a sample which includes a polynucleotide sequence to be analyzed and at least one extraneous nucleic acid molecule (.e.g, a cDNA other than the cDNA of interest, in a cDNA library); providing a plurality of selector probes wherein each selector probe is complementary to a region of the polynucleotide sequence to be analyzed and is not complementary to the extraneous nucleic acid, contacting the sample with the plurality of selector probes under conditions which allow hybridization of the polynucleotide sequence to be analyzed to a selector probe but which do not allow hybridization of the extraneous nucleic acid to a selector probe; allowing a nucleic acid molecule to be analyzed to hybridize with a selector probe; selecting the selector probes that hybridise to the polynucleotide sequence; detecting or identifying the selector probes which bound to the polynucleotide sequence, e.g., by mass spectrometry, and thereby analyzing the polynucleotide sequence.

In preferred embodiments the polynucleotide sequence is: a DNA molecule: all or part of a known gene; wild type DNA; mutant DNA; a genomic fragment, particularly a human genomic fragment; a cDNA, particularly a human cDNA.

In preferred embodiments the selector probes are labeled so that hybridization to the array of capture probes can be followed. The selector probes of a first pool can be labeled with a first label and the selector probes of a second pool can be labeled with a second label. If only one label is used then pools must be hybridized separately to the array of capture probes.

In preferred embodiments, each capture probe is in a discrete location on a support.

In preferred embodiments, each capture probe is in a discrete location in a microtiter plate, e.g., in the wells of a microtitre plate.

In preferred embodiments, each capture probe is in a gel pad on a support comprising more than one gel pads.

In another aspect, the invention features, a method of analyzing a polynucleotide sequence in a sample using a proximity detection pair. The method includes: providing a sample which includes a polynucleotide sequence to be analyzed and at least one extraneous nucleic acid molecule; providing a plurality of selector probes wherein each selector probe is complementary to a region of the polynucleotide sequence to be analyzed and is not complementary to the extraneous nucleic acid and has first member of the proximity detection pair, contacting the sample with the plurality of selector probes under conditions which allow hybridization of the polynucleotide sequence to be analyzed to a selector probe but which do not allow hybridization of the extraneous nucleic acid to hybridize to a selector probe; allowing a nucleic acid molecule to be analyzed to hybridize with a selector probe having the first member of the proximity detection pair; selecting a product of that hybridization, e.g., the selector probe, the nucleic acid to be analyzed, or both; providing an array of a plurality of capture probes having the second member of the proximity detection pair, wherein each of the capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region complementary to the plurality of selector probes; hybridization of the selected nucleic acid molecule having the first member of the proximity detection pair with the array of capture probes having the second member of the proximity detection pair, thereby allowing the first and second binding members of the proximity detection pair to come into close proximity; illuminating the array of a plurality of capture probes, wherein the illumination produces a reaction in the first member of the proximity detection pair, the product of the reaction from the first member generates a reaction in the second member of the proximity detection pair; detecting the reaction produced from the proximity detection pair, thereby analyzing the polynucleotide sequence.

In one embodiment, the second member of the proximity detection pair is a component of the capture probes comprising a double stranded protein and a single stranded portion in a capture array.

In another aspect, the invention features a composition of matter which includes one or more collections of the probes described herein.

In another aspect, the invention features a method of analysing a nucleic acid, e.g., detecting the presence of, or identifying, a genetic event, e.g., a SNP, in a target nucleic acid, e.g., a DNA. The method includes: providing a target nucleic acid having a genetic event;

providing a probe or primer, wherein the primer: a) is complementary to a region adjacent to the site of the genetic event, e.g., its 3' end is within 1, 2, 5, 10, 20, 30, or 50 nucleotides of the genetic event, and b) has a 3' end capable of serving as a priming site for extension; c) is, optionally, labeled; d) optionally has an identification region, e.g., a 5' terminal region or a 3' terminal region, e.g., an identification region not repeated in another probe in a plurality of probes contacted with the target, i.e., it is unique in the plurality, and is complementary to the target sequence, and

contacting the target with primer under conditions wherein only a primer having complementarity with the target nucleic acid will hybridize;

extending the primer across the genetic event using a primer terminating unit to generate a primer-extension product which terminates with a base which is complementary to a base in the genetic event, or generates a primer-extension product which extends beyond the base in the genetic event if the base in the genetic event is not complementary with the primer terminating unit; thereby analysing the target.

In a preferred embodiment the method includes: extending along the single strand which contains the genetic event with one and preferably with 2, 3, or all 4 labeled chain terminating nucleotides, wherein if more than one labeled chain terminating nucleotide is used each of the chain terminators, e.g., A or C, are distinguishable, such that a chain terminator is incoφorated at the position corresponding to the genetic event and can indicate the presence of the genetic event.

In a preferred embodiment identification region allows identification of the identity of the base at the genetic event. In a preferred embodiment the method further includes: providing an array having a plurality of capture probes, wherein each of the capture probes is positionally distinguishable from the other capture probes of the plurality and has a unique variable region (not repeated in another capture probe of the plurality); and

hybridizing the base terminated primer-extension product to a capture probe of the array, (preferrably the region of the base terminated primer extension product which corresponds to the genetic event hybridizes with the variable region of a capture probe)

thereby detecting or identifying a genetic event in a target nucleic acid.

The identification region can allow identification of the identity of a base at the genetic event.

In preferred embodiments the genetic event is: a polymoφhism, a single base pair change; a naturally occurring variant; an allele associated with a disease state; a mutation; an environmentally induced lesion. In preferred embodiments the target nucleic acid is coupled to a support prior to hybridizing with the nucleotide probes.

In preferred embodiments the capture probes have a structure comprising a double stranded region and a single stranded region in an ordered array. In preferred embodiments hybridization is detected by mass spectrophotometry, e.g., by MALDI-TOF mass spectrophotometry.

In preferred embodiments, the target is anchored to a substrate.

In methods described herein target or sample molecules can be amplified, e.g., by PCR, NASBA, RCA or other methods, prior to use.

Methods of U.S. 5,503,980 and or U.S. 5,631,134, both of which are hereby incoφorated by reference can be used in methods of the invention, particularly, the array and array-related steps recited herein can use methods taught in these patents.

Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description

The drawings will first be briefly described. Drawings

Fig. 1 is a diagram of steps 1 and 2 of an embodiment of the invention. Fig. 2 is a diagram of steps 3 and 4 of an embodiment of the invention. Figs. 3 is a diagram of steps 5 and 6 of an embodiment of the invention. Figs. 4 is a diagram of a DNA sequencing method.

Detection of Polymoφhisms

The figures describe an embodiment of the invention. In steps 1 and 2 the target DNA is analyzed and nucleotide probes which will hybridize to polymoφhism containing regions are selected. Methods for selecting nucleotide probes in terms of length and sequence, e.g., to optimize hybridization characteristics, e.g., to reduce or eliminate secondary structure are known in the art. The probes should be selected to minimize repetition of the identification regions. The identification regions are used to identify a nucleotide probe and thus a polymoφhism in the target . If necessary the probes can be divided into pools such that a pool has nucleotide probes all having unique identification regions. Thus, when a pool is hybridized to the array, the location of hybridization will identify a polymoφhism.

Steps 3 and 4 show the coupling of the target to a solid support, hybridization a pool of the probes to the target, and washing of the hybridization complex

Step 5 shows recovery of the hybridized probes.

Step 6 shows hybridization to the array of capture probes and detection of hybridization.

Methods of U.S. 5,503,980 and or U.S. 5,631,134, both of which are hereby incoφorated by reference can be used in methods of the invention, particularly, the array and array-related steps can use methods taught in these patents.

Differential Display Strategy-SNP (Single Nucleotide Polymoφhism ι and Mutation Detection

This method can integrate the identifier probe methods, e.g., those which use pools of probes, with a differential display approach.

1. Obtain target DNA from control and test samples (i.e. normal and mutant or disease).

2. Fragment target DNA, e.g., digest target DNA's with a restriction enzyme (e.g., a Type IIs enzyme) to generate target DNA fragments with staggered ends and denature target DNA fragments. 3. (optionally) De-phosphorylate the target DNA fragments.

4. Capture single strands of the fragmented DNA such that the single strands will be the Watson (or the Crick) of both test and control, i.e., both test and control are either, both Watson or both Crick. Test and control strands are isolated in different reaction vessels. This step can also be used to capture only a subset of fragments from the digest, e.g., by recognizing unique ends on the restriction fragments. This produces two separate samples: the Watson (or Crick) strand from the test DNA and the same strand from the control DNA. Each of these is coupled to a substrate.

5. Add probe pools, as described herein, to the anchored test DNA strand. 6. Isolate the probes that hybridized to the test DNA strand and add these probes to the anchored control DNA strand under the same conditions as were used in the previous step. This sequesters, by hybridization to control DNA, probes which recognize both test and control sequences. The control DNA should be present in excess, e.g., 10X excess.

7. Collect the bound and unbound probes in separate fractions (A & B) and apply each fraction to a separate ordered array, e.g., positional sequencing by hybridization array

(PSBH array) (1 & 2). A change in a signal at an array position indicates which probe is present.

8. The unbound probes (fraction B) that are hybridized to array 1 represent those SNPs or mutations that are present in the test sample but not in the control sample. 9. The bound probes (fraction A) that are ligated to the PSBH array 2 represent those SNPs or mutations that are common to both the control and the test samples.

Differential Display Strategv-RNA. EST and cDNA Detection

This method can integrate the identifier probe methods, e.g., those which use pools of probes, to analyze diverse samples.

1. Provide control and test samples (e.g., normal and mutant or disease). Normalize the concentration of the two samples.

2. In the case of an mRNA sample, capture the mRNA on a solid support using anchored polyA oligos or an ordered array, e.g., a PSBH array. If EST's or cDNA's are used, capture the ESTs and cDNAs using an ordered array, e.g., a PSBH array. Prepare at least two identical arrays for each sample.

3. Hybridize probe pools described herein to the anchored test mRNA (or cDNA or EST) and collect, in separate fractions, both the bound and unbound probes. Repeat this step if necessary.

4. Hybridize the bound probes to the array of control using conditions identical to those in the previous step.

5. Collect the bound and unbound probes in separate fractions (A & B) and apply each fraction to separate arrays (1 & 2). The presence or absence of a signal at each array position indicates which probes are present.

6. The unbound probes (fraction B) that are ligated array 1 represent those mRNA or cDNA species that are present in the test sample but not in the control sample.

7. The bound probes (fraction A) that are ligated to array 2 represent those mRNA or cDNA species that are common to both the control and the test samples.

Sequencing Method

1. Provide a probe having a first region complementary to a first region on a target nucleic acid, a second region complementary to a second region on the target nucleic acid, and optionally a third region disposed between. The first region on the target nucleic acid and the second region on the target nucleic acid are separated by a region to be sequenced.

2. Hybridize the probe from step 1 to the target nucleic acid, thereby forming a single strand loop including the region of the target nucleic acid to be sequenced.

3. Optionally, label the target DNA or the probe.

4. Create breaks, e.g., random breaks, in the single strand loop, for example by cleaving with an enzyme which cuts single strand nucleic acid.

5. Isolate cleavage fragments from step 4 and sequence them, e.g., by applying them to an ordered array, e.g., a PSBH array. Label present on either a fragment of the target nucleic acid (or on the probe fragment (if probe labelling is used the probe is cleaved in the third region) which is bound to the target nucleic acid fragment) allows for identification. DNA Structural Change Analysis

Insertions and deletions (or other events which alter chromosomal sequence) can be detected using proximity-label probes, e.g., FRET, HTRF, or LOCI probes. Target nucleic acid, e.g., fragments of genomic DNA generated by restriction digestion, are attached to a substrate in an array format (e.g., a Cantor array format). The target is denatured to form ssDNA (or ssDNA is generated using PCR off genomic DNA with one probe containing a linker molecule). A pool of probes, with first and second members of a proximity pair, e.g., donor and acceptor fluorophors, on separate probes, can be added to the genomic DNA. The probes recognize various, non-overlapping, regions of the genomic DNA such that some are in close enough proximity to generate a signal, e.g., to allow FRET, HTRF, or LOCI to occur between the donor and acceptor fluorophors. Two probes that are adjacent or nearby in space will create a signal using FRET, HTRF, or LOCI that can be detected by the appropriate instrumentation. A signature pattern of, e.g., fluorescence or chemiluminescence, will be detected from the array using this approach. Insertions and deletions within the population of genomic DNA may disrupt the pattern of probe fluorescence or chemiluminescence on the array. In a variation of this approach, competition between probes (i.e. one of which contains a donor fluorophor or each containing a unique fluorophor) with overlapping binding sites can be used to detect different sequences.

SNP Ratio Analysis

This is a method for determining the presence of a single nucleotide polymoφhism (SNP) marker that is based on comparing the ratio of nucleotides in a position in the genome between a population of control or "normal" genomic DNA and experimental or "disease" DNA. The concentration of genomic DNA extracted from each individual in a sample population is determined in order to normalize the concentration of the genomic DNA samples in the pool. A probe pool described herein is are added to the immobilized genomic DNA, and the ratio of the four possible nucleotides for the SNP are determined for the population using multiplex labeled probes or multiple probe pools with a single label. This ratio is examined for a wide variety of SNPs. The SNP ratios observed for one population can be compared to another population. This can be combined with the differential display methods described herein.

Positional Arrays

Positional arrays suitable for the present invention include high and low density arrays on a two dimensional or three dimensional surface. Positional arrays include nucleic acid molecules, peptide nucleic acids or high affinity binding molecules of known sequence attached to predefined locations on a surface. Arrays of this nature are described in numerous patents which are incoφorated herein by reference, Cantor US 5,503,980 (referred to herein as a Cantor array); Southern EP 0373 203 Bl; Southern 5,700,637 and Deugau 5,508,169. The density of the array can range from a low density format, e.g., a microtiter plate, e.g., a 96- or 384- well microtiter plate, to a high density format, e.g. 1000 molecules /cm², as described in Fodor US 5,445,934.

The surface on which the arrays are formed can be two dimensional, e.g. glass, plastic, polystyrene, or three dimensional, e.g. polymer gel pads, e.g. polyacylamide gel pads of a selected depth, width and height. See, e.g., gel pads described in U.S. 5,552,270, hereby incoφorated by reference.

The target or probes should bind to (and can be eluted from) the array at a single temperature. This can be effected by manipulating the length or concentration of the array or nucleic acid which hybridizes to it, by manipulating ionic strength or by providing modified bases.

Polymerase based extension of the capture probe and target or selector probe hybrid can be used to detect hybridization and may increase accuracy. The capture probes can be peptide nucleic acids in an array. Probes in an array can be: nucleic acids or peptide nucleic acids; haiφin-loop structures. Each capture probe can be in a discrete location on a support, or in a discrete location in a microtiter plate. A capture probe can be disposed in a gel pad on a support, e.g., a support comprising more than one gel pads.

Proximity Methods Proximity methods include those methods whereby a signal is generated when a first member and second member of a proximity detection pair are brought into close proximity. A "proximity detection pair" will have two members, the first member, e.g., an energy absorbing donor or a photosensitive molecule and the second member, e.g., an energy absorbing acceptor or a chemiluminescer particle. When the first and second members of the proximity detection pair are brought into close proximity, a signal is generated.

Examples of proximity methods include the following:

Fluorescence resonance energy transfer (FRET) Fluorescence resonance energy transfer (FRET) is based on a donor fluorophore that absorbs a photon of energy and enters an excited state. The donor fluorophore transfers its energy to an acceptor fluorophore when the two fluorophores are in close proximity by a process of non-radiative energy transfer. The acceptor fluorophore enters an excited state and eliminates the energy via radiative or non-radiative processes. Transfer of energy from the donor fluorophore to acceptor fluorophore only occurs if the two fluorophores are in close proximity.

Homogeneous time resolved fluorescence (ΗTRF

Homogeneous time resolved fluorescence (HTRF) uses FRET between two fluorophores and measures the fluorescent signals from a homogenous assay in which all components of the assay are present during measurement. The fluorescent signal from HTRF is measured after a time delay, thereby eliminating interfering signals. One example of the donor and acceptor fluorophores in HTRF include europium cryptate [(Eu)K] and XL665, respectively. Luminescent oxygen channelling assay (LOCD

In the luminescent oxygen channelling assay (LOCI), the proximity detection pairs includes a first member which is a sensitizer particle that contains phthalocyanine. The phthalocyanine absorbs energy at 680nm and produces singlet oxygen. The second member is a chemiluminescer particle that contains olefin which reacts with the singlet oxygen to produce chemiluminescence which decays in one second and is measured at 570nm. The reaction with the singlet oxygen and the subsequent emission depends on the proximity of the first and second members of the proximity detection pair.

Other embodiments are within the following claims. What is claimed is:

Claims

1. A method of analyzing a polynucleotide sequence in a sample comprising: providing a sample which includes a polynucleotide sequence to be analyzed and at least one extraneous nucleic acid molecule; providing a plurality of selector probes wherein each selector probe is complementary to a region of the polynucleotide sequence to be analyzed and is not complementary to the extraneous nucleic acid; contacting the sample with the plurality of selector probes under conditions which allow hybridization of the polynucleotide sequence to be analyzed to a selector probe but which do not allow hybridization of the extraneous nucleic acid to hybridize to a selector probe; allowing a nucleic acid molecule to be analyzed to hybridize with a selector probe; selecting a product of that hybridization, e.g., the selector probe, the nucleic acid to be analyzed, or both; providing an array of a plurality of capture probes, wherein each of the capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region complementary to the plurality of selector probes; hybridizing the selected nucleic acid molecule (which does not include the extraneous nucleic acid molecule) with the array of capture probes, thereby detecting or identifying a selected nucleic acid molecule which bound to the polynucleotide sequence and thereby analyzing the polynucleotide sequence.

2. The method of claim 1 wherein the polynucleotide sequence is: a DNA molecule: all or part of a known gene; wild type DNA; mutant DNA; a genomic fragment, particularly a human genomic fragment; a cDNA, particularly a human cDNA.

3. A method of detecting the presence of, or identifying, a genetic event comprising: providing a target nucleic acid having a genetic event; providing a plurality of nucleotide probes wherein each probe:

(a) is complementary to a region immediately 5' to the site of the genetic event and complementary to a region immediately 3' to the site of the genetic event , (b) has an identification region, e.g., a 5¹ terminal region or a 3' terminal region, which identification region is not repeated in another probe in the plurality, i.e., it is unique in the plurality, and is complementary to the target sequence, and (c) wherein the nucleotide at the position corresponding to the site of the first genetic event can be chosen from A, T, G, or C, wherein the plurality of nucleotide probes includes a first probe having a first base at the position corresponding to the site of the genetic event and a second probe having a second (i.e., different from the first base) base at the position corresponding to the site of the genetic event; contacting the target with the plurality of probes under conditions wherein only probes having complementarity with the target nucleic acid will hybridize; thereby analysing the target nucleic acid.

4. The method of claim 3 wherein the identification region can allow identification of the identity of a base at the genetic event.

5. The method of claim 3, further comprising: providing an array of a plurality of capture probes, wherein each of the capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region complementary to the identification region of a probe of the plurality of nucleotide probes; hybridizing the nucleotide probes which hybridize to the target to the array of capture probes, thereby detecting or identifying a nucleotide probe which bound to the target and thereby identifying the genetic event.

6. The method of claim 3 wherein the identification region is a naturally occurring sequence in the target molecule.

7. A method of detecting the presence of, or identifying, one or more genetic events, comprising: (1) providing a target nucleic acid having a first genetic event, e.g., a polymo╧åhism, and a second genetic event;

(2) providing a first plurality of nucleotide probes wherein each probe: (a) is complementary to a region immediately 5' to the site of first the first polymo╧åhism and complementary to a region immediately 3' to the site of the genetic event,

(b) has an identification region, e.g., a 5' terminal region or a 3' terminal region, which is complementary to the target sequence, and (c) wherein the nucleotide at the position corresponding to the site of the first polymo╧åhism can be chosen from A, T, G, or C,

(3) providing a second plurality of nucleotide probes wherein each probe:

(a) is complementary to a region immediately 5' to the site of the second polymo╧åhism and complementary to a region immediately 3' to the site of the second genetic event,

(c) wherein the nucleotide at the position corresponding to the site of the second polymo╧åhism can be chosen from A, T, G, or C,

(4) optionally, organizing the first and second plurality of probes into one or more pools, such that within a pool, an identification region is not repeated in the pool, i.e., it is unique in the pool, and

8. The method of claim 7, further comprising a particularly preferred embodiment the method further includes: (6) providing an array of a plurality of capture probes, wherein each of the plurality of capture probes is positionally distinguishable from other capture probes of the plurality on the array, and wherein each positionally distinguishable capture probe includes a unique (i.e., not repeated in another capture probe) region complementary to the identification region of a probe of the first and second plurality of nucleotide probes;

(7) hybridizing nucleotide probes which hybridize to the target to the array, thereby detecting or identifying a nucleotide probe which bound to the target and thereby identifying the polymo╧åhism, provided that in a hybridization all of the nucleotide probes have a unique identification region, e.g., they are all from a singe pool.

9. A method of analyzing a target nucleotide sequence in a sample comprising: providing a target nucleic acid; hybridizing to the target nucleic acid a first probe having attached thereto a first member of a proximity detection pair; hybridizing to the target nucleic acid a second probe having attached thereto a second member of a proximity detection pair; and determining if a signal is generated by the proximity detection pair.

10. A method of analyzing a polynucleotide sequence in a sample comprising: providing a target nucleic acid; hybridizing to the target nucleic acid a first probe having attached thereto a first member of a proximity detection pair; hybridizing to the target nucleic acid a second probe having attached thereto a second member of a proximity detection pair, wherein said second probe hybridizes to a region which contains a genetic event, e.g., a polymo╧åhism, e.g., a single nucleotide polymo╧åhism, and wherein hybridization of the second probe is determined by the presence or absence of the genetic event; determining if a signal is generated by the proximity detection pair.

11. A method of analyzing a polynucleotide sequence in a sample comprising: providing a sample which includes a polynucleotide sequence to be analyzed and at least one extraneous nucleic acid molecule; providing a plurality of selector probes wherein each selector probe is complementary to a region of the polynucleotide sequence to be analyzed and is not complementary to the extraneous nucleic acid, contacting the sample with the plurality of selector probes under conditions which allow hybridization of the polynucleotide sequence to be analyzed to a selector probe but which do not allow hybridization of the extraneous nucleic acid to hybridize to a selector probe; allowing a nucleic acid molecule to be analyzed to hybridize with a selector probe; selecting the selector probes that hybridise to the polynucleotide sequence; detecting or identifying the selector probes which bound to the polynucleotide sequence and thereby analyzing the polynucleotide sequence.