WO2000020639A1 - Enzymatic synthesis of oligonucleotide tags - Google Patents

Enzymatic synthesis of oligonucleotide tags Download PDF

Info

Publication number
WO2000020639A1
WO2000020639A1 PCT/US1999/022585 US9922585W WO0020639A1 WO 2000020639 A1 WO2000020639 A1 WO 2000020639A1 US 9922585 W US9922585 W US 9922585W WO 0020639 A1 WO0020639 A1 WO 0020639A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
oligonucleotide
words
repertoire
wherem
Prior art date
Application number
PCT/US1999/022585
Other languages
French (fr)
Inventor
Sydney Brenner
Steven R. Williams
Original Assignee
Lynx Therapeutics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lynx Therapeutics, Inc. filed Critical Lynx Therapeutics, Inc.
Priority to AU65025/99A priority Critical patent/AU6502599A/en
Publication of WO2000020639A1 publication Critical patent/WO2000020639A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H21/00Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/66General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6816Hybridisation assays characterised by the detection means
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures

Definitions

  • the invention relates generally to methods for synthesizing collections of minimally cross-hybridizing oligonucleotide tags for identifying, sorting, and/or tracking molecules, especially polynucleotides
  • BACKGROUND Specific hybridization of o gonucleotides and their analogs is a fundamental process that is employed in a wide variety of research, medical, and industrial applications, including the identification of disease-related polynucleotides in diagnostic assays, screening for clones of novel target polynucleotides. identification of specific polynucleotides in blots of mixtures of polynucleotides. amplification of specific target polynucleotides. therapeutic blocking of inappropriately expressed genes. DNA sequencing, and the like, e g Sambrook et al.
  • UnfortunateK incorrect hybridizations brought about by the creation of stable duplexes containing mismatches are not uncommon because base pairing and base stacking free energies van wideh among nucleotides in a duplex or triplex structure
  • a duplex consisting of a repeated sequence of deoxyadenosine (A) and thymidine (T) bound to its complement may have less stability than an equal-length duplex consisting of a repeated sequence of deoxvguanosine (G) and deoxycytidme (C) bound to a partially complementary target containing a mismatch
  • G deoxvguanosine
  • C deoxycytidme
  • words which are o gonucleotides usually 3 to 6 nucleotides in length, differ from every other member of the same set by at least two nucleotides Thus, a given word cannot form a duplex with the complement of any other word of the set without less than two mismatches
  • minimally cross-hyb ⁇ dizmg sets are preferably formed from words differing from one another by even more than two nucleotides
  • oligonucleotide tags constructed from concatenations of such words will differ from one another by at least two nucleotides, or by at least the number of nucleotides that their component words differ by Therefore, by judiciously selecting word length, differences between words in a set. and the number of words per tag. one can obtain a large set. or repertoire, of oligonucleotide tags that each differ from one another by a sigmficant percentage of their nucleotides
  • Such repertoires permit tagging and sorting of molecules with a much higher degree of specificity than ordinary ohgonucleotides
  • objectives of my invention include, but are not limited to, providing a method of synthesizing oligonucleotide tags which minimizes the production of failure sequences; providing an enzymatic method of synthesizing oligonucleotide tags by the combinatorial addition of words; providing a method of convergent synthesis of oligonucleotide tags from error-free components; providing a method of constructing tag-DNA conjugates whose tags are free of failure sequences; providing compositions comprising novel oligonucleotide tags.
  • My invention achieves these and other objectives by providing a method of synthesizing oligonucleotide tags that comprises successive cycles of cleavage of a oligonucleotide tag precursor to permit the ligation of one or more words from a minimally cross-hybridizing set, ligation of the one or more words, and amplification of ligated structure.
  • a method of synthesizing oligonucleotide tags that comprises successive cycles of cleavage of a oligonucleotide tag precursor to permit the ligation of one or more words from a minimally cross-hybridizing set, ligation of the one or more words, and amplification of ligated structure.
  • repertoires of oligonucleotide tags of a predetermined length are assembled from words, or sub-assemblies of words, that are free of failure sequences.
  • error- free words or sub-assemblies of words are obtained either by separately synthesizing and sequencing individual words or sub-assemblies of words p ⁇ or to assembly, or by successive ligations of adaptors having protruding strands consisting of word sequences that select complementary word sequences on the protruding strand of a growing tag.
  • words or sub-assemblies of words are inserted into and maintained in conventional cloning vectors, after which they are sequenced to confirm that no errors are present.
  • the words or sub-assemblies of words are excised from the vectors, mixed, and ligated to an oligonucleotide tag precursor.
  • e ⁇ or-containing words are excluded from the assembly process by requiring that the single stranded form of each added word anneal to a perfectly matched complement of an oligonucleotide tag precursor in a ligation step. If a mismatch exists because a failure sequence is present in one of the strands, no ligation will take place, either precluding further growth of the tag if the failure is carried by its protruding strand, or promoting the annealing of a different word if the failure is carried by the word being added.
  • the invention further includes repertoires of oligonucleotide tags consisting of a plurality words wherein at least two words of the plurality are separated by one or two nucleotides.
  • the present mvention overcomes difficulties in sorting polynucleotides with oligonucleotide tags synthesized by currently available methods By providmg oligonucleotide tags free of failure sequences, sampled and amplified tag-polynucleotide conjugates are assured of finding a tag complement with which to form a perfectly matched duplex
  • Figure la illustrates a preferred embodiment of the invention in which oligonucleotide tags are assembled by successive additions of one or more words to an oligonucleotide tag precursor
  • Figure lb illustrates a preferred embodiment of the mvention in which oligonucleotide tags are assembled by convergent additions of increasingly larger sub-assemblies of words
  • Figure 2 illustrates a preferred embodiment of the invention wherein oligonucleotide tags are assemble by successive additions and self-selection of words to an oligonucleotide tag precursor
  • word means an oligonucleotide selected from a minimally cross-hyb ⁇ dizing set of ohgonucleotides, as disclosed in U S patent 5,604,097, International patent application PCT US96/09513.
  • An oligonucleotide tag of the mvention consists of a plurality of words, or oligonucleotide subunits, that are selected from the same minimally cross-hyb ⁇ dizmg set In such a set, a duplex or t ⁇ plex consisting of a word of the set and the complement of any other word of the same set contams at least two mismatches
  • a duplex or t ⁇ plex consisting of a word of the set and the complement of any other word of the same set contams an even larger minimum number of mismatches, e g 3, 4. 5, or 6, depending on the length of the words
  • the minimum number of mismatches is either 1. 2. or 3 less than the length of the word
  • the minimum number of mismatches is 1 or 2 less than the length of the word
  • “Complement” or “tag complement” as used herem in reference to oligonucleotide tags refers to an oligonucleotide to which a oligonucleotide tag specifically hyb ⁇ dizes to form a perfectly matched duplex or t ⁇ plex In embodiments where specific hyb ⁇ dization results m a t ⁇ plex.
  • the oligonucleotide tag may be selected to be either double stranded or single stranded
  • the term "complement” is meant to encompass either a double stranded complement of a single stranded oligonucleotide tag or a single stranded complement of a double stranded oligonucleotide tag
  • populations of identical tag complements are attached to a spatially defined region of a solid phase support Preferably .
  • oligonucleotide as used herem includes linear ohgomers of natural or modified monomers or linkages, including deoxynbonucleosides, ⁇ bonucleosides, anome ⁇ c forms thereof, peptide nucleic acids (PNAs). and the like, capable of specifically binding to a target polynucleotide by way of a regular pattern of monomer-to-monomer interactions, such as Watson-Crick type of base pairing, base stacking.
  • oligonucleotides ranging in size from a few monome ⁇ c umts. e g 3-4, to several tens of monome ⁇ c umts Whenever an oligonucleotide is represented by a sequence of letters, such as "ATGCCTG,” it will be understood that the nucleotides are m 5 1 — >3' order from left to right and that upper or lower case "A” denotes deoxyadenosme, upper or lower case “C” denotes deoxycytidme.
  • “Perfectly matched" in reference to a duplex means that the poly- or oligonucleotide strands makmg up the duplex form a double stranded structure with one other such that every nucleotide in each strand undergoes Watson-C ⁇ ck basepai ⁇ ng with a nucleotide in the other strand
  • the term also comprehends the pairing of nucleoside analogs, such as deoxyinosme, nucleosides with 2-am ⁇ nopu ⁇ ne bases, and the like, that may be employed
  • the term means that the t ⁇ plex consists of a perfectly matched duplex and a third strand m which every nucleotide undergoes Hoogsteen or reverse Hoogsteen association with a basepair of the perfectly matched duplex
  • a "mismatch" m a duplex between a tag and an oligonucleotide means that a pair or triplet of nucleotides in the du
  • complexity means the number of different species of molecule present m the population
  • failure sequence refers to a synthetic oligonucleotide or polynucleotide that does not have the correct, or intended, length and/or sequence because of a failure m a step of the synthetic process, e g spu ⁇ ous chain initiation, failure of a coupling step failure of a capping step, cham scission, or the like
  • amplicon means the product of an amplification reaction That is.
  • amphcons are produced either m a polvmerase chain reaction (PCR) or by replication m a cloning vector Detailed Description of the Invention
  • PCR polvmerase chain reaction
  • the invention provides an enzymatic method for synthesizing a repertoire of oligonucleotide tags whose members are substantially free of failure sequences. Oligonucleotide tags are combinatorially synthesized by the assembly of error-free words or sub-assemblies of words in a series of enzymatic steps.
  • the method of the invention comprises the following steps: (a) providing a repertoire of oligonucleotide tag precursors in an amplicon, the oligonucleotide tag precursors each comprising one or more words, and each of the one or more words being selected from the same minimally cross- hybridizing set; (b) cleaving the amplicon at a word in each of the oligonucleotide tag precursors to form one or more ligatable ends on each oligonucleotide tag precursor: (c) hgating one or more words to the one or more ligatable ends to elongate each of the oligonucleotide tag precursors; (d) amplifying the elongated oligonucleotide tag precursors in the amplicon: and (e) repeating steps (b) through (d) until a repertoire of oligonucleotide tags having the predetermined length is formed.
  • each of the oligonucleotide tag precursors has the same length, which is determined by word length, the number of words making up the initial oligonucleotide tag precursor, and the stage of the assembly process, i.e. how manv words or sub-assemblies of words have been added by operation of the method of the invention.
  • the amplicon of the method is a population of cloning vectors wherein different oligonucleotide tags or oligonucleotide tag precursors are represented in equal proportions as inserts of such vectors.
  • the oligonucleotide tag precursors are cleaved for the ligation of an additional word or sub-assembly of words, the cleavage takes place at the same word for all the oligonucleotide tag precursors of the repertoire.
  • the step of cleaving is carried out with a type IIs restriction endonuclease which cleaves at the same word for all the oligonucleotide tag precursors of the repertoire and produces ligatable ends having protruding strands.
  • ligatable ends means ends of a double stranded DNA that can be ligated to another double stranded DNA. including blunt-end ligation and "sticky" end ligation.
  • ligatable ends are sticky ends.
  • the invention further includes repertoires of oligonucleotide tags defined by the following fonnula:
  • w ( , ... w n are words selected from the same minimally cross-hybridizing set, the words having a length of from three to fourteen nucleotides or basepairs; n is an integer in the range of from 4 to 10.
  • N is a nucleotide or basepair.
  • x j . ⁇ , x n -i are each an integer indicating how many nucleotides or basepairs. N. are present at the given location in the sequence of words, xj. x 2 . x n -i each being selected from the group consistmg of 0, 1. 2. 3. and 4. provided that at least one of xj. x 2 . x n _] is 1. 2.
  • xj. x 2 . x n- ⁇ are each selected from the group consistmg of 0. 1 , and 2, provided that at least one of x ⁇ . x 2 . x n _l is 1 or 2
  • oligonucleotide tags of the above formula are synthesized by the method of the mvention
  • words are from three to fourteen nucleotides or basepairs m length, and more preferably, words are from four to six nucleotides or basepairs in length Most preferably, words are four nucleotides or basepairs length
  • words consist of a linear sequence of nucleotides selected from the group consisting of A. C. G. and T For words constructed from 3 of the 4 natural nucleotides. the followmg word sizes, differences between words of the same set, and set sizes are preferred
  • subsets of the computed sets may be employed so that only words having specified GC content, melting temperature, reduced likelihood of self annealing, hairpin formation, or the like, are used to form tags
  • the above set sizes were computed using the algorithms listed in Brenner et al. PCT US96/09513 and allowed U S patent application Ser No 08/659,453 Exemplary minimally cross- hy b ⁇ dizing sets of words for use with the ⁇ n ⁇ ention arc listed in the following table
  • oligonucleotide tags of the invention are the range of from 18 to 60 nucleotides ui length More preferably, oligonucleotide tags are m the range of from 18 to 40 nucleotides in length
  • minimally cross-hybridizing sets comprise words that make approximateh equivalent contributions to duplex stability as every other word m the set
  • the stability of perfectly matched duplexes between even word and its complement is approximately equal
  • Guidance for selecting such sets is provided by published techniques for selecting optimal PCR primers and calculating duplex stabilities, e g Rychhk et al, Nucleic Acids Research 17 8543-8551 (1989) and 18 6409-6412 (1990). Breslauer et al Proc Natl Acad Sci . 83 3746-3750 (1986). Wetmur.
  • words or sub-assemblies of words are initially synthesized as single stranded ohgonucleotides using conventional solid phase synthetic methods, e g using a commercial DNA synthesizer, such as PE Applied Biosystems (Foster City, CA) model 392 DNA synthesizer, or like instrument
  • the words or sub-assemblies of words are synthesized within a longer oligonucleotide having approp ⁇ ate rest ⁇ ction endonuclease recognition sites and primer binding sites to facilitate later manipulation
  • such chemically synthesized ohgonucleotides are rendered double stranded by providing a primer which binds to one end of the ohgonucleotides and which is extended the length of the ohgonucleotides with a DNA polymerase m the presence of the four dNTPs
  • the following oligonucleotide shown in the 5'-»3' orientation
  • forward and reverse primers shown below may be used to render the oligonucleotide double stranded so that the indicated restriction endonuclease recognition sites are formed
  • the reverse primer is shown with a fluorescent label attached to its 5' end to facilitate purification "FAM” is a fluorescem dye available commercially, e g PE Applied Biosystems (Foster City. CA)
  • FAM fluorescem dye available commercially, e g PE Applied Biosystems (Foster City. CA)
  • the 64 double stranded ohgonucleotides containing the two- word combinations may be constructed by separately synthesizmg both strands and then annealing them together for clonmg mto a conventional cloning vector
  • the oligonucleotide of Formula I may be synthesized combinatorially. as disclosed in Brenner et al. International patent application PCT/US96/09513, so that a mixture of ohgonucleotides is produced, the components of the mixture being ohgonucleotides having different words For example, if the four-base words of Table I are employed, then the mixture corresponding to Formula I would consist of 64 different sequences, l e every possible two-word sequence In embodiments where s nthesis errors are eliminated by confirmatory sequencmg.
  • the ohgonucleotides of Formula I are synthesized separately followed by separate insertion into clonmg vectors and sequencing to confirm that each word sequence is co ⁇ ect As above, if the four-base words of Table I are employed, then 64 separate clonings and sequence determmations would be required After such confirmatory sequencmg, the 64 clones are combmed for use m the method of the mvention
  • Oligonucleotide tags produced by way of the invention may be assembled from words or sub-assemblies of words either by stepwise additions in a plurality of cycles of cleavage and ligation of preferably identically sized adaptors, or m stages of convergent assembly of fragments, each of such fragments comp ⁇ sing increasingly larger oligonucleotide precursors Examples of both approaches are illustrated in Figures la (stepwise additions) and lb (convergent assembly)
  • Figure la vector (100) is prepared for each sequence of words "- WJ-W2-"
  • the presence of two words m this example is only for purposes of illustration In this embodiment, any number of words can be used
  • Adjacent to words (108) are cleavage sites (107) and (109) of type IIs restnction endonucleases, r 2 and r-$, recogmzmg sites (106) and (110), respectively Adjacent to, and upstream of. restnction site ( 106) is restnction site ( 104) recognized by restnction endonuclease.
  • rj Flanking the entire assembly of restriction sites and words are optional primer binding sites (102) and (112). which may be used to copy the oligonucleotide tag for insertion into a vector as taught by Brenner et al. International application pct/us96/09513 In the prefened embodiment of Figure la.
  • vector (100) serves (1 14) as a starting material for the tag assembly process, l e at the start of the process.
  • l in the subscript of insert (120) Note that the process entails the successive insertion of the following element, or cassette -w-w-CN) ⁇ -
  • r 3 is virtually any type IIs restriction endonuclease which allows a predictable sequence (109) to be engmeered mto vector (100)
  • Exemplary r 3 's include Alw I, Bbs I, Bbv I.
  • r 3 leaves a 1 or 2 nucleotide protrudmg strand after cleavage
  • r 2 is virtually any type IIs restnction endonuclease which allows a predictable sequence (107) to be engmeered mto vector ( 100)
  • r may be selected from the same group of type IIs restriction endonucleases as r . but preferably for a given vector and r 2 are different Cycles of word addition in the preferred embodiment, illustrated m Figure la, begin with the step of cleaving (122) vector (121) with and r 2 .
  • restriction endonucleases r j and r 3 recogmzmg restriction sites (104) and (110), respectively, are used to cleave (116) vector ( 100) to produce fragment (118), which is inserted (126) into opened vector (124) to form vector (128), thereby elongating the oligonucleotide tag precursors by two words
  • the cycles are repeated (130) until an oligonucleotide tag repertoire of the desired length is obtained At such pomt.
  • the oligonucleotide tags may be excised from vector (128) by digesting with r 2 and r 3
  • repertoires may be synthesized in accordance with the invention with a convergent strategy as illustrated in Figure lb Vector (150), which may be identical to vector (100), contains the following elements restriction site (152) for restriction endonuclease. r j .
  • vector (1 0) mav also contain flanking primer bmdmg sites as with vector (100) (not shown) for producing copies of the oligonucleotide tags or their precursors
  • Two ahquots (160) and (162) are taken of vector (150) In aliquot (160).
  • vector (150) is digested with r j and r 2 so that fragment ( 161 ) is excised and opened vector ( 166) is formed Separately, in aliquot (162).
  • vector (150) is digested with ri and r 3 so that 2-word fragment (164) is excised After purification.
  • 2-word fragment (164) is inserted and ligated (168) into opened vector (166) to form vector (170), which contains oligonucleotide tag precursors consistmg of four words each
  • vector (170) contains oligonucleotide tag precursors consistmg of four words each
  • steps are repeated usmg vector (170) as the starting matenal That is, two ahquots (174) and (176) are taken of vector (170) In aliquot (174).
  • vector (170) is digested with and r 2 so that fragment (175) is excised and opened vector (180) is formed Separately, in aliquot (176).
  • vector (170) is digested with r j and r 3 so that 4-word fragment (178) is excised After purification.
  • 4-word fragment (178) is ligated (182) into opened vector (184) to form vector (184), which contains oligonucleotide tag precursors consistmg of eight words each Additional cycles may be earned out. or if the desired length of the tags is 8 words, then the oligonucleotide tags may be excised (186) by digestmg with r 2 and r 3
  • Repertoires of oligonucleotide tags may also be produced in accordance with the invention by repeated additions of words with self-selection during the ligation step
  • the length of the protruding strand produced by cleavage with a type IIs restriction endonuclease is the same as the length of a word
  • FIG. 200 A preferred implementation of this embodiment is illustrated in Figure 2 Vector (200), produced from conventional starting materials, includes the following elements restriction site for xq (204). Restnction site for r$ (206), restriction site for rg (208). cleavage site (209), a plurality of words (210). restriction site for x ⁇ (212), and a restriction site for rg (214) As with vector (100).
  • the above senes of elements may be flanked by optional primer binding sites (202) and (216) so that the oligonucleotide tag precursors may be convemently replicated, e g by PCR amplification Vector (221 ).
  • PCR amplification Vector (221 ) which may be a sample of starting vector (200) or a previously processed vector, is cleaved (224) with x ⁇ and rg to produce fragment (225) and opened vector (228).
  • rg is a type IIs restnction endonuclease which cleave across the upstream-most word of the oligonucleotide tag precursor of vector (228)
  • Vector (228) is actually a mixture by virtue of the different oligonucleotide tag precursors
  • the protrudmg strand of end (226) is present in N different sequences, where N is the number of words the minimally cross-hyb ⁇ dizmg set being used
  • a sample of vector (200) is cleaved (222) with and rg to produce fragment (218).
  • Fragment (218) is a mixture containing N 2 components in this example, where again N is the number of words m the minimally cross-hybridizing set bemg used N is to the second power because the fragment contains all possible combmations of two consecutive words Element (220) of fragment (218) is the single-stranded form of the second or downstream-most, word of vector (200) Fragment (218) is combmed with opened vector (228) under conditions that permit the smgle stranded forms of the words (220) and (226) to form perfectly matched duplexes Because of the minimally cross-hyb ⁇ dization property of the protrudmg strands, these conditions are readily met Strands that are not complementary or that contain failure sequences will not form perfectly matched duplexes and will not be ligated In this sense, the words m the protrudmg strands are "self-selecting " After msertion and ligation (230), vector (232) is formed which contam
  • an oligonucleotide tag repertoire is produced such that each oligonucleotide tag consists of eight words of four nucleotides
  • SEQ ID NO 4 A vector, corresponding to vector (200). is constructed by first inserting the following oligonucleotide (SEQ ID NO 4) mto a Bam HI and Eco RI digested pUC19 Pad Bse RI Bsp 120 Bbs I Eco RI Bam HI ⁇ * ⁇ aattgttaattaaggatgagctcactcctcgggcccgcataagtcttcgaattcg caattaattcctactcgagtgaggagcccgggcgtattcagaagcttaagcctaq
  • the oligonucleotide of Formula I and forward and reverse primers are synthesized using a conventional DNA synthesizer, e g PE Applied Biosystems (Foster City. CA) model 392
  • the oligonucleotide of Formula I is a mixture containing a repertoire of 64 two-word oligonucleotide tag precursors
  • the four-nucleotide words of Table 1 are employed After amplification by PCR. the amplification product is digested with Bbs I to give the following two products
  • the products are re-hgated, amplified by PCR, and digested with Bbv I to give the following two products
  • any words consistmg of failure sequences are selected agamst by the ligation event, l e words with failure sequences will not rehgate in the mixture, and thus, will not be amplified
  • the final product is digested with Pst I and Hind III and inserted mto a Pst I/Hmd Ill-digested pUC19 to give the following construct (SEQ ID NO 5)
  • Pst I. Bse RJ. Bbs I, Bsp 120. and Bbv I. correspond to r 4 , r 5 . r 6 . r 7 . and r 8 of Figure 2 respectively
  • the plasmid is isolated and cleaved with Pst I and Bbs I to give an opened vector with the following upstream and downstream (SEQ ID NO 6) ends ..cgacctgca wordword-gggcccaatgctgcaagcttggcg... ..gctgg word-cccgggttacgacgttcgaaccgc ...
  • gaggagatgaagacga-word acgtctcctctacttctgct-wordword
  • This fragment is inserted mto the above vector opened by digestion with Bbs I and Pst I to give the following construct (SEQ ID NO 8)
  • the isolated fragment is then inserted mto the Bse RI/Bsp 120 vector of Formula II, which vector is used to transform a suitable host
  • the construct is ready for inserting polynucleotides, such as cDNAs, mto the Eco RI restnction site to forni tag-polynucleotide conjugates in accordance with the method of Brenner et al, International patent application pct/us96/09513
  • an oligonucleotide tag repertoire is produced following the procedure outlined in Figure lb
  • Each oligonucleotide tag consists of eight words of six nucleotides each (selected from those listed m Table I) to give the repertoire having an expected complexity of 9 8 , or about 4 3 x 10 7
  • an oligonucleotide (SEQ ID NO 9) of the following form is synthesized Pst I Bse RI Bsp 120
  • the ohgonucleotides of Formula III are rendered double stranded and amplified by providing forward and reverse primers and conducting a PCR, as described above for the oligonucleotide of Formula I. After amplification, the ohgonucleotides are separately cleaved with Pst I and Hind III and cloned into a similarly cleaved M13mpl8 and suitable hosts are transformed. Clones are selected and the oligonucleotide inserts are sequenced using conventional techniques. Such selection and sequencing continue until a vector is obtained for each of the 81 two-word combinations whose sequence is confirmed to be correct.
  • the population of vectors is divided into two parts, after which the vectors in one part are cleaved with Pst I and Bsg I to give the following fragment mixture (SEQ ID NO: 11):
  • Example 3 Construction of an Eight-Word Tag Library an eight-word tag library with four-nucleotide words was constructed from two two-word hbranes in vectors pLCV-2 and pUCSE-2 P ⁇ or to construction of the eight- word tag library, 64 two-word double stranded ohgonucleotides were separately mserted mto pUC 19 vectors and propagated These 64 ohgonucleotides consisted of every possible two-word pair made up of four-nucleotide word selected from an eight-word minimally cross- hyb ⁇ dizmg set descnbed m Brenner, U S patent 5.604,097 After the identities of the mserts were confirmed by sequencmg, the mserts were then amplified by PCR and equal amounts of each amplicon were combmed to form the inserts of the two-word hbranes in vectors, pLCV-2 and pUCSE-2
  • a bacte ⁇ al host was transformed by the ligation product usmg electroporation, after which the transformed bacteria were plated, a clone was selected, and the insert of its plasmid was sequenced for confirmation pUCSE isolated from the clone was then digested with Eco RI and Hind III using the manufacturer's protocol and the large fragment was isolated The following adaptor (SEQ ID NO 14) was ligated to the large fragment to give plasmid pUCSE- Dl which contained the first di-word (underlmed)
  • pUCSE-D2 through pUCSE-D64 were separately constructed from pUCSE-Dl by digesting it with Pst I and Bsp 120 I and separately hgating the following adaptors (SEQ ID NO 15) to the large fragment
  • the words of the top strand were selected from the following minimally cross-hybridizing set: gatt, tgat, taga, tttg, gtaa, agta, atgt, and aaag. After cloning and isolation, the inserts of the vectors were sequenced to confirm the identities of the di-words.
  • Plasmid cloning vector pLCV-Dl was created from plasmid vector pBC.SK " (Stratagene) as follows, using the following ohgonucleotides:
  • Ohgonucleotides S-723 and S-724 were l ⁇ nased. annealed together, and ligated to pBC SK " which had been digested with Kprl and Xbal and treated with calf intestinal alkaline phosphatase. to create plasmid pSW143 I
  • Oligonucleotidess S-785 and S-786 were k ased. annealed together, and ligated to plasmid pSW143 1, which had been digested with Xhol and BamHI and treated with calf inestrnal alkaline phosphatase, to create plasmid pSWl 64 02
  • Ohgonucleotides S-960, S-961, S-962. and S-963 were kinased and annealed together to form a duplex consistmg of the four ohgonucleotides
  • Plasmid pSW164 02 was digested with Xhol and Sapl The digested DNA was electrophoresed in an agarose gel.
  • Plasmid pUC4K (from Pharmacia) was digested with Pstl and electrophoresed in an agarose gel The approx 1240 bp product was purified from the appropriate gel slice The two plasmid products (from pSW164 02 and pUC4K) were ligated together with the S-960/961/962/963 duplex to create plasmid pLCVa
  • DNA from Adenov ⁇ rus5 was digested with Pad and Bsp 1201, treated with calf intestinal alkaline phosphatase, and electrophoresed m an agarose gel
  • the approx 2853 bp product was purified from the appropnate gel slice This fragment was ligated to plasmid pLCVa which had been digested with Pad and Bsp 1201. to create plasmid pSW208 14
  • Plasmid pSW208 14 was digested with Xhol, treated with calf intestinal alkaline phosphatase. and electrophoresed in an agarose gel The approx 5374 bp product was pu ⁇ fied from the appropriate gel slice This fragment was ligated to ohgonucleotides S-1105 and S-1106 (which had been kinased and annealed together) to produce plasmid pLCVb, which was then digested with Eco RI and Hind III The large fragment was isolated and ligated to the Formula I adaptor (SEQ ID NO 14) to g ⁇ ve LCV-Dl As above for pUCSE. further plasmids. pLCV-D2 through pLC V-D64.
  • di- words were separately constructed from pLCV-Dl by digesting it with Pst I and Bsp 120 I. isolating the large fragment, and a ligating an adaptor of Formula II After clonmg and isolation, the inserts of the vectors were sequenced to confinn thhe identities of the di-words
  • Each of the vectors pLCV-Dl through -D64 and pUCSE-Dl through -D64 was separately amplified by PCR.
  • the components of the reaction mixture were as follows.
  • the temperature of the reactions was controlled as follows: 94°C for 3 min; 25 cycles of 94°C for 30 sec. 60°C for 30 sec, and 72°C for 10 sec; followed by 72°C for 3 min, then 4°C.
  • the DF and DR primer binding sites were upstream and downstream portions of the vectors selected to give amplicons of 104 basepairs in length.
  • 5 ⁇ l of each PCR product were separated polyacrylamide gel electrophoresis (20% with IxTBE) to confirm by visual inspection that the reaction yields were approximately the same for each PCR.
  • the excess biotinylated primers were removed by adding 50 ⁇ l 50% Ultralink (streptavidin-Sepharose, Pierce Chemical Co., Rockford, JL) and vorte ing the mixture at room temperature for 30 min.
  • the Ultralink material was separated from the reaction mixture by centrifugation, after which approximately half of the mixture was separated by polyacrylamide gel electrophoresis (20% gel).
  • the 29-basepair band was cut out of the gel and the 29-basepair fragment was eluted using the "crush and soak" method, e.g. Sambrook et al. Molecular Cloning.
  • pNCV3 was constructed by first assembling the following fragment (SEQ ID NO 26) from synthetic ohgonucleotides
  • the di-words of pLCV-2 were amplified either by PCR or plasmid expansion, the product was digested with Eco RI and Bbvl after winch the Eco RI-BbvI fragment was isolated as insert 1
  • Two-word library pUCSE-2 was digested with Eco RI. Bbs I. and Pst I. after which the large fragment was treated with calf intestine alkaline phosphatase to give vector 1 Vector 1 and insert 1 were combined in a conventir al ligation reaction to give three-word library.
  • pUCSE-3 pUCSE-3 was digested with Eco RI. Bbs 1. and Pst I.
  • pUCSE-4 The 4-mer words of pUCSE-4 were amphfied either by PCR or plasmid expansion, the product was digested with Eco RI and Bbvl after which the Eco RI-BbvI fragment was isolated as insert 2.
  • pLCV-2 was digested with Eco RI, Bbs I, and Pst I, after which the large fragment was treated with calf intestine alkaline phosphatase to give vector 3.
  • Vector 3 and insert 2 were then combined in a conventional ligation reaction to give five-word library, pLCV-5.
  • the 5-mer words of pLCV-5 were amplified either by PCR or plasmid expansion, the product was digested with Eco RI and Bbvl after which the Eco RI-BbvI fragment was isolated as insert 3.
  • pUCSE-4 was digested with Eco RI, Bbs I, and Pst I, after which the large fragment was treated with calf intestine alkaline phosphatase to give vector 4.
  • Vector 4 and insert 3 were then combined in a conventional ligation reaction to give eight-word library, pUCSE-8.
  • the 8-mer words of pUCSE-8 were amplified either by PCR or plasmid expansion, the product was digested with Bse RI and Bsp 120 I, after which the BseRI- Bspl20I fragment was isolated as insert 4.
  • pNCV3 was digested with Bse RI. Bsp 120 I, and Sac I, after which the large fragment was isolated and treated with calf intestine alkaline phosphatase to give vector 5.
  • Vector 5 was then combined with insert 4 in a conventional ligation reaction to give the eight-word library pNCV3-8.
  • ⁇ 223> Preferably, contains fluorescent label.

Abstract

The invention provides oligonucleotide tag compositions and methods for synthesizing repertoires of error-free oligonucleotide tags that may be used for labeling and sorting polynucleotides, such as cDNAs, restriction fragments, and the like. In accordance with the method of the invention, oligonucleotide tag precursors are provided in an amplicon, wherein the tag precursors each consists of one or more oligonucleotide 'words' selected from the same minimally cross-hybridizing set of words. The oligonucleotide tag precursors are elongated by repeated cycles of cleavage, ligation of one or more words, and amplification. Cycles continue until the oligonucleotide tags of the repertoire have a desired length or complexity.

Description

ENZYMATIC SYNTHESIS OF OLIGONUCLEOTIDE TAGS
Field of the Invention The invention relates generally to methods for synthesizing collections of minimally cross-hybridizing oligonucleotide tags for identifying, sorting, and/or tracking molecules, especially polynucleotides
BACKGROUND Specific hybridization of o gonucleotides and their analogs is a fundamental process that is employed in a wide variety of research, medical, and industrial applications, including the identification of disease-related polynucleotides in diagnostic assays, screening for clones of novel target polynucleotides. identification of specific polynucleotides in blots of mixtures of polynucleotides. amplification of specific target polynucleotides. therapeutic blocking of inappropriately expressed genes. DNA sequencing, and the like, e g Sambrook et al.
Molecular Cloning A Laboratory Manual. 2nd Edition (Cold Spring Harbor Laboratory. New York. 1989). Keller and Manak. DNA Probes. 2nd Edition (Stockton Press. New York. 1993). Milhgan et al. J Med Chem . 36 1923- 1937 (1993). Drmanac et al. Science. 260 1649-1652 ( 1993). Bains. J DNA Sequencing and Mapping. 4 143-150 (1993) Specific hybridization has also been proposed as a method of tracking, retnevmg. and identifying compounds labeled with oligonucleotide tags, e g Brenner. International application PCT/US95/12791. Church et al. Science. 240 185-188 (1988). Brenner and Lerner. Proc Natl Acad Sci . 89 5381-5383 (1992). Alper, Science. 264 1399-1401 (1994). Cheverm et al. Biotechnology . 12 1093-1099 (1994). and Needels et al. Proc Natl Acad Sci . 90 10700-10704 (1993) The successful implementation of such tagging and sorting schemes depends in large part on the success in achieving specific hybridization between a tag and its complement That is. for an oligonucleotide tag to successfully identify a substance, the number of false positive and false negativ e signals brought about by incorrect hybridizations must be minimized And for oligonucleotide tags to effectively sort molecules. the number of tags hybridized to complements at incorrect sites must be minimized
UnfortunateK . incorrect hybridizations brought about by the creation of stable duplexes containing mismatches are not uncommon because base pairing and base stacking free energies van wideh among nucleotides in a duplex or triplex structure For example, a duplex consisting of a repeated sequence of deoxyadenosine (A) and thymidine (T) bound to its complement may have less stability than an equal-length duplex consisting of a repeated sequence of deoxvguanosine (G) and deoxycytidme (C) bound to a partially complementary target containing a mismatch Thus, if a desired compound from a large combinatorial chemical library were tagged with the fonner oligonucleotide. a significant possibility would exist that, under hybndization conditions designed to detect perfectly matched AT-πch duplexes, undesired compounds labeled with the GC-πch ohgonucleotide—even in a mismatched duplex-would be detected or sorted along with the perfectly matched duplexes consisting of the AT-πch tag Even though reagents, such as tetramethylammomum chloride, are available to negate base-specific stability differences of oligonucleotide duplexes, the effect of such reagents is often limited and their presence can be incompatible with, or render more difficult, further manipulations of the selected compounds, e g amplification by polymerase chain reaction (PCR). or the like
Such problems have been addressed m the "solid phase" cloning technique, descπbed m Brenner. International application PCT/US95/12791. by the development of oligonucleotide tags synthesized combinatoπally from a set of so-called minimally cross-hybπdizmg o gonucleotides. or "words " The words, which are o gonucleotides usually 3 to 6 nucleotides in length, differ from every other member of the same set by at least two nucleotides Thus, a given word cannot form a duplex with the complement of any other word of the set without less than two mismatches Of course, minimally cross-hybπdizmg sets are preferably formed from words differing from one another by even more than two nucleotides
In such a scheme, different oligonucleotide tags constructed from concatenations of such words will differ from one another by at least two nucleotides, or by at least the number of nucleotides that their component words differ by Therefore, by judiciously selecting word length, differences between words in a set. and the number of words per tag. one can obtain a large set. or repertoire, of oligonucleotide tags that each differ from one another by a sigmficant percentage of their nucleotides Such repertoires permit tagging and sorting of molecules with a much higher degree of specificity than ordinary ohgonucleotides
Unfortunately, current methods of solid phase synthesis, although highly efficient, still lead to a sigmficant fraction of failure sequences when oligonucleotide tags start to exceed 30 to 40 nucleotides in length The presence of such failure sequences can have a significant impact on solid phase cloning and sorting schemes, such as the one descπbed in Brenner (cited above) When tag complements are synthesized separately from their corresponding oligonucleotide tags, the presence of different sets of failure sequences among the two reaction products means that not every oligonucleotide from one reaction will necessaπly have a complementary oligonucleotide among products of the other reaction In particular, failure sequences produced in one reaction will generally not have complementary failure sequences produced in the other reaction hile this is not a problem for tag complements combmatoπally synthesized on solid phase supports because the number and kind of failures are randomly distributed among a population of predominantly correct-sequence ohgonucleotides, for tags attached to DNAs which are sampled and amplified, a sigmficant probability exists that if one or more of the sampled tags contain failure sequences, no solid phase supports will exist for them that has a population of perfect complements. Consequently, DNAs with such tags cannot be effectively sorted.
In view of the above, it would be useful if there were available a method of producing oligonucleotide tags which would avoid or minimize the chance of there being sampled and amplified tags that contain failure sequences.
Summary of the Invention Accordingly, objectives of my invention include, but are not limited to, providing a method of synthesizing oligonucleotide tags which minimizes the production of failure sequences; providing an enzymatic method of synthesizing oligonucleotide tags by the combinatorial addition of words; providing a method of convergent synthesis of oligonucleotide tags from error-free components; providing a method of constructing tag-DNA conjugates whose tags are free of failure sequences; providing compositions comprising novel oligonucleotide tags. My invention achieves these and other objectives by providing a method of synthesizing oligonucleotide tags that comprises successive cycles of cleavage of a oligonucleotide tag precursor to permit the ligation of one or more words from a minimally cross-hybridizing set, ligation of the one or more words, and amplification of ligated structure. Preferably, repertoires of oligonucleotide tags of a predetermined length are assembled from words, or sub-assemblies of words, that are free of failure sequences. Preferably, such error- free words or sub-assemblies of words are obtained either by separately synthesizing and sequencing individual words or sub-assemblies of words pπor to assembly, or by successive ligations of adaptors having protruding strands consisting of word sequences that select complementary word sequences on the protruding strand of a growing tag. Preferably, in the former embodiment, words or sub-assemblies of words are inserted into and maintained in conventional cloning vectors, after which they are sequenced to confirm that no errors are present. For use in the method of the invention, the words or sub-assemblies of words are excised from the vectors, mixed, and ligated to an oligonucleotide tag precursor. Preferably, in the latter embodiment, eπor-containing words are excluded from the assembly process by requiring that the single stranded form of each added word anneal to a perfectly matched complement of an oligonucleotide tag precursor in a ligation step. If a mismatch exists because a failure sequence is present in one of the strands, no ligation will take place, either precluding further growth of the tag if the failure is carried by its protruding strand, or promoting the annealing of a different word if the failure is carried by the word being added. The invention further includes repertoires of oligonucleotide tags consisting of a plurality words wherein at least two words of the plurality are separated by one or two nucleotides. The present mvention overcomes difficulties in sorting polynucleotides with oligonucleotide tags synthesized by currently available methods By providmg oligonucleotide tags free of failure sequences, sampled and amplified tag-polynucleotide conjugates are assured of finding a tag complement with which to form a perfectly matched duplex
Bπef Descnption of the Drawmgs Figure la illustrates a preferred embodiment of the invention in which oligonucleotide tags are assembled by successive additions of one or more words to an oligonucleotide tag precursor Figure lb illustrates a preferred embodiment of the mvention in which oligonucleotide tags are assembled by convergent additions of increasingly larger sub-assemblies of words
Figure 2 illustrates a preferred embodiment of the invention wherein oligonucleotide tags are assemble by successive additions and self-selection of words to an oligonucleotide tag precursor
Definitions As used herem, the term "word" means an oligonucleotide selected from a minimally cross-hybπdizing set of ohgonucleotides, as disclosed in U S patent 5,604,097, International patent application PCT US96/09513. and allowed U S patent application Ser No 08/659.453, which references are incorporated by reference An oligonucleotide tag of the mvention consists of a plurality of words, or oligonucleotide subunits, that are selected from the same minimally cross-hybπdizmg set In such a set, a duplex or tπplex consisting of a word of the set and the complement of any other word of the same set contams at least two mismatches Preferably, a duplex or tπplex consisting of a word of the set and the complement of any other word of the same set contams an even larger minimum number of mismatches, e g 3, 4. 5, or 6, depending on the length of the words Still more preferably, the minimum number of mismatches is either 1. 2. or 3 less than the length of the word Most preferably, the minimum number of mismatches is 1 or 2 less than the length of the word
"Complement" or "tag complement" as used herem in reference to oligonucleotide tags refers to an oligonucleotide to which a oligonucleotide tag specifically hybπdizes to form a perfectly matched duplex or tπplex In embodiments where specific hybπdization results m a tπplex. the oligonucleotide tag may be selected to be either double stranded or single stranded Thus, where triplexes are formed, the term "complement" is meant to encompass either a double stranded complement of a single stranded oligonucleotide tag or a single stranded complement of a double stranded oligonucleotide tag Usually, populations of identical tag complements are attached to a spatially defined region of a solid phase support Preferably . such solid phase supports are microparticles and the defined region is the entire surface of the microparticle The term "oligonucleotide" as used herem includes linear ohgomers of natural or modified monomers or linkages, including deoxynbonucleosides, πbonucleosides, anomeπc forms thereof, peptide nucleic acids (PNAs). and the like, capable of specifically binding to a target polynucleotide by way of a regular pattern of monomer-to-monomer interactions, such as Watson-Crick type of base pairing, base stacking. Hoogsteen or reverse Hoogsteen types of base pairing, or the like Usually monomers are linked by phosphodiester bonds or analogs thereof to form ohgonucleotides ranging in size from a few monomeπc umts. e g 3-4, to several tens of monomeπc umts Whenever an oligonucleotide is represented by a sequence of letters, such as "ATGCCTG," it will be understood that the nucleotides are m 51— >3' order from left to right and that upper or lower case "A" denotes deoxyadenosme, upper or lower case "C" denotes deoxycytidme. upper or lower case "G" denotes deoxyguanosine. and upper or lower case "T" denotes thynndme, unless othenvise noted Analogs of phosphodiester linkages include phosphorothioate, phosphorodithioate. phosphoranihdate. phosphorarmdate. and the like Usually ohgonucleotides of the mvention compπse the four natural nucleotides. however, they may also compπse non-natural nucleotide analogs It is clear to those skilled m the art when ohgonucleotides havmg natural or non-natural nucleotides may be employed, e g where processmg by enzymes is called for, usually ohgonucleotides consisting of natural nucleotides are required
"Perfectly matched" in reference to a duplex means that the poly- or oligonucleotide strands makmg up the duplex form a double stranded structure with one other such that every nucleotide in each strand undergoes Watson-Cπck basepaiπng with a nucleotide in the other strand The term also comprehends the pairing of nucleoside analogs, such as deoxyinosme, nucleosides with 2-amιnopuπne bases, and the like, that may be employed In reference to a tπplex, the term means that the tπplex consists of a perfectly matched duplex and a third strand m which every nucleotide undergoes Hoogsteen or reverse Hoogsteen association with a basepair of the perfectly matched duplex Conversely, a "mismatch" m a duplex between a tag and an oligonucleotide means that a pair or triplet of nucleotides in the duplex or triplex fails to undergo Watson-Crick and/or Hoogsteen and/or reverse Hoogsteen bonding
As used herem, the term "complexity" m reference to a population of polynucleotides means the number of different species of molecule present m the population
As used herem, the term "failure sequence" refers to a synthetic oligonucleotide or polynucleotide that does not have the correct, or intended, length and/or sequence because of a failure m a step of the synthetic process, e g spuπous chain initiation, failure of a coupling step failure of a capping step, cham scission, or the like As used herein, "amplicon" means the product of an amplification reaction That is. it is a population of polynucleotides, usually double stranded, that are replicated from a few starting sequences Preferably, amphcons are produced either m a polvmerase chain reaction (PCR) or by replication m a cloning vector Detailed Description of the Invention The invention provides an enzymatic method for synthesizing a repertoire of oligonucleotide tags whose members are substantially free of failure sequences. Oligonucleotide tags are combinatorially synthesized by the assembly of error-free words or sub-assemblies of words in a series of enzymatic steps. Generally, the method of the invention comprises the following steps: (a) providing a repertoire of oligonucleotide tag precursors in an amplicon, the oligonucleotide tag precursors each comprising one or more words, and each of the one or more words being selected from the same minimally cross- hybridizing set; (b) cleaving the amplicon at a word in each of the oligonucleotide tag precursors to form one or more ligatable ends on each oligonucleotide tag precursor: (c) hgating one or more words to the one or more ligatable ends to elongate each of the oligonucleotide tag precursors; (d) amplifying the elongated oligonucleotide tag precursors in the amplicon: and (e) repeating steps (b) through (d) until a repertoire of oligonucleotide tags having the predetermined length is formed. The repertoire of oligonucleotide tags of the desired length contained in the final amplicon may then inserted into a convenient cloning vector, as taught by Brenner et al, International patent application PCT/US96/09513. Preferably, each of the oligonucleotide tag precursors has the same length, which is determined by word length, the number of words making up the initial oligonucleotide tag precursor, and the stage of the assembly process, i.e. how manv words or sub-assemblies of words have been added by operation of the method of the invention. Preferably, the amplicon of the method is a population of cloning vectors wherein different oligonucleotide tags or oligonucleotide tag precursors are represented in equal proportions as inserts of such vectors. Preferably, whenever the oligonucleotide tag precursors are cleaved for the ligation of an additional word or sub-assembly of words, the cleavage takes place at the same word for all the oligonucleotide tag precursors of the repertoire. Preferably, the step of cleaving is carried out with a type IIs restriction endonuclease which cleaves at the same word for all the oligonucleotide tag precursors of the repertoire and produces ligatable ends having protruding strands. As used herein, the term "ligatable ends" means ends of a double stranded DNA that can be ligated to another double stranded DNA. including blunt-end ligation and "sticky" end ligation. Preferably, ligatable ends are sticky ends.
The invention further includes repertoires of oligonucleotide tags defined by the following fonnula:
Wl(N)χl Λ 2(N),, .. (N ^ ,,
wherein w(,
Figure imgf000008_0001
... wn are words selected from the same minimally cross-hybridizing set, the words having a length of from three to fourteen nucleotides or basepairs; n is an integer in the range of from 4 to 10. N is a nucleotide or basepair. and xj. \ , xn-i are each an integer indicating how many nucleotides or basepairs. N. are present at the given location in the sequence of words, xj. x2. xn-i each being selected from the group consistmg of 0, 1. 2. 3. and 4. provided that at least one of xj. x2. xn_] is 1. 2. 3, or 4 Preferably, xj. x2. xn-ι are each selected from the group consistmg of 0. 1 , and 2, provided that at least one of x ι . x2. xn_l is 1 or 2 Preferably, oligonucleotide tags of the above formula are synthesized by the method of the mvention
Preferably, words are from three to fourteen nucleotides or basepairs m length, and more preferably, words are from four to six nucleotides or basepairs in length Most preferably, words are four nucleotides or basepairs length Usually, words consist of a linear sequence of nucleotides selected from the group consisting of A. C. G. and T For words constructed from 3 of the 4 natural nucleotides. the followmg word sizes, differences between words of the same set, and set sizes are preferred
Difference Word Length Between Words Set Size
4 3 8
5 4 6
6 4 9
7 5 8
8 5 16 8 6 9
In some embodiments employmg words of the above characteristics, subsets of the computed sets may be employed so that only words having specified GC content, melting temperature, reduced likelihood of self annealing, hairpin formation, or the like, are used to form tags The above set sizes were computed using the algorithms listed in Brenner et al. PCT US96/09513 and allowed U S patent application Ser No 08/659,453 Exemplary minimally cross- hy bπdizing sets of words for use with the ιn\ ention arc listed in the following table
Table I Exemplary Sets of Minimally Cross-Hybridizing Words
Number of Nucleotides per Word (Minimal No of Mismatches)
4(3) 5(4) 6(4) 7(5) 8(5) gatt tagta gattag gtaaaat atgagtat tgat aaaag agagtt aaaagga aggaagtg taga agggt agttga aaggaag agggtaga tttg ggtaa gagatt aattttt agttgaag gtaa gtatt gttgg- ggaggtg gagatggt agta tttgg tggttg gggtaga gaggatag atgt ttagag tgtataa gagtgata aaag ttgaga ttattgg ggaagtga atgtat ggatagat gtaatatg gttgggaa tatagttg tattagga tgtgttat ttatgagt ttgttgag
The length of oligonucleotide tags in a repertoire may van wideb' depending on several factors, including the size or complexity of the repertoire desired, the difficult in synthesizing corresponding tag complements on solid phase supports, the particular application and the like Generally, longer oligonucleotide tags permit the generation of larger repertoires, however, reliable synthesis of tag complements that exceed 40-50 nucleotides becomes increasingly difficult and monitoring and/or exercising quality control of mixtures of ohgonucleotides becomes increasingly difficult as complexitv increases Thus, selection of particular tag lengths and complexities requnes design tradeoffs b\ a practitioner of ordinary skill Preferably oligonucleotide tags of the invention are the range of from 18 to 60 nucleotides ui length More preferably, oligonucleotide tags are m the range of from 18 to 40 nucleotides in length
Preferably, minimally cross-hybridizing sets comprise words that make approximateh equivalent contributions to duplex stability as every other word m the set In this way, the stability of perfectly matched duplexes between even word and its complement is approximately equal Guidance for selecting such sets is provided by published techniques for selecting optimal PCR primers and calculating duplex stabilities, e g Rychhk et al, Nucleic Acids Research 17 8543-8551 (1989) and 18 6409-6412 (1990). Breslauer et al Proc Natl Acad Sci . 83 3746-3750 (1986). Wetmur. C Rev Biochem Mol Biol , 26 227-259 (1991).and the like For shorter tags, e g about 30 nucleotides or less the algonthm descπbed b\ R\chhk and Wetmur is preferred, and for longer tags e g about 30-35 nucleotides or greater, an algorithm disclosed by Suggs et al, pages 683-693 m Brown, editor, ICN-UCLA Symp Dev Biol , Vol 23 (Academic Press, New York. 1981) may be conveniently employed Clearly, the are many approaches available to one skilled in the art for designing sets of minimally cross-hybπdizing words within the scope of the mvention For example, to minimize the effects of different base-stackmg energies of terminal nucleotides when words are assembled, words may be provided that have the same terminal nucleotides In this way, when subumts are linked, the sum of the base-stackmg energies of all the adjoining terminal nucleotides will be the same, thereby reducing or eliminating variability in tag meltmg temperatures For use with the mvention. words or sub-assemblies of words are initially synthesized as single stranded ohgonucleotides using conventional solid phase synthetic methods, e g using a commercial DNA synthesizer, such as PE Applied Biosystems (Foster City, CA) model 392 DNA synthesizer, or like instrument Preferably, the words or sub-assemblies of words are synthesized within a longer oligonucleotide having appropπate restπction endonuclease recognition sites and primer binding sites to facilitate later manipulation Preferably, such chemically synthesized ohgonucleotides are rendered double stranded by providing a primer which binds to one end of the ohgonucleotides and which is extended the length of the ohgonucleotides with a DNA polymerase m the presence of the four dNTPs For example, in a preferred embodiment the following oligonucleotide (shown in the 5'-»3' orientation) containing two words may be synthesized chemically (SEQ ID NO 1)
Pst I Bse RI Bbs I Bsp 120 Bbv I Hind III
cgacacctgcagaggagatgaagacga [word] [word] gggcccatgctgcaagcttaccg
Formula I
In this example, forward and reverse primers shown below may be used to render the oligonucleotide double stranded so that the indicated restriction endonuclease recognition sites are formed
5 ' -cgacacctgcagaggag 5 ' -FAM-cggtaagcttgcagcat
Forward primer Reverse primer
(SEQ ID NO 2) (SEQ ID NO 3) Here the reverse primer is shown with a fluorescent label attached to its 5' end to facilitate purification "FAM" is a fluorescem dye available commercially, e g PE Applied Biosystems (Foster City. CA) Alternatively, the 64 double stranded ohgonucleotides containing the two- word combinations may be constructed by separately synthesizmg both strands and then annealing them together for clonmg mto a conventional cloning vector
In embodiments where synthesis errors are eliminated by "self-selection" (described more fully below), the oligonucleotide of Formula I may be synthesized combinatorially. as disclosed in Brenner et al. International patent application PCT/US96/09513, so that a mixture of ohgonucleotides is produced, the components of the mixture being ohgonucleotides having different words For example, if the four-base words of Table I are employed, then the mixture corresponding to Formula I would consist of 64 different sequences, l e every possible two-word sequence In embodiments where s nthesis errors are eliminated by confirmatory sequencmg. the ohgonucleotides of Formula I are synthesized separately followed by separate insertion into clonmg vectors and sequencing to confirm that each word sequence is coπect As above, if the four-base words of Table I are employed, then 64 separate clonings and sequence determmations would be required After such confirmatory sequencmg, the 64 clones are combmed for use m the method of the mvention
Oligonucleotide tags produced by way of the invention may be assembled from words or sub-assemblies of words either by stepwise additions in a plurality of cycles of cleavage and ligation of preferably identically sized adaptors, or m stages of convergent assembly of fragments, each of such fragments compπsing increasingly larger oligonucleotide precursors Examples of both approaches are illustrated in Figures la (stepwise additions) and lb (convergent assembly) In Figure la, vector (100) is prepared for each sequence of words "- WJ-W2-" The presence of two words m this example is only for purposes of illustration In this embodiment, any number of words can be used The practical constraint is the requirement that vector (100) be prepared for every sequence of words Thus, if three four- base words of Table I are employed, then 512 (=8x64) vectors must be prepared and their sequences confirmed
Adjacent to words (108) are cleavage sites (107) and (109) of type IIs restnction endonucleases, r2 and r-$, recogmzmg sites (106) and (110), respectively Adjacent to, and upstream of. restnction site ( 106) is restnction site ( 104) recognized by restnction endonuclease. rj Flanking the entire assembly of restriction sites and words are optional primer binding sites (102) and (112). which may be used to copy the oligonucleotide tag for insertion into a vector as taught by Brenner et al. International application pct/us96/09513 In the prefened embodiment of Figure la. vector (100) serves (1 14) as a starting material for the tag assembly process, l e at the start of the process. ι=l in the subscript of insert (120) Note that the process entails the successive insertion of the following element, or cassette -w-w-CN)^-
where "w" is a word. "N" is a nucleotide, and k is an mteger equal to 1. 2, 3. or 4 The tern "(NJjj" is equivalent to element (109) of Figure la As described above, preferably k is equal to 1 or 2. which is the length of the protrudmg strand resultmg from cleavage with the preferred type IIs restriction endonucleases of the invention r3 is virtually any type IIs restriction endonuclease which allows a predictable sequence (109) to be engmeered mto vector (100) Exemplary r3's include Alw I, Bbs I, Bbv I. Bci VI, Bpm I, Bsa MI, Bse GI, Bsr DI, Ear I. Fau 1. Mbo II, and the like Preferably. r3 leaves a 1 or 2 nucleotide protrudmg strand after cleavage Likewise. r2 is virtually any type IIs restnction endonuclease which allows a predictable sequence (107) to be engmeered mto vector ( 100) r may be selected from the same group of type IIs restriction endonucleases as r . but preferably for a given vector and r2 are different Cycles of word addition in the preferred embodiment, illustrated m Figure la, begin with the step of cleaving (122) vector (121) with and r2. to remove segment (123). thereby leaving opened vector (124), which is then isolated usmg conventional protocols In this embodiment, X2 cleaves the oligonucleotide tag precursor at the upstream-most word of the tag Separately, restriction endonucleases r j and r3 recogmzmg restriction sites (104) and (110), respectively, are used to cleave (116) vector ( 100) to produce fragment (118), which is inserted (126) into opened vector (124) to form vector (128), thereby elongating the oligonucleotide tag precursors by two words The cycles are repeated (130) until an oligonucleotide tag repertoire of the desired length is obtained At such pomt. the oligonucleotide tags may be excised from vector (128) by digesting with r2 and r3 Alternatively, repertoires may be synthesized in accordance with the invention with a convergent strategy as illustrated in Figure lb Vector (150), which may be identical to vector (100), contains the following elements restriction site (152) for restriction endonuclease. rj. restriction site (154) for restriction endonuclease r , which has cleavage site (155), one or more words (1561, and restnction site (158), which has cleavage site (157) Optionally, vector (1 0) mav also contain flanking primer bmdmg sites as with vector (100) (not shown) for producing copies of the oligonucleotide tags or their precursors Two ahquots (160) and (162) are taken of vector (150) In aliquot (160). vector (150) is digested with rj and r2 so that fragment ( 161 ) is excised and opened vector ( 166) is formed Separately, in aliquot (162). vector (150) is digested with ri and r3 so that 2-word fragment (164) is excised After purification. 2-word fragment (164) is inserted and ligated (168) into opened vector (166) to form vector (170), which contains oligonucleotide tag precursors consistmg of four words each These steps are repeated usmg vector (170) as the starting matenal That is, two ahquots (174) and (176) are taken of vector (170) In aliquot (174). vector (170) is digested with and r2 so that fragment (175) is excised and opened vector (180) is formed Separately, in aliquot (176). vector (170) is digested with r j and r3 so that 4-word fragment (178) is excised After purification. 4-word fragment (178) is ligated (182) into opened vector (184) to form vector (184), which contains oligonucleotide tag precursors consistmg of eight words each Additional cycles may be earned out. or if the desired length of the tags is 8 words, then the oligonucleotide tags may be excised (186) by digestmg with r2 and r3
Repertoires of oligonucleotide tags may also be produced in accordance with the invention by repeated additions of words with self-selection during the ligation step In this embodiment, the length of the protruding strand produced by cleavage with a type IIs restriction endonuclease is the same as the length of a word When an oligonucleotide tag precursor is cleaved at a word, cleavage occurs precisely at the upstream and downstream boundaries of a word, I e across a word, as shown below
cleavage site I
5 ' - ... nnrm-xxxx-xxxx-xxxx-nnnnn ... 3 ' - ... nnnn-xxxx-xxxx-xxxx-nnnnn ... t cleavage site
I
5 ' - ... πnnn xxxx-xxxx-xxxx-nnnnn 3 ' - ... nnnn-xxxx xxxx-xxxx-nnnnn
where the segments "-xxxx-" represent words consisting of four nucleotides each Preferabl . m this embodiment, word lengths of either 3, 4. or 5 nucleotides are employed A preferred implementation of this embodiment is illustrated in Figure 2 Vector (200), produced from conventional starting materials, includes the following elements restriction site for xq (204). restnction site for r$ (206), restriction site for rg (208). cleavage site (209), a plurality of words (210). restriction site for xη (212), and a restriction site for rg (214) As with vector (100). the above senes of elements may be flanked by optional primer binding sites (202) and (216) so that the oligonucleotide tag precursors may be convemently replicated, e g by PCR amplification Vector (221 ). which may be a sample of starting vector (200) or a previously processed vector, is cleaved (224) with x^ and rg to produce fragment (225) and opened vector (228). which is isolated usmg conventional protocols rg is a type IIs restnction endonuclease which cleave across the upstream-most word of the oligonucleotide tag precursor of vector (228) Vector (228) is actually a mixture by virtue of the different oligonucleotide tag precursors In particular, the protrudmg strand of end (226) is present in N different sequences, where N is the number of words the minimally cross-hybπdizmg set being used Separately, a sample of vector (200) is cleaved (222) with and rg to produce fragment (218). which is isolated Fragment (218) is a mixture containing N2 components in this example, where again N is the number of words m the minimally cross-hybridizing set bemg used N is to the second power because the fragment contains all possible combmations of two consecutive words Element (220) of fragment (218) is the single-stranded form of the second or downstream-most, word of vector (200) Fragment (218) is combmed with opened vector (228) under conditions that permit the smgle stranded forms of the words (220) and (226) to form perfectly matched duplexes Because of the minimally cross-hybπdization property of the protrudmg strands, these conditions are readily met Strands that are not complementary or that contain failure sequences will not form perfectly matched duplexes and will not be ligated In this sense, the words m the protrudmg strands are "self-selecting " After msertion and ligation (230), vector (232) is formed which contams and elongated oligonucleotide tag precursor The cleavage and msertion steps are repeated (234) until an oligonucleotide tag of the desired length is obtamed. after which the oligonucleotide tag repertoire may be excised by cleaving with xη and r.5
The following examples serve to illustrate the present mvention and are not meant to be lnniting Selection of many of the reagents, e g enzymes, vectors, and other materials, selection of reaction conditions and protocols, and material specifications, e g word length and composition, tag length, repertoire complexity, and the like, are matters of design choice which may be made by one of ordinary skill in the art Extensive guidance is available in the literature for applying particular protocols for a wide variety of design choices made m accordance with the mvention, e g Sambrook et al, Molecular Clonmg, Second Edition (Cold Spring Harbor Laboratory, New York, 1989), Ausubel et al, editors, Cunent Protocols in Molecular Biology (John Wiley & Sons, New York, 1997), and the like
Example 1
Repertoire Synthesis by Repeated Cycles of Cleavage. Self-Selection. Ligation. and Amplification
In this example, an oligonucleotide tag repertoire is produced such that each oligonucleotide tag consists of eight words of four nucleotides The procedure outlined m Figure 2 is followed A vector, corresponding to vector (200). is constructed by first inserting the following oligonucleotide (SEQ ID NO 4) mto a Bam HI and Eco RI digested pUC19 Pad Bse RI Bsp 120 Bbs I Eco RI Bam HI ψ * ψ aattgttaattaaggatgagctcactcctcgggcccgcataagtcttcgaattcg caattaattcctactcgagtgaggagcccgggcgtattcagaagcttaagcctaq
Formula II
Separately, the oligonucleotide of Formula I and forward and reverse primers (SEQ ID NO 2 and SEQ ID NO 3) are synthesized using a conventional DNA synthesizer, e g PE Applied Biosystems (Foster City. CA) model 392 The oligonucleotide of Formula I is a mixture containing a repertoire of 64 two-word oligonucleotide tag precursors The four-nucleotide words of Table 1 are employed After amplification by PCR. the amplification product is digested with Bbs I to give the following two products
... gaagacga word-word-gg ... cttctgct-word word-cc ...
The products are re-hgated, amplified by PCR, and digested with Bbv I to give the following two products
gaagacga-word word-gg cttctgct-word-word cc
The products are again re-hgated and amplified by PCR By this sequence of cleavages and relations, any words consistmg of failure sequences are selected agamst by the ligation event, l e words with failure sequences will not rehgate in the mixture, and thus, will not be amplified The final product is digested with Pst I and Hind III and inserted mto a Pst I/Hmd Ill-digested pUC19 to give the following construct (SEQ ID NO 5)
Pst I Bse RI Bbs I Bsp 120 Hind III
... cgacctgcagaggagatgaagacga-wordword-gggcccaatgctgcaagcttggcg. ... gctggacgtctcctctacttctgct-wordword-cccgggttacgacgttcgaaccgc , t Bbv I
where Pst I. Bse RJ. Bbs I, Bsp 120. and Bbv I. correspond to r4, r5. r6. r7. and r8 of Figure 2 respectively After amplification in a suitable host, the plasmid is isolated and cleaved with Pst I and Bbs I to give an opened vector with the following upstream and downstream (SEQ ID NO 6) ends ..cgacctgca wordword-gggcccaatgctgcaagcttggcg... ..gctgg word-cccgggttacgacgttcgaaccgc ...
Separately, a portion of the amphfied oligonucleotide of Foπnula I is digested with Pst I and Bbv I to give the following fragment (SEQ ID NO 7)
gaggagatgaagacga-word acgtctcctctacttctgct-wordword
This fragment is inserted mto the above vector opened by digestion with Bbs I and Pst I to give the following construct (SEQ ID NO 8)
gcagaggagatgaagacga-wordwordword-gggcccaatgctgcaagcttggcg .. ... cgtctcctctacttctgct-wordwordword-cccgggttacgacgttcgaaccgc.
which contams an oligonucleotide tag precursor of three words The steps of cleaving, inserting, and amplification are repeated until a construct containing eight words is obtained Preferably, at each step, reactants, e g vectors and/or inserts, are provided m amounts that are at least ten times the complexity of the reactant When synthesis is complete, the eight-word construct is cleaved with Bse RJ and Bsp 120 and the following fragment containing the oligonucleotide tag repertoire is isolated
(word) 8g ct (word) 8cccgg
The isolated fragment is then inserted mto the Bse RI/Bsp 120 vector of Formula II, which vector is used to transform a suitable host The construct is ready for inserting polynucleotides, such as cDNAs, mto the Eco RI restnction site to forni tag-polynucleotide conjugates in accordance with the method of Brenner et al, International patent application pct/us96/09513
Example 2 Repertoire Synthesis by Convergent Assembly of Enor-free Oligonucleotide Tag Precursors
In this example, an oligonucleotide tag repertoire is produced following the procedure outlined in Figure lb Each oligonucleotide tag consists of eight words of six nucleotides each (selected from those listed m Table I) to give the repertoire having an expected complexity of 98, or about 4 3 x 107 For each of the 9x9=81 two-word combmations, an oligonucleotide (SEQ ID NO 9) of the following form is synthesized Pst I Bse RI Bsp 120
cgacacctgcagttatcggaggag atgaag cgg [word] [word] gggcccatat-
-atccgtctgcacaagctt accg t t
Bsg I Hind III
Formula III
The ohgonucleotides of Formula III are rendered double stranded and amplified by providing forward and reverse primers and conducting a PCR, as described above for the oligonucleotide of Formula I. After amplification, the ohgonucleotides are separately cleaved with Pst I and Hind III and cloned into a similarly cleaved M13mpl8 and suitable hosts are transformed. Clones are selected and the oligonucleotide inserts are sequenced using conventional techniques. Such selection and sequencing continue until a vector is obtained for each of the 81 two-word combinations whose sequence is confirmed to be correct. Aliquots of the vectors are then combined in equal proportions to form an 81 -component mixture, after which the vectors are cleaved with Pst I and Hind III and the word-containing fragment is isolated and cloned into a similarly cleaved pUC19 to give a construct of the following form (SEQ ID NO: 10):
... ctgcagttatcggaggagatgaagaegg [word] [word] gggcccatat- ... gacgtcaatagcctcctctacttctgcc [word] [word] cccgggtata-
-atccgtctgcacaagcttggcg -taggcagacgtgttcgaaccgc
After cloning, the population of vectors is divided into two parts, after which the vectors in one part are cleaved with Pst I and Bsg I to give the following fragment mixture (SEQ ID NO: 11):
gttatcggaggagatgaagacgg [word] [word] gg acgtcaatagcctcctctacttctgcc [word] [word] which is isolated The vectors in the other part are cleaved with Pst I and Bse RI and the linearized word-containing vectors are isolated The word-containing fragments are ligated into the linearized vectors to form the following construct (SEQ ID NO 12)
... ctgcagttatcggaggagatgaagacgg [word] [word] gg [word] [word] - ... gacgtcaatagcctcctctacttctgcc [word] [word] cc [word] [word] -
-gggcccatatatccgtctgcacaagcttggcg -cccgggtatataggcagacgtgttcgaaccgc
After cloning, the construct is again divided mto two parts and the steps are repeated to give the final 8 -word repertoire havmg the form
.. gaagacgg ( [word] [word] gg) 4gccc ...
.. cttctgcc ( [word] [word] cc) 4cggg ...
This may then be cleaved with Bse RI and Bsg I and re-cloned mto a vector similar to that of Formula II for attachment to polynucleotides
Example 3 Construction of an Eight-Word Tag Library In this example, an eight-word tag library with four-nucleotide words was constructed from two two-word hbranes in vectors pLCV-2 and pUCSE-2 Pπor to construction of the eight- word tag library, 64 two-word double stranded ohgonucleotides were separately mserted mto pUC 19 vectors and propagated These 64 ohgonucleotides consisted of every possible two-word pair made up of four-nucleotide word selected from an eight-word minimally cross- hybπdizmg set descnbed m Brenner, U S patent 5.604,097 After the identities of the mserts were confirmed by sequencmg, the mserts were then amplified by PCR and equal amounts of each amplicon were combmed to form the inserts of the two-word hbranes in vectors, pLCV-2 and pUCSE-2 These were then used as descπbed below to form an eight-word tag library in pUCSE. after which the eight-word insert was transfened to vector pNCV3 which contams additional pπmer binding sites and restnction sites to facilitate tagging and sorting polynucleotide fragments
A Construction of two-word sequences in pUCSE pUC19 was digested to completion with Sap I and Eco RJ using the manufactuer's protocol and the large fragment was isolated All restriction endonucleases unless otherwise noted were purchased from New England Biolabs (Beverly. MA) The small Sap I-Eco RI fragment was removed to elimmate the β-gal promoter sequence, which was found to skew the representation of some combinations of words m the final library The following adaptor (SEQ ID NO 13) was ligated to the isolated large fragment in a conventional ligation reaction to give plasmid pUCSE as a ligation product
Eco RI Pst I Eco RV Hind III
Ψ Ψ aattctagactgcagttgatatcttaagctt gatctgacgtcaactatagaattcgaacga
A bacteπal host was transformed by the ligation product usmg electroporation, after which the transformed bacteria were plated, a clone was selected, and the insert of its plasmid was sequenced for confirmation pUCSE isolated from the clone was then digested with Eco RI and Hind III using the manufacturer's protocol and the large fragment was isolated The following adaptor (SEQ ID NO 14) was ligated to the large fragment to give plasmid pUCSE- Dl which contained the first di-word (underlmed)
BseRI EcoRI Pstl Bbsl Bspl20I Hindi11 aattctgcagaggagatgaagacgaaaagaaaggggcccatgctgca gacgtctcctctacttctgcttttctttccccgggtacgacgttcga t Bbvl
Formula I
Further plasmids, pUCSE-D2 through pUCSE-D64, containing di-words were separately constructed from pUCSE-Dl by digesting it with Pst I and Bsp 120 I and separately hgating the following adaptors (SEQ ID NO 15) to the large fragment
gaggagatgaagacga [word] [word] g acgtctcctctacttctgct [word] [word] cccgg
Formula II
-1} The words of the top strand were selected from the following minimally cross-hybridizing set: gatt, tgat, taga, tttg, gtaa, agta, atgt, and aaag. After cloning and isolation, the inserts of the vectors were sequenced to confirm the identities of the di-words.
B. Construction pLCV.
Plasmid cloning vector pLCV-Dl was created from plasmid vector pBC.SK" (Stratagene) as follows, using the following ohgonucleotides:
S-723 (SEO ID NO:16) 5'-CGA GAA AGA GGG ATA AGG CTC GAG CTT AAT TAA GAG TCG ACG AAT TCG GGC CCG GAT CCT GAC TCT TTC TCC CT-3'
S-724(SEOIDNO:17)
5'-CTA GAG GGA GAA AGA GTC AGG ATC CGG GCC CGA ATT CGT CGA CTC TTA ATT AAG CTC GAG CCT TAT CCC TCT TTC TCG GTA C-3'
S-785(SEOIDNO:18)
5 ' -TCG AGG CAT AAG TCT TCG AAT TCC ATC ACA CTG GGA AGA CAA CGT AG-3'
S-786(SEOIDNO:19)
5 ' -GAT CCT ACG TTG TCT TCC CAG TGT GAT GGA ATT CGA AGA CTT ATG CC-3'
S-960 (SEQ ID NO:20)
5 ' -TCG ATT AAT TAA CAA GCT TTG GGC CCT CGA GCA TAA GTC TTC TGC AGA ATT CGG ATC CAT CGA TGG TCA TAG C-3'
S-961 (SEOID O:21) 5' -TGT TTC CTG CCA CAC AAC ATA CGA GCC GGA AGC GGC CGC TCT AGA-3'
S-962(SE01DN0:22)
5 ' -AGC GTC TAG AGC GGC CGC TTC CGG CTC GTA TGT TGT GTG GCA GGA AAC AGC TAT GAC CAT C-3'
S-963 (SEQ ID NO:23) 5 ' -GAT GGA TCC GAA TTC TGC AGA AGA CTT ATG CTC GAG GGC CCA AAG CTT GTT AAT TAA-3'
S-110^ (SEQ ID NO 24) 5'-TCGA GGG CCC GCA TAA GTC TTC-3'
S-1106 (SEQ ID NO 25)
5'-TCGA GAA GAC TTA TGC GGG CCC-3'
Ohgonucleotides S-723 and S-724 were lαnased. annealed together, and ligated to pBC SK" which had been digested with Kprl and Xbal and treated with calf intestinal alkaline phosphatase. to create plasmid pSW143 I
Oligonucleotidess S-785 and S-786 were k ased. annealed together, and ligated to plasmid pSW143 1, which had been digested with Xhol and BamHI and treated with calf inestrnal alkaline phosphatase, to create plasmid pSWl 64 02
Ohgonucleotides S-960, S-961, S-962. and S-963 were kinased and annealed together to form a duplex consistmg of the four ohgonucleotides Plasmid pSW164 02 was digested with Xhol and Sapl The digested DNA was electrophoresed in an agarose gel. and the approximately 3045 bp product was puπfied from the approprate gel slice Plasmid pUC4K (from Pharmacia) was digested with Pstl and electrophoresed in an agarose gel The approx 1240 bp product was purified from the appropriate gel slice The two plasmid products (from pSW164 02 and pUC4K) were ligated together with the S-960/961/962/963 duplex to create plasmid pLCVa
DNA from Adenovιrus5 (New England Biolabs) was digested with Pad and Bsp 1201, treated with calf intestinal alkaline phosphatase, and electrophoresed m an agarose gel The approx 2853 bp product was purified from the appropnate gel slice This fragment was ligated to plasmid pLCVa which had been digested with Pad and Bsp 1201. to create plasmid pSW208 14
Plasmid pSW208 14 was digested with Xhol, treated with calf intestinal alkaline phosphatase. and electrophoresed in an agarose gel The approx 5374 bp product was puπfied from the appropriate gel slice This fragment was ligated to ohgonucleotides S-1105 and S-1106 (which had been kinased and annealed together) to produce plasmid pLCVb, which was then digested with Eco RI and Hind III The large fragment was isolated and ligated to the Formula I adaptor (SEQ ID NO 14) to gιve LCV-Dl As above for pUCSE. further plasmids. pLCV-D2 through pLC V-D64. containmg di- words were separately constructed from pLCV-Dl by digesting it with Pst I and Bsp 120 I. isolating the large fragment, and a ligating an adaptor of Formula II After clonmg and isolation, the inserts of the vectors were sequenced to confinn thhe identities of the di-words
10 C. Construction of two-word libraries, pUCSE-2 and pLCV-2.
Each of the vectors pLCV-Dl through -D64 and pUCSE-Dl through -D64 was separately amplified by PCR. The components of the reaction mixture were as follows.
10 μl template (about 1-5 ng)
10 μl lOx Klentaq™ buffer (Clontech Laboratories, Palo Alto. CA)
2.5 μl biotinylated DF primer at 100 pmoles/μl
2.5 μl biotinylated DR primer at 100 pmoles/μl 2.5 μl lO mM deoxynucleoside triphosphates
5 μl DMSO
66.5 μl H20
I μl Advantage Klentaq rM (Clontech Laboratories. Palo Alto, CA)
The temperature of the reactions was controlled as follows: 94°C for 3 min; 25 cycles of 94°C for 30 sec. 60°C for 30 sec, and 72°C for 10 sec; followed by 72°C for 3 min, then 4°C. The DF and DR primer binding sites were upstream and downstream portions of the vectors selected to give amplicons of 104 basepairs in length. After the reactions were completed, 5 μl of each PCR product were separated polyacrylamide gel electrophoresis (20% with IxTBE) to confirm by visual inspection that the reaction yields were approximately the same for each PCR. After such confinnation, using conventional protocols, 10 μl of each PCR was extracted twice with phenol and once with chloroform, after which the DNA in the aqueous phase was precipitate with ethanόl. After resuspension in 200 μl of lx NEB buffer #2 (New England Biolabs, Beverly, MA), the DNA was cleaved with Bbv I and Eco RI by adding the enzymes in 50 μl of the manufacturer's recommended buffer. The digestion resulted in the production of three fragments: a biotinylated fragment of 38 basepairs, a di-word- containing fragment of 29 basepairs. and a biotinylated fragment of 37 basepairs. After completion of the reaction, the excess biotinylated primers were removed by adding 50 μl 50% Ultralink (streptavidin-Sepharose, Pierce Chemical Co., Rockford, JL) and vorte ing the mixture at room temperature for 30 min. The Ultralink material was separated from the reaction mixture by centrifugation, after which approximately half of the mixture was separated by polyacrylamide gel electrophoresis (20% gel). The 29-basepair band was cut out of the gel and the 29-basepair fragment was eluted using the "crush and soak" method, e.g. Sambrook et al. Molecular Cloning. Second Edition (Cold Spring Harbor Laboratory, New York, 1989) This material was then ligated mto either pLCV-D 1 or pUCSE-D 1 after the latter were digested with Bbs I and Eco RI and treated with calf intestine alkaline phosphatase, using manufacturer's recommend protocols.
D. Construction of pNCV3. pNCV3 was constructed by first assembling the following fragment (SEQ ID NO 26) from synthetic ohgonucleotides
EcoRI aattctgtaaaacgacggccagtcgccagggtttccccagtcacgacgtgaataaatag- gacattttgctgccggtcagcggtcccaaaagggtcagtgctgcacttatttatc-
Pacl Bspl20I
I I ttaattaaggaataggcctctcctcgagctcggtaccgggcccgcataagtcttc- aattaattccttatccggagaggagctcgagccatggcccgggcgtattcagaag-
Clal EcoRV Sapl BamHI atctatcgatgattgaagagcgatatcgctcttcaatcggatccatcc- tagatagctactaacttctcgctatagcgagaagttagcctaggtagg- t Sapl
HindiII
I tcaactaattaccacacaacatacgagccggaagcgggtcatagctgtttcctga agttgattaatggtgtgttgtatgctcggccttcgcccagtatcgacaaaggacttcga
After isolation, the fragment was cloned mto Eco RI and Hind Ill-digested pLC V-D 1 using conventional protocols
E Assembly of eight-word library
The di-words of pLCV-2 were amplified either by PCR or plasmid expansion, the product was digested with Eco RI and Bbvl after winch the Eco RI-BbvI fragment was isolated as insert 1 Two-word library pUCSE-2 was digested with Eco RI. Bbs I. and Pst I. after which the large fragment was treated with calf intestine alkaline phosphatase to give vector 1 Vector 1 and insert 1 were combined in a conventir al ligation reaction to give three-word library. pUCSE-3 pUCSE-3 was digested with Eco RI. Bbs 1. and Pst I. after which the large fragment was treated with calf intestine alkaline phosphatase to give vector 2 Vectoi 2 and insert 1 were then combmed in a conventional ligation reaction to give four-word libran. pUCSE-4 The 4-mer words of pUCSE-4 were amphfied either by PCR or plasmid expansion, the product was digested with Eco RI and Bbvl after which the Eco RI-BbvI fragment was isolated as insert 2. pLCV-2 was digested with Eco RI, Bbs I, and Pst I, after which the large fragment was treated with calf intestine alkaline phosphatase to give vector 3. Vector 3 and insert 2 were then combined in a conventional ligation reaction to give five-word library, pLCV-5. The 5-mer words of pLCV-5 were amplified either by PCR or plasmid expansion, the product was digested with Eco RI and Bbvl after which the Eco RI-BbvI fragment was isolated as insert 3. pUCSE-4 was digested with Eco RI, Bbs I, and Pst I, after which the large fragment was treated with calf intestine alkaline phosphatase to give vector 4. Vector 4 and insert 3 were then combined in a conventional ligation reaction to give eight-word library, pUCSE-8. The 8-mer words of pUCSE-8 were amplified either by PCR or plasmid expansion, the product was digested with Bse RI and Bsp 120 I, after which the BseRI- Bspl20I fragment was isolated as insert 4. pNCV3 was digested with Bse RI. Bsp 120 I, and Sac I, after which the large fragment was isolated and treated with calf intestine alkaline phosphatase to give vector 5. Vector 5 was then combined with insert 4 in a conventional ligation reaction to give the eight-word library pNCV3-8.
F. Confirmation Sequencing of a Random Selection of Eight-Word Tags.
The results of the word assembly were tested by sequencing the 8-word inserts of 176 vectors from the pNCV3-8 library. The results of the sequence determinations are summarized in the following table:
Number of Tags Result Percentage
147 Perfect 8 words 83.5%
11 Perfect 7 words 6.2%
8 No insert 4.5%
4 8 words with 1 base deletion 2.2%
3 8 words with an incoπect word 1.7%
1 12 words 0.5%
1 10 words 0.5%
1 9 words 0.5%
Sequence Listing
<110> Brenner, Sydney
Williams, Steven . <120> Enzymatic synthesis of oligonucleotide tags <130> 810-01 <140> <141>
<150> US 60/103,030 <151> 1998-10-05 <160> 26 <170> Microsoft Word 5.1
<210> 1
<211> 58
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221>
<222>
<223>
<400> 1 cgacacctgc agaggagatg aagacgaddd dddddgggcc catgctgcaa 50 gcttaccg 58
<210> 2
<211> 17
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Primer.
<222> n.a.
<223>
<400> 2 cgacacctgc agaggag 17
<210> 3
<2ll> 17
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Primer.
<222> n.a.
<223>
<400> 3 cggtaagctt gcagcat 17
<210> 4
<211> 55
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Adaptor.
<222> n.a.
<223>
<400> 4 aattgttaat taaggatgag ctcactcctc gggcccgcat aagtcttcga 50 attcg 55
<210> 5
<211> 57
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Cloning vector.
<222> n.a.
<223>
<400> 5 cgacctgcag aggagatgaa gacgaddddd dddgggccca atgctgcaag 50 cttggcg 57
<210> 6
<211> 32
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Vector.
<222>
<223>
<400> 6 ddddddddgg gcccaatgct gcaagcttgg eg 32
<210> 7
<211> 20
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Adaptor.
<222> n.a.
<223> Preferably, contains fluorescent label.
<400> 7 gaggagatga agacgadddd 20
<210> 8
<211> 55
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Vector.
<222> n.a.
<223>
<400> 8 gcagaggaga tgaagacgad dddddddddd dgggcccaat gctgcaagct 50 tggcg 55
<210> 9
<2ll> 78
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Tag repertoire.
<222> n.a.
<223> n.a.
<400> 9 cgacacctgc agttatcgga ggagatgaag acggdddddd ddddddgggc 50 ccatatatcc gtctgcacaa gcttaccg 78
<210> 10
<211> 72
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Vector.
<222> N.a.
<223> N.a.
<400> 10 ctgcagttat cggaggagat gaagacggdd dddddddddd gggcccatat 50 atccgtctgc acaagcttac eg 72
<210> 11
<211> 36
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Adaptor.
<222> N.a.
<223> N.a.
<400> 11 gttatcggag gagatgaagac ggdddddddd ddddgg 36
<210> 12
<211> 86
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Vector.
<222> N.a.
<223> N.a.
<400> 12 ctgcagttat cggaggagat gaagacggdd dddddddddd ggdddddddd 50 ddddgggccc atatatccgt ctgcacaagc ttaccg 86
<210> 13
<211> 31
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Adaptor.
<222> N.a.
<223> N.a.
<400> 13 aattctagac tgcagttgat atcttaagct t 31
<210> 14
<211> 47
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Adaptor.
<222> N.a. <223> N.a.
<400> 14 aattctgcag aggagatgaa gacgaaaaga aaggggccca tgctgca 47
<210> 15
<211> 25
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Adaptor.
<222> N.a.
<223> N.a.
<400> 15 gaggagatga agacgadddd ddddg 25
<210> 16
<211> 74
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Oligonucleotide.
<222> N.a.
<223> N.a.
<400> 16 cgagaaagag ggataaggct cgagcttaat taagagtcga cgaattcggg 50 cccggatcct gactctttct ccct 74
<210> 17
<211> 82
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Oligonucleotide.
<222> N.a.
<223> N.a.
<400> 17 ctagagggag aaagagtcag gatccgggcc cgaattcgtc gactcttaat 50 taagctcgag ccttatccct ctttctcggt ac 82
<210> 18
<211> 47
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Oligonucleotide.
<222> N.a.
<223> N.a.
<400> 18 tcgaggcata agtcttcgaa ttccatcaca ctgggaagac aacgtag 47
<210> 19 <211> 47 <212 > DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Vector.
<222> N.a.
<223> N.a.
<400> 19 gatcctacgt tgtcttccca gtgtgatgga attcgaagac ttatgcc 47
<210> 20
<211> 72
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Oligonucleotide.
<222> N.a.
<223> N.a.
<400> 20 tcgattaatt aacaagcttt gggccctcga gcataagtct tctgcagaat 50 tcggatccat cgatggtcat ag 72
<210> 21
<211> 45
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Oligonucleotide.
<222> N.a.
<223> N.a.
<400> 21 tgtttcctgc cacacaacat acgagccgga agcggccgct ctaga 45
<210> 22
<211> 62
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Oligonucleotide.
<222> N.a.
<223> N.a.
<400> 22 agcgtctaga gcggccgctt ccggctcgta tgttgtgtgg caggaaacaa 50 gctatgacca tc 62
<210> 23
<211> 57
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Oligonucleotide.
<222> N.a.
<223> N.a.
<400> 23 gatggatccg aattctgcag aagacttatg ctcgagggcc caaagcttgt 50 taattaa 57
<210> 24
<211> 22
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Oligonucleotide.
<222> N.a.
<223> N.a.
<400> 24 tcgagggccc gcataagtct tc 22
<210> 25
<211> 22
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Vector.
<222> N.a.
<223> N.a.
<400> 25 tcgagaagac ttatgcgggc cc 22
<210> 26
<211> 217
<212> DNA
<213> Artificial Sequence
<220> No special biological significance.
<221> Adaptor.
<222> N.a.
<223> N.a.
<400> 26 aattctgtaa aacgacggcc agtcgccagg gttttcccag tcacgacgtg 50 aataaatagt taattaagga ataggcctct cctcgagctc ggtaccgggc 100 ccgcataagt cttcatctat cgatgattga agagcgatat cgctcttcaa 150 tcggatccat cctcaactaa ttaccacaca acatacgagc cggaagcggg 200 tcatagctgt ttcctga 217

Claims

We claim
1 A method of synthesizing a repertoire of oligonucleotide tags of a predetermined length, the method comprismg the steps of (a) providing a repertoire of oligonucleotide tag precursors m an amplicon, the oligonucleotide tag precursors each comprismg one or more words, and each of the one or more words bemg selected from the same minimally cross-hybndizmg set.
(b) cleavmg the amplicon at a word in each of the oligonucleotide tag precursors to form one or more ligatable ends on each oligonucleotide tag precursor (c) ligatmg one or more words to the one or more ligatable ends to elongate each of the oligonucleotide tag precursors.
(d) amplifying the elongated oligonucleotide tag precursors in the amplicon, and
(e) repeatmg steps (b) through (d) until a repertoire of oligonucleotide tags havmg the predetermined length is formed
2 The method of claim 1 wherem said amplicon is a clonmg vector
3 The method of claim 2 wherem said step of cleaving includes cleavmg said amplicon in a region adjacent to said word by a type IIs restriction endonuclease
4 The method of claim 3 wherem said word has a length m the range of from three to fourteen nucleotides
5 The method of claim 4 wherem oligonucleotide tag has a length in the range of from 18 to 60 nucleotides
6 The method of claim 2 wherein said step of cleaving includes cleaving said amplicon across said word by a type IIs restriction endonuclease
7 The method of claim 2 wherein said word has a length of four and wherem said oligonucleotide tag has a length m the range of from 18 to 40
8 A repertoire of oligonucleotide tags, wherem the oligonucleotide tags of the repertoire are of the form
w1(N)xlw2(N)x2 (N)„ wherem each of Wj through wn is a word consistmg of an oligonucleotide havmg a length from three to fourteen nucleotides or basepairs and bemg selected from the same minimally cross- hybπdizing set wherem a word of the set and a complement of any other word of the set has at least two mismatches. N is a nucleotide or basepatr. each of xj through xn_j is an integer selected from the group consisting of 0, 1, 2. 3. and 4. provided that at least one of xj through xn_l is 1, 2. 3. or 4. and n is an integer m the range of from 4 to 10
9 The repertoire of claim 8 wherein each of said X] through xn_j is selected from the group consistmg of 0. 1. and 2, and wherein said length of said word is from four to ten nucleotides or basepairs
10 The repertoire of claim 9 wherein said oligonucleotide tags are single stranded and wherem n is in the range of from 6 to 10
11 The repertoire of claim 10 wherem a duplex between each of said words of said minimally cross-hybridizing set and said complement of any other word of said set would have at least three mismatches
12 The repertoire of claim 11 wherein a duplex between each of said words of said minimally cross-hybridizing set and said complement of any other word of said set would have at least five mismatches whenever said word has a length of greater than or equal to six nucleotides
13 The repertoire of claim 10 having a number of said oligonucleotide tags that is in the range of from 100 to 1 x 109
14 The repertoire of claim 13 havmg a number of said oligonucleotide tags that is m the range of from 1000 to 1 x 108
15 A repertoire of clomng vectors for attaching oligonucleotide tags to polynucleotides, wherem each of the vectors comprises a double stranded element coπespondrng to an oligonucleotide tag of the form
w1(N)xlw2(N) 2 (N^jWn
wherem each of Wj through wn is a word consistmg of an oligonucleotide havmg a length from three to fourteen nucleotides and being selected from the same minimally cross-hybπdizing set wherein a word of the set and a complement of any other word of the set has at least two mismatches: N is a nucleotide; each of xj through xn_γ is an integer selected from the group consisting of 0. 1. 2. 3, and 4, provided that at least one of xj through xn.j is 1, 2. 3. or 4; and n is an integer in the range of from 4 to 10.
16. The repertoire of claim 1 wherein each of said xj through xn_] is selected from the group consisting of 0, 1. and 2. and wherem said length of said word is from four to ten nucleotides or basepairs.
PCT/US1999/022585 1998-10-05 1999-09-28 Enzymatic synthesis of oligonucleotide tags WO2000020639A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU65025/99A AU6502599A (en) 1998-10-05 1999-09-28 Enzymatic synthesis of oligonucleotide tags

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10303098P 1998-10-05 1998-10-05
US60/103,030 1998-10-05

Publications (1)

Publication Number Publication Date
WO2000020639A1 true WO2000020639A1 (en) 2000-04-13

Family

ID=22292986

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/022585 WO2000020639A1 (en) 1998-10-05 1999-09-28 Enzymatic synthesis of oligonucleotide tags

Country Status (2)

Country Link
AU (1) AU6502599A (en)
WO (1) WO2000020639A1 (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001081564A2 (en) * 2000-04-26 2001-11-01 Actinodrug Pharmaceuticals Gmbh Method for producing dna encoding polypeptides that are composed of several sections, and for producing polypeptides by expressing the dna thus obtained
US6458530B1 (en) 1996-04-04 2002-10-01 Affymetrix Inc. Selecting tag nucleic acids
US7026123B1 (en) 2001-08-29 2006-04-11 Pioneer Hi-Bred International, Inc. UTR tag assay for gene function discovery
US8932992B2 (en) 2001-06-20 2015-01-13 Nuevolution A/S Templated molecules and methods for using such molecules
US9096951B2 (en) 2003-02-21 2015-08-04 Nuevolution A/S Method for producing second-generation library
US9109248B2 (en) 2002-10-30 2015-08-18 Nuevolution A/S Method for the synthesis of a bifunctional complex
US9121110B2 (en) 2002-12-19 2015-09-01 Nuevolution A/S Quasirandom structure and function guided synthesis methods
US9359601B2 (en) 2009-02-13 2016-06-07 X-Chem, Inc. Methods of creating and screening DNA-encoded libraries
US9574189B2 (en) 2005-12-01 2017-02-21 Nuevolution A/S Enzymatic encoding methods for efficient synthesis of large libraries
EP3342868A1 (en) 2016-12-30 2018-07-04 Systasy Bioscience GmbH Constructs and screening methods
US10730906B2 (en) 2002-08-01 2020-08-04 Nuevolutions A/S Multi-step synthesis of templated molecules
US10731151B2 (en) 2002-03-15 2020-08-04 Nuevolution A/S Method for synthesising templated molecules
US10865409B2 (en) 2011-09-07 2020-12-15 X-Chem, Inc. Methods for tagging DNA-encoded libraries
US11118215B2 (en) 2003-09-18 2021-09-14 Nuevolution A/S Method for obtaining structural information concerning an encoded molecule and method for selecting compounds
US11225655B2 (en) 2010-04-16 2022-01-18 Nuevolution A/S Bi-functional complexes and methods for making and using such complexes
US11674135B2 (en) 2012-07-13 2023-06-13 X-Chem, Inc. DNA-encoded libraries having encoding oligonucleotide linkages not readable by polymerases

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0292128A1 (en) * 1987-04-28 1988-11-23 Tamir Biotechnology Ltd Improved DNA probes
WO1993006121A1 (en) * 1991-09-18 1993-04-01 Affymax Technologies N.V. Method of synthesizing diverse collections of oligomers

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0292128A1 (en) * 1987-04-28 1988-11-23 Tamir Biotechnology Ltd Improved DNA probes
WO1993006121A1 (en) * 1991-09-18 1993-04-01 Affymax Technologies N.V. Method of synthesizing diverse collections of oligomers

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BRENNER ET AL.: "Encoded Combinatorial Chemistry", PROC. NATL. ACAD. SCI. USA, vol. 89, June 1992 (1992-06-01), pages 5381 - 5383, XP002926353 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6458530B1 (en) 1996-04-04 2002-10-01 Affymetrix Inc. Selecting tag nucleic acids
WO2001081564A3 (en) * 2000-04-26 2002-04-25 Florian Schauwecker Method for producing DNA encoding polypeptides that are composed of several sections, and for producing polypeptides by expressing the DNA thus obtained
WO2001081564A2 (en) * 2000-04-26 2001-11-01 Actinodrug Pharmaceuticals Gmbh Method for producing dna encoding polypeptides that are composed of several sections, and for producing polypeptides by expressing the dna thus obtained
US8932992B2 (en) 2001-06-20 2015-01-13 Nuevolution A/S Templated molecules and methods for using such molecules
US10669538B2 (en) 2001-06-20 2020-06-02 Nuevolution A/S Templated molecules and methods for using such molecules
US7026123B1 (en) 2001-08-29 2006-04-11 Pioneer Hi-Bred International, Inc. UTR tag assay for gene function discovery
US7405282B2 (en) 2001-08-29 2008-07-29 Pioneer Hi-Bred International, Inc. UTR tag assay for gene function discovery
US10731151B2 (en) 2002-03-15 2020-08-04 Nuevolution A/S Method for synthesising templated molecules
US10730906B2 (en) 2002-08-01 2020-08-04 Nuevolutions A/S Multi-step synthesis of templated molecules
US9109248B2 (en) 2002-10-30 2015-08-18 Nuevolution A/S Method for the synthesis of a bifunctional complex
US9284600B2 (en) 2002-10-30 2016-03-15 Neuvolution A/S Method for the synthesis of a bifunctional complex
US9487775B2 (en) 2002-10-30 2016-11-08 Nuevolution A/S Method for the synthesis of a bifunctional complex
US9885035B2 (en) 2002-10-30 2018-02-06 Nuevolution A/S Method for the synthesis of a bifunctional complex
US10077440B2 (en) 2002-10-30 2018-09-18 Nuevolution A/S Method for the synthesis of a bifunctional complex
US9121110B2 (en) 2002-12-19 2015-09-01 Nuevolution A/S Quasirandom structure and function guided synthesis methods
US9096951B2 (en) 2003-02-21 2015-08-04 Nuevolution A/S Method for producing second-generation library
US11118215B2 (en) 2003-09-18 2021-09-14 Nuevolution A/S Method for obtaining structural information concerning an encoded molecule and method for selecting compounds
US11965209B2 (en) 2003-09-18 2024-04-23 Nuevolution A/S Method for obtaining structural information concerning an encoded molecule and method for selecting compounds
US9574189B2 (en) 2005-12-01 2017-02-21 Nuevolution A/S Enzymatic encoding methods for efficient synthesis of large libraries
US11702652B2 (en) 2005-12-01 2023-07-18 Nuevolution A/S Enzymatic encoding methods for efficient synthesis of large libraries
US9359601B2 (en) 2009-02-13 2016-06-07 X-Chem, Inc. Methods of creating and screening DNA-encoded libraries
US11168321B2 (en) 2009-02-13 2021-11-09 X-Chem, Inc. Methods of creating and screening DNA-encoded libraries
US11225655B2 (en) 2010-04-16 2022-01-18 Nuevolution A/S Bi-functional complexes and methods for making and using such complexes
US10865409B2 (en) 2011-09-07 2020-12-15 X-Chem, Inc. Methods for tagging DNA-encoded libraries
US11674135B2 (en) 2012-07-13 2023-06-13 X-Chem, Inc. DNA-encoded libraries having encoding oligonucleotide linkages not readable by polymerases
EP3342868A1 (en) 2016-12-30 2018-07-04 Systasy Bioscience GmbH Constructs and screening methods

Also Published As

Publication number Publication date
AU6502599A (en) 2000-04-26

Similar Documents

Publication Publication Date Title
US20030049616A1 (en) Enzymatic synthesis of oligonucleotide tags
AU2018202940B2 (en) Compositions and Methods for High Fidelity Assembly of Nucleic Acids
US5629179A (en) Method and kit for making CDNA library
WO2000020639A1 (en) Enzymatic synthesis of oligonucleotide tags
US5503995A (en) Exchangeable template reaction
AU700952B2 (en) PCR-based cDNA subtractive cloning method
JP6219944B2 (en) Amplification dependent on 5 &#39;protection
El-Sagheer et al. Single tube gene synthesis by phosphoramidate chemical ligation
WO1991018114A1 (en) Polynucleotide amplification
CA2073184A1 (en) Compositions and methods for analyzing genomic variation
US5952201A (en) Method of preparing oligonucleotide probes or primers, vector therefor and use thereof
JPH07177885A (en) Method for specific cloning of nucleic acid
US5827704A (en) Vectors for cloning and modification of DNA fragments
WO1992013104A1 (en) 5&#39; and 3&#39; polymerase chain reaction walking from known dna sequences
KR100762261B1 (en) Process for preparation of full-length cDNA and anchor and primer used for the same
KR20050009118A (en) Plasmid having a function of T-vector and expression vector, and expression of the target gene using the same
JP4681129B2 (en) Asymmetric modification of nucleic acid terminal region
JP3213719B2 (en) Gene library and method for producing the same
JPS60234584A (en) Recombinant plasmid
Glick et al. Molecular Genetics: Gene Isolation, Characterization and Manipulation
JP2002325573A (en) Vector

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref country code: AU

Ref document number: 1999 65025

Kind code of ref document: A

Format of ref document f/p: F

AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TR TT UA UG US UZ VN

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase