WO1999066028A2 - Genes for the biosynthesis of epothilones - Google Patents

Genes for the biosynthesis of epothilones Download PDF

Info

Publication number
WO1999066028A2
WO1999066028A2 PCT/EP1999/004171 EP9904171W WO9966028A2 WO 1999066028 A2 WO1999066028 A2 WO 1999066028A2 EP 9904171 W EP9904171 W EP 9904171W WO 9966028 A2 WO9966028 A2 WO 9966028A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
nucleotides
acids
ammo acids
ammo
Prior art date
Application number
PCT/EP1999/004171
Other languages
French (fr)
Other versions
WO1999066028A3 (en
Inventor
Thomas Schupp
James Madison Ligon
Istvan Molnar
Ross Zirkle
Jörn Görlach
Devon Cyr
Original Assignee
Novartis Ag
Novartis-Erfindungen Verwaltungsgesellschaft Mbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to NZ508326A priority Critical patent/NZ508326A/en
Priority to HU0102186A priority patent/HUP0102186A3/en
Priority to SK1924-2000A priority patent/SK19242000A3/en
Priority to AU46116/99A priority patent/AU753567B2/en
Priority to JP2000554837A priority patent/JP2002518004A/en
Priority to CA002329774A priority patent/CA2329774A1/en
Priority to PL345579A priority patent/PL200157B1/en
Priority to EP99929243A priority patent/EP1088078A2/en
Application filed by Novartis Ag, Novartis-Erfindungen Verwaltungsgesellschaft Mbh filed Critical Novartis Ag
Priority to BR9911349-0A priority patent/BR9911349A/en
Priority to IL13973599A priority patent/IL139735A0/en
Publication of WO1999066028A2 publication Critical patent/WO1999066028A2/en
Publication of WO1999066028A3 publication Critical patent/WO1999066028A3/en
Priority to IL139735A priority patent/IL139735A/en
Priority to NO20006195A priority patent/NO20006195L/en
Priority to IL190391A priority patent/IL190391A0/en
Priority to NO20091055A priority patent/NO20091055L/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/52Genes encoding for enzymes or proenzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P17/00Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms
    • C12P17/18Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms containing at least two hetero rings condensed among themselves or condensed with a common carbocyclic ring system, e.g. rifamycin
    • C12P17/181Heterocyclic compounds containing oxygen atoms as the only ring heteroatoms in the condensed system, e.g. Salinomycin, Septamycin
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • A61P35/04Antineoplastic agents specific for metastasis
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria

Definitions

  • the present invention relates generally to polyketides and genes for their synthesis.
  • the present invention relates to the isolation and characterization of novel poly- ketide synthase and non ⁇ bosomal peptide synthetase genes from Sorangium cellulosum that are necessary for the biosynthesis of epothilones A and B.
  • Polyketides are compounds synthesized from two-carbon building blocks, the ⁇ - carbon of which always carries a keto group, thus the name polyketide. These compounds include many important antibiotics, immunosuppressants, cancer chemotherapeutic agents, and other compounds possessing a broad range of biological properties. The tremendous structural diversity derives from the different lengths of the polyketide chain, the different side-chains introduced (either as part of the two-carbon building blocks or after the polyketide backbone is formed), and the stereochemistry of such groups. The keto groups may also be reduced to hydroxyls, enoyls, or removed altogether. Each round of two-carbon addition is carried out by a complex of enzymes called the polyketide synthase (PKS) in a manner similar to fatty acid biosynthesis.
  • PKS polyketide synthase
  • Type I proteins are polyfunctional, with several catalytic domains carrying out different enzymatic steps covalently linked together (e.g. PKS for erythromycin, soraphen, rifamycin, and avermectin (MacNeil et al., in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et al.), American Society for Microbiology, Washington D. C. pp. 245-256 (1993)); whereas type II proteins are monofunctional (Hutchinson et al., in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et al.), American Society for Microbiology, Washington D. C. pp. 203-216 (1993)).
  • PKS for erythromycin, soraphen, rifamycin, and avermectin
  • type II proteins are monofunctional (Hutchinson et al., in Industrial Microorganisms: Basic and Applied
  • NRPSs non- ⁇ bosomal polypeptide synthetases
  • NRPSs are multienzymes that are organized in modules Each module is responsible for the addition (and the additional processing, if required) of one ammo acid building block.
  • NRPSs activate ammo acids by forming aminoacyl-adenylates, and capture the activated ammo acids on thiol groups of phophopanthetemyl prosthetic groups on peptidyl carrier protein domains.
  • NRPSs modify the amino acids by epime ⁇ zation, N-methyla- tion, or cyclization if necessary, and catalyse the formation of peptide bonds between the enzyme-bound ammo acids.
  • NRPSs are responsible for the biosynthesis of peptide secondary metabolites like cyclospo ⁇ n, could provide polyketide chain terminator units as in rapa- mycin, or form mixed systems with PKSs as in yers iabactin biosynthesis.
  • Epothilones A and B are 16-membered macrocyclic polyketides with an acylcyste- ine-de ⁇ ved starter unit that are produced by the bacterium Sorangium cellulosum strain So ce90 (Gerth et al., J. Antibiotics 49: 560-563 (1996), incorporated herein by reference).
  • the structure of epothilone A and B wherein R signifies hydrogen (epothilone A) or methyl (epothilone B) is:
  • epothilones have a narrow antifungal spectrum and especially show a high cytotoxicity in animal cell cultures (see, Hofle et ai, Patent DE 4138042 (1993), incorporated herein by reference). Of significant importance, epothilones mimic the biological effects of taxol, both in vivo and in cultured cells (Bollag et al., Cancer Research 55. 2325- 2333 (1995), incorporated herein by reference). Taxol and taxotere, which stabilize cellular microtubules, are cancer chemotherapeutic agents with significant activity against various human solid tumors (Rowinsky et ai, J. Natl. Cancer Inst.
  • epothilone analogs have been synthesized that have a superior cytotoxic activity as compared to epothilone A or epothilone B as demonstrated by their enhanced ability to induce the polymerization and stabilization of microtubules (WO 98/25929, incorporated herein by reference).
  • one object of the present invention is to isolate the genes that are involved in the synthesis of epothilones, particularly the genes that are involved in the synthesis of epothiiones A and B in myxobactena of the Sorangium/- Polyangium group, i.e., Sorangium cellulosum strain So ce90.
  • a further object of the invention is to provide a method for the recombinant production of epothilones for application in anticancer formulations.
  • the present invention unexpectedly overcomes the difficulties set forth above to provide for the first time a nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone.
  • the nucleotide sequence is isolated from a species belonging to Myxobactena, most preferably Sorangium cellulosum.
  • the present invention provides an isolated nucleic acid molecuie comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said polypeptide comprises an amino acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: SEQ ID NO:2, ammo acids 11 -437 of SEQ ID NO:2, ammo acids 543-864 of SEQ ID NO:2, ammo acids 974-1273 of SEQ ID NO:2, ammo acids 1314-1385 of SEQ ID NO:2, SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, ammo acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, ammo acids 549-565 of SEQ ID NO:3, ammo acids 588-603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO
  • the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said polypeptide comprises an ammo acid sequence selected from the group consisting of: SEQ ID NO:2, ammo acids 11 -437 of SEQ ID NO:2, am o acids 543-864 of SEQ ID NO:2, am o acids 974-1273 of SEQ ID NO:2, am o acids 1314-1385 of SEQ ID NO:2, SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, am o acids 118-125 of SEQ ID NO:3, am o acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, ammo acids 549-565 of SEQ ID NO:3, ammo acids 588- 603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO:3, ammo acids
  • the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415-5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 1 1549-11764 of SEQ ID NO 1 , nucleotides 1 1872-16104 of SEQ ID NO nucleotides 12085-12114 of SEQ ID NO 1 , nucleotides 122
  • the present invention provides a nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence is selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415-5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643- 8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 11549-11764 of SEQ ID NO:1 , nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-12114 of SEQ ID NO:1 , nucleotides 12223-12246
  • the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415- 5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 9236-10201 of
  • the present invention also provides a chime ⁇ c gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule of the invention. Further, the present invention provides a recombinant vector comprising such a chime ⁇ c gene, wherein the vector is capable of being stably transformed into a host cell Still further, the present invention provides a recombinant host cell comprising such a chime ⁇ c gene, wherein the host cell is capable of expressing the nucleotide sequence that encodes at least one polypeptide necessary for the biosynthesis of an epothilone.
  • the recombinant host cell is a bacterium belonging to the order Actmomycetales, and in a more preferred embodiment the recombinant host cell is a strain of Streptomyces. In other embodiments, the recombinant host cell is any other bacterium amenable to fermentation, such as a pseudomonad or E. coll. Even further, the present invention provides a Bac clone comprising a nucieic acid molecule of the invention, preferably Bac clone pEP015. in another aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes an epothilone synthase domain.
  • the epothilone synthase domain is a ⁇ -ketoacyl-syn- thase (KS) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 11 -437 of SEQ ID NO:2, ammo acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103- 5525 of SEQ ID NO:5, am o acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7.
  • KS ⁇ -ketoacyl-syn- thase
  • said KS domain preferably comprises an am o acid sequence selected from the group consisting of: am o acids 11 -437 of SEQ ID NO:2, am o acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103-5525 of SEQ ID NO:5, ammo acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and am o acids 32-450 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 21860-23116 of SEQ ID NO:1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO 1 , nucleotides 21860-23116 of SEQ ID NO-1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 21860-231 16 of SEQ ID NO:1 , nucleotides 26318-27595 of SEQ ID NO: 1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO 1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO:1.
  • the epothilone synthase domain is an acyltrans- ferase (AT) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: am o acids 543-864 of SEQ ID NO:2, ammo acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, am o acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 5631- 5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556-877 of SEQ ID NO:7.
  • AT acyltrans- ferase
  • said AT domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO:2, ammo acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, am o acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 5631-5951 of SEQ ID NO:5, ammo acids 561 -881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556- 877 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865-18827 of SEQ ID NO:1 , nucleotides 23431-24397 of SEQ ID NO:1 , nucleotides 2791 1 -28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636-39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO:1 , and nucleotides 56600-57565 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865- 18827 of SEQ ID NO:1 , nucleotides 23431 -24397 of SEQ ID NO:1 , nucleotides 27911 - 28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636- 39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680- 50642 of SEQ ID NO:1 , and nucleotides 56
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865-18827 of SEQ ID NO:1 , nucleotides 23431 -24397 of SEQ ID NO: 1 , nucleotides 27911 -28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636-39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO:1 , and nucleotides 56600-57565 of SEQ ID NO:1.
  • the epothilone synthase domain is an enoyi reductase (ER) domain comprising an ammo acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: ammo acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
  • ER enoyi reductase
  • said ER domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 974-1273 of SEQ ID NO:2, am o acids 4433-4719 of SEQ ID NO:5, am o acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 35042-35902 of SEQ ID NO:1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO-1
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID NO-1 , nucleotides 35042-35902 of SEQ ID NO: 1 , nucleotides 41369-42256 of SEQ ID NO:1 ,
  • said nucleotide sequence most preferably is selected from the group consisting of. nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 35042-35902 of SEQ ID NO-1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO:1.
  • the epothilone synthase domain is an acyl carrier protein (ACP) domain, wherein said polypeptide comprises an ammo acid sequence substantially similar to an am o acid sequence selected from the group consisting of ammo acids 1314-1385 of SEQ ID NO.2, am o acids 1722-1792 of SEQ ID NO:4, ammo acids 1434-1506 of SEQ ID NO-5, ammo acids 2932-3005 of SEQ ID NO.5, am o acids 5010-5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, ammo acids 1430- 1503 of SEQ ID NO:6, am o acids 3673-3745 of SEQ ID NO:6, and ammo acids 2093- 2164 of SEQ ID NO.7.
  • ACP acyl carrier protein
  • said ACP domain preferably comprises an ammo acid sequence selected from the group consisting of. ammo acids 1314-1385 of SEQ ID NO:2, ammo acids 1722-1792 of SEQ ID NO-4, am o acids 1434- 1506 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO:5, ammo acids 5010-5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, ammo acids 1430-1503 of SEQ ID NO:6, am o acids 3673-3745 of SEQ ID NO:6, and am o acids 2093-2164 of SEQ ID NO.7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides
  • nucleotides 21414-21626 of SEQ ID NO: 1 nucleotides 26045-26263 of SEQ ID NO 1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO 1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 47811 -48032 of SEQ ID NO 1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 61211 -61426 of SEQ ID NO 1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 11549-1 1764 of SEQ ID NO"1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO:1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO:1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 47811 -48032 of SEQ ID NO:1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 61211 -6
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 1 1549-1 1764 of SEQ ID NO:1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO:1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO:1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 4781 1 -48032 of SEQ ID NO:1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 6121 1 -61426 of SEQ ID NO:1.
  • the epothilone synthase domain is a dehydratase (DH) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO.4, ami- no acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7.
  • DH dehydratase
  • said DH domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO:4, ammo acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, am o acids 2383-2551 of SEQ ID NO:6, and am o acids 887-1051 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401 - 33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670- 51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401-33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670-51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401 -33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670-51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1.
  • the epothilone synthase domain is a ⁇ -keto- reductase (KR) domain comprising an ammo acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID NO:4, am o acids 1147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, am o acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and ammo acids 1810-2055 of SEQ ID NO:7.
  • KR ⁇ -keto- reductase
  • said KR domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 1439-1684 of SEQ ID NO:4, ammo acids 1147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857- 7101 of SEQ ID NO.5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and am o acids 1810-2055 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1.
  • the epothilone synthase domain is a methyltransferase (MT) domain comprising an ammo acid sequence substantially similar to amino acids 2671-3045 of SEQ ID NO:6.
  • said MT domain preferably comprises ammo acids 2671-3045 of SEQ ID NO:6.
  • said nucleotide sequence preferably is substantially similar to nucleotides 51534-52657 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of nucleotides 51534-52657 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is nucleotides 51534-52657 of SEQ ID NO:1.
  • the epothilone synthase domain is a thioesterase (TE) domain comprising an ammo acid sequence substantially similar to ammo acids 2165- 2439 of SEQ ID NO:7.
  • said TE domain preferably comprises ammo acids 2165-2439 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to nucleotides 61427-62254 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of nucleotides 61427-62254 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is nucleotides 61427-62254 of SEQ ID NO:1.
  • the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes a non-nbosomal peptide synthetase, wherein said non-nbosomal peptide synthetase comprises an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, amino acids 1 18-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, amino acids 549- 565 of SEQ ID NO:3, amino acids 588-603 of SEQ ID NO:3, amino acids 669-684 of SEQ ID NO:3, amino acids 815-821 of SEQ ID NO:3, amino acids 868-892 of SEQ ID NO:3, am o acids 903-912 of SEQ ID NO:3, am o acids 918-940 of SEQ ID NO:3, am o acids
  • said non-nbosomal peptide synthetase preferably comprises an ammo acid sequence selected from the group consisting of: SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, am o acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, ammo acids 549-565 of SEQ ID NO:3, am o acids 588-603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO:3, ammo acids 815-821 of SEQ ID NO:3, ammo acids 868-892 of SEQ ID NO:3, ammo acids 903-912 of SEQ ID NO:3, am o acids 918-940 of SEQ ID NO:3, ammo acids 1268-1274 of SEQ ID NO:3, ammo acids 1285-1297 of SEQ ID NO:3, ammo acids 973-1256 of SEQ ID NO:3, and
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-121 14 of SEQ ID NO:
  • nucleotides 12223-12246 of SEQ ID NO 1 nucleotides 12466-12507 of SEQ ID NO 1 , nucleotides 12928-12960 of SEQ ID NO 1 , nucleotides 13516-13566 of SEQ ID NO 1 , nucleotides 13633-13680 of SEQ ID NO: 1 , nucleotides 13876-13923 of SEQ ID NO 1 , nucleotides 14313-14334 of SEQ ID NO 1 , nucleotides 14473-14547 of SEQ ID NO 1 , nucleotides 14578-14607 of SEQ ID NO 1 , nucleotides 14623-14692 of SEQ ID NO 1 , nucleotides 15673-15693 of SEQ ID NO 1 , nucleotides 15724-15762 of SEQ ID NO 1 , nucleotides 14788-15639 of SEQ ID NO: 1 , and nucleotides 157
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 1 1872-16104 of SEQ ID NO:1 , nucleotides 12085-12114 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466-12507 of SEQ ID NO:1 , nucleotides
  • nucleotides 13516-13566 of SEQ ID NO:1 nucleotides 13633-13680 of SEQ ID NO 1 , nucleotides 13876-13923 of SEQ ID NO:1 , nucleotides 14313-14334 of SEQ ID NO 1 , nucleotides 14473-14547 of SEQ ID NO:1 , nucleotides 14578-14607 of SEQ ID NO 1 , nucleotides 14623-14692 of SEQ ID NO:1 , nucleotides 15673-15693 of SEQ ID NO 1 , nucleotides 15724-15762 of SEQ ID NO:1 , nucleotides 14788-15639 of SEQ ID NO 1 , and nucleotides 15901 -15924 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085- 12114 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466- 12507 of SEQ ID NO :1 , nucleotides 12928-12960 of SEQ ID NO: 1 , nucleotides 13516- 13566 of SEQ ID NO :1 , nucleotides 13633-13680 of SEQ ID NO: 1 , nucleotides 13876- 13923 of SEQ ID NO :1 , nucleotides 14313-14334 of SEQ ID NO: 1 , nucleotides 14473- 14547 of SEQ ID NO :1 , nucleotides 14578-14607 of SEQ ID NO.1 , nucleo
  • the present invention further provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes a polypeptide comprising an ammo acid sequence selected from the group consisting of SEQ ID NOs:2-23.
  • the present invention also provides methods for the recombinant production of polyketides such as epothilones in quantities large enough to enable their purification and use in pharmaceutical formulations such as those for the treatment of cancer
  • polyketides such as epothilones
  • a specific advantage of these production methods is the chira ty of the molecules produced; production in transgenic organisms avoids the generation of populations of racemic mixtures, within which some enantiomers may have reduced activity.
  • the present invention provides a method for heterologous expression of epothilone in a recombinant host, comprising: (a) introducing into a host a chime ⁇ c gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule of the invention that comprises a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone; and (b) growing the host in conditions that allow biosynthesis of epothilone in the host.
  • the present invention also provides a method for producing epothilone, comprising: (a) expressing epothilone in a recombinant host by the aforementioned method; and (b) extracting epothilone from the recombinant host.
  • the present invention provides an isolated polypeptide comprising an ammo acid sequence that consists of an epothilone synthase domain.
  • the epothilone synthase domain is a ⁇ -ketoacyl- synthase (KS) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 11-437 of SEQ ID NO:2, am o acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, am o acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103- 5525 of SEQ ID NO:5, ammo acids 35-454 of SEQ ID NO:6, am o acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7.
  • KS ⁇ -ketoacyl- synthase
  • said KS domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 11 -437 of SEQ ID NO:2, ammo acids 7-432 of SEQ ID NO:4, amino acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103-5525 of SEQ ID NO.5, ammo acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7.
  • the epothilone synthase domain is an acyltrans- ferase (AT) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO-2, ammo acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, am o acids 5631- 5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556-877 of SEQ ID NO:7.
  • AT acyltrans- ferase
  • said AT domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO.2, ammo acids 539-859 of SEQ ID NO 4, ammo acids 563-884 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, am o acids 5631 -5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556- 877 of SEQ ID NO:7.
  • the epothilone synthase domain is an enoyl reductase (ER) domain comprising an ammo acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: am o acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
  • ER enoyl reductase
  • said ER domain preferably comprises an am o acid sequence selected from the group consisting of: ammo acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
  • the epothilone synthase domain is an acyl carrier protein (ACP) domain, wherein said polypeptide comprises an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ami- no acids 1314-1385 of SEQ ID NO:2, amino acids 1722-1792 of SEQ ID NO:4, ammo acids 1434-1506 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO:5, am o acids 5010- 5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, ammo acids 1430-1503 of SEQ ID NO:6, amino acids 3673-3745 of SEQ ID NO:6, and amino acids 2093-2164 of SEQ ID NO:7.
  • ACP acyl carrier protein
  • said ACP domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 1314-1385 of SEQ ID NO:2, ammo acids 1722-1792 of SEQ ID NO:4, am o acids 1434-1506 of SEQ ID NO:5, amino acids 2932-3005 of SEQ ID NO:5, amino acids 5010-5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, amino acids 1430-1503 of SEQ ID NO:6, ammo acids 3673-3745 of SEQ ID NO:6, and amino acids 2093-2164 of SEQ ID NO:7.
  • the epothilone synthase domain is a dehydratase (DH) domain comprising an ammo acid sequence substantially similar to an am o acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO:4, ami- no acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7.
  • DH dehydratase
  • said DH domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 869-1037 of SEQ ID NO:4, ammo acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7.
  • the epothilone synthase domain is a ⁇ -keto- reductase (KR) domain comprising an amino acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: ammo acids 1439-1684 of SEQ ID NO:4, ammo acids 1 147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and am o acids 1810-2055 of SEQ ID NO:7.
  • KR ⁇ -keto- reductase
  • said KR domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 1439-1684 of SEQ ID NO:4, ammo acids 1147-1399 of SEQ ID NO:5, ammo acids 2645- 2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and amino acids 1810-2055 of SEQ ID NO:7.
  • the epothilone synthase domain is a methyl- transferase (MT) domain comprising an amino acid sequence substantially similar to amino acids 2671-3045 of SEQ ID NO:6.
  • said MT domain preferably comprises amino acids 2671-3045 of SEQ ID NO:6.
  • the epothilone synthase domain is a thioesterase (TE) domain comprising an ammo acid sequence substantially similar to ammo acids 2165- 2439 of SEQ ID NO:7.
  • said TE domain preferably comprises ammo acids 2165-2439 of SEQ ID NO.7.
  • Associated With / Operatively Linked refers to two DNA sequences that are related physically or functionally.
  • a promoter or regulatory DNA sequence is said to be "associated with" a DNA sequence that codes for an RNA or a protein if the two sequences are operatively linked, or situated such that the regulator DNA sequence will affect the expression level of the coding or structural DNA sequence.
  • Chime ⁇ c Gene A recombinant DNA sequence in which a promoter or regulatory DNA sequence is operatively linked to, or associated with, a DNA sequence that codes for an mRNA or which is expressed as a protein, such that the regulator DNA sequence is able to regulate transcription or expression of the associated DNA sequence.
  • the regulator DNA sequence of the chime ⁇ c gene is not normally operatively linked to the associated DNA sequence as found in nature.
  • Coding DNA Sequence A DNA sequence that is translated in an organism to produce a protein.
  • acyl carrier protein ACP
  • KS ⁇ -ketosynthase
  • AT acyltransferase
  • KR ⁇ - ketoreductase
  • DH dehydratase
  • ER enoylreductase
  • TE thioesterase
  • Epothilones 16-membered macrocyc c polyketides naturally produced by the bacterium Sorangium cellulosum strain So ce90, which mimic the biological effects of taxol.
  • epothilone refers to the class of polyketides that includes epothilone A and epothilone B, as well as analogs thereof such as those described in WO 98/25929.
  • Epothilone Synthase A polyketide synthase responsible for the biosynthesis of epothilone.
  • Gene A defined region that is located within a genome and that, besides the aforementioned coding DNA sequence, comprises other, primarily regulatory, DNA sequences responsible for the control of the expression, that is to say the transcription and translation, of the coding portion.
  • Heterologous DNA Sequence A DNA sequence not naturally associated with a host cell into which it is introduced, including non-naturally occurring multiple copies of a naturally occurring DNA sequence
  • Homologous DNA Sequence A DNA sequence naturally associated with a host cell into which it is introduced.
  • an isolated nucleic acid molecule or an isolated enzyme is a nucleic acid molecule or enzyme that, by the hand of man, exists apart from its native environment and is therefore not a product of nature
  • An isolated nucleic acid molecule or enzyme may exist in a purified form or may exist in a non-native environment such as, for example, a recombinant host cell.
  • Module A genetic element encoding all of the distinct activities required in a single round of polyketide biosynthesis, i.e., one condensation step and all the ⁇ -carbonyl processing steps associated therewith.
  • Each module encodes an ACP, a KS, and an AT activity to accomplish the condensation portion of the biosynthesis, and selected post- condensation activities to effect the ⁇ -carbonyl processing
  • NRPS NRPS.
  • a non-nbosomal polypeptide synthetase which is a complex of enzymatic activities responsible for the incorporation of ammo acids into secondary metabolites including, for example, ammo acid adenylation, epime ⁇ zation, N-methylation, cyc zation, peptidyl carrier protein, and condensation domains
  • a functional NRPS is one that catalyzes the incorporation of an ammo acid into a secondary metabolite.
  • NRPS gene One or more genes encoding NRPSs for producing functional secondary metabolites, e.g., epothilones A and B, when under the direction of one or more compatible control elements.
  • Nucleic Acid Molecule A linear segment of single- or double-stranded DNA or RNA that can be isolated from any source. In the context of the present invention, the nucleic acid molecule is preferably a segment of DNA.
  • PKS A polyketide synthase, which is a complex of enzymatic activities (domains) responsible for the biosynthesis of polyketides including, for example, ketoreductase, dehy- dratase, acyl carrier protein, enoylreductase, ketoacyl ACP synthase, and acyltransferase.
  • a functional PKS is one that catalyzes the synthesis of a polyketide.
  • PKS Genes One or more genes encoding various polypeptides required for producing functional polyketides, e.g., epothilones A and B, when under the direction of one or more compatible control elements.
  • nucleic acids a nucleic acid molecule that has at least 60 percent sequence identity with a reference nucleic acid molecule.
  • a substantially similar DNA sequence is at least 80% identical to a reference DNA sequence; in a more preferred embodiment, a substantially similar DNA sequence is at least 90% identical to a reference DNA sequence; and in a most preferred embodiment, a substantially similar DNA sequence is at least 95% identical to a reference DNA sequence.
  • a substantially similar DNA sequence preferably encodes a protein or peptide having substantially the same activity as the protein or peptide encoded by the reference DNA sequence.
  • a substantially similar nucleotide sequence typically hybridizes to a reference nucleic acid molecule, or fragments thereof, under the following conditions: hybridization at 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 pH 7.0, 1 mM EDTA at 50°C; wash with 2X SSC, 1% SDS, at 50°C.
  • SDS sodium dodecyl sulfate
  • a substantially similar am o acid sequence is an amino acid sequence that is at least 90% identical to the am o acid sequence of a reference protein or peptide and has substantially the same activity as the reference protein or peptide.
  • Transformation A process for introducing heterologous nucleic acid into a host cell or organism.
  • Transformed / Transgenic / Recombinant refers to a host organism such as a bacterium into which a heterologous nucleic acid molecule has been introduced.
  • the nucleic acid molecule can be stably integrated into the genome of the host or the nucleic acid molecule can also be present as an extrachromosomal molecule. Such an extrachromosomal molecule can be auto-replicating.
  • Transformed cells, tissues, or plants are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof.
  • a "non-transformed", “non-transgenic", or “non-recombmant" host refers to a wild-type organism, i.e., a bacterium, which does not contain the heterologous nucleic acid molecule.
  • Nucleotides are indicated by their bases by the following standard abbreviations: adenine (A), cytosine (C), thymine (T), and guanine (G).
  • Ammo acids are likewise indicated by the following standard abbreviations: alanine (ala; A), arginme (Arg; R), asparagme (Asn; N), aspartic acid (Asp; D), cysteme (Cys; C), glutamme (Gin; Q), glutamic acid (Glu; E), glycme (Gly; G), histid e (His; H), isoleucme (lie; I), leucme (Leu; L), lysme (lys; K), methionme (Met; M), phenylalanme (Phe; F), prolme (Pro; P), se ⁇ ne (Ser; S), threonme (Thr; T), tryptophan (Trp; W), tyros
  • SEQ ID NO:1 is the nucleotide sequence of a 68750 bp contig containing 22 open reading frames (ORFs), which comprises the epothilone biosynthesis genes.
  • SEQ ID NO:2 is the protein sequence of a type I polyketide synthase (EPOS A) encoded by epoA (nucleotides 7610-1 1875 of SEQ ID NO:1 ).
  • EPOS A type I polyketide synthase
  • SEQ ID NO:3 is the protein sequence of a non-nbosomal peptide synthetase (EPOS P) encoded by epoP (nucleotides 1 1872-16104 of SEQ ID NO:1 )
  • EPOS P non-nbosomal peptide synthetase
  • SEQ ID NO:4 is the protein sequence of a type I polyketide synthase (EPOS B) encoded by epoB (nucleotides 16251 -21749 of SEQ ID NO:1 ).
  • EPOS B type I polyketide synthase
  • SEQ ID NO:5 is the protein sequence of a type I polyketide synthase (EPOS C) encoded by epc € (nucleotides 21746-43519 of SEQ ID NO:1 ).
  • EPOS C type I polyketide synthase
  • SEQ ID NO:6 is the protein sequence of a type I polyketide synthase (EPOS D) encoded by epoD (nucleotides 43524-54920 of SEQ ID NO:1 ).
  • EPOS D type I polyketide synthase
  • SEQ ID NO:7 is the protein sequence of a type I polyketide synthase (EPOS E) encoded by epoE (nucleotides 54935-62254 of SEQ ID NO:1).
  • EPOS E type I polyketide synthase
  • SEQ ID NO:8 is the protein sequence of a cytochrome P450 oxygenase homologue (EPOS F) encoded by epoF (nucleotides 62369-63628 of SEQ ID NO:1 ).
  • EPOS F cytochrome P450 oxygenase homologue
  • SEQ ID NO:9 is a partial protein sequence (partial Orf 1 ) encoded by ort ⁇ (nucleotides 1 -1826 of SEQ ID NO:1 ).
  • SEQ ID NO:10 is a protein sequence (Orf 2) encoded by orf ⁇ (nucleotides 3171 -1900 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:11 is a protein sequence (Orf 3) encoded by orf3 (nucleotides 3415-5556 of SEQ ID NO:1 ).
  • SEQ ID NO:12 is a protein sequence (Orf 4) encoded by orfA (nucleotides 5992-5612 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:13 is a protein sequence (Orf 5) encoded by orf5 (nucleotides 6226-6675 of SEQ ID NO:1 ).
  • SEQ ID NO:14 is a protein sequence (Orf 6) encoded by orf ⁇ (nucleotides 63779- 64333 of SEQ ID NO:1 ).
  • SEQ ID NO:15 is a protein sequence (Orf 7) encoded by orf7 (nucleotides 64290- 63853 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:16 is a protein sequence (Orf 8) encoded by orfQ (nucleotides 64363- 64920 of SEQ ID NO:1 ).
  • SEQ ID NO:17 is a protein sequence (Orf 9) encoded by or/9 (nucleotides 64727- 64287 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:18 is a protein sequence (Orf 10) encoded by orfl O (nucleotides 65063- 65767 of SEQ ID NO:1 ).
  • SEQ ID NO:19 is a protein sequence (Orf 11 ) encoded by ⁇ r l 1 (nucleotides 65874- 65008 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:20 is a protein sequence (Orf 12) encoded by o ⁇ f ⁇ 2 (nucleotides 66338- 65871 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:21 is a protein sequence (Orf 13) encoded by or/13 (nucleotides 66667- 67137 of SEQ ID NO:1 ).
  • SEQ ID NO:22 is a protein sequence (Orf 14) encoded by or/14 (nucleotides 67334- 68251 of SEQ ID NO:1 ).
  • SEQ ID NO:23 is a partial protein sequence (partial Orf 15) encoded by orfl ⁇ (nucleotides 68346-68750 of SEQ ID NO:1).
  • SEQ ID NO:24 is the universal reverse PCR primer sequence.
  • SEQ ID NO:25 is the universal forward PCR primer sequence.
  • SEQ ID NO:26 is the NH24 end “B” PCR primer sequence.
  • SEQ ID NO:27 is the NH2 end "A” PCR primer sequence.
  • SEQ ID NO:28 is the NH2 end "B” PCR primer sequence.
  • SEQ ID NO:29 is the pEP015-NH6 end "B” PCR primer sequence.
  • SEQ ID NO:30 is the pEP015-H2.7 end “A” PCR primer sequence.
  • the genes involved in the biosynthesis of epothilones can be isolated using the techniques according to the present invention.
  • the preferable procedure for the isolation of epothilone biosynthesis genes requires the isolation of genomic DNA from an organism identified as producing epothilones A and B, and the transfer of the isolated DNA on a suitable plasmid or vector to a host organism that does not normally produce the polyketide, followed by the identification of transformed host colonies to which the epothilone-producmg ability has been conferred.
  • the exact region of the transforming epothilone- confer ⁇ ng DNA can be more precisely defined.
  • the transforming epothilone-confernng DNA can be cleaved into smaller fragments and the smallest that maintains the epothilone-confernng ability further characterized.
  • a variation of this technique involves the transformation of host DNA into the same host that has had its epothilone-producmg ability disrupted by mutagenesis.
  • an epothilone-producmg organism is mutated and non- epothilone-produc g mutants are isolated. These are then complemented by genomic DNA isolated from the epothilone-producmg parent strain.
  • a further example of a technique that can be used to isolate genes required for epothilone biosynthesis is the use of transposon mutagenesis to generate mutants of an epothilone-producmg organism that, after mutagenesis, fails to produce the polyketide.
  • the region of the host genome responsible for epothilone production is tagged by the transposon and can be recovered and used as a probe to isolate the native genes from the parent strain.
  • PKS genes that are required for the synthesis of polyketides and that are similar to known PKS genes may be isolated by virtue of their sequence homology to the biosynthetic genes for which the sequence is known, such as those for the biosynthesis of ⁇ famycin or soraphen. Techniques suitable for isolation by homology include standard library screening by DNA hybridization.
  • Preferred for use as a probe molecule is a DNA fragment that is obtainable from a gene or another DNA sequence that plays a part in the synthesis of a known polyketide.
  • a preferred probe molecule comprises a 1.2 kb Sma ⁇ DNA fragment encoding the ketosyntha- se domain of the fourth module of the soraphen PKS (U.S Patent No. 5,716,849), and a more preferred probe molecule comprises the ⁇ -ketoacyl synthase domains from the first and second modules of the ⁇ famycin PKS (Schupp et al., FEMS Microbiology Letters 159: 201-207 (1998)). These can be used to probe a gene library of an epothilone-producmg microorganism to isolate the PKS genes responsible for epothilone biosynthesis.
  • biosynthetic genes for epothilones A and B can surprisingly be cloned from a microorganism that produces that polyketide.
  • the cloned PKS genes can be modified and expressed in transgenic host organisms.
  • the isolated epothilone biosynthetic genes can be expressed in heterologous hosts to enable the production of the polyketide with greater efficiency than might be possible from native hosts. Techniques for these genetic manipulations are specific for the different available hosts and are known in the art. For example, heterologous genes can be expressed in Streptomyces and other actinomycetes using techniques such as those described in McDaniel et ai., Science 262: 1546-1550 (1993) and Kao et ai., Science 265: 509-512 (1994), both of which are incorporated herein by reference.
  • genes responsible for polyketide biosynthesis can also be expressed in other host organisms such as pseudomonads and E coll. Techniques for these genetic manipulations are specific for the different available hosts and are known in the art.
  • PKS genes have been sucessfuliy expressed in E. cob using the pT7-7 vector, which uses the T7 promoter. See, Tabor et al., Proc. Natl. Acad. Sci. USA 82. 1074-1078 (1985), incorporated herein by reference.
  • the expression vectors pKK223-3 and pKK223-2 can be used to express heterologous genes in E.
  • operons encoding multiple ORFs the simplest procedure is to insert the operon into a vector such as pKK223-3 in transc ⁇ ptional fusion, allowing the cognate ⁇ bo- some binding site of the heterologous genes to be used.
  • Techniques for overexpression in gram-positive species such as Bacillus are also known in the art and can be used in the context of this invention (Quax et al., in: Industrial Microorganisms Basic and Applied Molecular Genetics, Eds. Baltz et ai., American Society for Microbiology, Washington (1993))
  • yeast and baculovirus expression systems include yeast and baculovirus expression systems. See, for example, 'The Expression of Recombinant Proteins in Yeasts," Sudbery, P E., Curr. Opin. Biotechnol 7(5): 517-524 (1996); "Methods for Expressing Recombinant Proteins in Yeast,” Mackay, et al., Ed ⁇ tor(s). Carey, Paul R., Protein Eng. Des 105-153, Publisher: Academic, San Diego, Calif (1996); “Expression of heterologous gene products in yeast,” Pichuantes, et al., Ed ⁇ tor(s) Cleland, J.
  • PKS genes in heterologous hosts Another consideration for expression of PKS genes in heterologous hosts is the requirement of enzymes for posttranslational modification of PKS enzymes by phosphopante- theinylation before they can synthesize polyketides.
  • the enzymes responsible for this modification of type I PKS enzymes, phosphopantethemyl (P-pant) transferases are not normally present in many hosts such as E. coli.
  • This problem can be solved by coexpres- sion of a P-pant transferase with the PKS genes in the heterologous host, as described by Kealey et ai., Proc. Natl. Acad. Sci. USA 95: 505-509 (1998), incorporated herein by reference.
  • the significant criteria in the choice of host organism are its ease of manipulation, rapidity of growth (i.e. fermentation), possession or the proper molecular machinery for processes such as posttranslational modification, and its lack of susceptibility to the polyketide being overproduced.
  • Most preferred host organisms are actinomycetes such as strains of Streptomyces.
  • Other preferred host organisms are pseudomonads and E. coli.
  • the above-described methods of polyketide production have significant advantages over the technology currently used in the preparation of the compounds. These advantages include the cheaper cost of production, the ability to produce greater quantities of the compounds, and the ability to produce compounds of a preferred biological enantiomer, as opposed to racemic mixtures inevitably generated by organic synthesis.
  • Compounds produced by heterologous hosts can be used in medical (e.g. cancer treatment in the case of epothilones) as well as agricultural applications.
  • Sorangium cellulosum strain 90 (DSM 6773, Deutsche Sammlung von Mikroorganis- men und Zellkulturen, Braunschweig) is streaked out and grown (30°C) on an agar plate of SolE medium (0.35% glucose, 0.05% tryptone, 0.15% MgS0 x 7H 0, 0.05% ammonium sulfate, 0.1% CaCI 2 , 0.006% K 2 HP0 4 , 0.01% sodium dithionite, 0.0008% Fe-EDTA, 1 2% HEPES, 3.5% [vol/vol] supernatant of sterilized stationary S. cellulosum culture) pH ad.
  • SolE medium 0.35% glucose, 0.05% tryptone, 0.15% MgS0 x 7H 0, 0.05% ammonium sulfate, 0.1% CaCI 2 , 0.006% K 2 HP0 4 , 0.01% sodium dithionite, 0.0008% Fe-EDTA, 1 2% HEPES, 3.5% [vol/vol] superna
  • pBelobacll contains a gene encoding chloramphenicol resistance, a multiple cloning site in the lacZ gene providing for blue/white selection on appropriate medium, as well as the genes required for the replication and maintenance of the plasmid at one or two copies per cell.
  • the ligation mixture is used to transform Eschenchia coli DH10B electrocompetent cells using standard electroporation techniques. Chloramphenicol-resistant recombinant (white, lac mutant) colonies are transferred to a positively charged nylon membrane filter in 384 3X3 grid format. The clones are lysed and the DNA is cross-linked to the filters. The same clones are also preserved as liquid cultures at -80°C.
  • the Bac library filters are probed by standard Southern hybridization procedures
  • the DNA probes used encode ⁇ -ketoacyl synthase domains from the first and second modules of the ⁇ famycin polyketide synthase (Schupp et al., FEMS Microbiology Letters 159: 201 -207 (1998)).
  • the probe DNAs are generated by PCR with primers flanking each ketosynthase domain using the plasmid pNE95 as the template (pNE95 equals cosmid 2 described in Schupp et al. (1998)).
  • PCR-amplified DNA 25 ng is isolated from a 0.5% agarose gel and labeled with 32 P-dCTP using a random primer labeling kit (Gibco-BRL, Bethesda MD, USA) according to the manufacturer's instructions.
  • Hybridization is at 65°C for 36 hours and membranes are washed at high stringency (3 times with 0.1 x SSC and 0.5% SDS for 20 mm at 65°C).
  • the labeled blot is exposed on a phosphorescent screen and the signals are detected on a Phospholmager 445SI (screen and 445SI from Molecular Dynamics). This results in strong hybridization of certain Bac clones to the probes.
  • Bac DNA from the Bac clones of interest is isolated by a typical mmiprep procedure.
  • the cells are resuspended in 200 ⁇ l lysozyme solution (50mM glucose, 10 mM EDTA, 25 mM T ⁇ s-HCI, 5mg/ml lysozyme), lysed in 400 ⁇ l lysis solution (0.2 N NaOH and 2% SDS), the proteins are precipitated (3.0 M potassium acetate, adjusted to pH5.2 with acetic acid), and the Bac DNA is precipitated with isopropanol.
  • the DNA is resuspended in 20 ⁇ l of nuclease-free distilled water, restricted with SamHI (New England Biolabs, Inc.) and separated on a 0.7% agarose gel.
  • the gel is blotted by Southern hybridization as described above and probed under conditions described above, with a 1.2 kb Smal DNA fragment encoding the ketosyn- thase domain of the fourth module of the soraphen polyketide synthase as the probe (see, U.S. Patent No. 5,716,849).
  • Five different hybridization patterns are observed.
  • One clone representing each of the five patterns is selected and named pEP015, pEPO20, pEPO30, pEP031 , and pEP033, respectively.
  • the DNA of the five selected Bac clones is digested with SamHI and random fragments are subcloned into pBluesc ⁇ pt II SK+ (Stratagene) at the SamHI site. Subclones carrying inserts between 2 and 10 kb in size are selected for sequencing of the flanking ends of the inserts and also probed with the 1.2 Smal probe as described above. Subclones that show a high degree of sequence homoiogy to known polyketide synthases and/or strong hybridization to the soraphen ketosynthase domain are used for gene disruption experiments.
  • the SamHI inserts of the subclones generated from the five selected Bac clones as described above are isolated and gated into the unique SamHI site of plasmid pCIB132 (see, U.S. Patent No. 5,716,849).
  • the pClB132 derivatives carrying the inserts are transformed into Eschenchia coli ED8767 containing the helper plasmid pUZ8 (Hedges and Matthew, Plasmid 2- 269-278 (1979).
  • the transformants are used as donors in conjugation experiments with Sorangium cellulosum BCE28/2 as recipient.
  • the mixed cells are then cent ⁇ - fuged at 4000 rpm for 10 minutes and resuspended in 0.5 ml G51 b medium
  • This cell suspension is then plated as a drop in the center of a plate with So1 E agar containg 50 mg/l kanamycm.
  • the cells obtained after incubation for 24 hours at 30°C are harvested and resuspended in 0 8 ml of G51 b medium, and 0.1 to 0.3 ml of this suspension is plated out on a selective So1 E solid medium containing phleomycin (30 mg/l), streptomycin (300 mg/l), and kanamycm (50 mg/l)
  • the counterselection of the donor Eschenchia coli strain takes place with the aid of streptomycin.
  • the colonies that grow on this selective medium after an incubation time of 8-12 days at a temperature of 30°C are isolated with a plastic loop and streaked out and cultivated on the same agar medium for a second round of selection and purification
  • the colony-derived cultures that grow on this selective agar medium after 7 days at a temperature of 30°C are transconjugants of Sorangium cellulosum BCE28/2 that have acquired phleomycin resistance by conjugative transfer of the pCIB132 derivatives carrying the subcloned SamHI fragments.
  • Transconjugant cells grown on about 1 square cm surface of the selective So1 E plates of the second round of selection are transferred by a sterile plastic loop into 10 ml of medium G52-H in an 50 ml Erlenmeyer flask. After incubation at 30°C and 180 rpm for 3 days, the culture is transfered into 50 ml of medium G52-H in an 200 ml Erlenmeyer flask.
  • 10 ml of this culture is transfered into 50 ml of medium 23B3 (0.2 % glucose, 2 % potato starch, 1.6 % soya meal defatted, 0 0008 % Fe-EDTA Sodium salt, 0.5 % HEPES (4-(2-hydroxyethyl)-p ⁇ peraz ⁇ ne-1 - ethane-sulfonic-acid), 2 % vol/vol polysterole resin XAD16 (Rohm & Haas), pH adjusted to 7.8 with NaOH) in an 200 ml Erlenmeyer flask.
  • medium 23B3 0.2 % glucose, 2 % potato starch, 1.6 % soya meal defatted, 0 0008 % Fe-EDTA Sodium salt, 0.5 % HEPES (4-(2-hydroxyethyl)-p ⁇ peraz ⁇ ne-1 - ethane-sulfonic-acid), 2 % vol/vol polysterole resin XAD16 (R
  • Quantitative determination of the epothilone produced takes place after incubation of the cultures at 30°C and 180 rpm for 7 days.
  • the complete culture broth is filtered by suction through a 150 ⁇ m nylon filter.
  • the resin remaining on the filter is then resuspended in 10 ml isopropanol and extracted by shaking the suspension at 180 rpm for 1 hour.
  • epothilones A and B therein is determined by means of an HPLC and detection at 250 nm with a UV_DAD detector (HPLC with Waters -Symetry C18 column and a gradient of 0.02 % phosphoric acid 60%-0% and acetonit ⁇ l 40%-100%).
  • Transconjugants with three different integrated SamHI fragments subcloned from pEP015, namely transconjugants with the SamHI fragment of plasmid pEP015-21 , transconjugants with the SamHI fragment of plasmid pEPOl 5-4-5, and transconjugants with the SamHI fragment of plasmid pEPOl 5-4-1 are tested in the manner described above. HPLC analysis reveals that all transconjugants no longer produce epothilone A or B.
  • epothilone A and B are detectable in a concentration of 2-4 mg/l in transconjugants with SamHI fragments integrated that are derived from pEPO20, pEPO30, pEP031 , pEP033, and in the parental strain BCE28/2.
  • Example 8 Nucleotide Sequence Determination of the Cloned Fragments and
  • Plasmid DNA is isolated from the strain Eschenchia coli DH10B [pEP015-21], and the nucleotide sequence of the 2.3-kb SamHI insert in pEP015-21 is determined. Automated DNA sequencing is done on the double-stranded DNA template by the dideoxynucleo- tide chain termination method, using Applied Biosystems model 377 sequencers.
  • the primers used are the universal reverse primer (5' GGA AAC AGC TAT GAC CAT G 3" (SEQ ID NO:24)) and the universal forward primer (5' GTA AAA CGA CGG CCA GT 3' (SEQ ID NO:25)).
  • oligonucle- otides designed for the 3' ends of the previously determined sequences, are used to extend and join contigs. Both strands are entirely sequenced, and every nucleotide is se- quenced at least two times.
  • the nucleotide sequence is compiled using the program Sequencher vers. 3 0 (Gene Codes Corporation), and analyzed using the University of Wisconsin Genetics Computer Group programs.
  • the nucleotide sequence of the 2213-bp insert corresponds to nucleotides 20779-22991 of SEQ ID NO:1.
  • Plasmid DNA is isolated from the strain Eschenchia coli DH10B [pEPOl 5-4-1], and the nucleotide sequence of the 3.9-kb SamHI insert in pEPOl 5-4-1 is determined as described in (A) above.
  • the nucleotide sequence of the 3909-bp insert corresponds to nucleotides 16876-20784 of SEQ ID NO:1.
  • Plasmid DNA is isolated from the strain Eschenchia coli DH10B [pEPOl 5-4-5], and the nucleotide sequence of the 2.3-kb SamHI insert in pEPOl 5-4-5 is determined as described in (A) above.
  • the nucleotide sequence of the 2233-bp insert corresponds to nucleotides 42528-44760 of SEQ ID NO:1.
  • Example 9 Subcloning and Ordering of DNA Fragments from pEP015 Containing Epothilone Biosynthesis Genes
  • pEP015 is digested to completion with the restriction enzyme H/ ⁇ dlll and the resulting fragments are subcloned into pBluescript II SK- or pNEB193 (New England Biolabs) that has been cut with H/ ⁇ dlll and dephosphorylated with calf intestinal alkaline phospha- tase.
  • pEP015-NH1 pEP015-NH2
  • pEP015-NH6 pEP015-NH24
  • pEP015-H2.7 and pEP015- H3.0 both based on pBluescript II SK-
  • the SamHI insert of pEP015-21 is isolated and DIG-labeled (Non-radioactive DNA labeling and detection system, Boeh ⁇ nger Mannheim), and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015- NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signal is detected for pEP015-NH24, indicating that pEP015-21 is contained within pEP015-NH24.
  • DIG-labeled Non-radioactive DNA labeling and detection system, Boeh ⁇ nger Mannheim
  • the SamHI insert of pEPOI 5-4-1 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015-NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signals are detected for pEP015-NH24 and pEP015-H2.7. Nucleotide sequence data generated from one end each of pEP015-NH24 and pEP015-H2.7 are also in complete agreement with the previously determined sequence of the SamHI insert of pEPOI 5-4-1 .
  • the SamHI insert of pEPOI 5-4-5 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015-NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signal is detected for pEP015-NH2, indicating that pEP015-21 is contained within pEP015-NH2.
  • Nucleotide sequence data is generated from both ends of pEP015-NH2 and from the end of pEP015-NH24 that does not overlap with pEPOI 5-4-1.
  • PCR primers NH24 end “B”: GTGACTGGCGCCTGGAATCTGCATGAGC (SEQ ID N0:26), NH2 end “A”: AGCGGGAGCTTGCTAGACATTCTGTTTC (SEQ ID N0:27), and NH2 end “B”: GACGCGCCTCGGGCAGCGCCCCAA (SEQ ID NO:28), pointing towards the H/ ⁇ dlll sites, are designed based on these sequences and used in amplification reactions with pEP015 and, in separate experiments, with Sorangium cellulosum So ce90 genomic DNA as the templates.
  • the H/ ⁇ dlll insert of pEP015-H2.7 is isolated and DIG-labeled as above, and used as a probe in a DNA hybridization experiment at high stringency against pEP015 digested by Noti.
  • a Noti fragment of about 9 kb in size shows a strong a hybridization, and is further subcloned into pBluescript II SK- that has been digested with Noti and dephosphorylated with calf intestinal alkaline phosphatase, to yield pEP015-N9-16.
  • the ⁇ /orl insert of pEP015-N9-16 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015- NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signals are detected for pEPOI 5-NH6, and also for the expected clones pEPOI 5-H2.7 and pEPOI 5- NH24. Nucleotide sequence data is generated from both ends of pEP015-NH6 and from the end of pEP015-H2.7 that does not overlap with pEPOI 5-4-1.
  • PCR primers are designed pointing towards the H/ndlll sites and used in amplification reactions with pEP015 and, in separate experiments, with Sorangium cellulosum So ce90 genomic DNA as the templates. Specific amplification is found with primer pair pEP015-NH6 end "B”: CACCGAAGCGTCGATCTGGTCCATC (SEQ ID NO:29) and pEP015-H2.7 end "A”: CGGTCAGATCGACGACGGGCTTTCC (SEQ ID NO:30) with both templates. The amplimers are cloned into pBluescript II SK- and completely sequenced.
  • sequences of the amplimers are identical, and also agree completely with the end sequences of pEP015- NH6 and pEP015-H2.7, fused at the H/ ⁇ dlll site, establishing that the H/ ⁇ dlll fragments of pEP015-NH6 and pEP015-H2.7 are, in this order, contiguous.
  • a cosmid DNA library of Sorangium cellulosum So ce90 is generated, using established procedures, in pScosT ⁇ plex-ll (Ji, et ai., Genomics Z ⁇ - 185-192 (1996)). Briefly, high- molecular weight genomic DNA of Sorangium cellulosum So ce90 is partially digested with the restriction enzyme Sau3AI to provide fragments with average sizes of about 40 kb, and ligated to SamHI and Xba ⁇ digested pScosT ⁇ plex-ll The hgation mix is packaged with Gigapack III XL (Stratagene) and used to transfect E. coli XL1 Blue MR cells.
  • the cosmid library is screened with the approximately 2.2 kb SamHI - H/ ⁇ dlll fragment, derived from the downstream end of the insert of pEP015-NH2, used as a probe in colony hybridization.
  • a strongly hybridizing clone, named pEP04E7 is selected.
  • pEP04E7 DNA is isolated, digested with several restriction endonucleases, and probed in Southern hybridization experiments with the 2.2 kb SamHI - H/ ⁇ dlll fragment.
  • a strongly hybridizing ⁇ /ofl fragment of approximately 9 kb in size is selected and subcloned into pBluescript II SK- to yield pEP04E7-N9-8.
  • End sequencing reveals, however, that the downstream end of the insert of pEP04E7-N9-8 contains the SamHI - ⁇ /ofl polyl ker of pScosT ⁇ plex-ll, thereby indicating that the genomic DNA insert of pEP04E7 ends at a Sau3AI site within the extending H/ndlll - ⁇ /ofl fragment and that the ⁇ /ofl site is derived from pScosT ⁇ plex-ll.
  • a H/ndlll - EcoRV fragment of about 13 kb in size is found to strongly hybridize to the probe, and is subcloned into pBluescript II SK- digested with H/ndlll and H/ncll to yield pEP032-HEV15.
  • Oligonucleotide primers are designed based on the downstream end sequence of pEP015-NH2 and on the upstream (H/ ⁇ dlll) end sequence derived from pEP032-HEV15, and used in sequencing reactions with pEP04E7-N9-8 as the template.
  • the sequences reveal the existence of a small H/ndlll fragment (EPO4E7-H0.02) of 24 bp, undetectable in standard restriction analysis, separating the H/ndlll site at the downstream end of pEP015- NH2 from the H/ndlll site at the upstream end of pEP032-HEV15.
  • Example 9 the subclone contig described in Example 9 is extended to include the H/ ⁇ dlll fragment EPO4E7-H0.02 and the insert of pEP032-HEV15, and constitutes the inserts of: pEP015-NH6, pEP015-H2.7, pEP015-NH24, pEP015-NH2, EPO4E7-H0.02 and pEP032- HEV15, in this order.
  • the nucleotide sequence of the subclone contig described in Example 10 is determined as follows. pEPOI 5-H2.7. Plasmid DNA is isolated from the strain Eschenchia coli DH1 OB [pEP015-H2.7], and the nucleotide sequence of the 2.7-kb SamHI insert in pEP015-H2.7 is determined. Automated DNA sequencing is done on the double-stranded DNA template by the dideoxynucleotide chain termination method, using Applied Biosystems model 377 sequencers.
  • the primers used are the universal reverse primer (5' GGA AAC AGC TAT GAC CAT G 3' (SEQ ID NO:24)) and the universal forward primer (5' GTA AAA CGA CGG CCA GT 3' (SEQ ID NO:25)).
  • custom- synthesized oligonucleotides designed for the 3' ends of the previously determined sequences, are used to extend and join contigs.
  • the H/ndlll inserts of these pias- mids are isolated, and subjected to random fragmentation using a Hydroshear apparatus (Genomic Instrumentation Services, Inc.) to yield an average fragment size of 1 -2 kb.
  • the fragments are end-repaired using T4 DNA Poiymerase and Klenow DNA Polymerase enzymes in the presence of desoxynucleotide triphosphates, and phosphorylated with T4 DNA Kinase in the presence of ⁇ bo-ATP. Fragments in the size range of 1.5-2.2 kb are isolated from agarose gels, and ligated into pBluescript II SK- that has been cut with EcoRV and dephosphorylated. Random subclones are sequenced using the universal reverse and the universal forward primers. pEP032-HEV15.
  • pEP032-HEV15 is digested with H/ ⁇ dlll and Sspl, the approximately 13.3 kb fragment containing the -13 kb H/ ⁇ dlll - EcoRV insert from So. cellulosum So ce90 and a 0.3 kb H/ ⁇ cll - Sspl fragment from pBluescript II SK- is isolated, and partially digested with Haelll to yield fragments with an average size of 1 -2 kb. Fragments in the size range of 1.5-2.2 kb are isolated from agarose gels, and ligated into pBluescript II SK- that has been cut with EcoRV and dephosphorylated. Random subclones are sequenced using the universal reverse and the universal forward primers.
  • the chromatograms are analyzed and assembled into contigs with the Phred, Phrap and Consed programs (Ewmg, et ai., Genome Res. 8(3). 175-185 (1998); Ewmg, et al , Genome Res. 8(3): 186-194 (1998); Gordon, et ai., Genome Res. 8(3): 195-202 (1998)) Contig gaps are filled, sequence discrepancies are resolved, and low-quality regions are resequenced using custom-designed o gonucleotide primers for sequencing on either the original subclones or selected clones from the random subclone libraries. Both strands are completely sequenced, and every basepair is covered with at least a minimum aggregated Phred score of 40 (confidence level of 99.99%).
  • the nucleotide sequence of the 68750 bp contig is shown as SEQ ID NO:1.
  • Example 12 Nucleotide Sequence Analysis of the Epothilone Biosynthesis Genes
  • SEQ ID NO:1 is found to contain 22 ORFs as detailed below in Table 1 :
  • epoA codes for EPOS A (SEQ ID NO:2), a type I polyketide synthase consisting of a single module, and harboring the following domains: ⁇ -ketoacyl-synthase (KS) (nucleotides 7643-8920 of SEQ ID NO:1 , amino acids 11- 437 of SEQ ID NO:2); acyltransferase (AT) (nucleotides 9236-10201 of SEQ ID NO:1 , ammo acids 543-864 of SEQ ID NO:2); enoyl reductase (ER) (nucleotides 10529-11428 of SEQ ID NO:1 , ammo acids 974-1273 of SEQ ID NO:2); and acyl carrier protein homologous domain (ACP) (nucleotides 11549-11764 of SEQ ID NO:1 , am
  • EPOS A Sequence comparisons and motif analysis (Haydock, et al. FEBS Lett. 374: 246- 248 (1995), Tang, et al., Gene 216: 255-265 (1998)) reveal that the AT encoded by EPOS A is specific for malonyl-CoA.
  • EPOS A should be involved in the initiation of epothilone biosynthesis by loading the acetate unit to the multienzyme complex that will eventually form part of the 2-methylth ⁇ azole ring (C26 and C20).
  • epoP nucleotides 11872-16104 of SEQ ID NO:1
  • EPOS P SEQ ID NO:3
  • EPOS P harbors the following domains:
  • motif K (ammo acids 72-81 [FPLTDIQESY] of SEQ ID NO:3, corresponding to nucleotide positions 12085-12114 of SEQ ID NO.1 ); motif L (ammo acids 1 18-125 [VVARHDML] of SEQ ID NO.3, corresponding to nucleotide positions 12223-12246 of SEQ ID NO:1 ), motif M (ammo acids 199- 212 [SIDLINVDLGSLSI] of SEQ ID NO:3, corresponding to nucleotide positions 12466- 12507 of SEQ ID NO:1 ); and motif O (ammo acids 353-363 [GDFTSMVLLDI] of SEQ ID NO:3, corresponding to nucleotide positions 12928-12960 of SEQ ID NO:1 );
  • motif A (ammo acids 549- 565 [LTYEELSRRSRRLGARL] of SEQ ID NO:3, corresponding to nucleotide positions 13516-13566 of SEQ ID NO:1 ); motif B (ammo acids 588-603 [VAVLAVLESGAAYVPI] of SEQ ID NO.3, corresponding to nucleotide positions 13633-13680 of SEQ ID NO:1); motif C (ammo acids 669-684 [AYVIYTSGSTGLPKGV] of SEQ ID NO:3, corresponding to nucleotide positions 13876-13923 of SEQ ID NO:1 ); motif D (ammo acids 815-821 [SLGGATE] of SEQ ID NO:3, corresponding to nucleotide positions 14313-14334 of SEQ ID NO:1 ); motif E (amino acids 868-892 [GQLYIGGVGLALGYWRDEEKTRKSF] of
  • PCP peptidyl carrier protein homologous domain
  • EPOS P is involved in the activation of a cysteme by adenylation, binding the activated cysteine as an ammoacyl-S-PCP, forming a peptide bond between the enzyme-bound cysteme and the acetyl-S-ACP supplied by EPOS A, and the formation of the initial thiazoiine ring by intramolecular heterocyciization.
  • the unknown domain of EPOS P displays very weak homologies to NAD(P)H oxidases and reductases from Bacillus species. Thus, this unknown domain and/or the ER domain of EPOS A may be involved in the oxidation of the initial 2-methylthiazoline ring to a 2-methylthiazole.
  • epoB (nucleotides 16251 -21749 of SEQ ID NO:1 ) codes for EPOS B (SEQ ID NO:4), a type I polyketide synthase consisting of a single module, and harboring the following domains: KS (nucleotides 16269-17546 of SEQ ID NO:1 , am o acids 7-432 of SEQ ID NO:4); AT (nucleotides 17865-18827 of SEQ ID NO:1 , ammo acids 539-859 of SEQ ID NO:4); dehydratase (DH) (nucleotides 18855-19361 of SEQ ID NO:1 , amino acids 869-1037 of SEQ ID NO:4); ⁇ -ketoreductase (KR) (nucleotides 20565-21302 of SEQ ID NO:1 , ammo acids 1439-1684 of SEQ ID NO:4); and ACP (nucleotides 21414
  • EPOS B is specific for methylmalonyl-CoA.
  • EPOS A should be involved in the first polyketide chain extension by catalysing the Claisen-like condensation of the 2-methyl-4-thiazolecarboxyl-S-PCP starter group with the methylmalonyl-S-ACP, and the concomitant reduction of the b-keto group of C17 to an enoyl.
  • epoC (nucleotides 21746-43519 of SEQ ID NO:1 ) codes for EPOS C (SEQ ID NO:5), a type I polyketide synthase consisting of 4 modules.
  • the first module harbors a KS (nucleotides 21860-231 16 of SEQ ID NO:1 , amino acids 39-457 of SEQ ID NO:5); a malonyl CoA- specific AT (nucleotides 23431-24397 of SEQ ID NO:1 , amino acids 563-884 of SEQ ID NO:5); a KR (nucleotides 25184-25942 of SEQ ID NO:1 , amino acids 1147-1399 of SEQ ID NO:5); and an ACP (nucleotides 26045-26263 of SEQ ID NO:1 , amino acids 1434-1506 of SEQ ID NO:5).
  • KS nucleotides 21860-231 16 of SEQ ID NO:1 , amino acids 39-457 of SEQ ID NO:5
  • a malonyl CoA- specific AT nucleotides 23431-24397 of SEQ ID NO:1 , amino acids 563-884 of SEQ ID NO:5
  • KR nucleo
  • This module incorporates an acetate extender unit (C14-C13) and reduces the ⁇ -keto group at C15 to the hydroxyl group that takes part in the final lactonization of the epothilone macrolactone ring.
  • the second module of EPOS C harbors a KS (nucleotides 26318-27595 of SEQ ID NO:1 , ammo acids 1524-1950 of SEQ ID NO:5); a malonyl CoA- specific AT (nucleotides 27911 -28876 of SEQ ID NO:1 , ammo acids 2056-2377 of SEQ ID NO:5), a KR (nucleotides 29678-30429 of SEQ ID NO.1 , ammo acids 2645-2895 of SEQ ID NO:5); and an ACP (nucleotides 30539-30759 of SEQ ID NO.1 , ammo acids 2932-3005 of SEQ ID NO.5).
  • KS
  • This module incorporates an acetate extender unit (C12-C1 1 ) and reduces the ⁇ -keto group at C13 to a hydroxyl group.
  • an acetate extender unit C12-C1 1
  • the nascent polyketide chain of epothilone corresponds to epothilone A
  • the incorporation of the methyl side chain at C12 in epothilone B would require a post-PKS C-methyltransferase activity
  • the formation of the epoxi ring at C13-C12 would also require a post-PKS oxidation step.
  • the third module of EPOS C harbors a KS (nucleotides 30815-32092 of SEQ ID NO:1 , ammo acids 3024-3449 of SEQ ID NO:5); a malonyl CoA-specific AT (nucleotides 32408-33373 of SEQ ID NO:1 , ammo acids 3555-3876 of SEQ ID NO:5); a DH (nucleotides 33401 -33889 of SEQ ID NO.1 , ammo acids 3886-4048 of SEQ ID NO:5); an ER (nucleotides 35042-35902 of SEQ ID NO.1 , ammo acids 4433-4719 of SEQ ID NO.5), a KR (nucleotides 35930-36667 of SEQ ID NO 1 , ammo acids 4729-4974 of SEQ ID NO:5), and an ACP (nucleotides 36773-36991 of SEQ ID NO.1 , am
  • This module incorporates a propionate extender unit (C24 and C8-C7) and fully reduces the ⁇ -keto group at C9.
  • epoD nucleotides 43524-54920 of SEQ ID NO:1
  • EPOS D SEQ ID NO:6
  • the first module harbors a KS (nucleotides 43626-44885 of SEQ ID NO:1 , am o acids 35-454 of SEQ ID NO:6); a methylmalonyl CoA-specific AT (nucleotides 45204-46166 of SEQ ID NO:1 , ammo acids 561 -881 of SEQ ID NO:6); a KR (nucleotides 46950-47702 of SEQ ID NO:1 , ammo acids 1143-1393 of SEQ ID N0:6); and an ACP (nucleotides 47811 -48032 of SEQ ID NO:1 , ami- no acids 1430-1503 of SEQ ID NO:6).
  • KS nucleotides 43626-44885 of SEQ ID NO:1 , am o acids 35-454 of SEQ ID NO:6
  • a methylmalonyl CoA-specific AT nucleotides 45204-46166 of SEQ ID NO:1 , am
  • This module incorporates a propionate extender unit (C23 and C6-C5) and reduces the ⁇ -keto group at C7 to a hydoxyl group.
  • the second module harbors a KS (nucleotides 48087-49361 of SEQ ID NO:1 , am o acids 1522-1946 of SEQ ID NO: 6); a methylmalonyl CoA-specific AT (nucleotides 49680-50642 of SEQ ID NO:1 , am o acids 2053-2373 of SEQ ID NO:6); a DH (nucleotides 50670-51176 of SEQ ID NO:1 , ammo acids 2383-2551 of SEQ ID NO:6); a methyltransferase (MT, nucleotides 51534-52657 of SEQ ID NO.1 , ammo acids 2671-3045 of SEQ ID NO.6), a KR (nucleotides 53697-54431 of SEQ
  • This module incorporates a propionate extender unit (C21 or C22 and C4-C3) and reduces the ⁇ -keto group at C5 to a hydoxyl group. This reduction is somewhat unexpected, since epothilones contain a keto group at C5. Discrepancies of this kind between the deduced reductive capabilities of PKS modules and the redox state of the corresponding positions in the final polyketide products have been, however, reported in the literature (see, for example, Schwecke, et al., Proc. Natl. Acad. Sci. USA 92: 7839-7843 (1995) and Schupp, et al., FEMS Microbiology Letters 159: 201 -207 (1998)).
  • EPOS D is predicted to incorporate a propionate unit into the growing polyketide chain, providing one methyl side chain at C4.
  • This module also contains a methyltransferase domain integrated into the PKS between the DH and the KR domains, in an arrangement similar to the one seen in the HMWP1 yersiniabactm synthase (Gehnng, A.M., DeMoll, E., Fetherston, J.D., Mori, I., Mayhew, G.F., Blattner, F.R., Walsh, C.T., and Perry, R.D.: Iron acquisition in plague: modular logic in enzymatic biogenesis of yersiniabactm by Yersmia pestis. Chem. Biol. 5, 573-586, 1998).
  • EPOS E (SEQ ID NO:7), a type I polyketide synthase consisting of one module, harboring a KS (nucleotides 55028- 56284 of SEQ ID NO:1 , ammo acids 32-450 of SEQ ID NO:7); a malonyl CoA-specific AT (nucleotides 56600-57565 of SEQ ID NO:1 , ammo acids 556-877 of SEQ ID NO:7); a DH (nucleotides 57593-58087 of SEQ ID NO: 1 , am o acids 887-1051 of SEQ ID NO:7); a probably nonfunctional ER (nucleotides 59366-60304 of SEQ ID NO:1 , am
  • the ER domain in this module harbors an active site motif with some highly unusual ammo acid substitutions that probably render this domain inactive.
  • the module incorporates an acetate extender unit (C2-C1 ), and reduces the ⁇ -keto at C3 to an enoyl group.
  • Epothilones contain a hydroxyl group at C3, so this reduction also appears to be excessive as discussed for the second module of EPOS D
  • the TE domain of EPOS E takes part in the release and cychzation of the grown polyketide chain via lactonization between the carboxyl group of C1 and the hydroxyl group of C15.
  • the deduced protein product (Orf 2, SEQ ID NO:10) of or/2 (nucleotides 3171 -1900 on the reverse complement strand of SEQ ID NO:1 ) shows strong similarities to hypothetical ORFs from Mycobacterium and Streptomyces coelicolor, and more distant similarities to carboxypeptidases and DD- peptidases of different bacteria.
  • the deduced protein product of or/3 shows homoiogies to Na/H antiporters of different bacteria. Orf 3 might take part in the export of epothilones from the producer strain orf4 and or/5 have no homologues in the sequence databanks
  • EPOS F SEQ ID NO.8
  • epoF codes for EPOS F (SEQ ID NO.8), a deduced protein with strong sequence similarities to cytochrome P450 oxygenases.
  • EPOS F may take part in the adjustment of the redox state of the carbons C12, C5, and/or C3.
  • the deduced protein product of 0//14 shows strong similarities to Gl.3293544, a hypothetic protein with no proposed function from Streptomyces coelicolor, and also to Gl:2654559, the human emb ⁇ onic lung protein. It is also more distantly related to cation efflux system proteins like Gl:2623026 from Methano- bacte ⁇ um thermoautotrophicum, so it might also take part in the export of epothilones from the producing cells.
  • the remaining ORFs (orfo-o ⁇ f ⁇ 3 and orfl 5) show no homoiogies to entries in the sequence databanks.
  • Epothilone synthase genes according to the present invention are expressed in heterologous organisms for the purposes of epothilone production at greater quantities than can be accomplished by fermentation of Sorangium cellulosum.
  • a preferable host for heterologous expression is Streptomyces, e.g. Streptomyces coelicolor, which natively produces the polyketide actinorhodm. Techniques for recombinant PKS gene expression in this host are described in McDaniel et al., Science 262: 1546-1550 (1993) and Kao et al., Science 265: 509-512 (1994).
  • the heterologous host strain is engineered to contain a chromosomal deletion of the actinorhodm (act) gene cluster.
  • Expression plasmids containing the epothilone synthase genes of the invention are constructed by transferring DNA from a temperature-sensitive donor plasmid to a recipient shuttle vector in E coli (McDaniel et al. (1993) and Kao et al. (1994)), such that the synthase genes are built-up by homologous recombination within the vector.
  • the epothilone synthase gene cluster is introduced into the vector by restriction fragment ligation. Following selection, e.g. as described in Kao et ai.
  • DNA from the vector is introduced into the act-minus Streptomyces coelicolor strain according to protocols set forth in Hopwood et ai., Genetic Manipulation of Streptomyces. A Laboratory Manual (John Innes Foundation, Norwich, United Kingdom, 1985), incorporated herein by reference.
  • the recombinant Streptomyces strain is grown on R2YE medium (Hopwood et ai. (1985)) and produces epothilones.
  • the epothilone synthase genes according to the present invention are expressed in other host organisms such as pseudomonads, Bacillus, yeast, insect cells and/or E. coli.
  • PKS and NRPS genes are preferably expressed in E.
  • the expression vectors pKK223-3 and pKK223-2 are used to express PKS and NRPS genes in E. coli, either in transc ⁇ ptional or translational fusion, behind the tac or trc promoter.
  • G52 Medium yeast extract, low in salt (BioSpringer, Maison Alfort, France) 2 g/l
  • Cyclodextnns (Fluka, Buchs, Switzerland, or Wacker Chemie, Kunststoff, Germany) in different concentrations are sterilised separately and added to the 1 B12 medium prior to seeding.
  • the culture is overseeded every 3-4 days, by adding 50 ml of culture to 450 ml of G52 medium (in a 2 litre Erlenmeyer flask). All experiments and fermentations are carried out by starting with this maintenance culture.
  • Fermentation Fermentations are carried out on a scale of 10 litres, 100 litres and 500 litres 20 litre and 100 litre fermentations serve as an intermediate culture step Whereas the pre- cultures and intermediate cultures are seeded as the maintenance culture 10% (v/v), the mam cultures are seeded with 20% (v/v) of the intermediate culture Important: In contrast to the agitating cultures, the ingredients of the media for the fermentation are calculated on the final culture volume including the inoculum. If, for example, 18 litres of medium + 2 litres of inoculum are combined, then substances for 20 litres are weighed in, but are only mixed with 18 litres
  • 100 litres 90 litres of G52 medium in a fermenter having a total volume of 150 litres are seeded with 10 litres of the 20 litre intermediate culture. Cultivation lasts for 3-4 days, and the conditions are: 30°C, 150 rpm, 0.5 litres of air per litre liquid per mm, 0.5 bars excess pressure, no pH control. Mam culture. 10 litres, 100 litres or 500 litres:
  • 10 litres The media substances for 10 litres of 1 B12 medium are sterilised in 7 litres of water, then 1 litre of a sterile 10% 2-(hydroxypropyl) - ⁇ -cyclodext ⁇ n solution are added, and seeded with 2 litres of a 20 litre intermediate culture.
  • the duration of the mam culture is 6- 7 days, and the conditions are: 30°C, 250 rpm, 0.5 litres of air per litre of liquid per mm, 0.5 bars excess pressure, pH control with H ⁇ SOVKOH to pH 7.6 +/- 0.5 (i.e no control between pH 7.1 and 8.1 ).
  • the media substances for 100 litres of 1 B12 medium are sterilised in 70 litres of water, then 10 litres of a sterile 10% 2-(hydroxypropyl) - ⁇ -cyclodext ⁇ n solution are added, and seeded with 20 litres of a 20 litre intermediate culture.
  • the duration of the mam culture is 6-7 days, and the conditions are. 30°C, 200 rpm, 0.5 litres air per litre liquid per mm., 0.5 bars excess pressure, pH control with H ⁇ SO KOH to pH 7.6 +/- 0.5.
  • the chain of seeding for a 100 litre fermentation is shown schematically as follows: maintenance culture (500ml)
  • 500 litres The media substances for 500 litres of 1 B12 medium are sterilised in 350 litres of water, then 50 litres of a sterile 10% 2-(hydroxypropyl) - ⁇ -cyclodext ⁇ n solution are added, and seeded with 100 litres of a 100 litre intermediate culture.
  • the duration of the main culture is 6-7 days, and the conditions are: 30°C, 120 rpm, 0.5 litres air per litre liquid per mm., 0.5 bars excess pressure, pH control with H 2 SO4/KOH to pH 7.6 +/- 0.5.
  • Solvents A: 0.02 % phosphoric acid
  • Epo A 4.30 mm
  • Epo B 5.38 mm
  • Cyclodext ⁇ ns are cyclic ( ⁇ -1 ,4)-linked oligosacchandes of ⁇ -D-glucopyranose with a relatively hydrophobic central cavity and a hydrophilic external surface area.
  • ⁇ -cyclodextrin (6) ⁇ -cyclodextrin (7), ⁇ - cyclodextrin (8), ⁇ -cyclodext ⁇ n (9), ⁇ - cyclodextrin (10), ⁇ -cyclodext ⁇ n (11 ), ⁇ -cyclodext ⁇ n (12), and ⁇ - cyclodext ⁇ n (13).
  • ⁇ -cyciodext ⁇ n and in particular ⁇ -cyclodextrin, ⁇ - cyclodext ⁇ n or ⁇ -cyclodextrin, or mixtures thereof.
  • Cyclodextrin derivatives are primarily derivatives of the above-mentioned cyclodex- t ⁇ ns, especially of ⁇ -cyclodext ⁇ n, ⁇ -cyclodext ⁇ n or ⁇ -cyclodextnn, primarily those in which one or more up to all of the hydroxy groups (3 per glucose radical) are etherified or este- ⁇ fied.
  • Ethers are primarily alkyl ethers, especially lower alkyl, such as methyl or ethyl ether, also propyl or butyl ether; the aryl-hydroxyalkyl ethers, such as phenyl-hydroxy-lower-alkyl, especially phenyl-hydroxyethyl ether; the hydroxyalkyl ethers, in particular hydroxy-lower- alkyl ethers, especially 2-hydroxyethyl, hydroxypropyl such as 2-hydroxypropyl or hydroxy- butyl such as 2-hydroxybutyl ether; the carboxyalkyl ethers, in particular carboxy-lower-alkyl ethers, especially carboxymethyl or carboxyethyl ether; de ⁇ vatised carboxyalkyl ethers, in particular de ⁇ vatised carboxy-lower-alkyl ether in which the de ⁇ vatised carboxy is etherified or amidated carboxy (primarily aminocarbonyl, mono- or di-lower-alkyl-
  • alk is alkyl, especially lower alkyl, and n is a whole number from 2 to 12, especially 2 to 5, in particular 2 or 3; cyclodext ⁇ ns in which one or more OH groups are etherified with a radical of formula
  • R' is hydrogen, hydroxy, -0-(alk-0) z -H, -0-(alk(-R)-0-) p -H or -0-(alk(-R)-0-)q-alk-CO-Y; alk in all cases is alkyl, especially lower alkyl; m, n, p, q and z are a whole number from 1 to 12, preferably 1 to 5, in particular 1 to 3; and Y is OR ⁇ or NR 2 R 3 , wherein R ⁇ , R 2 and R 3 independently of one another, are hydrogen or lower alkyl, or R 2 and R 3 combined together with the linking nitrogen signify morpholino, pipe ⁇ dino, pyrrolidmo or piperaz o; or branched cyclodext ⁇ ns, in which etherifications or acetals with other sugar molecules are present, especially glucosyl-, diglucosyl- (G 2 - ⁇ -cyclodext ⁇ n), maltosyl- or dim
  • Mixtures of two or more of the said cyclodextnns and/or cyclodextrin derivatives may also exist.
  • the cyclodextnns or cyclodextrin derivatives are added to the culture medium preferably in a concentration of 0.02 to 10, preferably 0.05 to 5, especially 0.1 to 4, for example 0.1 to 2 percent by weight (w/v).
  • Cyclodextnns or cyclodextrin derivatives are known or may be produced by known processes (see for example US 3,459,731 ; US 4,383,992; US 4,535,152; US 4,659,696; EP 0 094 157; EP 0 149 197; EP 0 197 571 ; EP 0 300 526, EP 0 320 032; EP 0 499 322; EP 0 503 710; EP 0 818 469; WO 90/12035; WO 91/1 1200; WO 93/19061 ; WO 95/08993; WO 96/14090; GB 2,189,245; DE 3,118,218; DE 3,317,064 and the references mentioned therein, which also refer to the synthesis of cyclodextnns or cyclodextrin derivatives, or also: T.
  • Fermentation is carried out in a 15 litre glass fermenter.
  • the medium contains 10 g/l of 2-(hydroxypropyl)- ⁇ -cyclodextr ⁇ n from Wacker Chemie, Kunststoff, Germany.
  • the progress of fermentation is illustrated in Table 3. Fermentation is ended after 6 days and working up takes place.
  • Fermentation is carried out in a 150 litre fermenter
  • the medium contains 10 g/l of 2- (Hydroxypropyl)- ⁇ -cyclodextr ⁇ n.
  • the progress of fermentation is illustrated in Table 4
  • the fermentation is harvested after 7 days and worked up.
  • Fermentation is carried out in a 750 litre fermenter.
  • the medium contains 10 g/l of 2- (Hydroxypropyl)- ⁇ -cyclodextr ⁇ n.
  • the progress of fermentation is illustrated in Table 5. The fermentation is harvested after 7 days and worked up.
  • Fermentation is carried out in a 15 litre glass fermenter.
  • the medium does not contain any cyclodextrin or other adsorber.
  • the progress of fermentation is illustrated in Table 6. The fermentation is not harvested and worked up.
  • the mam part of the epothilones are found in the centrifugate,
  • the cent ⁇ fuged cell pulp contains ⁇ 15% of the determined epothilone portion and is not further processed.
  • the resin is discharged from the centrifuge and washed with 10-15 litres of deionised water.
  • Desorption is effected by stirring the resin twice, each time in portions with 30 litres of isopropanol in 30 litre glass stirring vessels for 30 minutes. Separation of the isopropanol phase from the resin takes place using a suction filter.
  • the isopropanol is then removed from the combined isopropanol phases by adding 15-20 litres of water in a vacuum-operated circulating evaporator (Schmid-Verdampfer) and the resulting water phase of ca. 10 litres is extracted 3x each time with 10 litres of ethyl acetate. Extraction is effected in 30 litre glass stirring vessels.
  • the ethyl acetate extract is concentrated to 3-5 litres in a vacuum-operated circulating evaporator (Schmid-Verdampfer) and afterwards concentrated to dryness in a rotary evaporator (B ⁇ chi type) under vacuum.
  • the result is an ethyl acetate extract of 50.2 g.
  • the ethyl acetate extract is dissolved in 500 ml of methanol, the insoluble portions filtered off using a folded filter, and the solution added to a 10 kg Sephadex LH 20 column (Pharmacia, Uppsala, Sweden) (column diameter 20 cm, filling level ca. 1.2 m). Elution is effected with methanol as eluant. Epothilone A and B is present predominantly in fractions 21 -23 (at a fraction size of 1 litre). These fractions are concentrated to dryness in a vacuum on a rotary evaporator (total weight 9.0 g).
  • compositions comprising epothilones are used for example in the treatment of cancerous diseases, such as various human solid tumors.
  • anticancer formulations comprise, for example, an active amount of an epothilone together with one or more organic or inorganic, liquid or solid, pharmaceutically suitable carrier materials.
  • Such formulations are delivered, for example, enterally, nasally, rectally, orally, or parenterally, particularly intramuscularly or intravenously.
  • the dosage of the active ingredient is dependent upon the weight, age, and physical and pharmacokinetical condition of the patient and is further dependent upon the method of delivery.
  • epothilones mimic the biological effects of taxol
  • epothilones may be substituted for taxol in compositions and methods utilizing taxol in the treatment of cancer. See, for example, U.S. Patent Nos. 5,496,804, 5,565,478, and 5,641 ,803, all of which are incorporated herein by reference.
  • epothilone B is supplied in individual 2 ml glass vials formulated as 1 mg/1 ml of clear, colorless intravenous concentrate.
  • the substance is formulated in polyethylene giycol 300 (PEG 300) and diluted with 50 or 100 ml 0.9% Sodium Chloride Injection, USP, to achieve the desired final concentration of the drug for infusion. It is administered as a single 30-m ⁇ nute intravenous infusion every 21 days (treatment three-weekly) for six cycles, or as a single 30-m ⁇ nute intravenous infusion every 7 days (weekly treatment).
  • the dose is between about 0.1 and about 6, preferably about 0.1 and about 5 mg/m 2 , more preferably about 0.1 and about 3 mg/m 2 , even more preferably 0.1 and 1.7 mg/m 2 , most preferably about 0.3 and about 1 mg/m 2 ; for three-weekly treatment (treatment every three weeks or every third week) the dose is between about 0.3 and about 18 mg/m 2 , preferably about 0.3 and about 15 mg/m 2 , more preferably about 0.3 and about 12 mg/m 2 , even more preferably about 0.3 and about 7.5 mg/m 2 , still more preferably about 0.3 and about 5 mg/m 2 , most preferably about 1.0 and about 3.0 mg/m 2 .
  • This dose is preferably administered to the human by intravenous (i.v.) administration during 2 to 180 mm, preferably 2 to 120 mm, more preferably during about 5 to about 30 m , most preferably during about 10 to about 30 mm, e.g. during about 30 m
  • microorganism identified under I. above was accoapanied by: j j a scientific description
  • the microorganism identified under I. above was received by this nternational Depositary Authority on (date of the original deposit) and a request to convert the original deposit to a deposit under the Budap ⁇ t Treaty was received by it on (date of receipt of request for conversion) .

Abstract

Nucleic acid molecules are isolated from Sorangium cellulosum that encode polypeptides necessary for the biosynthesis of epothilone. Disclosed are methods for the production of epothilone in recombinant hosts transformed with the genes of the invention. In this manner, epothilone can be produced in quantities large enough to enable their purification and use in pharmaceutical formulations such as those for the treatment of cancer.

Description

GENES FOR THE BIOSYNTHESIS OF EPOTHI ONES
FIELD OF THE INVENTION
The present invention relates generally to polyketides and genes for their synthesis. In particular, the present invention relates to the isolation and characterization of novel poly- ketide synthase and nonπbosomal peptide synthetase genes from Sorangium cellulosum that are necessary for the biosynthesis of epothilones A and B.
BACKGROUND OF THE INVENTION
Polyketides are compounds synthesized from two-carbon building blocks, the β- carbon of which always carries a keto group, thus the name polyketide. These compounds include many important antibiotics, immunosuppressants, cancer chemotherapeutic agents, and other compounds possessing a broad range of biological properties. The tremendous structural diversity derives from the different lengths of the polyketide chain, the different side-chains introduced (either as part of the two-carbon building blocks or after the polyketide backbone is formed), and the stereochemistry of such groups. The keto groups may also be reduced to hydroxyls, enoyls, or removed altogether. Each round of two-carbon addition is carried out by a complex of enzymes called the polyketide synthase (PKS) in a manner similar to fatty acid biosynthesis.
The biosynthetic genes for an increasing number of polyketides have been isolated and sequenced. For example, see U.S. Patent Nos. 5,639,949, 5,693,774, and 5,716,849, all of which are incorporated herein by reference, which describe genes for the biosynthesis of soraphen. See also, Schupp et al., FEMS Microbiology Letters 159: 201 -207 (1998) and WO 98/07868, which describe genes for the biosynthesis of rifamycin, and U.S. Patent No. 5,876,991 , which describes genes for the biosynthesis of tylactone, all of which are incorporated herein by reference. The encoded proteins generally fall into two types: type I and type II. Type I proteins are polyfunctional, with several catalytic domains carrying out different enzymatic steps covalently linked together (e.g. PKS for erythromycin, soraphen, rifamycin, and avermectin (MacNeil et al., in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et al.), American Society for Microbiology, Washington D. C. pp. 245-256 (1993)); whereas type II proteins are monofunctional (Hutchinson et al., in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et al.), American Society for Microbiology, Washington D. C. pp. 203-216 (1993)).
For the simpler polyketides such as actinorhodm (produced by Streptomyces coelicoloή, the several rounds of two-carbon additions are carried out iteratively on PKS enzymes encoded by one set of PKS genes. In contrast, synthesis of the more complicated compounds such as erythromycin and soraphen involves PKS enzymes that are organized into modules, whereby each module carries out one round of two-carbon addition (for review, see Hopwood et al., in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et al.), American Society for Microbiology, Washington D. C, pp. 267-275 (1993))
Complex polyketides and secondary metabolites in general may contain substructures that are derived from ammo acids instead of simple carboxylic acids. Incorporations of these building blocks are accomplished by non-πbosomal polypeptide synthetases (NRPSs) NRPSs are multienzymes that are organized in modules Each module is responsible for the addition (and the additional processing, if required) of one ammo acid building block. NRPSs activate ammo acids by forming aminoacyl-adenylates, and capture the activated ammo acids on thiol groups of phophopanthetemyl prosthetic groups on peptidyl carrier protein domains. Further, NRPSs modify the amino acids by epimeπzation, N-methyla- tion, or cyclization if necessary, and catalyse the formation of peptide bonds between the enzyme-bound ammo acids. NRPSs are responsible for the biosynthesis of peptide secondary metabolites like cyclospoπn, could provide polyketide chain terminator units as in rapa- mycin, or form mixed systems with PKSs as in yers iabactin biosynthesis.
Epothilones A and B are 16-membered macrocyclic polyketides with an acylcyste- ine-deπved starter unit that are produced by the bacterium Sorangium cellulosum strain So ce90 (Gerth et al., J. Antibiotics 49: 560-563 (1996), incorporated herein by reference). The structure of epothilone A and B wherein R signifies hydrogen (epothilone A) or methyl (epothilone B) is:
Figure imgf000005_0001
The epothilones have a narrow antifungal spectrum and especially show a high cytotoxicity in animal cell cultures (see, Hofle et ai, Patent DE 4138042 (1993), incorporated herein by reference). Of significant importance, epothilones mimic the biological effects of taxol, both in vivo and in cultured cells (Bollag et al., Cancer Research 55. 2325- 2333 (1995), incorporated herein by reference). Taxol and taxotere, which stabilize cellular microtubules, are cancer chemotherapeutic agents with significant activity against various human solid tumors (Rowinsky et ai, J. Natl. Cancer Inst. 83: 1778-1781 (1991 )) Competition studies have revealed that epothilones act as competitive inhibitors of taxol binding to microtubules, consistent with the interpretation that they share the same microtubule-b - dmg site and possess a similar microtubule affinity as taxol However, epothilones enjoy a significant advantage over taxol in that epothilones exhibit a much lower drop in potency compared to taxol against a multiple drug-resistant cell line (Bollag et al. (1995)). Furthermore, epothilones are considerably less efficiently exported from the cells by P-glycoprotein than is taxol (Gerth et ai (1996)). In addition, several epothilone analogs have been synthesized that have a superior cytotoxic activity as compared to epothilone A or epothilone B as demonstrated by their enhanced ability to induce the polymerization and stabilization of microtubules (WO 98/25929, incorporated herein by reference).
Despite the promise shown by the epothilones as anticancer agents, problems pertaining to the production of these compounds presently limit their commercial potential The compounds are too complex for industrial-scale chemical synthesis and so must be produced by fermentation. Techniques for the genetic manipulation of myxobactena such as Sorangium cellulosum are described in U.S. Patent No. 5,686,295, incorporated herein by reference. However, Sorangium cellulosum is notoriously difficult to ferment and production levels of epothilones are therefore low. Recombinant production of epothilones in hetero- logous hosts that are more amenable to fermentation could solve current production problems. However, the genes that encode the polypeptides responsible for epothilone bio- synthesis have heretofore not been isolated. Furthermore, the strain that produces epothilones, i.e. So ce90, also produces at least one additional polyketide, spirangien, which would be expected to greatly complicate the isolation of the genes particularly responsible for epothilone biosynthesis.
Therefore, in view of the foregoing, one object of the present invention is to isolate the genes that are involved in the synthesis of epothilones, particularly the genes that are involved in the synthesis of epothiiones A and B in myxobactena of the Sorangium/- Polyangium group, i.e., Sorangium cellulosum strain So ce90. A further object of the invention is to provide a method for the recombinant production of epothilones for application in anticancer formulations.
SUMMARY OF THE INVENTION
In furtherance of the aforementioned and other objects, the present invention unexpectedly overcomes the difficulties set forth above to provide for the first time a nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone. In a preferred embodiment, the nucleotide sequence is isolated from a species belonging to Myxobactena, most preferably Sorangium cellulosum.
In another preferred embodiment, the present invention provides an isolated nucleic acid molecuie comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said polypeptide comprises an amino acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: SEQ ID NO:2, ammo acids 11 -437 of SEQ ID NO:2, ammo acids 543-864 of SEQ ID NO:2, ammo acids 974-1273 of SEQ ID NO:2, ammo acids 1314-1385 of SEQ ID NO:2, SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, ammo acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, ammo acids 549-565 of SEQ ID NO:3, ammo acids 588-603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO:3, ammo acids 815-821 of SEQ ID NO:3, ammo acids 868-892 of SEQ ID NO:3, ammo acids 903-912 of SEQ ID NO:3, ammo acids 918-940 of SEQ ID NO:3, ammo acids 1268-1274 of SEQ ID NO:3, ammo acids 1285-1297 of SEQ ID NO:3, ammo acids 973- 1256 of SEQ ID NO:3, am o acids 1344-1351 of SEQ ID NO:3, SEQ ID NO:4, ammo acids 7-432 of SEQ ID NO:4, ammo acids 539-859 of SEQ ID NO:4, ammo acids 869-1037 of SEQ ID NO:4, ammo acids 1439-1684 of SEQ ID NO:4, ammo acids 1722-1792 of SEQ ID N0:4, SEQ ID N0:5, ammo acids 39-457 of SEQ ID NO:5, ammo acids 563-884 of SEQ ID NO:5, ammo acids 1147-1399 of SEQ ID NO:5, am o acids 1434-1506 of SEQ ID NO:5, am o acids 1524-1950 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 3886- 4048 of SEQ ID NO:5, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 5010-5082 of SEQ ID NO:5, ammo acids 5103-5525 of SEQ ID NO:5, ammo acids 5631 -5951 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, amino acids 6857-7101 of SEQ ID NO:5, am o acids 7140-7211 of SEQ ID NO:5, SEQ ID NO:6, am o acids 35-454 of SEQ ID NO:6, am o acids 561 -881 of SEQ ID NO:6, ammo acids 1143-1393 of SEQ ID NO.6, ammo acids 1430-1503 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, ammo acids 2053-2373 of SEQ ID NO.6, ammo acids 2383-2551 of SEQ ID NO:6, am o acids 2671 - 3045 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, ammo acids 3673-3745 of SEQ ID NO.6, SEQ ID NO:7, ammo acids 32-450 of SEQ ID NO.7, ammo acids 556-877 of SEQ ID NO 7, ammo acids 887-1051 of SEQ ID NO 7, ammo acids 1478-1790 of SEQ ID NO.7, ammo acids 1810-2055 of SEQ ID NO:7, ammo acids 2093-2164 of SEQ ID NO:7, ammo acids 2165-2439 of SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:1 1 , and SEQ ID NO:22.
In a more preferred embodiment, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said polypeptide comprises an ammo acid sequence selected from the group consisting of: SEQ ID NO:2, ammo acids 11 -437 of SEQ ID NO:2, am o acids 543-864 of SEQ ID NO:2, am o acids 974-1273 of SEQ ID NO:2, am o acids 1314-1385 of SEQ ID NO:2, SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, am o acids 118-125 of SEQ ID NO:3, am o acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, ammo acids 549-565 of SEQ ID NO:3, ammo acids 588- 603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO:3, ammo acids 815-821 of SEQ ID NO:3, am o acids 868-892 of SEQ ID NO:3, ammo acids 903-912 of SEQ ID NO:3, ammo acids 918-940 of SEQ ID NO:3, ammo acids 1268-1274 of SEQ ID NO:3, amino acids 1285-1297 of SEQ ID NO:3, ammo acids 973-1256 of SEQ ID NO:3, am o acids 1344-1351 of SEQ ID NO:3, SEQ ID NO:4, am o acids 7-432 of SEQ ID NO:4, ammo acids 539-859 of SEQ ID NO:4, ammo acids 869-1037 of SEQ ID NO:4, am o acids 1439-1684 of SEQ ID NO:4, ammo acids 1722-1792 of SEQ ID NO:4, SEQ ID NO:5, ammo acids 39- 457 of SEQ ID NO:5, ammo acids 563-884 of SEQ ID NO:5, ammo acids 1147-1399 of SEQ ID NO:5, am o acids 1434-1506 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 3886-4048 of SEQ ID NO:5, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 5010- 5082 of SEQ ID NO:5, ammo acids 5103-5525 of SEQ ID NO:5, ammo acids 5631 -5951 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, SEQ ID NO:6, ammo acids 35-454 of SEQ ID NO:6, ammo acids 561-881 of SEQ ID NO:6, ammo acids 1143-1393 of SEQ ID NO:6, am o acids 1430-1503 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, am o acids 2053-2373 of SEQ ID NO:6, ammo acids 2383-2551 of SEQ ID NO:6, ammo acids 2671 -3045 of SEQ ID NO:6, ammo acids 3392- 3636 of SEQ ID NO:6, ammo acids 3673-3745 of SEQ ID NO:6, SEQ ID NO.7, ammo acids 32-450 of SEQ ID NO:7, ammo acids 556-877 of SEQ ID N0 , ammo acids 887-1051 of SEQ ID NO:7, ammo acids 1478-1790 of SEQ ID NO.7, ammo acids 1810-2055 of SEQ ID NO:7, ammo acids 2093-2164 of SEQ ID NO:7, ammo acids 2165-2439 of SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:11 , and SEQ ID NO:22.
In yet another preferred embodiment, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415-5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 1 1549-11764 of SEQ ID NO 1 , nucleotides 1 1872-16104 of SEQ ID NO nucleotides 12085-12114 of SEQ ID NO 1 , nucleotides 12223-12246 of SEQ ID NO nucleotides 12466-12507 of SEQ ID NO 1 , nucleotides 12928-12960 of SEQ ID NO nucleotides 13516-13566 of SEQ ID NO 1 , nucleotides 13633-13680 of SEQ ID NO nucleotides 13876-13923 of SEQ ID NO 1 , nucleotides 14313-14334 of SEQ ID NO nucleotides 14473-14547 of SEQ ID NO 1 , nucleotides 14578-14607 of SEQ ID NO nucleotides 14623-14692 of SEQ ID NO 1 , nucleotides 15673-15693 of SEQ ID NO nucleotides 15724-15762 of SEQ ID NO:1 nucleotides 14788-15639 of SEQ ID NO nucleotides 15901 -15924 of SEQ ID NO:1 nucleotides 16251 -21749 of SEQ ID NO nucleotides 16269-17546 of SEQ ID NO:1 nucleotides 17865-18827 of SEQ ID NO nucleotides 18855-19361 of SEQ ID NO:1 nucleotides 20565-21302 of SEQ ID NO nucleotides 21414-21626 of SEQ ID NO:1 nucleotides 21746-43519 of SEQ ID NO nucleotides 21860-231 16 of SEQ ID NO:1 nucleotides 23431-24397 of SEQ ID NO nucleotides 25184-25942 of SEQ ID NO:1 nucleotides 26045-26263 of SEQ ID NO nucleotides 26318-27595 of SEQ ID NO:1 nucleotides 2791 1-28876 of SEQ ID NO nucleotides 29678-30429 of SEQ ID NO:1 nucleotides 30539-30759 of SEQ ID NO nucleotides 30815-32092 of SEQ ID NO:1 nucleotides 32408-33373 of SEQ ID NO nucleotides 33401 -33889 of SEQ ID NO:1 nucleotides 35042-35902 of SEQ ID NO nucleotides 35930-36667 of SEQ ID NO:1 nucleotides 36773-36991 of SEQ ID NO nucleotides 37052-38320 of SEQ ID NO:1 nucleotides 38636-39598 of SEQ ID NO nucleotides 39635-40141 of SEQ ID NO:1 nucleotides 41369-42256 of SEQ ID NO nucleotides 42314-43048 of SEQ ID NO.1 nucleotides 43163-43378 of SEQ ID NO nucleotides 43524-54920 of SEQ ID NO"1 nucleotides 43626-44885 of SEQ ID NO nucleotides 45204-46166 of SEQ ID NO:1 nucleotides 46950-47702 of SEQ ID NO: nucleotides 47811 -48032 of SEQ ID NO:1 nucleotides 48087-49361 of SEQ ID NO nucleotides 49680-50642 of SEQ ID NO:1 nucleotides 50670-51176 of SEQ ID NO nucleotides 51534-52657 of SEQ ID NO:1 nucleotides 53697-54431 of SEQ ID NO nucleotides 54540-54758 of SEQ ID NO:1 nucleotides 54935-62254 of SEQ ID NO nucleotides 55028-56284 of SEQ ID NO:1 nucleotides 56600-57565 of SEQ ID NO nucleotides 57593-58087 of SEQ ID NO:1 nucleotides 59366-60304 of SEQ ID NO nucleotides 60362-61099 of SEQ ID NO:1 nucleotides 61211 -61426 of SEQ ID NO nucleotides 61427-62254 of SEQ ID NO:1 nucleotides 62369-63628 of SEQ ID NO nucleotides 67334-68251 of SEQ ID NO:1 and nucleotides 1 -68750 SEQ ID NO:1. In an especially preferred embodiment, the present invention provides a nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence is selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415-5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643- 8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 11549-11764 of SEQ ID NO:1 , nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-12114 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466-12507 of SEQ ID NO:1 , nucleotides 12928-12960 of SEQ ID NO:1 , nucleotides 13516-13566 of SEQ ID NO:1 , nucleotides 13633-13680 of SEQ ID
NO nucleotides 13876-13923 of SEQ ID NO:1 nucleotides 14313-14334 of SEQ ID
NO nucleotides 14473-14547 of SEQ ID NO:1 nucleotides 14578-14607 of SEQ ID
NO nucleotides 14623-14692 of SEQ ID NO:1 nucleotides 15673-15693 of SEQ ID
NO nucleotides 15724-15762 of SEQ ID NO:1 nucleotides 14788-15639 of SEQ ID
NO nucleotides 15901-15924 of SEQ ID NO:1 nucleotides 16251 -21749 of SEQ ID
NO nucleotides 16269-17546 of SEQ ID NO:1 nucleotides 17865-18827 of SEQ ID
NO nucleotides 18855-19361 of SEQ ID NO:1 nucleotides 20565-21302 of SEQ ID
NO nucleotides 21414-21626 of SEQ ID NO:1 nucleotides 21746-43519 of SEQ ID
NO nucleotides 21860-231 16 of SEQ ID NO:1 nucleotides 23431 -24397 of SEQ ID
NO nucleotides 25184-25942 of SEQ ID NO:1 nucleotides 26045-26263 of SEQ ID
NO nucleotides 26318-27595 of SEQ ID NO:1 nucleotides 27911 -28876 of SEQ ID
NO nucleotides 29678-30429 of SEQ ID NO:1 nucleotides 30539-30759 of SEQ ID
NO nucleotides 30815-32092 of SEQ ID NO:1 nucleotides 32408-33373 of SEQ ID
NO nucleotides 33401 -33889 of SEQ ID NO:1 nucleotides 35042-35902 of SEQ ID
NO nucleotides 35930-36667 of SEQ ID NO:1 nucleotides 36773-36991 of SEQ ID
NO nucleotides 37052-38320 of SEQ ID NO:1 nucleotides 38636-39598 of SEQ ID
NO nucleotides 39635-40141 of SEQ ID NO:1 nucleotides 41369-42256 of SEQ ID
NO nucleotides 42314-43048 of SEQ ID NO:1 nucleotides 43163-43378 of SEQ ID
NO nucleotides 43524-54920 of SEQ ID NO:1 nucleotides 43626-44885 of SEQ ID
NO nucleotides 45204-46166 of SEQ ID NO:1 nucleotides 46950-47702 of SEQ ID
NO nucleotides 47811-48032 of SEQ ID NO:1 nucleotides 48087-49361 of SEQ ID
NO nucleotides 49680-50642 of SEQ ID NO:1 nucleotides 50670-51 176 of SEQ ID
NO nucleotides 51534-52657 of SEQ ID NO:1 nucleotides 53697-54431 of SEQ ID
NO nucleotides 54540-54758 of SEQ ID NO:1 nucleotides 54935-62254 of SEQ ID
NO nucleotides 55028-56284 of SEQ ID NO:1 nucleotides 56600-57565 of SEQ ID
NO nucleotides 57593-58087 of SEQ ID NO:1 nucleotides 59366-60304 of SEQ ID
NO nucleotides 60362-61099 of SEQ ID NO:1 nucleotides 61211 -61426 of SEQ ID
NO nucleotides 61427-62254 of SEQ ID NO:1 nucleotides 62369-63628 of SEQ ID
NO nucleotides 67334-68251 of SEQ ID NO:1 and nucleotides 1 -68750 SEQ ID NO:1. In yet another preferred embodiment, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415- 5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 10529-1 1428 of SEQ ID NO:1 , nucleotides 11549-11764 of SEQ ID NO:1 , nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-12114 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466-12507 of SEQ ID NO 1 , nucleotides 12928-12960 of SEQ ID NO-1 , nucleotides 13516-13566 of SEQ ID NO'1 , nucleotides 13633-13680 of SEQ ID NO:1 , nucleotides 13876-13923 of SEQ ID NO.1 , nucleotides 14313-14334 of SEQ ID NO:1 , nucleotides 14473-14547 of SEQ ID NO.1 , nucleotides 14578-14607 of SEQ ID NO:1 , nucleotides 14623-14692 of SEQ ID NO:1 , nucleotides 15673-15693 of SEQ ID NO:1 , nucleotides 15724-15762 of SEQ ID NO:1 , nucleotides 14788-15639 of SEQ ID NO:1 , nucleotides 15901 -15924 of SEQ ID NO:1 , nucleotides 16251 -21749 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 17865-18827 of SEQ ID NO:1 , nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 21746-43519 of SEQ ID NO:1 , nucleotides 21860-23116 of SEQ ID NO:1 , nucleotides 23431 -24397 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO-1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 2791 1 -28876 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 33401 -33889 of SEQ ID NO:1 , nucleotides 35042-35902 of SEQ ID NO: 1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 38636-39598 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 41369-42256 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 43524-54920 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 47811-48032 of SEQ ID NO: 1 , nucleotides 48087-49361 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO: 1 , nucleotides 50670-51 176 of SEQ ID NO:1 , nucleotides 51534-52657 of SEQ ID NO: 1 , nucleotides 53697-54431 of SEQ ID NO:1 , nucleotides 54540-54758 of SEQ ID NO: 1 , nucleotides 54935-62254 of SEQ ID NO: 1 , nucleotides 55028-56284 of SEQ ID NO: 1 , nucleotides 56600-57565 of SEQ ID NO:1 , nucleotides 57593-58087 of SEQ ID NO: 1 , nucleotides 59366-60304 of SEQ ID NO:1 , nucleotides 60362-61099 of SEQ ID NO: 1 , nucleotides 6121 1 -61426 of SEQ ID NO:1 , nucleotides 61427-62254 of SEQ ID NO: 1 , nucleotides 62369-63628 of SEQ ID NO:1 , nucleotides 67334-68251 of SEQ ID NO: 1 , and nucleotides 1 -68750 SEQ ID NO.1
The present invention also provides a chimeπc gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule of the invention. Further, the present invention provides a recombinant vector comprising such a chimeπc gene, wherein the vector is capable of being stably transformed into a host cell Still further, the present invention provides a recombinant host cell comprising such a chimeπc gene, wherein the host cell is capable of expressing the nucleotide sequence that encodes at least one polypeptide necessary for the biosynthesis of an epothilone. In a preferred embodiment, the recombinant host cell is a bacterium belonging to the order Actmomycetales, and in a more preferred embodiment the recombinant host cell is a strain of Streptomyces. In other embodiments, the recombinant host cell is any other bacterium amenable to fermentation, such as a pseudomonad or E. coll. Even further, the present invention provides a Bac clone comprising a nucieic acid molecule of the invention, preferably Bac clone pEP015. in another aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes an epothilone synthase domain.
According to one embodiment, the epothilone synthase domain is a β-ketoacyl-syn- thase (KS) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 11 -437 of SEQ ID NO:2, ammo acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103- 5525 of SEQ ID NO:5, am o acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7. According to this embodiment, said KS domain preferably comprises an am o acid sequence selected from the group consisting of: am o acids 11 -437 of SEQ ID NO:2, am o acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103-5525 of SEQ ID NO:5, ammo acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and am o acids 32-450 of SEQ ID NO:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 21860-23116 of SEQ ID NO:1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO 1 , nucleotides 21860-23116 of SEQ ID NO-1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 21860-231 16 of SEQ ID NO:1 , nucleotides 26318-27595 of SEQ ID NO: 1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO 1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO:1.
According to another embodiment, the epothilone synthase domain is an acyltrans- ferase (AT) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: am o acids 543-864 of SEQ ID NO:2, ammo acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, am o acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 5631- 5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556-877 of SEQ ID NO:7. According to this embodiment, said AT domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO:2, ammo acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, am o acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 5631-5951 of SEQ ID NO:5, ammo acids 561 -881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556- 877 of SEQ ID NO:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865-18827 of SEQ ID NO:1 , nucleotides 23431-24397 of SEQ ID NO:1 , nucleotides 2791 1 -28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636-39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO:1 , and nucleotides 56600-57565 of SEQ ID NO:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865- 18827 of SEQ ID NO:1 , nucleotides 23431 -24397 of SEQ ID NO:1 , nucleotides 27911 - 28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636- 39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680- 50642 of SEQ ID NO:1 , and nucleotides 56600-57565 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865-18827 of SEQ ID NO:1 , nucleotides 23431 -24397 of SEQ ID NO: 1 , nucleotides 27911 -28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636-39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO:1 , and nucleotides 56600-57565 of SEQ ID NO:1.
According to still another embodiment, the epothilone synthase domain is an enoyi reductase (ER) domain comprising an ammo acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: ammo acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7. According to this embodiment, said ER domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 974-1273 of SEQ ID NO:2, am o acids 4433-4719 of SEQ ID NO:5, am o acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 35042-35902 of SEQ ID NO:1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO-1 According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID NO-1 , nucleotides 35042-35902 of SEQ ID NO: 1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of. nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 35042-35902 of SEQ ID NO-1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO:1.
According to another embodiment, the epothilone synthase domain is an acyl carrier protein (ACP) domain, wherein said polypeptide comprises an ammo acid sequence substantially similar to an am o acid sequence selected from the group consisting of ammo acids 1314-1385 of SEQ ID NO.2, am o acids 1722-1792 of SEQ ID NO:4, ammo acids 1434-1506 of SEQ ID NO-5, ammo acids 2932-3005 of SEQ ID NO.5, am o acids 5010-5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, ammo acids 1430- 1503 of SEQ ID NO:6, am o acids 3673-3745 of SEQ ID NO:6, and ammo acids 2093- 2164 of SEQ ID NO.7. According to this embodiment, said ACP domain preferably comprises an ammo acid sequence selected from the group consisting of. ammo acids 1314-1385 of SEQ ID NO:2, ammo acids 1722-1792 of SEQ ID NO-4, am o acids 1434- 1506 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO:5, ammo acids 5010-5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, ammo acids 1430-1503 of SEQ ID NO:6, am o acids 3673-3745 of SEQ ID NO:6, and am o acids 2093-2164 of SEQ ID NO.7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides
11549-11764 of SEQ ID NO 1 , nucleotides 21414-21626 of SEQ ID NO: 1 , nucleotides 26045-26263 of SEQ ID NO 1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO 1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 47811 -48032 of SEQ ID NO 1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 61211 -61426 of SEQ ID NO 1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 11549-1 1764 of SEQ ID NO"1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO:1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO:1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 47811 -48032 of SEQ ID NO:1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 61211 -61426 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 1 1549-1 1764 of SEQ ID NO:1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO:1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO:1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 4781 1 -48032 of SEQ ID NO:1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 6121 1 -61426 of SEQ ID NO:1.
According to another embodiment, the epothilone synthase domain is a dehydratase (DH) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO.4, ami- no acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7. According to this embodiment, said DH domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO:4, ammo acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, am o acids 2383-2551 of SEQ ID NO:6, and am o acids 887-1051 of SEQ ID NO:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401 - 33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670- 51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401-33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670-51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401 -33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670-51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1. According to yet another embodiment, the epothilone synthase domain is a β-keto- reductase (KR) domain comprising an ammo acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID NO:4, am o acids 1147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, am o acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and ammo acids 1810-2055 of SEQ ID NO:7. According to this embodiment, said KR domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 1439-1684 of SEQ ID NO:4, ammo acids 1147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857- 7101 of SEQ ID NO.5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and am o acids 1810-2055 of SEQ ID NO:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1. According to an additional embodiment, the epothilone synthase domain is a methyltransferase (MT) domain comprising an ammo acid sequence substantially similar to amino acids 2671-3045 of SEQ ID NO:6. According to this embodiment, said MT domain preferably comprises ammo acids 2671-3045 of SEQ ID NO:6. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to nucleotides 51534-52657 of SEQ ID NO:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of nucleotides 51534-52657 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is nucleotides 51534-52657 of SEQ ID NO:1.
According to another embodiment, the epothilone synthase domain is a thioesterase (TE) domain comprising an ammo acid sequence substantially similar to ammo acids 2165- 2439 of SEQ ID NO:7. According to this embodiment, said TE domain preferably comprises ammo acids 2165-2439 of SEQ ID NO:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to nucleotides 61427-62254 of SEQ ID NO:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of nucleotides 61427-62254 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is nucleotides 61427-62254 of SEQ ID NO:1.
In still another aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes a non-nbosomal peptide synthetase, wherein said non-nbosomal peptide synthetase comprises an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, amino acids 1 18-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, amino acids 549- 565 of SEQ ID NO:3, amino acids 588-603 of SEQ ID NO:3, amino acids 669-684 of SEQ ID NO:3, amino acids 815-821 of SEQ ID NO:3, amino acids 868-892 of SEQ ID NO:3, am o acids 903-912 of SEQ ID NO:3, am o acids 918-940 of SEQ ID NO:3, am o acids 1268-1274 of SEQ ID NO:3, ammo acids 1285-1297 of SEQ ID NO:3, ammo acids 973- 1256 of SEQ ID NO:3, and ammo acids 1344-1351 of SEQ ID NO:3. According to this embodiment, said non-nbosomal peptide synthetase preferably comprises an ammo acid sequence selected from the group consisting of: SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, am o acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, ammo acids 549-565 of SEQ ID NO:3, am o acids 588-603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO:3, ammo acids 815-821 of SEQ ID NO:3, ammo acids 868-892 of SEQ ID NO:3, ammo acids 903-912 of SEQ ID NO:3, am o acids 918-940 of SEQ ID NO:3, ammo acids 1268-1274 of SEQ ID NO:3, ammo acids 1285-1297 of SEQ ID NO:3, ammo acids 973-1256 of SEQ ID NO:3, and ammo acids 1344-1351 of SEQ ID NO:3. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-121 14 of SEQ ID
NO 1 , nucleotides 12223-12246 of SEQ ID NO 1 , nucleotides 12466-12507 of SEQ ID NO 1 , nucleotides 12928-12960 of SEQ ID NO 1 , nucleotides 13516-13566 of SEQ ID NO 1 , nucleotides 13633-13680 of SEQ ID NO: 1 , nucleotides 13876-13923 of SEQ ID NO 1 , nucleotides 14313-14334 of SEQ ID NO 1 , nucleotides 14473-14547 of SEQ ID NO 1 , nucleotides 14578-14607 of SEQ ID NO 1 , nucleotides 14623-14692 of SEQ ID NO 1 , nucleotides 15673-15693 of SEQ ID NO 1 , nucleotides 15724-15762 of SEQ ID NO 1 , nucleotides 14788-15639 of SEQ ID NO: 1 , and nucleotides 15901 -15924 of SEQ ID NO 1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 1 1872-16104 of SEQ ID NO:1 , nucleotides 12085-12114 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466-12507 of SEQ ID NO:1 , nucleotides
12928-12960 of SEQ ID NO 1 , nucleotides 13516-13566 of SEQ ID NO:1 , nucleotides 13633-13680 of SEQ ID NO 1 , nucleotides 13876-13923 of SEQ ID NO:1 , nucleotides 14313-14334 of SEQ ID NO 1 , nucleotides 14473-14547 of SEQ ID NO:1 , nucleotides 14578-14607 of SEQ ID NO 1 , nucleotides 14623-14692 of SEQ ID NO:1 , nucleotides 15673-15693 of SEQ ID NO 1 , nucleotides 15724-15762 of SEQ ID NO:1 , nucleotides 14788-15639 of SEQ ID NO 1 , and nucleotides 15901 -15924 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085- 12114 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466- 12507 of SEQ ID NO :1 , nucleotides 12928-12960 of SEQ ID NO: 1 , nucleotides 13516- 13566 of SEQ ID NO :1 , nucleotides 13633-13680 of SEQ ID NO: 1 , nucleotides 13876- 13923 of SEQ ID NO :1 , nucleotides 14313-14334 of SEQ ID NO: 1 , nucleotides 14473- 14547 of SEQ ID NO :1 , nucleotides 14578-14607 of SEQ ID NO.1 , nucleotides 14623- 14692 of SEQ ID NO :1 , nucleotides 15673-15693 of SEQ ID NO: 1 , nucleotides 15724- 15762 of SEQ ID NO :1 , nucleotides 14788-15639 of SEQ ID NO: 1 , and nucleotides 15901- 15924 of SEQ ID NO :1.
The present invention further provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes a polypeptide comprising an ammo acid sequence selected from the group consisting of SEQ ID NOs:2-23.
In accordance with another aspect, the present invention also provides methods for the recombinant production of polyketides such as epothilones in quantities large enough to enable their purification and use in pharmaceutical formulations such as those for the treatment of cancer A specific advantage of these production methods is the chira ty of the molecules produced; production in transgenic organisms avoids the generation of populations of racemic mixtures, within which some enantiomers may have reduced activity. In particular, the present invention provides a method for heterologous expression of epothilone in a recombinant host, comprising: (a) introducing into a host a chimeπc gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule of the invention that comprises a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone; and (b) growing the host in conditions that allow biosynthesis of epothilone in the host. The present invention also provides a method for producing epothilone, comprising: (a) expressing epothilone in a recombinant host by the aforementioned method; and (b) extracting epothilone from the recombinant host.
According to still another aspect, the present invention provides an isolated polypeptide comprising an ammo acid sequence that consists of an epothilone synthase domain.
According to one embodiment, the epothilone synthase domain is a β-ketoacyl- synthase (KS) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 11-437 of SEQ ID NO:2, am o acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, am o acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103- 5525 of SEQ ID NO:5, ammo acids 35-454 of SEQ ID NO:6, am o acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7. According to this embodiment, said KS domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 11 -437 of SEQ ID NO:2, ammo acids 7-432 of SEQ ID NO:4, amino acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103-5525 of SEQ ID NO.5, ammo acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7.
According to another embodiment, the epothilone synthase domain is an acyltrans- ferase (AT) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO-2, ammo acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, am o acids 5631- 5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556-877 of SEQ ID NO:7. According to this embodiment, said AT domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO.2, ammo acids 539-859 of SEQ ID NO 4, ammo acids 563-884 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, am o acids 5631 -5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556- 877 of SEQ ID NO:7.
According to still another embodiment, the epothilone synthase domain is an enoyl reductase (ER) domain comprising an ammo acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: am o acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7. According to this embodiment, said ER domain preferably comprises an am o acid sequence selected from the group consisting of: ammo acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
According to another embodiment, the epothilone synthase domain is an acyl carrier protein (ACP) domain, wherein said polypeptide comprises an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ami- no acids 1314-1385 of SEQ ID NO:2, amino acids 1722-1792 of SEQ ID NO:4, ammo acids 1434-1506 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO:5, am o acids 5010- 5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, ammo acids 1430-1503 of SEQ ID NO:6, amino acids 3673-3745 of SEQ ID NO:6, and amino acids 2093-2164 of SEQ ID NO:7. According to this embodiment, said ACP domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 1314-1385 of SEQ ID NO:2, ammo acids 1722-1792 of SEQ ID NO:4, am o acids 1434-1506 of SEQ ID NO:5, amino acids 2932-3005 of SEQ ID NO:5, amino acids 5010-5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, amino acids 1430-1503 of SEQ ID NO:6, ammo acids 3673-3745 of SEQ ID NO:6, and amino acids 2093-2164 of SEQ ID NO:7.
According to another embodiment, the epothilone synthase domain is a dehydratase (DH) domain comprising an ammo acid sequence substantially similar to an am o acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO:4, ami- no acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7. According to this embodiment, said DH domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 869-1037 of SEQ ID NO:4, ammo acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7.
According to yet another embodiment, the epothilone synthase domain is a β-keto- reductase (KR) domain comprising an amino acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: ammo acids 1439-1684 of SEQ ID NO:4, ammo acids 1 147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and am o acids 1810-2055 of SEQ ID NO:7. According to this embodiment, said KR domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 1439-1684 of SEQ ID NO:4, ammo acids 1147-1399 of SEQ ID NO:5, ammo acids 2645- 2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and amino acids 1810-2055 of SEQ ID NO:7.
According to an additional embodiment, the epothilone synthase domain is a methyl- transferase (MT) domain comprising an amino acid sequence substantially similar to amino acids 2671-3045 of SEQ ID NO:6. According to this embodiment, said MT domain preferably comprises amino acids 2671-3045 of SEQ ID NO:6. According to another embodiment, the epothilone synthase domain is a thioesterase (TE) domain comprising an ammo acid sequence substantially similar to ammo acids 2165- 2439 of SEQ ID NO:7. According to this embodiment, said TE domain preferably comprises ammo acids 2165-2439 of SEQ ID NO.7.
Other aspects and advantages of the present invention will become apparent to those skilled in the art from a study of the following description of the invention and non-limiting examples.
DEFINITIONS
In describing the present invention, the following terms will be employed, and are intended to be defined as indicated below.
Associated With / Operatively Linked. Refers to two DNA sequences that are related physically or functionally. For example, a promoter or regulatory DNA sequence is said to be "associated with" a DNA sequence that codes for an RNA or a protein if the two sequences are operatively linked, or situated such that the regulator DNA sequence will affect the expression level of the coding or structural DNA sequence.
Chimeπc Gene: A recombinant DNA sequence in which a promoter or regulatory DNA sequence is operatively linked to, or associated with, a DNA sequence that codes for an mRNA or which is expressed as a protein, such that the regulator DNA sequence is able to regulate transcription or expression of the associated DNA sequence. The regulator DNA sequence of the chimeπc gene is not normally operatively linked to the associated DNA sequence as found in nature.
Coding DNA Sequence A DNA sequence that is translated in an organism to produce a protein.
Domain- That part of a polyketide synthase necessary for a given distinct activity Examples include acyl carrier protein (ACP), β-ketosynthase (KS), acyltransferase (AT), β- ketoreductase (KR), dehydratase (DH), enoylreductase (ER), and thioesterase (TE) domains.
Epothilones: 16-membered macrocyc c polyketides naturally produced by the bacterium Sorangium cellulosum strain So ce90, which mimic the biological effects of taxol. In this application, "epothilone" refers to the class of polyketides that includes epothilone A and epothilone B, as well as analogs thereof such as those described in WO 98/25929. Epothilone Synthase: A polyketide synthase responsible for the biosynthesis of epothilone.
Gene: A defined region that is located within a genome and that, besides the aforementioned coding DNA sequence, comprises other, primarily regulatory, DNA sequences responsible for the control of the expression, that is to say the transcription and translation, of the coding portion.
Heterologous DNA Sequence. A DNA sequence not naturally associated with a host cell into which it is introduced, including non-naturally occurring multiple copies of a naturally occurring DNA sequence
Homologous DNA Sequence- A DNA sequence naturally associated with a host cell into which it is introduced.
Homologous Recombination Reciprocal exchange of DNA fragments between homologous DNA molecules.
Isolated: In the context of the present invention, an isolated nucleic acid molecule or an isolated enzyme is a nucleic acid molecule or enzyme that, by the hand of man, exists apart from its native environment and is therefore not a product of nature An isolated nucleic acid molecule or enzyme may exist in a purified form or may exist in a non-native environment such as, for example, a recombinant host cell.
Module: A genetic element encoding all of the distinct activities required in a single round of polyketide biosynthesis, i.e., one condensation step and all the β-carbonyl processing steps associated therewith. Each module encodes an ACP, a KS, and an AT activity to accomplish the condensation portion of the biosynthesis, and selected post- condensation activities to effect the β-carbonyl processing
NRPS. A non-nbosomal polypeptide synthetase, which is a complex of enzymatic activities responsible for the incorporation of ammo acids into secondary metabolites including, for example, ammo acid adenylation, epimeπzation, N-methylation, cyc zation, peptidyl carrier protein, and condensation domains A functional NRPS is one that catalyzes the incorporation of an ammo acid into a secondary metabolite.
NRPS gene: One or more genes encoding NRPSs for producing functional secondary metabolites, e.g., epothilones A and B, when under the direction of one or more compatible control elements. Nucleic Acid Molecule: A linear segment of single- or double-stranded DNA or RNA that can be isolated from any source. In the context of the present invention, the nucleic acid molecule is preferably a segment of DNA.
ORF: Open Reading Frame.
PKS: A polyketide synthase, which is a complex of enzymatic activities (domains) responsible for the biosynthesis of polyketides including, for example, ketoreductase, dehy- dratase, acyl carrier protein, enoylreductase, ketoacyl ACP synthase, and acyltransferase. A functional PKS is one that catalyzes the synthesis of a polyketide.
PKS Genes: One or more genes encoding various polypeptides required for producing functional polyketides, e.g., epothilones A and B, when under the direction of one or more compatible control elements.
Substantially Similar: With respect to nucleic acids, a nucleic acid molecule that has at least 60 percent sequence identity with a reference nucleic acid molecule. In a preferred embodiment, a substantially similar DNA sequence is at least 80% identical to a reference DNA sequence; in a more preferred embodiment, a substantially similar DNA sequence is at least 90% identical to a reference DNA sequence; and in a most preferred embodiment, a substantially similar DNA sequence is at least 95% identical to a reference DNA sequence. A substantially similar DNA sequence preferably encodes a protein or peptide having substantially the same activity as the protein or peptide encoded by the reference DNA sequence. A substantially similar nucleotide sequence typically hybridizes to a reference nucleic acid molecule, or fragments thereof, under the following conditions: hybridization at 7% sodium dodecyl sulfate (SDS), 0.5 M NaP04 pH 7.0, 1 mM EDTA at 50°C; wash with 2X SSC, 1% SDS, at 50°C. With respect to proteins or peptides, a substantially similar am o acid sequence is an amino acid sequence that is at least 90% identical to the am o acid sequence of a reference protein or peptide and has substantially the same activity as the reference protein or peptide.
Transformation: A process for introducing heterologous nucleic acid into a host cell or organism.
Transformed / Transgenic / Recombinant: Refers to a host organism such as a bacterium into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome of the host or the nucleic acid molecule can also be present as an extrachromosomal molecule. Such an extrachromosomal molecule can be auto-replicating. Transformed cells, tissues, or plants are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof. A "non-transformed", "non-transgenic", or "non-recombmant" host refers to a wild-type organism, i.e., a bacterium, which does not contain the heterologous nucleic acid molecule.
Nucleotides are indicated by their bases by the following standard abbreviations: adenine (A), cytosine (C), thymine (T), and guanine (G). Ammo acids are likewise indicated by the following standard abbreviations: alanine (ala; A), arginme (Arg; R), asparagme (Asn; N), aspartic acid (Asp; D), cysteme (Cys; C), glutamme (Gin; Q), glutamic acid (Glu; E), glycme (Gly; G), histid e (His; H), isoleucme (lie; I), leucme (Leu; L), lysme (lys; K), methionme (Met; M), phenylalanme (Phe; F), prolme (Pro; P), seπne (Ser; S), threonme (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y), and valme (Val; V) Furthermore, (Xaa; X) represents any ammo acid.
DESCRIPTION OF THE SEQUENCES IN THE SEQUENCE LISTING
SEQ ID NO:1 is the nucleotide sequence of a 68750 bp contig containing 22 open reading frames (ORFs), which comprises the epothilone biosynthesis genes.
SEQ ID NO:2 is the protein sequence of a type I polyketide synthase (EPOS A) encoded by epoA (nucleotides 7610-1 1875 of SEQ ID NO:1 ).
SEQ ID NO:3 is the protein sequence of a non-nbosomal peptide synthetase (EPOS P) encoded by epoP (nucleotides 1 1872-16104 of SEQ ID NO:1 )
SEQ ID NO:4 is the protein sequence of a type I polyketide synthase (EPOS B) encoded by epoB (nucleotides 16251 -21749 of SEQ ID NO:1 ).
SEQ ID NO:5 is the protein sequence of a type I polyketide synthase (EPOS C) encoded by epc€ (nucleotides 21746-43519 of SEQ ID NO:1 ).
SEQ ID NO:6 is the protein sequence of a type I polyketide synthase (EPOS D) encoded by epoD (nucleotides 43524-54920 of SEQ ID NO:1 ).
SEQ ID NO:7 is the protein sequence of a type I polyketide synthase (EPOS E) encoded by epoE (nucleotides 54935-62254 of SEQ ID NO:1).
SEQ ID NO:8 is the protein sequence of a cytochrome P450 oxygenase homologue (EPOS F) encoded by epoF (nucleotides 62369-63628 of SEQ ID NO:1 ).
SEQ ID NO:9 is a partial protein sequence (partial Orf 1 ) encoded by ort\ (nucleotides 1 -1826 of SEQ ID NO:1 ). SEQ ID NO:10 is a protein sequence (Orf 2) encoded by orfλ (nucleotides 3171 -1900 on the reverse complement strand of SEQ ID NO:1 ).
SEQ ID NO:11 is a protein sequence (Orf 3) encoded by orf3 (nucleotides 3415-5556 of SEQ ID NO:1 ).
SEQ ID NO:12 is a protein sequence (Orf 4) encoded by orfA (nucleotides 5992-5612 on the reverse complement strand of SEQ ID NO:1 ).
SEQ ID NO:13 is a protein sequence (Orf 5) encoded by orf5 (nucleotides 6226-6675 of SEQ ID NO:1 ).
SEQ ID NO:14 is a protein sequence (Orf 6) encoded by orfδ (nucleotides 63779- 64333 of SEQ ID NO:1 ).
SEQ ID NO:15 is a protein sequence (Orf 7) encoded by orf7 (nucleotides 64290- 63853 on the reverse complement strand of SEQ ID NO:1 ).
SEQ ID NO:16 is a protein sequence (Orf 8) encoded by orfQ (nucleotides 64363- 64920 of SEQ ID NO:1 ).
SEQ ID NO:17 is a protein sequence (Orf 9) encoded by or/9 (nucleotides 64727- 64287 on the reverse complement strand of SEQ ID NO:1 ).
SEQ ID NO:18 is a protein sequence (Orf 10) encoded by orfl O (nucleotides 65063- 65767 of SEQ ID NO:1 ).
SEQ ID NO:19 is a protein sequence (Orf 11 ) encoded by σr l 1 (nucleotides 65874- 65008 on the reverse complement strand of SEQ ID NO:1 ).
SEQ ID NO:20 is a protein sequence (Orf 12) encoded by oιf\2 (nucleotides 66338- 65871 on the reverse complement strand of SEQ ID NO:1 ).
SEQ ID NO:21 is a protein sequence (Orf 13) encoded by or/13 (nucleotides 66667- 67137 of SEQ ID NO:1 ).
SEQ ID NO:22 is a protein sequence (Orf 14) encoded by or/14 (nucleotides 67334- 68251 of SEQ ID NO:1 ).
SEQ ID NO:23 is a partial protein sequence (partial Orf 15) encoded by orflδ (nucleotides 68346-68750 of SEQ ID NO:1).
SEQ ID NO:24 is the universal reverse PCR primer sequence. SEQ ID NO:25 is the universal forward PCR primer sequence. SEQ ID NO:26 is the NH24 end "B" PCR primer sequence. SEQ ID NO:27 is the NH2 end "A" PCR primer sequence. SEQ ID NO:28 is the NH2 end "B" PCR primer sequence. SEQ ID NO:29 is the pEP015-NH6 end "B" PCR primer sequence. SEQ ID NO:30 is the pEP015-H2.7 end "A" PCR primer sequence.
DEPOSIT INFORMATION
The following material has been deposited with the Agricultural Research Service, Patent Culture Collection (NRRL), 1815 North University Street, Peona, Illinois 61604, under the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. All restrictions on the availability of the deposited material will be irrevocably removed upon the granting of a patent.
Deposited Material Accession Number Deposit Date pEP015 NRRL B-30033 June 1 1 , 1998 pEP032 NRRL B-301 19 April 16, 1999
DETAILED DESCRIPTION OF THE INVENTION
The genes involved in the biosynthesis of epothilones can be isolated using the techniques according to the present invention. The preferable procedure for the isolation of epothilone biosynthesis genes requires the isolation of genomic DNA from an organism identified as producing epothilones A and B, and the transfer of the isolated DNA on a suitable plasmid or vector to a host organism that does not normally produce the polyketide, followed by the identification of transformed host colonies to which the epothilone-producmg ability has been conferred. Using a technique such as λ::Tn5 transposon mutagenesis (de Bruijn & Lupski, Gene 27: 131 -149 (1984)), the exact region of the transforming epothilone- conferπng DNA can be more precisely defined. Alternatively or additionally, the transforming epothilone-confernng DNA can be cleaved into smaller fragments and the smallest that maintains the epothilone-confernng ability further characterized. Whereas the host organism lacking the ability to produce epothilone may be a different species from the organism from which the polyketide derives, a variation of this technique involves the transformation of host DNA into the same host that has had its epothilone-producmg ability disrupted by mutagenesis. In this method, an epothilone-producmg organism is mutated and non- epothilone-produc g mutants are isolated. These are then complemented by genomic DNA isolated from the epothilone-producmg parent strain. A further example of a technique that can be used to isolate genes required for epothilone biosynthesis is the use of transposon mutagenesis to generate mutants of an epothilone-producmg organism that, after mutagenesis, fails to produce the polyketide. Thus, the region of the host genome responsible for epothilone production is tagged by the transposon and can be recovered and used as a probe to isolate the native genes from the parent strain. PKS genes that are required for the synthesis of polyketides and that are similar to known PKS genes may be isolated by virtue of their sequence homology to the biosynthetic genes for which the sequence is known, such as those for the biosynthesis of πfamycin or soraphen. Techniques suitable for isolation by homology include standard library screening by DNA hybridization.
Preferred for use as a probe molecule is a DNA fragment that is obtainable from a gene or another DNA sequence that plays a part in the synthesis of a known polyketide. A preferred probe molecule comprises a 1.2 kb Sma\ DNA fragment encoding the ketosyntha- se domain of the fourth module of the soraphen PKS (U.S Patent No. 5,716,849), and a more preferred probe molecule comprises the β-ketoacyl synthase domains from the first and second modules of the πfamycin PKS (Schupp et al., FEMS Microbiology Letters 159: 201-207 (1998)). These can be used to probe a gene library of an epothilone-producmg microorganism to isolate the PKS genes responsible for epothilone biosynthesis.
Despite the well-known difficulties with PKS gene isolation in general and despite the difficulties expected to be encountered with the isolation of epothilone biosynthesis genes in particular, by using the methods descπbed in the instant specification, biosynthetic genes for epothilones A and B can surprisingly be cloned from a microorganism that produces that polyketide. Using the methods of gene manipulation and recombinant production described in this specification, the cloned PKS genes can be modified and expressed in transgenic host organisms.
The isolated epothilone biosynthetic genes can be expressed in heterologous hosts to enable the production of the polyketide with greater efficiency than might be possible from native hosts. Techniques for these genetic manipulations are specific for the different available hosts and are known in the art. For example, heterologous genes can be expressed in Streptomyces and other actinomycetes using techniques such as those described in McDaniel et ai., Science 262: 1546-1550 (1993) and Kao et ai., Science 265: 509-512 (1994), both of which are incorporated herein by reference. See also, Rowe et ai., Gene 216: 215-223 (1998); Holmes et ai., EMBO Journal 12(8): 3183-3191 (1993) and Bibb et ai., Gene 38: 215-226 (1985), all of which are incorporated herein by reference.
Alternately, genes responsible for polyketide biosynthesis, i.e., epothilone biosynthetic genes, can also be expressed in other host organisms such as pseudomonads and E coll. Techniques for these genetic manipulations are specific for the different available hosts and are known in the art. For example, PKS genes have been sucessfuliy expressed in E. cob using the pT7-7 vector, which uses the T7 promoter. See, Tabor et al., Proc. Natl. Acad. Sci. USA 82. 1074-1078 (1985), incorporated herein by reference. In addition, the expression vectors pKK223-3 and pKK223-2 can be used to express heterologous genes in E. coli, either in transcπptional or translational fusion, behind the tac or trc promoter. For the expression of operons encoding multiple ORFs, the simplest procedure is to insert the operon into a vector such as pKK223-3 in transcπptional fusion, allowing the cognate πbo- some binding site of the heterologous genes to be used Techniques for overexpression in gram-positive species such as Bacillus are also known in the art and can be used in the context of this invention (Quax et al., in: Industrial Microorganisms Basic and Applied Molecular Genetics, Eds. Baltz et ai., American Society for Microbiology, Washington (1993))
Other expression systems that may be used with the epothilone biosynthetic genes of the invention include yeast and baculovirus expression systems. See, for example, 'The Expression of Recombinant Proteins in Yeasts," Sudbery, P E., Curr. Opin. Biotechnol 7(5): 517-524 (1996); "Methods for Expressing Recombinant Proteins in Yeast," Mackay, et al., Edιtor(s). Carey, Paul R., Protein Eng. Des 105-153, Publisher: Academic, San Diego, Calif (1996); "Expression of heterologous gene products in yeast," Pichuantes, et al., Edιtor(s) Cleland, J. L., Craik, C. S., Protein Eng. 129-161 , Publisher Wiley-ϋss, New York, N. Y (1996); WO 98/27203, Kealey et ai , Proc Natl Acad. Sci. USA 95. 505-509 (1998); "Insect Cell Culture: Recent Advances, Bioengineeπng Challenges And Implications In Protein Production," Palomares, et al., Edιtor(s): Galmdo, Enrique; Ramirez, Octavio T., Adv. Bioprocess Eng. Vol. II, Invited Pap. Int. Symp., 2nd (1998) 25-52, Publisher: Kluwer, Dordrecht, Neth; "Baculovirus Expression Vectors," Jarvis, Donald L., Edιtor(s): Miller, Lois K., Baculoviruses 389-431 , Publisher: Plenum, New York, N. Y. (1997); "Production Of Heterologous Proteins Using The Baculovirus/lnsect Expression System," Gπttiths, et al., Methods Mol. Biol. (Totowa, N. J.) 75 (Basic Cell Culture Protocols (2nd Edition)) 427-440 (1997); and "Insect Cell Expression Technology," Luckow, Verne A., Protein Eng. 183-218, Publisher: Wiley-Liss, New York, N. Y. (1996); all of which are incorporated herein by reference.
Another consideration for expression of PKS genes in heterologous hosts is the requirement of enzymes for posttranslational modification of PKS enzymes by phosphopante- theinylation before they can synthesize polyketides. However, the enzymes responsible for this modification of type I PKS enzymes, phosphopantethemyl (P-pant) transferases are not normally present in many hosts such as E. coli. This problem can be solved by coexpres- sion of a P-pant transferase with the PKS genes in the heterologous host, as described by Kealey et ai., Proc. Natl. Acad. Sci. USA 95: 505-509 (1998), incorporated herein by reference.
Therefore, for the purposes of polyketide production, the significant criteria in the choice of host organism are its ease of manipulation, rapidity of growth (i.e. fermentation), possession or the proper molecular machinery for processes such as posttranslational modification, and its lack of susceptibility to the polyketide being overproduced. Most preferred host organisms are actinomycetes such as strains of Streptomyces. Other preferred host organisms are pseudomonads and E. coli. The above-described methods of polyketide production have significant advantages over the technology currently used in the preparation of the compounds. These advantages include the cheaper cost of production, the ability to produce greater quantities of the compounds, and the ability to produce compounds of a preferred biological enantiomer, as opposed to racemic mixtures inevitably generated by organic synthesis. Compounds produced by heterologous hosts can be used in medical (e.g. cancer treatment in the case of epothilones) as well as agricultural applications.
EXPERIMENTAL
The invention will be further described by reference to the following detailed examples. These examples are provided for purposes of illustration only, and are not intended to be limiting unless otherwise specified. Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Ausubel (ed.), Current Protocols in Molecular Biology, John Wiley and Sons, Inc. (1994), T. Maniatis, E. F. Fπtsch and J. Sambrook, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor laboratory, Cold Spring Harbor, NY (1989); and by T.J. Silhavy, M.L. Berman, and L.W Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1984).
Example 1 Cultivation of an Epothilone-Producmg Strain of Sorangium cellulosum
Sorangium cellulosum strain 90 (DSM 6773, Deutsche Sammlung von Mikroorganis- men und Zellkulturen, Braunschweig) is streaked out and grown (30°C) on an agar plate of SolE medium (0.35% glucose, 0.05% tryptone, 0.15% MgS0 x 7H 0, 0.05% ammonium sulfate, 0.1% CaCI2, 0.006% K2HP04, 0.01% sodium dithionite, 0.0008% Fe-EDTA, 1 2% HEPES, 3.5% [vol/vol] supernatant of sterilized stationary S. cellulosum culture) pH ad. 7.4 Cells from about 1 square cm are picked and inoculated into 5 mis of G51t liquid medium (0 2% glucose, 0.5% starch, 0.2% tryptone, 0.1 % probion S, 0.05% CaCI2x2H20, 0.05% MgSO4x7H20, 1.2% HEPES, pH ad 7.4) and incubated at 30°C with shaking at 225 rpm. After 4 days, the culture is transferred into 50 mis of G51t and incubated as above for 5 days. This culture is used to inoculate 500 mis of G51t and incubated as above for 6 days. The culture is centπfuged for 10 minutes at 4000 rpm and the cell pellet is resuspended in 50 mis of G51t
Example 2: Generation of a Bacterial Artificial Chromosome (Bac) Library
To generate a Bac library, S. cellulosum cells cultivated as described in Example 1 above are embedded into agarose blocks, lysed, and the liberated genomic DNA is partially digested by the restriction enzyme H/πdlll. The digested DNA is separated on an agarose gel by pulsed-field electrophoresis. Large (approximately 90-150 kb) DNA fragments are isolated from the agarose gel and ligated into the vector pBelobacll. pBelobacll contains a gene encoding chloramphenicol resistance, a multiple cloning site in the lacZ gene providing for blue/white selection on appropriate medium, as well as the genes required for the replication and maintenance of the plasmid at one or two copies per cell. The ligation mixture is used to transform Eschenchia coli DH10B electrocompetent cells using standard electroporation techniques. Chloramphenicol-resistant recombinant (white, lac mutant) colonies are transferred to a positively charged nylon membrane filter in 384 3X3 grid format. The clones are lysed and the DNA is cross-linked to the filters. The same clones are also preserved as liquid cultures at -80°C.
Example 3 Screening the Bac Library of Sorangium cellulosum 90 for the Presence of Type
I Polyketide Synthase-Related Sequences
The Bac library filters are probed by standard Southern hybridization procedures The DNA probes used encode β-ketoacyl synthase domains from the first and second modules of the πfamycin polyketide synthase (Schupp et al., FEMS Microbiology Letters 159: 201 -207 (1998)). The probe DNAs are generated by PCR with primers flanking each ketosynthase domain using the plasmid pNE95 as the template (pNE95 equals cosmid 2 described in Schupp et al. (1998)). 25 ng of PCR-amplified DNA is isolated from a 0.5% agarose gel and labeled with 32P-dCTP using a random primer labeling kit (Gibco-BRL, Bethesda MD, USA) according to the manufacturer's instructions. Hybridization is at 65°C for 36 hours and membranes are washed at high stringency (3 times with 0.1 x SSC and 0.5% SDS for 20 mm at 65°C). The labeled blot is exposed on a phosphorescent screen and the signals are detected on a Phospholmager 445SI (screen and 445SI from Molecular Dynamics). This results in strong hybridization of certain Bac clones to the probes. These clones are selected and cultured overnight in 5 mis of Luna broth (LB) at 37°C. Bac DNA from the Bac clones of interest is isolated by a typical mmiprep procedure. The cells are resuspended in 200 μl lysozyme solution (50mM glucose, 10 mM EDTA, 25 mM Tπs-HCI, 5mg/ml lysozyme), lysed in 400 μl lysis solution (0.2 N NaOH and 2% SDS), the proteins are precipitated (3.0 M potassium acetate, adjusted to pH5.2 with acetic acid), and the Bac DNA is precipitated with isopropanol. The DNA is resuspended in 20μl of nuclease-free distilled water, restricted with SamHI (New England Biolabs, Inc.) and separated on a 0.7% agarose gel. The gel is blotted by Southern hybridization as described above and probed under conditions described above, with a 1.2 kb Smal DNA fragment encoding the ketosyn- thase domain of the fourth module of the soraphen polyketide synthase as the probe (see, U.S. Patent No. 5,716,849). Five different hybridization patterns are observed. One clone representing each of the five patterns is selected and named pEP015, pEPO20, pEPO30, pEP031 , and pEP033, respectively.
Example 4: Subcloning of SamHI Fragments from pEP015, pEPO20, pEPO30, pEP031 , and pEP033
The DNA of the five selected Bac clones is digested with SamHI and random fragments are subcloned into pBluescπpt II SK+ (Stratagene) at the SamHI site. Subclones carrying inserts between 2 and 10 kb in size are selected for sequencing of the flanking ends of the inserts and also probed with the 1.2 Smal probe as described above. Subclones that show a high degree of sequence homoiogy to known polyketide synthases and/or strong hybridization to the soraphen ketosynthase domain are used for gene disruption experiments.
Example 5: Preparation of Streptomycin-Resistant Spontaneous Mutants of Sorangium cellulosum strain So ce90
0.1 ml of a three day old culture of Sorangium cellulosum strain So ce90, which is raised in liquid medium G52-H (0.2% yeast extract, 0.2% soyameal defatted, 0.8% potato starch, 0.2% glucose, 0.1 % MgS04 x7H20, 0.1 % CaCI2 X2H20, 0.008% Fe-EDTA, pH ad 7.4 with KOH), is plated out on agar plates with SolE medium supplemented with 100 μg/ml streptomycin. The plates are incubated at 30°C for 2 weeks. The colonies growing on this medium are streptomycin-resistant mutants, which are streaked out and cultivated once more on the same agar medium with streptomycin for purification. One of these streptomycin-resistant mutants is selected and is called BCE28/2. Example 6: Gene Disruptions in Sorangium cellulosum BCE28/2 Using the Subcloned
SamHI Fragments
The SamHI inserts of the subclones generated from the five selected Bac clones as described above are isolated and gated into the unique SamHI site of plasmid pCIB132 (see, U.S. Patent No. 5,716,849). The pClB132 derivatives carrying the inserts are transformed into Eschenchia coli ED8767 containing the helper plasmid pUZ8 (Hedges and Matthew, Plasmid 2- 269-278 (1979). The transformants are used as donors in conjugation experiments with Sorangium cellulosum BCE28/2 as recipient. For the conjugation, 5-10 x 109 cells of Sorangium cellulosum BCE28/2 from an early stationary phase culture (reaching about 5 x 10s cells/ml) grown at 30°C in liquid medium G51b (G51 b equals medium G51t with tryptone replaced by peptone) are mixed in a 1 1 cellular ratio with a late-log phase culture (in LB liquid medium) of E coli ED8767 containing pCIB132 derivatives carrying the subcloned SamHI fragments and the helper plasmid pUZ8. The mixed cells are then centπ- fuged at 4000 rpm for 10 minutes and resuspended in 0.5 ml G51 b medium This cell suspension is then plated as a drop in the center of a plate with So1 E agar containg 50 mg/l kanamycm. The cells obtained after incubation for 24 hours at 30°C are harvested and resuspended in 0 8 ml of G51 b medium, and 0.1 to 0.3 ml of this suspension is plated out on a selective So1 E solid medium containing phleomycin (30 mg/l), streptomycin (300 mg/l), and kanamycm (50 mg/l) The counterselection of the donor Eschenchia coli strain takes place with the aid of streptomycin. The colonies that grow on this selective medium after an incubation time of 8-12 days at a temperature of 30°C are isolated with a plastic loop and streaked out and cultivated on the same agar medium for a second round of selection and purification The colony-derived cultures that grow on this selective agar medium after 7 days at a temperature of 30°C are transconjugants of Sorangium cellulosum BCE28/2 that have acquired phleomycin resistance by conjugative transfer of the pCIB132 derivatives carrying the subcloned SamHI fragments.
Integration of the pCIBl32-derιved plasmids into the chromosome of Sorangium cellulosum BCE28/2 by homologous recombination is verified by Southern hybridization. For this experiment, complete DNA from 5-10 tranconjugants per transferred SamHI fragment is isolated (from 10 ml cultures grown in medium G52-H for three days) applying the method described by Pospiech and Neumann, Trends Genet. 11 : 217 (1995). For the Southern blot, the DNA isolated as described above is cleaved either with the restriction enzymes Bg/ll, C/al, or Λ σtl, and the respective SamHI inserts or pCIB132 are used as 32P labelled probes.
Example 7: Analysis of the Effect of the Integrated SamHI Fragments on Epothilone Production by Sorangium cellulosum After Gene Disruption
Transconjugant cells grown on about 1 square cm surface of the selective So1 E plates of the second round of selection (see Example 6) are transferred by a sterile plastic loop into 10 ml of medium G52-H in an 50 ml Erlenmeyer flask. After incubation at 30°C and 180 rpm for 3 days, the culture is transfered into 50 ml of medium G52-H in an 200 ml Erlenmeyer flask. After incubation at 30°C and 180 rpm for 4-5 days, 10 ml of this culture is transfered into 50 ml of medium 23B3 (0.2 % glucose, 2 % potato starch, 1.6 % soya meal defatted, 0 0008 % Fe-EDTA Sodium salt, 0.5 % HEPES (4-(2-hydroxyethyl)-pιperazιne-1 - ethane-sulfonic-acid), 2 % vol/vol polysterole resin XAD16 (Rohm & Haas), pH adjusted to 7.8 with NaOH) in an 200 ml Erlenmeyer flask.
Quantitative determination of the epothilone produced takes place after incubation of the cultures at 30°C and 180 rpm for 7 days. The complete culture broth is filtered by suction through a 150 μm nylon filter. The resin remaining on the filter is then resuspended in 10 ml isopropanol and extracted by shaking the suspension at 180 rpm for 1 hour. 1 ml is removed from this suspension and centπfuged at 12,000 rpm in an Eppendorff Microfuge The amount of epothilones A and B therein is determined by means of an HPLC and detection at 250 nm with a UV_DAD detector (HPLC with Waters -Symetry C18 column and a gradient of 0.02 % phosphoric acid 60%-0% and acetonitπl 40%-100%).
Transconjugants with three different integrated SamHI fragments subcloned from pEP015, namely transconjugants with the SamHI fragment of plasmid pEP015-21 , transconjugants with the SamHI fragment of plasmid pEPOl 5-4-5, and transconjugants with the SamHI fragment of plasmid pEPOl 5-4-1 , are tested in the manner described above. HPLC analysis reveals that all transconjugants no longer produce epothilone A or B. By contrast, epothilone A and B are detectable in a concentration of 2-4 mg/l in transconjugants with SamHI fragments integrated that are derived from pEPO20, pEPO30, pEP031 , pEP033, and in the parental strain BCE28/2. Example 8: Nucleotide Sequence Determination of the Cloned Fragments and
Construction of Contigs
A. SamHI Insert of Plasmid pEP015-21
Plasmid DNA is isolated from the strain Eschenchia coli DH10B [pEP015-21], and the nucleotide sequence of the 2.3-kb SamHI insert in pEP015-21 is determined. Automated DNA sequencing is done on the double-stranded DNA template by the dideoxynucleo- tide chain termination method, using Applied Biosystems model 377 sequencers. The primers used are the universal reverse primer (5' GGA AAC AGC TAT GAC CAT G 3" (SEQ ID NO:24)) and the universal forward primer (5' GTA AAA CGA CGG CCA GT 3' (SEQ ID NO:25)). In subsequent rounds of sequencing reactions, custom-synthesized oligonucle- otides, designed for the 3' ends of the previously determined sequences, are used to extend and join contigs. Both strands are entirely sequenced, and every nucleotide is se- quenced at least two times. The nucleotide sequence is compiled using the program Sequencher vers. 3 0 (Gene Codes Corporation), and analyzed using the University of Wisconsin Genetics Computer Group programs. The nucleotide sequence of the 2213-bp insert corresponds to nucleotides 20779-22991 of SEQ ID NO:1.
B. SamHI Insert of Plasmid pEPOl 5-4-1
Plasmid DNA is isolated from the strain Eschenchia coli DH10B [pEPOl 5-4-1], and the nucleotide sequence of the 3.9-kb SamHI insert in pEPOl 5-4-1 is determined as described in (A) above. The nucleotide sequence of the 3909-bp insert corresponds to nucleotides 16876-20784 of SEQ ID NO:1.
C. SamHI Insert of Plasmid pEPOl 5-4-5
Plasmid DNA is isolated from the strain Eschenchia coli DH10B [pEPOl 5-4-5], and the nucleotide sequence of the 2.3-kb SamHI insert in pEPOl 5-4-5 is determined as described in (A) above. The nucleotide sequence of the 2233-bp insert corresponds to nucleotides 42528-44760 of SEQ ID NO:1. Example 9: Subcloning and Ordering of DNA Fragments from pEP015 Containing Epothilone Biosynthesis Genes
pEP015 is digested to completion with the restriction enzyme H/πdlll and the resulting fragments are subcloned into pBluescript II SK- or pNEB193 (New England Biolabs) that has been cut with H/πdlll and dephosphorylated with calf intestinal alkaline phospha- tase. Six different clones are generated and named pEP015-NH1 , pEP015-NH2, pEP015-NH6, pEP015-NH24 (all based on pNEB193), and pEP015-H2.7 and pEP015- H3.0 (both based on pBluescript II SK-).
The SamHI insert of pEP015-21 is isolated and DIG-labeled (Non-radioactive DNA labeling and detection system, Boehπnger Mannheim), and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015- NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signal is detected for pEP015-NH24, indicating that pEP015-21 is contained within pEP015-NH24.
The SamHI insert of pEPOI 5-4-1 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015-NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signals are detected for pEP015-NH24 and pEP015-H2.7. Nucleotide sequence data generated from one end each of pEP015-NH24 and pEP015-H2.7 are also in complete agreement with the previously determined sequence of the SamHI insert of pEPOI 5-4-1 . These experiments demonstrate that pEPOI 5-4-1 (which contains one internal H/πdlll site) overlaps pEP015-H2.7 and pEP015-NH24, and that pEP015-H2.7 and pEP015-NH24, in this order, are contiguous.
The SamHI insert of pEPOI 5-4-5 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015-NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signal is detected for pEP015-NH2, indicating that pEP015-21 is contained within pEP015-NH2.
Nucleotide sequence data is generated from both ends of pEP015-NH2 and from the end of pEP015-NH24 that does not overlap with pEPOI 5-4-1. PCR primers NH24 end "B": GTGACTGGCGCCTGGAATCTGCATGAGC (SEQ ID N0:26), NH2 end "A": AGCGGGAGCTTGCTAGACATTCTGTTTC (SEQ ID N0:27), and NH2 end "B": GACGCGCCTCGGGCAGCGCCCCAA (SEQ ID NO:28), pointing towards the H/πdlll sites, are designed based on these sequences and used in amplification reactions with pEP015 and, in separate experiments, with Sorangium cellulosum So ce90 genomic DNA as the templates. Specific amplification is found with primer pair NH24 end "B" and NH2 end "A" with both templates. The amplimers are cloned into pBluescript II SK- and completely sequenced. The sequences of the amplimers are identical, and also agree completely with the end sequences of pEP015-NH24 and pEP015-NH2, fused at the H/πdlll site, establishing that the H/πdlll fragments of pEP015-NH2 and pEP015-NH24 are, in this order, contiguous.
The H/πdlll insert of pEP015-H2.7 is isolated and DIG-labeled as above, and used as a probe in a DNA hybridization experiment at high stringency against pEP015 digested by Noti. A Noti fragment of about 9 kb in size shows a strong a hybridization, and is further subcloned into pBluescript II SK- that has been digested with Noti and dephosphorylated with calf intestinal alkaline phosphatase, to yield pEP015-N9-16. The Λ/orl insert of pEP015-N9-16 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015- NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signals are detected for pEPOI 5-NH6, and also for the expected clones pEPOI 5-H2.7 and pEPOI 5- NH24. Nucleotide sequence data is generated from both ends of pEP015-NH6 and from the end of pEP015-H2.7 that does not overlap with pEPOI 5-4-1. PCR primers are designed pointing towards the H/ndlll sites and used in amplification reactions with pEP015 and, in separate experiments, with Sorangium cellulosum So ce90 genomic DNA as the templates. Specific amplification is found with primer pair pEP015-NH6 end "B": CACCGAAGCGTCGATCTGGTCCATC (SEQ ID NO:29) and pEP015-H2.7 end "A": CGGTCAGATCGACGACGGGCTTTCC (SEQ ID NO:30) with both templates. The amplimers are cloned into pBluescript II SK- and completely sequenced. The sequences of the amplimers are identical, and also agree completely with the end sequences of pEP015- NH6 and pEP015-H2.7, fused at the H/πdlll site, establishing that the H/πdlll fragments of pEP015-NH6 and pEP015-H2.7 are, in this order, contiguous.
All of these experiments, taken together, establish a contig of H/πdlll fragments covering a region of about 55 kb and consisting of the H/πdlll inserts of pEP015-NH6, pEP015-H2.7, pEP015-NH24, and pEP015-NH2, in this order. The inserts of the remaining two H/πdlll subclones, namely pEP015-NH1 and pEPO15-H3.0, are not found to be parts of this contig. Example 10: Further Extension of the Subclone Contig Covering the Epothilone
Biosynthesis Genes
An approximately 2.2 kb SamHI - H/πdlll fragment derived from the downstream end of the insert of pEP015-NH2 and thus representing the downstream end of the subclone contig described in Example 9 is isolated, DIG-labeled, and used in Southern hybridization experiments against pEP015 and pEP015-NH2 DNAs digested with several enzymes. The strongly hybridizing bands are always found to be the same in size between the two target DNAs indicating that the Sorangium cellulosum So ce90 genomic DNA fragment cloned into pEP015 ends with the H/πdlll site at the downstream end of pEP015-NH2.
A cosmid DNA library of Sorangium cellulosum So ce90 is generated, using established procedures, in pScosTπplex-ll (Ji, et ai., Genomics Z^ - 185-192 (1996)). Briefly, high- molecular weight genomic DNA of Sorangium cellulosum So ce90 is partially digested with the restriction enzyme Sau3AI to provide fragments with average sizes of about 40 kb, and ligated to SamHI and Xba\ digested pScosTπplex-ll The hgation mix is packaged with Gigapack III XL (Stratagene) and used to transfect E. coli XL1 Blue MR cells.
The cosmid library is screened with the approximately 2.2 kb SamHI - H/πdlll fragment, derived from the downstream end of the insert of pEP015-NH2, used as a probe in colony hybridization. A strongly hybridizing clone, named pEP04E7 is selected. pEP04E7 DNA is isolated, digested with several restriction endonucleases, and probed in Southern hybridization experiments with the 2.2 kb SamHI - H/πdlll fragment. A strongly hybridizing Λ/ofl fragment of approximately 9 kb in size is selected and subcloned into pBluescript II SK- to yield pEP04E7-N9-8. Further Southern hybridization experiments reveal that the approximately 9 kb Λ/ofl insert of pEP04E7-N9-8 overlaps pEP015-NH2 over 6 kb in a Not\ - H/ndlll fragment, while the remaining approximately 3 kb H/πdlll - Λ/ofl fragment would extend the subclone contig described in Example 9. End sequencing reveals, however, that the downstream end of the insert of pEP04E7-N9-8 contains the SamHI - Λ/ofl polyl ker of pScosTπplex-ll, thereby indicating that the genomic DNA insert of pEP04E7 ends at a Sau3AI site within the extending H/ndlll - Λ/ofl fragment and that the Λ/ofl site is derived from pScosTπplex-ll.
An approximately 1.6 kb Pst - Sal\ fragment derived from the approximately 3 kb extending H/ndlll - Λ/ofl subfragment of pEP04E7-N9-8, containing only Sorangium cellulosum So ce90-derived sequences free of vector, is used as a probe against the bacterial artificial chromosome library described in Example 2. Besides the previously- isolated EP015, a Bac clone, named EP032, is found to strongly hybridize to the probe. pEP032 is isolated, digested with several restriction endonucleases, and hybridized with the approximately 1.6 kb Pst\ - Sal probe. A H/ndlll - EcoRV fragment of about 13 kb in size is found to strongly hybridize to the probe, and is subcloned into pBluescript II SK- digested with H/ndlll and H/ncll to yield pEP032-HEV15.
Oligonucleotide primers are designed based on the downstream end sequence of pEP015-NH2 and on the upstream (H/πdlll) end sequence derived from pEP032-HEV15, and used in sequencing reactions with pEP04E7-N9-8 as the template. The sequences reveal the existence of a small H/ndlll fragment (EPO4E7-H0.02) of 24 bp, undetectable in standard restriction analysis, separating the H/ndlll site at the downstream end of pEP015- NH2 from the H/ndlll site at the upstream end of pEP032-HEV15.
Thus, the subclone contig described in Example 9 is extended to include the H/πdlll fragment EPO4E7-H0.02 and the insert of pEP032-HEV15, and constitutes the inserts of: pEP015-NH6, pEP015-H2.7, pEP015-NH24, pEP015-NH2, EPO4E7-H0.02 and pEP032- HEV15, in this order.
Example 11 : Nucleotide Sequence Determination of the Subclone Contig Covering the
Epothilone Biosynthesis Genes
The nucleotide sequence of the subclone contig described in Example 10 is determined as follows. pEPOI 5-H2.7. Plasmid DNA is isolated from the strain Eschenchia coli DH1 OB [pEP015-H2.7], and the nucleotide sequence of the 2.7-kb SamHI insert in pEP015-H2.7 is determined. Automated DNA sequencing is done on the double-stranded DNA template by the dideoxynucleotide chain termination method, using Applied Biosystems model 377 sequencers. The primers used are the universal reverse primer (5' GGA AAC AGC TAT GAC CAT G 3' (SEQ ID NO:24)) and the universal forward primer (5' GTA AAA CGA CGG CCA GT 3' (SEQ ID NO:25)). In subsequent rounds of sequencing reactions, custom- synthesized oligonucleotides, designed for the 3' ends of the previously determined sequences, are used to extend and join contigs. pEP015-NH6, pEP015-NH24 and pEP015-NH2. The H/ndlll inserts of these pias- mids are isolated, and subjected to random fragmentation using a Hydroshear apparatus (Genomic Instrumentation Services, Inc.) to yield an average fragment size of 1 -2 kb. The fragments are end-repaired using T4 DNA Poiymerase and Klenow DNA Polymerase enzymes in the presence of desoxynucleotide triphosphates, and phosphorylated with T4 DNA Kinase in the presence of πbo-ATP. Fragments in the size range of 1.5-2.2 kb are isolated from agarose gels, and ligated into pBluescript II SK- that has been cut with EcoRV and dephosphorylated. Random subclones are sequenced using the universal reverse and the universal forward primers. pEP032-HEV15. pEP032-HEV15 is digested with H/πdlll and Sspl, the approximately 13.3 kb fragment containing the -13 kb H/πdlll - EcoRV insert from So. cellulosum So ce90 and a 0.3 kb H/πcll - Sspl fragment from pBluescript II SK- is isolated, and partially digested with Haelll to yield fragments with an average size of 1 -2 kb. Fragments in the size range of 1.5-2.2 kb are isolated from agarose gels, and ligated into pBluescript II SK- that has been cut with EcoRV and dephosphorylated. Random subclones are sequenced using the universal reverse and the universal forward primers.
The chromatograms are analyzed and assembled into contigs with the Phred, Phrap and Consed programs (Ewmg, et ai., Genome Res. 8(3). 175-185 (1998); Ewmg, et al , Genome Res. 8(3): 186-194 (1998); Gordon, et ai., Genome Res. 8(3): 195-202 (1998)) Contig gaps are filled, sequence discrepancies are resolved, and low-quality regions are resequenced using custom-designed o gonucleotide primers for sequencing on either the original subclones or selected clones from the random subclone libraries. Both strands are completely sequenced, and every basepair is covered with at least a minimum aggregated Phred score of 40 (confidence level of 99.99%).
The nucleotide sequence of the 68750 bp contig is shown as SEQ ID NO:1. Example 12: Nucleotide Sequence Analysis of the Epothilone Biosynthesis Genes
SEQ ID NO:1 is found to contain 22 ORFs as detailed below in Table 1 :
Table 1
Figure imgf000043_0001
* On the reverse complementer strand. Numbering according to SEQ ID NO:1.
epoA (nucleotides 7610-11875 of SEQ ID NO:1 ) codes for EPOS A (SEQ ID NO:2), a type I polyketide synthase consisting of a single module, and harboring the following domains: β-ketoacyl-synthase (KS) (nucleotides 7643-8920 of SEQ ID NO:1 , amino acids 11- 437 of SEQ ID NO:2); acyltransferase (AT) (nucleotides 9236-10201 of SEQ ID NO:1 , ammo acids 543-864 of SEQ ID NO:2); enoyl reductase (ER) (nucleotides 10529-11428 of SEQ ID NO:1 , ammo acids 974-1273 of SEQ ID NO:2); and acyl carrier protein homologous domain (ACP) (nucleotides 11549-11764 of SEQ ID NO:1 , ammo acids 1314-1385 of SEQ ID NO:2). Sequence comparisons and motif analysis (Haydock, et al. FEBS Lett. 374: 246- 248 (1995), Tang, et al., Gene 216: 255-265 (1998)) reveal that the AT encoded by EPOS A is specific for malonyl-CoA. EPOS A should be involved in the initiation of epothilone biosynthesis by loading the acetate unit to the multienzyme complex that will eventually form part of the 2-methylthιazole ring (C26 and C20). epoP (nucleotides 11872-16104 of SEQ ID NO:1 ) codes for EPOS P (SEQ ID NO:3), a non-nbosomal peptide synthetase containing one module. EPOS P harbors the following domains:
• peptide bond formation domain, as delineated by motif K (ammo acids 72-81 [FPLTDIQESY] of SEQ ID NO:3, corresponding to nucleotide positions 12085-12114 of SEQ ID NO.1 ); motif L (ammo acids 1 18-125 [VVARHDML] of SEQ ID NO.3, corresponding to nucleotide positions 12223-12246 of SEQ ID NO:1 ), motif M (ammo acids 199- 212 [SIDLINVDLGSLSI] of SEQ ID NO:3, corresponding to nucleotide positions 12466- 12507 of SEQ ID NO:1 ); and motif O (ammo acids 353-363 [GDFTSMVLLDI] of SEQ ID NO:3, corresponding to nucleotide positions 12928-12960 of SEQ ID NO:1 );
• ammoacyl adenylate formation domain, as delineated by motif A (ammo acids 549- 565 [LTYEELSRRSRRLGARL] of SEQ ID NO:3, corresponding to nucleotide positions 13516-13566 of SEQ ID NO:1 ); motif B (ammo acids 588-603 [VAVLAVLESGAAYVPI] of SEQ ID NO.3, corresponding to nucleotide positions 13633-13680 of SEQ ID NO:1); motif C (ammo acids 669-684 [AYVIYTSGSTGLPKGV] of SEQ ID NO:3, corresponding to nucleotide positions 13876-13923 of SEQ ID NO:1 ); motif D (ammo acids 815-821 [SLGGATE] of SEQ ID NO:3, corresponding to nucleotide positions 14313-14334 of SEQ ID NO:1 ); motif E (amino acids 868-892 [GQLYIGGVGLALGYWRDEEKTRKSF] of SEQ ID NO:3, corresponding to nucleotide positions 14473-14547 of SEQ ID NO:1 ); motif F (ammo acids 903-912 [YKTGDLGRYL] of SEQ ID NO:3, corresponding to nucleotide positions 14578-14607 of SEQ ID NO:1 ); motif G (ammo acids 918-940 [EFMGREDNQIKLRGYRVELGEIE] of SEQ ID NO:3, corresponding to nucleotide positions 14623-14692 of SEQ ID NO:1); motif H (ammo acids 1268-1274 [LPEYMVP] of SEQ ID NO:3, corresponding to nucleotide positions 15673-15693 of SEQ ID NO:1); and motif I (amino acids 1285-1297 [LTSNGKVDRKALR] of SEQ ID NO:3, corresponding to nucleotide positions 15724-15762 of SEQ ID NO:1 );
• an unknown domain, inserted between motifs G and H of the ammoacyl adenylate formation domain (amino acids 973-1256 of SEQ ID NO:3, corresponding to nucleotide positions 14788-15639 of SEQ ID NO:1 ); and
• a peptidyl carrier protein homologous domain (PCP), delineated by motif J (am o acids 1344-1351 [GATSIHIV] of SEQ ID NO:3, corresponding to nucleotide positions 15901-15924 of SEQ ID NO:1 ).
It is proposed that EPOS P is involved in the activation of a cysteme by adenylation, binding the activated cysteine as an ammoacyl-S-PCP, forming a peptide bond between the enzyme-bound cysteme and the acetyl-S-ACP supplied by EPOS A, and the formation of the initial thiazoiine ring by intramolecular heterocyciization. The unknown domain of EPOS P displays very weak homologies to NAD(P)H oxidases and reductases from Bacillus species. Thus, this unknown domain and/or the ER domain of EPOS A may be involved in the oxidation of the initial 2-methylthiazoline ring to a 2-methylthiazole. epoB (nucleotides 16251 -21749 of SEQ ID NO:1 ) codes for EPOS B (SEQ ID NO:4), a type I polyketide synthase consisting of a single module, and harboring the following domains: KS (nucleotides 16269-17546 of SEQ ID NO:1 , am o acids 7-432 of SEQ ID NO:4); AT (nucleotides 17865-18827 of SEQ ID NO:1 , ammo acids 539-859 of SEQ ID NO:4); dehydratase (DH) (nucleotides 18855-19361 of SEQ ID NO:1 , amino acids 869-1037 of SEQ ID NO:4); β-ketoreductase (KR) (nucleotides 20565-21302 of SEQ ID NO:1 , ammo acids 1439-1684 of SEQ ID NO:4); and ACP (nucleotides 21414-21626 of SEQ ID NO:1 , ammo acids 1722-1792 of SEQ ID NO:4). Sequence comparisons and motif analysis reveal that the AT encoded by EPOS B is specific for methylmalonyl-CoA. EPOS A should be involved in the first polyketide chain extension by catalysing the Claisen-like condensation of the 2-methyl-4-thiazolecarboxyl-S-PCP starter group with the methylmalonyl-S-ACP, and the concomitant reduction of the b-keto group of C17 to an enoyl. epoC (nucleotides 21746-43519 of SEQ ID NO:1 ) codes for EPOS C (SEQ ID NO:5), a type I polyketide synthase consisting of 4 modules. The first module harbors a KS (nucleotides 21860-231 16 of SEQ ID NO:1 , amino acids 39-457 of SEQ ID NO:5); a malonyl CoA- specific AT (nucleotides 23431-24397 of SEQ ID NO:1 , amino acids 563-884 of SEQ ID NO:5); a KR (nucleotides 25184-25942 of SEQ ID NO:1 , amino acids 1147-1399 of SEQ ID NO:5); and an ACP (nucleotides 26045-26263 of SEQ ID NO:1 , amino acids 1434-1506 of SEQ ID NO:5). This module incorporates an acetate extender unit (C14-C13) and reduces the β-keto group at C15 to the hydroxyl group that takes part in the final lactonization of the epothilone macrolactone ring. The second module of EPOS C harbors a KS (nucleotides 26318-27595 of SEQ ID NO:1 , ammo acids 1524-1950 of SEQ ID NO:5); a malonyl CoA- specific AT (nucleotides 27911 -28876 of SEQ ID NO:1 , ammo acids 2056-2377 of SEQ ID NO:5), a KR (nucleotides 29678-30429 of SEQ ID NO.1 , ammo acids 2645-2895 of SEQ ID NO:5); and an ACP (nucleotides 30539-30759 of SEQ ID NO.1 , ammo acids 2932-3005 of SEQ ID NO.5). This module incorporates an acetate extender unit (C12-C1 1 ) and reduces the β-keto group at C13 to a hydroxyl group. Thus, the nascent polyketide chain of epothilone corresponds to epothilone A, and the incorporation of the methyl side chain at C12 in epothilone B would require a post-PKS C-methyltransferase activity The formation of the epoxi ring at C13-C12 would also require a post-PKS oxidation step. The third module of EPOS C harbors a KS (nucleotides 30815-32092 of SEQ ID NO:1 , ammo acids 3024-3449 of SEQ ID NO:5); a malonyl CoA-specific AT (nucleotides 32408-33373 of SEQ ID NO:1 , ammo acids 3555-3876 of SEQ ID NO:5); a DH (nucleotides 33401 -33889 of SEQ ID NO.1 , ammo acids 3886-4048 of SEQ ID NO:5); an ER (nucleotides 35042-35902 of SEQ ID NO.1 , ammo acids 4433-4719 of SEQ ID NO.5), a KR (nucleotides 35930-36667 of SEQ ID NO 1 , ammo acids 4729-4974 of SEQ ID NO:5), and an ACP (nucleotides 36773-36991 of SEQ ID NO.1 , ammo acids 5010-5082 of SEQ ID NO:5) This module incorporates an acetate extender unit (C10-C9) and fully reduces the β-keto group at C1 1 The fourth module of EPOS C harbors a KS (nucleotides 37052-38320 of SEQ ID NO:1 , ammo acids 5103- 5525 of SEQ ID NO:5); a methylmalonyl CoA-specific AT (nucleotides 38636-39598 of SEQ ID NO.1 , ammo acids 5631 -5951 of SEQ ID NO:5); a DH (nucleotides 39635-40141 of SEQ ID NO.1 , ammo acids 5964-6132 of SEQ ID NO:5); an ER (nucleotides 41369-42256 of SEQ ID NO:1 , ammo acids 6542-6837 of SEQ ID NO:5); a KR (nucleotides 42314-43048 of SEQ ID NO:1 , ammo acids 6857-7101 of SEQ ID NO:5); and an ACP (nucleotides 43163- 43378 of SEQ ID NO:1 , ammo acids 7140-7211 of SEQ ID NO:5). This module incorporates a propionate extender unit (C24 and C8-C7) and fully reduces the β-keto group at C9. epoD (nucleotides 43524-54920 of SEQ ID NO:1 ) codes for EPOS D (SEQ ID NO:6), a type I polyketide synthase consisting of 2 modules. The first module harbors a KS (nucleotides 43626-44885 of SEQ ID NO:1 , am o acids 35-454 of SEQ ID NO:6); a methylmalonyl CoA-specific AT (nucleotides 45204-46166 of SEQ ID NO:1 , ammo acids 561 -881 of SEQ ID NO:6); a KR (nucleotides 46950-47702 of SEQ ID NO:1 , ammo acids 1143-1393 of SEQ ID N0:6); and an ACP (nucleotides 47811 -48032 of SEQ ID NO:1 , ami- no acids 1430-1503 of SEQ ID NO:6). This module incorporates a propionate extender unit (C23 and C6-C5) and reduces the β-keto group at C7 to a hydoxyl group. The second module harbors a KS (nucleotides 48087-49361 of SEQ ID NO:1 , am o acids 1522-1946 of SEQ ID NO: 6); a methylmalonyl CoA-specific AT (nucleotides 49680-50642 of SEQ ID NO:1 , am o acids 2053-2373 of SEQ ID NO:6); a DH (nucleotides 50670-51176 of SEQ ID NO:1 , ammo acids 2383-2551 of SEQ ID NO:6); a methyltransferase (MT, nucleotides 51534-52657 of SEQ ID NO.1 , ammo acids 2671-3045 of SEQ ID NO.6), a KR (nucleotides 53697-54431 of SEQ ID NO:1 , ammo acids 3392-3636 of SEQ ID NO:6); and an ACP (nucleotides 54540-54758 of SEQ ID NO:1 , ammo acids 3673-3745 of SEQ ID NO:6). This module incorporates a propionate extender unit (C21 or C22 and C4-C3) and reduces the β-keto group at C5 to a hydoxyl group. This reduction is somewhat unexpected, since epothilones contain a keto group at C5. Discrepancies of this kind between the deduced reductive capabilities of PKS modules and the redox state of the corresponding positions in the final polyketide products have been, however, reported in the literature (see, for example, Schwecke, et al., Proc. Natl. Acad. Sci. USA 92: 7839-7843 (1995) and Schupp, et al., FEMS Microbiology Letters 159: 201 -207 (1998)). An important feature of epothilones is the presence of gem-methyl side groups at C4 (C21 and C22). The second module of EPOS D is predicted to incorporate a propionate unit into the growing polyketide chain, providing one methyl side chain at C4. This module also contains a methyltransferase domain integrated into the PKS between the DH and the KR domains, in an arrangement similar to the one seen in the HMWP1 yersiniabactm synthase (Gehnng, A.M., DeMoll, E., Fetherston, J.D., Mori, I., Mayhew, G.F., Blattner, F.R., Walsh, C.T., and Perry, R.D.: Iron acquisition in plague: modular logic in enzymatic biogenesis of yersiniabactm by Yersmia pestis. Chem. Biol. 5, 573-586, 1998). This MT domain in EPOS D is proposed to be responsible for the incorporation of the second methyl side group (C21 or C22) at C4. epoE (nucleotides 54935-62254 of SEQ ID NO:1 ) codes for EPOS E (SEQ ID NO:7), a type I polyketide synthase consisting of one module, harboring a KS (nucleotides 55028- 56284 of SEQ ID NO:1 , ammo acids 32-450 of SEQ ID NO:7); a malonyl CoA-specific AT (nucleotides 56600-57565 of SEQ ID NO:1 , ammo acids 556-877 of SEQ ID NO:7); a DH (nucleotides 57593-58087 of SEQ ID NO: 1 , am o acids 887-1051 of SEQ ID NO:7); a probably nonfunctional ER (nucleotides 59366-60304 of SEQ ID NO:1 , am o acids 1478-1790 of SEQ ID NO:7); a KR (nucleotides 60362-61099 of SEQ ID NO:1 , ammo acids 1810-2055 of SEQ ID N0:7); an ACP (nucleotides 61211-61426 of SEQ ID N0:1 , ammo acids 2093- 2164 of SEQ ID N0:7); and a thioesterase (TE) (nucleotides 61427-62254 of SEQ ID NO:1 , am o acids 2165-2439 of SEQ ID NO:7). The ER domain in this module harbors an active site motif with some highly unusual ammo acid substitutions that probably render this domain inactive. The module incorporates an acetate extender unit (C2-C1 ), and reduces the β-keto at C3 to an enoyl group. Epothilones contain a hydroxyl group at C3, so this reduction also appears to be excessive as discussed for the second module of EPOS D The TE domain of EPOS E takes part in the release and cychzation of the grown polyketide chain via lactonization between the carboxyl group of C1 and the hydroxyl group of C15.
Five ORFs are detected upstream of epoA in the sequenced region The partially sequenced orfl has no homologues in the sequence databanks. The deduced protein product (Orf 2, SEQ ID NO:10) of or/2 (nucleotides 3171 -1900 on the reverse complement strand of SEQ ID NO:1 ) shows strong similarities to hypothetical ORFs from Mycobacterium and Streptomyces coelicolor, and more distant similarities to carboxypeptidases and DD- peptidases of different bacteria. The deduced protein product of or/3 (nucleotides 3415- 5556 of SEQ ID NO:1 ), Orf 3 (SEQ ID NO:11 ), shows homoiogies to Na/H antiporters of different bacteria. Orf 3 might take part in the export of epothilones from the producer strain orf4 and or/5 have no homologues in the sequence databanks
Eleven ORFs are found downstream of epoE in the sequenced region. epoF (nucleotides 62369-63628 of SEQ ID NO:1 ) codes for EPOS F (SEQ ID NO.8), a deduced protein with strong sequence similarities to cytochrome P450 oxygenases. EPOS F may take part in the adjustment of the redox state of the carbons C12, C5, and/or C3. The deduced protein product of 0//14 (nucleotides 67334-68251 of SEQ ID NO.1 ), Orf 14 (SEQ ID NO:22) shows strong similarities to Gl.3293544, a hypothetic protein with no proposed function from Streptomyces coelicolor, and also to Gl:2654559, the human embπonic lung protein. It is also more distantly related to cation efflux system proteins like Gl:2623026 from Methano- bacteπum thermoautotrophicum, so it might also take part in the export of epothilones from the producing cells. The remaining ORFs (orfo-oιf\ 3 and orfl 5) show no homoiogies to entries in the sequence databanks.
Example 13: Recombinant Expression of Epothilone Biosynthesis Genes Epothilone synthase genes according to the present invention are expressed in heterologous organisms for the purposes of epothilone production at greater quantities than can be accomplished by fermentation of Sorangium cellulosum. A preferable host for heterologous expression is Streptomyces, e.g. Streptomyces coelicolor, which natively produces the polyketide actinorhodm. Techniques for recombinant PKS gene expression in this host are described in McDaniel et al., Science 262: 1546-1550 (1993) and Kao et al., Science 265: 509-512 (1994). See also, Holmes et ai., EMBO Journal 12(8). 3183-3191 (1993) and Bibb et al., Gene 38: 215-226 (1985), as well as U.S. Patent Nos. 5,521 ,077, 5,672,491 , and 5,712,146, which are incorporated herein by reference.
According to one method, the heterologous host strain is engineered to contain a chromosomal deletion of the actinorhodm (act) gene cluster. Expression plasmids containing the epothilone synthase genes of the invention are constructed by transferring DNA from a temperature-sensitive donor plasmid to a recipient shuttle vector in E coli (McDaniel et al. (1993) and Kao et al. (1994)), such that the synthase genes are built-up by homologous recombination within the vector. Alternatively, the epothilone synthase gene cluster is introduced into the vector by restriction fragment ligation. Following selection, e.g. as described in Kao et ai. (1994), DNA from the vector is introduced into the act-minus Streptomyces coelicolor strain according to protocols set forth in Hopwood et ai., Genetic Manipulation of Streptomyces. A Laboratory Manual (John Innes Foundation, Norwich, United Kingdom, 1985), incorporated herein by reference. The recombinant Streptomyces strain is grown on R2YE medium (Hopwood et ai. (1985)) and produces epothilones. Alternatively, the epothilone synthase genes according to the present invention are expressed in other host organisms such as pseudomonads, Bacillus, yeast, insect cells and/or E. coli. PKS and NRPS genes are preferably expressed in E. coli using the pT7-7 vector, which uses the T7 promoter. See, Tabor etai., Proc. Natl. Acad. Sci. USA 82: 1074-1078 (1985). In another embodiment, the expression vectors pKK223-3 and pKK223-2 are used to express PKS and NRPS genes in E. coli, either in transcπptional or translational fusion, behind the tac or trc promoter. Expression of PKS and NRPS genes in heterologous hosts, which do not naturally have the phosphopantethemyl (P-pant) transferases needed for posttranslational modification of PKS enzymes, requires the coexpression in the host of a P- pant transferase, as described by Kealey et al., Proc. Natl. Acad. Sci. USA 95: 505-509 (1998). Example 14: Isolation of Epothilones from Producing Strains
Examples of cultivation, fermentation, and extraction procedures for polyketide isolation, which are useful for extracting epothilones from both native and recombinant hosts according to the present invention, are given in WO 93/10121 , incorporated herein by reference, in Example 57 of U.S. Patent No. 5,639,949, in Gerth et ai., J. Antibiotics 49: 560-563 (1996), and in Swiss patent application no. 396/98, filed February 19, 1998, and U.S. patent application no. 09/248,910 (that discloses also preferred mutant strains of Sorangium cellulosum), both of which are incorporated herein by reference. The following are procedures that are useful for isolating epothilones from cultured Sorangium cellulosum strains such as So ce90, and may also be used for the isolation of epothilone from recombinant hosts.
A: Cultivation of epothilone-producinq strains:
Strain: Sorangium cellulosum Soce-90 or a recombinant host strain according to the present invention.
Preservation of the strain: In liquid N2.
Media: Precuitures and intermediate cultures: G52
Main culture: 1 B12
G52 Medium: yeast extract, low in salt (BioSpringer, Maison Alfort, France) 2 g/l
MgS04 (7 H20) 1 g/l
CaCI2 (2 H20) 1 g/l soya meal defatted Soyamine 50T (Lucas Meyer, Hamburg,
Germany) 2 g/l potato starch Noredux A-150 (Blattmann, Waedenswil,
Switzerland) 8 g/l glucose anhydrous 2 g/l EDTA-Fe(lll)-Na salt (8 g/l) 1 ml/I pH 7.4, corrected with KOH Sterilisation: 20 mms. 120 °C
1 B12 Medium: potato starch Noredux A-150 (Blattmann, Waedenswil, Switzerland) 20 g/l soya meal defatted Soyamme 50T (Lucas Meyer, Hamburg, Germany) 11 g/l
EDTA-Fe(lll)-Na salt 8 mg/l pH 7.8, corrected with KOH Sterilisation: 20 mms. 120 °C
Addition of cyclodextrms and cyclodextπn derivatives
Cyclodextnns (Fluka, Buchs, Switzerland, or Wacker Chemie, Munich, Germany) in different concentrations are sterilised separately and added to the 1 B12 medium prior to seeding.
Cultivation: 1 ml of the suspension of Sorangium cellulosum Soce-90 from a liquid N2 ampoule is transferred to 10 ml of G52 medium (in a 50 ml Erlenmeyer flask) and incubated for 3 days at 180 rpm in an agitator at 30°C, 25 mm displacement. 5 ml of this culture is added to 45 ml of G52 medium (in a 200 ml Erlenmeyer flask) and incubated for 3 days at 180 rpm in an agitator at 30°C, 25 mm displacement. 50 ml of this culture is then added to 450 ml of G52 medium (in a 2 litre Erlenmeyer flask) and incubated for 3 days at 180 rpm in an agitator at 30°C, 50 mm displacement.
Maintenance culture: The culture is overseeded every 3-4 days, by adding 50 ml of culture to 450 ml of G52 medium (in a 2 litre Erlenmeyer flask). All experiments and fermentations are carried out by starting with this maintenance culture.
Tests in a flask-
(\\ Preculture in an agitating flask: Starting with the 500 ml of maintenance culture, 1 x 450 ml of G52 medium are seeded with
50 ml of the maintenance culture and incubated for 4 days at 180 rpm in an agitator at
30°C, 50 mm displacement.
(II) Mam culture in the agitating flask:
40 ml of 1 B12 medium plus 5 g/l 4-morpholιne-propane-sulfonιc acid (= MOPS) powder (in a
200 ml Erlenmeyer flask) are mixed with 5 ml of a 10x concentrated cyclodextπn solution, seeded with 10 ml of preculture and incubated for 5 days at 180 rpm in an agitator at 30°C,
50 mm displacement.
Fermentation Fermentations are carried out on a scale of 10 litres, 100 litres and 500 litres 20 litre and 100 litre fermentations serve as an intermediate culture step Whereas the pre- cultures and intermediate cultures are seeded as the maintenance culture 10% (v/v), the mam cultures are seeded with 20% (v/v) of the intermediate culture Important: In contrast to the agitating cultures, the ingredients of the media for the fermentation are calculated on the final culture volume including the inoculum. If, for example, 18 litres of medium + 2 litres of inoculum are combined, then substances for 20 litres are weighed in, but are only mixed with 18 litres
Preculture in an agitating flask
Starting with the 500 ml maintenance culture, 4 x 450 ml of G52 medium (in a 2 litre Erlenmeyer flask) are each seeded with 50 ml thereof, and incubated for 4 days at 180 rpm in an agitator at 30°C, 50 mm displacement.
Intermediate culture, 20 litres or 100 litres:
20 litres: 18 litres of G52 medium in a fermenter having a total volume of 30 litres are seeded with 2 litres of the preculture. Cultivation lasts for 3-4 days, and the conditions are:
30°C, 250 rpm, 0.5 litres of air per litre liquid per mm, 0.5 bars excess pressure, no pH control.
100 litres: 90 litres of G52 medium in a fermenter having a total volume of 150 litres are seeded with 10 litres of the 20 litre intermediate culture. Cultivation lasts for 3-4 days, and the conditions are: 30°C, 150 rpm, 0.5 litres of air per litre liquid per mm, 0.5 bars excess pressure, no pH control. Mam culture. 10 litres, 100 litres or 500 litres:
10 litres: The media substances for 10 litres of 1 B12 medium are sterilised in 7 litres of water, then 1 litre of a sterile 10% 2-(hydroxypropyl) -β-cyclodextπn solution are added, and seeded with 2 litres of a 20 litre intermediate culture. The duration of the mam culture is 6- 7 days, and the conditions are: 30°C, 250 rpm, 0.5 litres of air per litre of liquid per mm, 0.5 bars excess pressure, pH control with HSOVKOH to pH 7.6 +/- 0.5 (i.e no control between pH 7.1 and 8.1 ).
100 litres- The media substances for 100 litres of 1 B12 medium are sterilised in 70 litres of water, then 10 litres of a sterile 10% 2-(hydroxypropyl) -β-cyclodextπn solution are added, and seeded with 20 litres of a 20 litre intermediate culture. The duration of the mam culture is 6-7 days, and the conditions are. 30°C, 200 rpm, 0.5 litres air per litre liquid per mm., 0.5 bars excess pressure, pH control with HSO KOH to pH 7.6 +/- 0.5. The chain of seeding for a 100 litre fermentation is shown schematically as follows: maintenance culture (500ml)
(
Figure imgf000053_0001
500 litres: The media substances for 500 litres of 1 B12 medium are sterilised in 350 litres of water, then 50 litres of a sterile 10% 2-(hydroxypropyl) -β-cyclodextπn solution are added, and seeded with 100 litres of a 100 litre intermediate culture. The duration of the main culture is 6-7 days, and the conditions are: 30°C, 120 rpm, 0.5 litres air per litre liquid per mm., 0.5 bars excess pressure, pH control with H2SO4/KOH to pH 7.6 +/- 0.5.
Product analysis: Preparation of the sample: 50 ml samples are mixed with 2 ml of polystyrene resin Amberlite XAD16 (Rohm + Haas, Frankfurt, Germany) and shaken at 180 rpm for one hour at 30°C. The resin is subsequently filtered using a 150 μm nylon sieve, washed with a little water and then added together with the filter to a 15 ml Nunc tube. Elution of the product from the resin:
10 ml of isopropanol (>99%) are added to the tube with the filter and the resin. Afterwards, the sealed tube is shaken for 30 minutes at room temperature on a Rota-Mixer (Labinco BV, Netherlands). Then, 2 ml of the liquid are centπfuged off and the supernatant is added using a pipette to HPLC tubes.
HPLC analysis:
Column: Waters-Symetry C18, 100 x 4 mm, 3.5 μm
WAT066220 + preliminary column 3.9 x 20 mm
WAT054225
Solvents: A: 0.02 % phosphoric acid
B: Acetonitrile (HPLC-Quality)
Gradient: 41% B from 0 to 7 min.
100% B from 7.2 to 7.8 mm.
41% B from 8 to 12 min.
Oven temp.: 30°C Detection: 250 nm, UV-DAD detection Injection vol.: 10 μl
Retention time: Epo A: 4.30 mm Epo B: 5.38 mm
B: Effect of the addition of cvclodextrin and cvclodextrin derivatives to the epothilone concentrations attained.
Cyclodextπns are cyclic (α-1 ,4)-linked oligosacchandes of α-D-glucopyranose with a relatively hydrophobic central cavity and a hydrophilic external surface area.
The following are distinguished in particular (the figures in parenthesis give the number of glucose units per molecule): α-cyclodextrin (6), β-cyclodextrin (7), γ- cyclodextrin (8), δ-cyclodextπn (9), ε- cyclodextrin (10), ζ-cyclodextπn (11 ), η-cyclodextπn (12), and θ- cyclodextπn (13). Especially preferred are δ-cyciodextπn and in particular α-cyclodextrin, β- cyclodextπn or γ-cyclodextrin, or mixtures thereof. Cyclodextrin derivatives are primarily derivatives of the above-mentioned cyclodex- tπns, especially of α-cyclodextπn, β-cyclodextπn or γ-cyclodextnn, primarily those in which one or more up to all of the hydroxy groups (3 per glucose radical) are etherified or este- πfied. Ethers are primarily alkyl ethers, especially lower alkyl, such as methyl or ethyl ether, also propyl or butyl ether; the aryl-hydroxyalkyl ethers, such as phenyl-hydroxy-lower-alkyl, especially phenyl-hydroxyethyl ether; the hydroxyalkyl ethers, in particular hydroxy-lower- alkyl ethers, especially 2-hydroxyethyl, hydroxypropyl such as 2-hydroxypropyl or hydroxy- butyl such as 2-hydroxybutyl ether; the carboxyalkyl ethers, in particular carboxy-lower-alkyl ethers, especially carboxymethyl or carboxyethyl ether; deπvatised carboxyalkyl ethers, in particular deπvatised carboxy-lower-alkyl ether in which the deπvatised carboxy is etherified or amidated carboxy (primarily aminocarbonyl, mono- or di-lower-alkyl-ammocarbonyl, mor- pholino-, pipeπdino-, pyrrolidmo- or piperazmo-carbonyl, or alkyloxycarbonyl), in particular lower alkoxycarbonyl-lower-alkyl ether, for example methyloxycarbonyipropyl ether or ethyloxycarbonylpropyl ether; the sulfoalkyl ethers, in particular sulfo-lower-alkyl ethers, especially sulfobutyl ether; cyclodextπns in which one or more OH groups are etherified with a radical of formula
-0-[alk-0-]n-H wherein alk is alkyl, especially lower alkyl, and n is a whole number from 2 to 12, especially 2 to 5, in particular 2 or 3; cyclodextπns in which one or more OH groups are etherified with a radical of formula
Figure imgf000055_0001
wherein R' is hydrogen, hydroxy, -0-(alk-0)z-H, -0-(alk(-R)-0-)p-H or -0-(alk(-R)-0-)q-alk-CO-Y; alk in all cases is alkyl, especially lower alkyl; m, n, p, q and z are a whole number from 1 to 12, preferably 1 to 5, in particular 1 to 3; and Y is ORι or NR2R3 , wherein Rι, R2 and R3 independently of one another, are hydrogen or lower alkyl, or R2 and R3 combined together with the linking nitrogen signify morpholino, pipeπdino, pyrrolidmo or piperaz o; or branched cyclodextπns, in which etherifications or acetals with other sugar molecules are present, especially glucosyl-, diglucosyl- (G2-β-cyclodextπn), maltosyl- or dimaltosyl- cyclodextπn, or N-acetylglucosaminyl-, glucosammyl-, N-acetylgalactosammyl- or galactosaminyl-cyclodextπn. Esters are primarily alkanoyl esters, in particular lower alkanoyl esters, such as acetyl esters of cyclodextnns.
It is also possible to have cyclodextnns in which two or more different said ether and ester groups are present at the same time.
Mixtures of two or more of the said cyclodextnns and/or cyclodextrin derivatives may also exist.
Preference is given in particular to α-, β- or γ-cyclodextπns or the lower alkyl ethers thereof, such as methyl-β-cyclodextπn or in particular 2,6-dι-O-methyl-β-cyclodextπn, or in particular the hydroxy lower alkyl ethers thereof, such as 2-hydroxypropyl-α- 2-hydroxy- propyl-β- or 2-hydroxypropyl-γ-cyclodextrιn.
The cyclodextnns or cyclodextrin derivatives are added to the culture medium preferably in a concentration of 0.02 to 10, preferably 0.05 to 5, especially 0.1 to 4, for example 0.1 to 2 percent by weight (w/v).
Cyclodextnns or cyclodextrin derivatives are known or may be produced by known processes (see for example US 3,459,731 ; US 4,383,992; US 4,535,152; US 4,659,696; EP 0 094 157; EP 0 149 197; EP 0 197 571 ; EP 0 300 526, EP 0 320 032; EP 0 499 322; EP 0 503 710; EP 0 818 469; WO 90/12035; WO 91/1 1200; WO 93/19061 ; WO 95/08993; WO 96/14090; GB 2,189,245; DE 3,118,218; DE 3,317,064 and the references mentioned therein, which also refer to the synthesis of cyclodextnns or cyclodextrin derivatives, or also: T. Loftsson and M.E. Brewster (1996). Pharmaceutical Applications of Cyclodextnns: Drug Solubilization and Stabilisation: Journal of Pharmaceutical Science 85 (10):1017-1025, R.A. Rajewski and V.J. Stella(1996): Pharmaceutical Applications of Cyclodextnns: In Vivo Drug Delivery: Journal of Pharmaceutical Science 85 (11 ): 1142-1169).
All the cyclodextrin derivatives tested here are obtainable from the company Fluka, Buchs, CH. The tests are carried out in 200 ml agitating flasks with 50 ml culture volume. As controls, flasks with adsorber resin Amberlite XAD-16 (Rohm & Haas, Frankfurt, Germany) and without any adsorber addition are used. After incubation for 5 days, the following epothilone titres can be determined by HPLC:
Table 2:
Figure imgf000056_0001
Figure imgf000057_0001
1) Apart from Amberlite (%v/v), all percentages are by weight (%w/v).
Few of the cyclodextrins tested (2,6-di-o-methyl-β-cyclodextrin, methyl-β-cyclodextrin) display no effect or a negative effect on epothilone production at the concentrations used. 1 -2% 2-hydroxy-propyl-β-cyclodextrin and β-cyclodextrin increase epothilone production in the examples by 6 to 8 times compared with production using no cyclodextrins. C: 10 litre fermentation with 1 % 2-(hvdroxypropyl)-β-cyclodextrin):
Fermentation is carried out in a 15 litre glass fermenter. The medium contains 10 g/l of 2-(hydroxypropyl)-β-cyclodextrιn from Wacker Chemie, Munich, Germany. The progress of fermentation is illustrated in Table 3. Fermentation is ended after 6 days and working up takes place. Table 3- Progress of a 10 litre fermentation
Figure imgf000058_0001
D: 100 litre fermentation with 1% 2-(hvdroxypropyl)-β-cvclodextrιn):
Fermentation is carried out in a 150 litre fermenter The medium contains 10 g/l of 2- (Hydroxypropyl)-β-cyclodextrιn. The progress of fermentation is illustrated in Table 4 The fermentation is harvested after 7 days and worked up.
Table 4 Progress of a 100 litre fermentation
Figure imgf000058_0002
Figure imgf000059_0001
E: 500 litre fermentation with 1% 2-(hvdroxypropyl)-β-cvclodextrin):
Fermentation is carried out in a 750 litre fermenter. The medium contains 10 g/l of 2- (Hydroxypropyl)-β-cyclodextrιn. The progress of fermentation is illustrated in Table 5. The fermentation is harvested after 7 days and worked up.
Table 5: Progress of a 500 litre fermentation
Figure imgf000059_0002
F: Comparison example 10 litre fermentation without adding an adsorber:
Fermentation is carried out in a 15 litre glass fermenter. The medium does not contain any cyclodextrin or other adsorber. The progress of fermentation is illustrated in Table 6. The fermentation is not harvested and worked up.
Table 6: Progress of a 10 litre fermentation without adsorber.
G: Working up of the epothilones: Isolation from a 500 litre main culture:
The volume of harvest from the 500 litre mam culture of example 2D is 450 litres and is separated using a Westfalia clarifying separator Type SA-20-06 (rpm = 6500) into the liquid phase (centrifugate + rinsing water = 650 litres) and solid phase (cells = ca. 15 kg). The mam part of the epothilones are found in the centrifugate, The centπfuged cell pulp contains < 15% of the determined epothilone portion and is not further processed. The 650 litre centrifugate is then placed in a 4000 litre stirπng vessel, mixed with 10 litres of Amberlite XAD-16 (centπfugate:resιn volume = 65:1 ) and stirred. After a period of contact of ca. 2 hours, the resin is centπfuged away in a Heine overflow centrifuge (basket content 40 litres; rpm = 2800). The resin is discharged from the centrifuge and washed with 10-15 litres of deionised water. Desorption is effected by stirring the resin twice, each time in portions with 30 litres of isopropanol in 30 litre glass stirring vessels for 30 minutes. Separation of the isopropanol phase from the resin takes place using a suction filter. The isopropanol is then removed from the combined isopropanol phases by adding 15-20 litres of water in a vacuum-operated circulating evaporator (Schmid-Verdampfer) and the resulting water phase of ca. 10 litres is extracted 3x each time with 10 litres of ethyl acetate. Extraction is effected in 30 litre glass stirring vessels. The ethyl acetate extract is concentrated to 3-5 litres in a vacuum-operated circulating evaporator (Schmid-Verdampfer) and afterwards concentrated to dryness in a rotary evaporator (Bϋchi type) under vacuum. The result is an ethyl acetate extract of 50.2 g. The ethyl acetate extract is dissolved in 500 ml of methanol, the insoluble portions filtered off using a folded filter, and the solution added to a 10 kg Sephadex LH 20 column (Pharmacia, Uppsala, Sweden) (column diameter 20 cm, filling level ca. 1.2 m). Elution is effected with methanol as eluant. Epothilone A and B is present predominantly in fractions 21 -23 (at a fraction size of 1 litre). These fractions are concentrated to dryness in a vacuum on a rotary evaporator (total weight 9.0 g). These Sephadex peak fractions (9.0 g) are thereafter dissolved in 92 ml of acetonιtπle:-water:-methylene chloride = 50:40:2, the solution filtered through a folded filter and added to a RP column (equipment Prepbar 200, Merck; 2. 0 kg LiChrospher RP-18 Merck, gram size 12μm, column diameter 10 cm, filling level 42 cm; Merck, Darmstadt, Germany) Elution is effected with acetonιtπle:water = 3.7 (flow rate = 500 ml/m .; retention time of epothilone A = ca. 51 -59 mms.; retention time of epothilone B = ca 60-69 mms.) Fractionation is monitored with a UV detector at 250 nm. The fractions are concentrated to dryness under vacuum on a Buchi-Rotavapor rotary evaporator. The weight of the epothilone A peak fraction is 700 mg, and according to HPLC (external standard) it has a content of 75.1 %. That of the epothilone B peak fraction is 1980 mg, and the content according to HPLC (external standard) is 86.6%. Finally, the epothilone A fraction (700 mg) is crystallised from 5 ml of ethyl acetate:toluene = 2:3, and yields 170 mg of epothilone A pure crystallisate [content according to HLPC (% of area) = 94.3%]. Crystallisation of the epothilone B fraction (1980 mg) is effected from 18 ml of methanol and yields 1440 mg of epothilone B pure crystallisate [content according to HPLC (% of area) = 99.2%] m.p. (Epothilone B) e.g. 124-125 °C, 1H-NMR data for Epothilone B-
500 MHz-NMR, solvent: DMSO-d6. Chemical displacement δ in ppm relative to TMS. s = singlet; d = doublet; m = multiplet
δ (Multiplicity) Integral (number of H)
7.34 (s)
6.50 (s)
5.28 (d)
5.08 (d)
4.46 (d)
4.08 (m) 3.47 (m) 1
3.11 (m) 1
2.83 (dd) 1
2.64 (s) 3
2.36 (m) 2
2.09 (s) 3
2.04 (m) 1
1.83 (m) 1
1.61 (m) 1
1.47 - 1.24 (m) 4
1.18 (s) 6
1.13 (m) 2
1.06 (d) 3
0.89 (d + s, overlapping) 6
∑ = 41
Example 15: Medical Uses of Recombinantly Produced Epothilones
Pharmaceutical preparations or compositions comprising epothilones are used for example in the treatment of cancerous diseases, such as various human solid tumors. Such anticancer formulations comprise, for example, an active amount of an epothilone together with one or more organic or inorganic, liquid or solid, pharmaceutically suitable carrier materials. Such formulations are delivered, for example, enterally, nasally, rectally, orally, or parenterally, particularly intramuscularly or intravenously. The dosage of the active ingredient is dependent upon the weight, age, and physical and pharmacokinetical condition of the patient and is further dependent upon the method of delivery. Because epothilones mimic the biological effects of taxol, epothilones may be substituted for taxol in compositions and methods utilizing taxol in the treatment of cancer. See, for example, U.S. Patent Nos. 5,496,804, 5,565,478, and 5,641 ,803, all of which are incorporated herein by reference.
For example, for treatments, epothilone B is supplied in individual 2 ml glass vials formulated as 1 mg/1 ml of clear, colorless intravenous concentrate. The substance is formulated in polyethylene giycol 300 (PEG 300) and diluted with 50 or 100 ml 0.9% Sodium Chloride Injection, USP, to achieve the desired final concentration of the drug for infusion. It is administered as a single 30-mιnute intravenous infusion every 21 days (treatment three-weekly) for six cycles, or as a single 30-mιnute intravenous infusion every 7 days (weekly treatment).
Preferably, for weekly treatment, the dose is between about 0.1 and about 6, preferably about 0.1 and about 5 mg/m2, more preferably about 0.1 and about 3 mg/m2, even more preferably 0.1 and 1.7 mg/m2, most preferably about 0.3 and about 1 mg/m2; for three-weekly treatment (treatment every three weeks or every third week) the dose is between about 0.3 and about 18 mg/m2, preferably about 0.3 and about 15 mg/m2, more preferably about 0.3 and about 12 mg/m2, even more preferably about 0.3 and about 7.5 mg/m2, still more preferably about 0.3 and about 5 mg/m2, most preferably about 1.0 and about 3.0 mg/m2. This dose is preferably administered to the human by intravenous (i.v.) administration during 2 to 180 mm, preferably 2 to 120 mm, more preferably during about 5 to about 30 m , most preferably during about 10 to about 30 mm, e.g. during about 30 m
While the present invention has been descπbed with reference to specific embodiments thereof, it will be appreciated that numerous variations, modifications, and embodiments are possible, and accordingly, all such variations, modifications and embodiments are to be regarded as being within the spirit and scope of the present invention.
BUDAPEST TREATY ON THE INTERNATIONAL
RECOGNITION OF THE DEPOSIT OP MICROORGANISMS
FOR THE PURPOSE OF PATENT PROCEDURES
INTERNATIONAL FORM
TO RECEIPT IN THE CASE OF AN ORIGINAL DEPOSIT
Novartiβ AG issued pursuant to Rule 7.1 by the
Novartiø Corporation INTERNATIONAL DEPOSITARϊ AUTHORITY
Patent and Trademark Dβpt. identified at the bottom of thin page
3054 Cornwall!* Rd.
Research Triangle Park, NC 27709
NAME AND ADDRESS OF DEPOSITOR
IDENTIFICATION OF THE MICROORGANISM
Identification reference given by the Accession number given by the DEPOSITOR; INTERNATIONAL DEPOSITARY AUTHORITY:
Eaahβr±ahia ααli DH1QB [pEP015] NRRL B-30033
II. SCIENTIFIC DESCRIPTION AND/OR PROPOSED TAXONOMIC DESIGNATION The microorganism identified under I. above waa accompanied byi
I I a scientific description
[χ | a proposed taxαnomic designation (Mark with a CΓOBB where applicable)
III. RECEIPT AND ACCEPTANCE
Thiβ International Depositary Authority accepts the microorganism identified under I. above, which was received by it on June 11, 1998 (date of the original deposit)1
IV. RECEIPT OF REQUEST FOR CONVERSION
The microorganism identified under I. above was received by this International Depositary Authority on (date of the original deposit) and a request to convert the original deposit to a deposit under the Budapest Treaty was received by it on (date of reαeipt of request for conversion) ■
INTERNATIONAL DEPOSITARY AUTHORITY
Name: Agricultural Research Culture Signature(β) of person (β) having the power Collection (NRRL) to represent the International Depositary International Depositary Authority Authority or oaf ^aau.thorised o icial(β):
AddrββBt 1815 N. University Street Peoria, Illinois 61604 U.S.A. Datei
Figure imgf000064_0001
1 Where Rule 6.4(d) applies, such date is the date on which the status of international depositary authority was acquired. BUDAPEST TREATY ON THE INTRRKAJ OIAL
RECOGNITION OF THE DEPOSIT OF MICROORGANISMS
FOR THE PURPOSE OF PATENT PROCEDURES
INTERNATIONAL FORM
TO RECEIPT IN THE CASK OF AN ORIGINAL DEPOSIT
Novatiβ AQ issued pursuant to Rule 7.1 by the c/o Novartis Agricultural Biotechnology INTERNATIONAL DEPOSITARY AUTHORITY
Research, Inc. identified at the bottom of thiβ page Patent & Trademark Department 3054 Cornwall!s Road Research Triangle Park, NC 27709 NAME AND ADDRESS
OF DEPOSITOR
IDENTIFICATION OF TUB MICROORGANISM
Identification reference given by the Accession number given by the
DEPOSITOR: INTERNATIONAL DEPOSITARY AUTHORITY:
Bβeherichia coli DH10B [pEP032]
NRHL B-30119
II. SCIENTIFIC DESCRIPTION AND/OR PROPOSED TAXONQMIC DESIGNATION
The microorganism identified under I. above was accoapanied by: j j a scientific description
^^ a proposed taxonooiic designation (Mark with a cross where applicable)
III . RECEIPT AND ACCEPTANCE
This International Depositary Authority accepts the microorganism identified under I. above, which was received by it on April 16, 1999 (date of the original deposit)1
IV. RECEIPT OF REQUEST FOR CONVERSION
The microorganism identified under I. above was received by this nternational Depositary Authority on (date of the original deposit) and a request to convert the original deposit to a deposit under the Budapββt Treaty was received by it on (date of receipt of request for conversion) .
INTERNATIONAL DEPOSITARY AϋTOORITY
Name: Agricultural Research Culture Signature (s) of person(s) having the power Collection (NRRL) to represent the International Depositary International Depositary Authority Authority or of authorized official (β) :
Address: 1815 N. University Street Peoria, Illinois 61604 U.S.A. Date:
1 Where Rule 6.4(d) applies, such date is the date on which the status of international depositary authority was acquired.

Claims

What is claimed is:
1. An isolated nucleic acid molecule compnsing a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone.
2. An isolated nucleic acid molecule according to claim 1 , wherein said nucleotide sequence is isolated from a myxobacterium.
3. An isolated nucleic acid molecule according to claim 2, wherein said myxobacterium is Sorangium cellulosum
4. A chimeπc gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule according to claim 1.
5. A recombinant vector comprising a chimeric gene according to claim 4.
6. A recombinant host cell comprising a chimeric gene according to claim 4.
7. The recombinant host cell of claim 6, which is a bacteria
8 The recombinant host cell of claim 7, which is an Actinomycete
9. The recombinant host cell of claim 8, which is Streptomyces.
10. A Bac clone comprising a nucleic acid molecule according to claim 1.
1 1. The Bac clone of claim 10, which is pEPO15.
12. An isolated nucleic acid molecule according to claim 1 , wherein said polypeptide comprises an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: SEQ ID NO:2, ammo acids 11 -437 of SEQ ID NO:2, am o acids 543-864 of SEQ ID NO:2, ammo acids 974-1273 of SEQ ID NO:2, ammo acids 1314-1385 of SEQ ID NO.2, SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, ammo acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, am o acids 549-565 of SEQ ID NO:3, ammo acids 588-603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO:3, ammo acids 815-821 of SEQ ID NO:3, ammo acids 868-892 of SEQ ID NO:3, ammo acids 903-912 of SEQ ID NO:3, ammo acids 918-940 of SEQ ID NO:3, ammo acids 1268-1274 of SEQ ID NO:3, ammo acids 1285-1297 of SEQ ID NO:3, ammo acids 973-1256 of SEQ ID NO:3, ammo acids 1344-1351 of SEQ ID NO:3, SEQ ID NO:4, ammo acids 7-432 of SEQ ID NO:4, ammo acids 539-859 of SEQ ID NO:4, am o acids 869-1037 of SEQ ID NO:4, ammo acids 1439-1684 of SEQ ID NO:4, ammo acids 1722-1792 of SEQ ID NO:4, SEQ ID NO:5, ammo acids 39-457 of SEQ ID NO:5, ammo acids 563-884 of SEQ ID NO:5, ammo acids 1147-1399 of SEQ ID NO:5, ammo acids 1434-1506 of SEQ ID NO.5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 2932- 3005 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 3886-4048 of SEQ ID NO:5, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 5010-5082 of SEQ ID NO:5, am o acids 5103-5525 of SEQ ID NO:5, ammo acids 5631 -5951 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, am o acids 6542-6837 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, SEQ ID NO:6, am o acids 35-454 of SEQ ID NO:6, ammo acids 561 -881 of SEQ ID NO:6, ammo acids 1143- 1393 of SEQ ID NO:6, ammo acids 1430-1503 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO 6, am o acids 2053-2373 of SEQ ID NO.6, ammo acids 2383-2551 of SEQ ID NO:6, ammo acids 2671-3045 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, am o acids 3673-3745 of SEQ ID NO:6, SEQ ID NO:7, ammo acids 32-450 of SEQ ID NO:7, ammo acids 556-877 of SEQ ID NO:7, am o acids 887-1051 of SEQ ID NO:7, ammo acids 1478-1790 of SEQ ID NO:7, am o acids 1810-2055 of SEQ ID NO:7, ammo acids 2093-2164 of SEQ ID NO:7, am o acids 2165-2439 of SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:11 , and SEQ ID NO:22.
13. An isolated nucleic acid molecule according to claim 12, wherein said polypeptide comprises an ammo acid sequence selected from the group consisting of: SEQ ID NO:2, ammo acids 11 -437 of SEQ ID NO:2, am o acids 543-864 of SEQ ID NO:2, ammo acids 974-1273 of SEQ ID NO:2, ammo acids 1314-1385 of SEQ ID NO:2, SEQ ID NO:3, am o acids 72-81 of SEQ ID NO:3, ammo acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID N0:3, ammo acids 353-363 of SEQ ID N0:3, ammo acids 549- 565 of SEQ ID N0:3, ammo acids 588-603 of SEQ ID N0:3, ammo acids 669-684 of SEQ ID N0:3, ammo acids 815-821 of SEQ ID N0:3, ammo acids 868-892 of SEQ ID NO:3, am o acids 903-912 of SEQ ID NO:3, ammo acids 918-940 of SEQ ID NO:3, ammo acids 1268-1274 of SEQ ID NO:3, ammo acids 1285-1297 of SEQ ID NO:3, ammo acids 973- 1256 of SEQ ID NO:3, ammo acids 1344-1351 of SEQ ID NO:3, SEQ ID NO:4, ammo acids 7-432 of SEQ ID NO:4, ammo acids 539-859 of SEQ ID NO:4, ammo acids 869-1037 of SEQ ID NO:4, am o acids 1439-1684 of SEQ ID NO.4, ammo acids 1722-1792 of SEQ ID NO:4, SEQ ID NO:5, ammo acids 39-457 of SEQ ID NO.5, am o acids 563-884 of SEQ ID NO:5, ammo acids 1 147-1399 of SEQ ID NO:5, ammo acids 1434-1506 of SEQ ID NO'5, ammo acids 1524-1950 of SEQ ID NO.5, ammo acids 2056-2377 of SEQ ID NO 5, am o acids 2645-2895 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO 5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 3886- 4048 of SEQ ID NO:5, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 5010-5082 of SEQ ID NO.5, ammo acids 5103-5525 of SEQ ID NO:5, ammo acids 5631-5951 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, am o acids 6857-7101 of SEQ ID NO.5, ammo acids 7140-7211 of SEQ ID NO:5, SEQ ID NO.6, ammo acids 35-454 of SEQ ID NO.6, am o acids 561 -881 of SEQ ID NO:6, ammo acids 1 143-1393 of SEQ ID NO 6, am o acids 1430-1503 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO- 6, ammo acids 2053-2373 of SEQ ID NO:6, ammo acids 2383-2551 of SEQ ID NO"6, ammo acids 2671 - 3045 of SEQ ID NO"6, ammo acids 3392-3636 of SEQ ID NO.6, ammo acids 3673-3745 of SEQ ID NO.6, SEQ ID NO:7, ammo acids 32-450 of SEQ ID NO.7, ammo acids 556-877 of SEQ ID NO:7, ammo acids 887-1051 of SEQ ID NO:7, am o acids 1478-1790 of SEQ ID NO:7, ammo acids 1810-2055 of SEQ ID NO:7, ammo acids 2093-2164 of SEQ ID NO:7, ammo acids 2165-2439 of SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:11 , and SEQ ID NO:22.
14. An isolated nucleic acid molecule according to claim 12, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415- 5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 1 1549-11764 of SEQ ID NO :1 , nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-12114 of SEQ ID NO:1 nucleotides 12223-12246 of SEQ ID
NO nucleotides 12466-12507 of SEQ ID NO:1 nucleotides 12928-12960 of SEQ ID
NO nucleotides 13516-13566 of SEQ ID NO:1 nucleotides 13633-13680 of SEQ ID
NO nucleotides 13876-13923 of SEQ ID NO:1 nucleotides 14313-14334 of SEQ ID
NO nucleotides 14473-14547 of SEQ ID NO:1 nucleotides 14578-14607 of SEQ ID
NO nucleotides 14623-14692 of SEQ ID NO:1 nucleotides 15673-15693 of SEQ ID
NO nucleotides 15724-15762 of SEQ ID NO:1 nucleotides 14788-15639 of SEQ ID
NO nucleotides 15901 -15924 of SEQ ID NO:1 nucleotides 16251-21749 of SEQ ID
NO nucleotides 16269-17546 of SEQ ID NO:1 nucleotides 17865-18827 of SEQ ID
NO nucleotides 18855-19361 of SEQ ID NO:1 nucleotides 20565-21302 of SEQ ID
NO nucleotides 21414-21626 of SEQ ID NO:1 nucleotides 21746-43519 of SEQ ID
NO nucleotides 21860-23116 of SEQ ID NO:1 nucleotides 23431 -24397 of SEQ ID
NO nucleotides 25184-25942 of SEQ ID NO:1 nucleotides 26045-26263 of SEQ ID
NO nucleotides 26318-27595 of SEQ ID NO:1 nucleotides 2791 1 -28876 of SEQ ID
NO nucleotides 29678-30429 of SEQ ID NO:1 nucleotides 30539-30759 of SEQ ID
NO nucleotides 30815-32092 of SEQ ID NO:1 nucleotides 32408-33373 of SEQ ID
NO nucleotides 33401 -33889 of SEQ ID NO:1 nucleotides 35042-35902 of SEQ ID
NO nucleotides 35930-36667 of SEQ ID NO:1 nucleotides 36773-36991 of SEQ ID
NO nucleotides 37052-38320 of SEQ ID NO:1 nucleotides 38636-39598 of SEQ ID
NO nucleotides 39635-40141 of SEQ ID NO'1 nucleotides 41369-42256 of SEQ ID
NO nucleotides 42314-43048 of SEQ ID NO:1 nucleotides 43163-43378 of SEQ ID
NO nucleotides 43524-54920 of SEQ ID NO:1 nucleotides 43626-44885 of SEQ ID
NO nucleotides 45204-46166 of SEQ ID NO:1 nucleotides 46950-47702 of SEQ ID
NO nucleotides 47811-48032 of SEQ ID NO:1 nucleotides 48087-49361 of SEQ ID
NO nucleotides 49680-50642 of SEQ ID NO:1 nucleotides 50670-51176 of SEQ ID
NO nucleotides 51534-52657 of SEQ ID NO:1 nucleotides 53697-54431 of SEQ ID
NO nucleotides 54540-54758 of SEQ ID NO:1 nucleotides 54935-62254 of SEQ ID
NO nucleotides 55028-56284 of SEQ ID NO:1 nucleotides 56600-57565 of SEQ ID
NO nucleotides 57593-58087 of SEQ ID NO:1 nucleotides 59366-60304 of SEQ ID
NO nucleotides 60362-61099 of SEQ ID NO:1 nucleotides 61211-61426 of SEQ ID
NO nucleotides 61427-62254 of SEQ ID NO:1 nucleotides 62369-63628 of SEQ ID
NO nucleotides 67334-68251 of SEQ ID NO:1 and nucleotides 1-68750 SEQ ID NO:1.
15. A nucieic acid molecule according to claim 12, wherein said nucleotide sequence is selected from the group consisting of: the complement of nucleotides 1900- 3171 of SEQ ID NO: 1, nucleotides 3415-5556 of SEQ ID NO:1, nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1, nucleotides 10529-11428 of SEQ ID NO:1, nucleotides 11549-11764 of SEQ ID NO:1, nucleotides 11872-16104 of SEQ ID NO:1, nucleotides 12085-12114 of SEQ ID NO:1, nucleotides 12223-12246 of SEQ ID NO:1, nucleotides 12466-12507 of SEQ ID NO:1, nucleotides 12928-12960 of SEQ ID NO:1, nucleotides 13516-13566 of SEQ ID NO:1, nucleotides 13633-13680 of SEQ ID NO:1, nucleotides 13876-13923 of SEQ ID NO 1, nucleotides 14313-14334 of SEQ ID NO-1, nucleotides 14473-14547 of SEQ ID NO:1, nucleotides 14578-14607 of SEQ ID NO:1, nucleotides 14623-14692 of SEQ ID NO:1, nucleotides 15673-15693 of SEQ ID NO:1, nucleotides 15724-15762 of SEQ ID NO:1, nucleotides 14788-15639 of SEQ ID NO:1, nucleotides 15901-15924 of SEQ ID NO:1, nucleotides 16251-21749 of SEQ ID NO:1, nucleotides 16269-17546 of SEQ ID NO:1, nucleotides 17865-18827 of SEQ ID NO:1, nucleotides 18855-19361 of SEQ ID NO:1, nucleotides 20565-21302 of SEQ ID NO:1, nucleotides 21414-21626 of SEQ ID NO:1, nucleotides 21746-43519 of SEQ IDNO:1, nucleotides 21860-23116 of SEQ ID NO:1, nucleotides 23431-24397 of SEQ ID NO:1, nucleotides 25184-25942 of SEQ ID NO:1, nucleotides 26045-26263 of SEQ ID NO:1, nucleotides 26318-27595 of SEQ ID NO 1, nucleotides 27911-28876 of SEQ ID NO 1, nucleotides 29678-30429 of SEQ ID NO 1, nucleotides 30539-30759 of SEQ ID NO'1, nucleotides 30815-32092 of SEQ ID NO:1, nucleotides 32408-33373 of SEQ ID NO:1, nucleotides 33401-33889 of SEQ ID NO:1, nucleotides 35042-35902 of SEQ ID NO:1, nucleotides 35930-36667 of SEQ ID NO:1, nucleotides 36773-36991 of SEQ ID NO:1, nucleotides 37052-38320 of SEQ ID NO:1, nucleotides 38636-39598 of SEQ ID NO:1, nucleotides 39635-40141 of SEQ ID NO:1, nucleotides 41369-42256 of SEQ ID NO:1, nucleotides 42314-43048 of SEQ ID NO:1, nucleotides 43163-43378 of SEQ ID NO:1, nucleotides 43524-54920 of SEQ ID NO:1, nucleotides 43626-44885 of SEQ ID NO:1, nucleotides 45204-46166 of SEQ ID NO:1, nucleotides 46950-47702 of SEQ ID NO:1, nucleotides 47811-48032 of SEQ ID NO:1, nucleotides 48087-49361 of SEQ ID NO:1, nucleotides 49680-50642 of SEQ ID NO:1, nucleotides 50670-51176 of SEQ ID NO:1, nucleotides 51534-52657 of SEQ ID NO:1, nucleotides 53697-54431 of SEQ ID NO:1, nucleotides 54540-54758 of SEQ ID O 1 , nucleotides 54935-62254 of SEQ ID NO 1 , nucleotides 55028-56284 of SEQ ID NO 1 , nucieotides 56600-57565 of SEQ ID NO 1 , nucleotides 57593-58087 of SEQ ID NO 1 , nucleotides 59366-60304 of SEQ ID NO 1 , nucleotides 60362-61099 of SEQ ID NO 1 , nucleotides 61211 -61426 of SEQ ID NO 1 , nucleotides 61427-62254 of SEQ ID NO 1 , nucleotides 62369-63628 of SEQ ID NO 1 , nucleotides 67334-68251 of SEQ ID NO 1 , and nucleotides 1-68750 SEQ ID NO:1.
16. A chimeric gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule according to claim 12.
17. A recombinant vector comprising a chimeric gene according to claim 16.
18. A recombinant host cell comprising a chimeric gene according to claim 16.
19. The recombinant host cell of claim 18, which is a bacteria.
20. The recombinant host cell of claim 19, which is an Actinomycete.
21. The recombinant host cell of claim 20, which is Streptomyces.
22. An isolated nucleic acid molecule according to claim 1 , wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415- 5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 11549-11764 of SEQ ID NO: 1 , nucleotides 11872-16104 of SEQ ID
NO 1 , nucleotides 12085-12114 of SEQ ID NO 1 , nucleotides 12223-12246 of SEQ ID NO 1 , nucleotides 12466-12507 of SEQ ID NO 1 , nucleotides 12928-12960 of SEQ ID NO 1 , nucleotides 13516-13566 of SEQ ID NO 1 , nucleotides 13633-13680 of SEQ ID NO 1 , nucleotides 13876-13923 of SEQ ID NO 1 , nucleotides 14313-14334 of SEQ ID NO 1 , nucleotides 14473-14547 of SEQ ID NO 1 , nucleotides 14578-14607 of SEQ ID NO 1 , nucleotides 14623-14692 of SEQ ID NO 1 , nucleotides 15673-15693 of SEQ ID NO: 1, nucleotides 15724-15762 of SEQ ID NO:1, nucleotides 14788-15639 of SEQ ID NO:1, nucleotides 15901-15924 of SEQ ID NO:1, nucleotides 16251-21749 of SEQ ID NO:1, nucleotides 16269-17546 of SEQ ID NO:1, nucleotides 17865-18827 of SEQ ID NO:1, nucleotides 18855-19361 of SEQ ID NO:1, nucleotides 20565-21302 of SEQ ID NO:1, nucleotides 21414-21626 of SEQ ID NO:1, nucleotides 21746-43519 of SEQ ID NO:1, nucleotides 21860-23116 of SEQ ID NO:1, nucleotides 23431-24397 of SEQ ID NO:1, nucleotides 25184-25942 of SEQ ID NO:1, nucleotides 26045-26263 of SEQ ID NO:1, nucleotides 26318-27595 of SEQ ID NO:1, nucleotides 27911-28876 of SEQ ID NO:1, nucleotides 29678-30429 of SEQ ID NO:1, nucleotides 30539-30759 of SEQ ID NO:1, nucleotides 30815-32092 of SEQ ID NO:1, nucleotides 32408-33373 of SEQ ID NO:1, nucleotides 33401-33889 of SEQ ID NO:1, nucleotides 35042-35902 of SEQ ID NO:1, nucleotides 35930-36667 of SEQ ID NO:1, nucleotides 36773-36991 of SEQ ID NO:1, nucleotides 37052-38320 of SEQ ID NO:1, nucleotides 38636-39598 of SEQ ID NO:1, nucleotides 39635-40141 of SEQ ID NO:1, nucleotides 41369-42256 of SEQ ID NO:1, nucleotides 42314-43048 of SEQ ID NO:1, nucleotides 43163-43378 of SEQ ID NO:1, nucleotides 43524-54920 of SEQ ID NO:1, nucleotides 43626-44885 of SEQ ID NO:1, nucleotides 45204-46166 of SEQ ID NO:1, nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 47811-48032 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1, nucleotides 49680-50642 of SEQ ID NO:1, nucleotides 50670-51176 of SEQ ID NO:1, nucleotides 51534-52657 of SEQ ID NO:1, nucleotides 53697-54431 of SEQ ID NO:1, nucleotides 54540-54758 of SEQ ID NO:1, nucleotides 54935-62254 of SEQ ID NO:1, nucleotides 55028-56284 of SEQ ID NO:1, nucleotides 56600-57565 of SEQ ID NO:1, nucleotides 57593-58087 of SEQ ID NO:1, nucleotides 59366-60304 of SEQ ID NO:1, nucleotides 60362-61099 of SEQ ID NO:1, nucleotides 61211-61426 of SEQ ID NO:1, nucleotides 61427-62254 of SEQ ID NO:1, nucleotides 62369-63628 of SEQ ID NO: 1, nucleotides 67334-68251 of SEQ ID NO:1, and nucleotides 1-68750 SEQ ID NO:1.
23. A chimeric gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule according to claim 22.
24. A recombinant vector comprising a chimeric gene according to claim 23.
25. A recombinant host cell comprising a chimeric gene according to claim 23.
26. The recombinant host cell of claim 25, which is a bacteria.
27. The recombinant host cell of claim 26, which is an Actmomycete.
28. The recombinant host cell of claim 27, which is Streptomyces.
29. An isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one epothilone synthase domain.
30 An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a ╬▓-ketoacyl-synthase domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 11 -437 of SEQ ID NO:2, ammo acids 7-432 of SEQ ID NO.4, ammo acids 39- 457 of SEQ ID NO:5, am o acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO.5, ammo acids 5103-5525 of SEQ ID NO.5, ammo acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO.7.
31. An isolated nucleic acid molecule according to claim 30, wherein said ╬▓- ketoacyl-synthase domain comprises an am o acid sequence selected from the group consisting of amino acids 1 1 -437 of SEQ ID NO.2, am o acids 7-432 of SEQ ID NO 4, ammo acids 39-457 of SEQ ID NO.5, ammo acids 1524-1950 of SEQ ID NO.5, ammo acids 3024-3449 of SEQ ID NO.5, ammo acids 5103-5525 of SEQ ID NO:5, ammo acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO. 6, and ammo acids 32-450 of SEQ ID NO:7.
32. An isolated nucleic acid molecule according to claim 30, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 21860-23116 of SEQ ID NO:1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO:1.
33. An isolated nucleic acid molecule according to claim 30, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 21860-23116 of SEQ ID NO-1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO-1
34 An isolated nucleic acid molecule according to claim 30, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO 1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 21860-23116 of SEQ ID NO:1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO.1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO 1
35. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a an acyltransferase domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of ammo acids 543-864 of SEQ ID NO 2, ammo acids 539-859 of SEQ ID NO 4, am o acids 563-884 of SEQ ID NO.5, ammo acids 2056-2377 of SEQ ID NO.5, ammo acids 3555-3876 of SEQ ID NO.5, ammo acids 5631 -5951 of SEQ ID NO.5, amino acids 561 -881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556-877 of SEQ ID NO.7
36. An isolated nucleic acid molecule according to claim 35, wherein said acyltransferase domain comprises an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO:2, am o acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 5631 -5951 of SEQ ID NO:5, am o acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and am o acids 556- 877 of SEQ ID NO.7.
37. An isolated nucleic acid molecule according to claim 35, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865-18827 of SEQ ID NO:1 , nucleotides 23431-24397 of SEQ ID NO:1 , nucleotides 27911 -28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636-39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO:1 , and nucleotides 56600-57565 of SEQ ID NO:1.
38. An isolated nucleic acid molecule according to claim 35, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of- nucleotides 9236-10201 of SEQ ID NO.1 , nucleotides 17865-18827 of SEQ ID NO-1 , nucleotides 23431 -24397 of SEQ ID NO:1 , nucleotides 2791 1-28876 of SEQ ID NO-1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636-39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO.1 , and nucleotides 56600-57565 of SEQ ID NO:1.
39. An isolated nucleic acid molecule according to claim 35, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865-18827 of SEQ ID NO:1 , nucleotides 23431 -24397 of SEQ ID NO:1 , nucleotides 2791 1 -28876 of SEQ ID NO'1 , nucleotides 32408-33373 of SEQ ID
NO 1 , nucleotides 38636-39598 of SEQ ID NO.1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO:1 , and nucleotides 56600-57565 of SEQ ID NO:1.
40. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is an enoyl reductase domain comprising an am o acid sequence substantially similar to an am o acid sequence selected from the group consisting of: ammo acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
41. An isolated nucleic acid molecule according to claim 40, wherein said enoyl reductase domain comprises an ammo acid sequence selected from the group consisting of: ammo acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and am o acids 1478-1790 of SEQ ID NO.7.
42. An isolated nucleic acid molecule according to claim 40, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 35042-35902 of SEQ ID NO:1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO:1.
43 An isolated nucleic acid molecule according to claim 40, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of. nucleotides 10529-1 1428 of SEQ ID NO.1 , nucleotides 35042-35902 of SEQ ID NO:1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO:1.
44 An isolated nucleic acid molecule according to claim 40, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 10529-1 1428 of SEQ ID NO:1 , nucleotides 35042-35902 of SEQ ID NO:1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO:1.
45. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is an acyl carrier protein domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 1314-1385 of SEQ ID NO:2, ammo acids 1722-1792 of SEQ ID NO:4, ammo acids 1434-1506 of SEQ ID NO:5, am o acids 2932-3005 of SEQ ID NO:5, ammo acids 5010-5082 of SEQ ID NO:5, am o acids 7140-7211 of SEQ ID NO:5, ammo acids 1430- 1503 of SEQ ID NO:6, ammo acids 3673-3745 of SEQ ID NO:6, and am o acids 2093- 2164 of SEQ ID NO:7.
46. An isolated nucleic acid molecule according to claim 45, wherein said acyl carrier protein domain comprises an ammo acid sequence selected from the group consisting of: ammo acids 1314-1385 of SEQ ID NO:2, am o acids 1722-1792 of SEQ ID N0:4, amino acids 1434-1506 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID NO:5, amino acids 5010-5082 of SEQ ID NO:5, amino acids 7140-7211 of SEQ ID NO:5, amino acids 1430-1503 of SEQ ID NO:6, amino acids 3673-3745 of SEQ ID NO:6, and amino acids 2093-2164 of SEQ ID NO:7.
47. An isolated nucleic acid molecule according to claim 45, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 11549-11764 of SEQ ID NO:1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO:1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO:1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 4781 1 -48032 of SEQ ID NO:1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 61211 -61426 of SEQ ID NO:1.
48. An isolated nucleic acid molecule according to claim 45, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 11549-11764 of SEQ ID NO:1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO:1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO:1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 47811-48032 of SEQ ID NO:1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 61211 -61426 of SEQ ID NO:1.
49. An isolated nucleic acid molecule according to claim 45, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 11549-11764 of SEQ ID
NO 1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO 1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO 1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 47811 -48032 of SEQ ID NO 1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 61211 -61426 of SEQ ID NO 1.
50. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a dehydratase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO:4, ammo acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, am o acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7.
51. An isolated nucleic acid molecule according to claim 50, wherein said dehydratase domain comprises an ammo acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO:4, ammo acids 3886-4048 of SEQ ID NO:5, am o acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7.
52. An isolated nucleic acid molecule according to claim 50, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401 -33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670-51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1.
53. An isolated nucleic acid molecule according to claim 50, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401 -33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670-51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1.
54. An isolated nucleic acid molecule according to claim 50, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401-33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670-51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1.
55. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a ╬▓-ketoreductase domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 1439-1684 of SEQ ID NO:4, amino acids 1147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, am o acids 4729-4974 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392- 3636 of SEQ ID NO:6, and ammo acids 1810-2055 of SEQ ID NO:7.
56. An isolated nucleic acid molecule according to claim 55, wherein said ╬▓- ketoreductase domain comprises an ammo acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID NO:4, ammo acids 1147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, am o acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and am o acids 1810-2055 of SEQ ID NO:7.
57. An isolated nucleic acid molecule according to claim 55, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO: 1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1.
58. An isolated nucleic acid molecule according to claim 55, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of. nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1.
59. An isolated nucleic acid molecule according to claim 55, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO.1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1.
60. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a methyltransferase domain comprising an ammo acid sequence substantially similar to ammo acids 2671 -3045 of SEQ ID NO:6.
61. An isolated nucleic acid molecule according to claim 60, wherein said methyltransferase domain comprises ammo acids 2671-3045 of SEQ ID NO.6.
62 An isolated nucleic acid molecule according to claim 60, wherein said nucleotide sequence is substantially similar to nucleotides 51534-52657 of SEQ ID NO 1.
63. An isolated nucleic acid molecule according to claim 60, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of nucleotides 51534-52657 of SEQ ID NO:1.
64. An isolated nucleic acid molecule according to claim 60, wherein said nucleotide sequence is nucleotides 51534-52657 of SEQ ID NO:1.
65. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a thioesterase domain comprising an ammo acid sequence substantially similar to ammo acids 2165-2439 of SEQ ID NO:7.
66. An isolated nucleic acid molecule according to claim 65, wherein said thioesterase domain compπses ammo acids 2165-2439 of SEQ ID NO:7.
67. An isolated nucleic acid molecule according to claim 65, wherein said nucleotide sequence is substantially similar to nucleotides 61427-62254 of SEQ ID NO:1.
68. An isolated nucleic acid molecule according to claim 65, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of nucleotides 61427-62254 of SEQ ID NO:1.
69. An isolated nucleic acid molecule according to claim 65, wherein said nucleotide sequence is nucleotides 61427-62254 of SEQ ID NO:1.
70. An isolated nucleic acid molecule comprising a nucleotide sequence that encodes a non-nbosomal peptide synthetase, wherein said non-nbosomal peptide synthetase comprises an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, ammo acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, am o acids 549-565 of SEQ ID NO.3, ammo acids 588-603 of SEQ ID NO:3, am o acids 669-684 of SEQ ID NO.3, ammo acids 815-821 of SEQ ID NO 3, ammo acids 868-892 of SEQ ID NO.3, ammo acids 903-912 of SEQ ID NO.3, ammo acids 918-940 of SEQ ID NO.3, ammo acids 1268-1274 of SEQ ID NO.3, ammo acids 1285-1297 of SEQ ID NO.3, ammo acids 973-1256 of SEQ ID NO.3, and am o acids 1344-1351 of SEQ ID NO:3.
71. An isolated nucleic acid molecule according to claim 70, wherein said non- nbosomal peptide synthetase comprises an ammo acid sequence selected from the group consisting of: SEQ ID NO:3, am o acids 72-81 of SEQ ID NO:3, ammo acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, am o acids 353-363 of SEQ ID NO 3, ammo acids 549-565 of SEQ ID NO:3, ammo acids 588-603 of SEQ ID NO 3, ammo acids 669-684 of SEQ ID NO.3, ammo acids 815-821 of SEQ ID NO:3, ammo acids 868-892 of SEQ ID NO:3, ammo acids 903-912 of SEQ ID NO:3, ammo acids 918-940 of SEQ ID NO:3, ammo acids 1268-1274 of SEQ ID NO:3, ammo acids 1285-1297 of SEQ ID NO.3, ammo acids 973-1256 of SEQ ID NO.3, and ammo acids 1344-1351 of SEQ ID NO:3.
72. An isolated nucleic acid molecule according to claim 70, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-12114 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466-12507 of SEQ ID NO:1 , nucleotides 12928-12960 of SEQ ID NO:1 , nucleotides 13516-13566 of SEQ ID NO:1 , nucleotides 13633-13680 of SEQ ID NO:1 , nucleotides 13876-13923 of SEQ ID NO:1 , nucleotides 14313-14334 of SEQ ID NO:1 , nucleotides 14473-14547 of SEQ ID NO:1 , nucleotides 14578-14607 of SEQ ID N0:1 , nucleotides 14623-14692 of SEQ ID NO:1 , nucleotides 15673-15693 of SEQ ID NO:1 , nucleotides 15724-15762 of SEQ ID NO: 1 , nucleotides 14788-15639 of SEQ ID NO:1 , and nucleotides 15901 -15924 of SEQ ID NO:1.
73. An isolated nucleic acid molecule according to claim 70, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucieotide sequence selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-121 14 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466-12507 of SEQ ID
NO 1 , nucleotides 12928-12960 of SEQ ID NO 1 , nucleotides 13516-13566 of SEQ ID NO 1 , nucleotides 13633-13680 of SEQ ID NO: 1 , nucleotides 13876-13923 of SEQ ID NO 1 , nucleotides 14313-14334 of SEQ ID NO 1 , nucleotides 14473-14547 of SEQ ID NO 1 , nucleotides 14578-14607 of SEQ ID NO 1 , nucleotides 14623-14692 of SEQ ID NO 1 , nucleotides 15673-15693 of SEQ ID NO 1 , nucleotides 15724-15762 of SEQ ID NO 1 , nucleotides 14788-15639 of SEQ ID NO 1 , and nucleotides 15901 -15924 of SEQ ID NO 1.
74. An isolated nucleic acid molecule according to ciaim 70, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-121 14 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466-12507 of SEQ ID NO:1 , nucleotides 12928-12960 of SEQ ID NO:1 , nucleotides 13516-13566 of SEQ ID NO:1 , nucleotides 13633-13680 of SEQ ID NO:1 , nucleotides 13876-13923 of SEQ ID NO:1 , nucleotides 14313-14334 of SEQ ID NO:1 , nucleotides 14473-14547 of SEQ ID NO:1 , nucleotides 14578-14607 of SEQ ID NO:1 , nucleotides 14623-14692 of SEQ ID NO: 1 , nucleotides 15673-15693 of SEQ ID NO:1 , nucleotides 15724-15762 of SEQ ID NO:1 , nucleotides 14788-15639 of SEQ ID NO:1 , and nucleotides 15901-15924 of SEQ ID NO:1.
75. A method for heterologous expression of epothilone in a recombinant host, comprising:
(a) introducing a chimeric gene according to claim 4 into a host; and
(b) growing the host in conditions that allow biosynthesis of epothilone in the host.
76. A method for producing epothilone, comprising:
(a) expressing epothilone in a recombinant host by the method of claim 75; and
(b) extracting epothilone from the recombinant host.
77. An isolated polypeptide comprising an ammo acid sequence that consists of an epothilone synthase domain.
78. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is a ╬▓-ketoacyl-synthase domain comprising an am o acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 11 - 437 of SEQ ID NO:2, ammo acids 7-432 of SEQ ID NO.4, ammo acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, am o acids 3024-3449 of SEQ ID NO:5, amino acids 5103-5525 of SEQ ID NO:5, am o acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and am o acids 32-450 of SEQ ID NO:7.
79. An isolated polypeptide according to claim 78, wherein said ╬▓-ketoacyl-synthase domain comprises an amino acid sequence selected from the group consisting of: am o acids 11-437 of SEQ ID NO:2, ammo acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103-5525 of SEQ ID NO:5, ammo acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7.
80. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is an acyltransferase domain comprising an ammo acid sequence substantially similar to an am o acid sequence selected from the group consisting of: am o acids 543- 864 of SEQ ID NO:2, ammo acids 539-859 of SEQ ID NO:4, amino acids 563-884 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 5631 -5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556-877 of SEQ ID NO:7.
81. An isolated polypeptide according to claim 80, wherein said acyltransferase domain comprises an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID N0:2, am o acids 539-859 of SEQ ID NO:4, ammo acids 563- 884 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 5631-5951 of SEQ ID NO:5, ammo acids 561 -881 of SEQ ID NO:6, am o acids 2053-2373 of SEQ ID NO:6, and ammo acids 556-877 of SEQ ID NO:7.
82. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is an enoyl reductase domain comprising an am o acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 974- 1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
83. An isolated polypeptide according to claim 82, wherein said enoyl reductase domain comprises an ammo acid sequence selected from the group consisting of: ammo acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
84. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is an acyl carrier protein domain, wherein said polypeptide comprises an am o acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 1314-1385 of SEQ ID NO:2, ammo acids 1722-1792 of SEQ ID NO:4, ammo acids 1434-1506 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO.5, ammo acids 5010-5082 of SEQ ID NO:5, am o acids 7140-7211 of SEQ ID NO.5, ammo acids 1430-1503 of SEQ ID NO:6, am o acids 3673-3745 of SEQ ID NO.6, and ammo acids 2093-2164 of SEQ ID NO:7
85. An isolated polypeptide according to claim 84, wherein said acyl carrier protein domain comprises an ammo acid sequence selected from the group consisting of: ammo acids 1314-1385 of SEQ ID NO:2, ammo acids 1722-1792 of SEQ ID NO:4, ammo acids 1434-1506 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO:5, am o acids 5010- 5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, am o acids 1430-1503 of SEQ ID NO:6, ammo acids 3673-3745 of SEQ ID NO:6, and ammo acids 2093-2164 of SEQ ID NO:7.
86. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is a dehydratase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 869-1037 of SEQ ID NO:4, amino acids 3886-4048 of SEQ ID N0:5, amino acids 5964-6132 of SEQ ID NO:5, amino acids 2383-2551 of SEQ ID NO:6, and amino acids 887-1051 of SEQ ID NO:7.
87. An isolated polypeptide according to claim 86, wherein said dehydratase domain comprises an amino acid sequence selected from the group consisting of: amino acids 869-1037 of SEQ ID NO:4, amino acids 3886-4048 of SEQ ID NO:5, amino acids 5964-6132 of SEQ ID NO:5, amino acids 2383-2551 of SEQ ID NO:6, and amino acids 887- 1051 of SEQ ID NO:7.
88. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is a ╬▓-ketoreductase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 1439- 1684 of SEQ ID NO:4, amino acids 1147-1399 of SEQ ID NO:5, amino acids 2645-2895 of SEQ ID NO:5, amino acids 4729-4974 of SEQ ID NO:5, amino acids 6857-7101 of SEQ ID NO:5, amino acids 1143-1393 of SEQ ID NO:6, amino acids 3392-3636 of SEQ ID NO:6, and amino acids 1810-2055 of SEQ ID NO:7.
89. An isolated polypeptide according to claim 88, wherein said ╬▓-ketoreductase domain comprises an amino acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID NO:4, amino acids 1 147-1399 of SEQ ID NO:5, amino acids 2645-2895 of SEQ ID NO:5, amino acids 4729-4974 of SEQ ID NO:5, amino acids 6857- 7101 of SEQ ID NO:5, amino acids 1143-1393 of SEQ ID NO:6, amino acids 3392-3636 of SEQ ID NO:6, and amino acids 1810-2055 of SEQ ID NO:7.
90. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is a methyltransferase domain comprising an amino acid sequence substantially similar to amino acids 2671 -3045 of SEQ ID NO:6.
91. An isolated polypeptide according to claim 90, wherein said methyltransferase domain comprises amino acids 2671-3045 of SEQ ID NO:6.
92. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is a thioesterase domain comprising an amino acid sequence substantially similar to amino acids 2165-2439 of SEQ ID NO:7.
93. An isolated polypeptide according to claim 77, wherein said thioesterase domain comprises amino acids 2165-2439 of SEQ ID NO:7.
PCT/EP1999/004171 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones WO1999066028A2 (en)

Priority Applications (14)

Application Number Priority Date Filing Date Title
NZ508326A NZ508326A (en) 1998-06-18 1998-06-12 A polyketide synthase and non ribosomal peptide synthase genes, isolated from a myxobacterium, necessary for synthesis of epothiones A and B
EP99929243A EP1088078A2 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones
AU46116/99A AU753567B2 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones
JP2000554837A JP2002518004A (en) 1998-06-18 1999-06-16 Epothilone biosynthesis gene
BR9911349-0A BR9911349A (en) 1998-06-18 1999-06-16 Genes for epothilone biosynthesis
PL345579A PL200157B1 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones
SK1924-2000A SK19242000A3 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones
HU0102186A HUP0102186A3 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones
CA002329774A CA2329774A1 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones
IL13973599A IL139735A0 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones
IL139735A IL139735A (en) 1998-06-18 2000-11-16 Genes for the biosynthesis of epothilones
NO20006195A NO20006195L (en) 1998-06-18 2000-12-06 Genes for the biosynthesis of epothilones
IL190391A IL190391A0 (en) 1998-06-18 2008-03-24 Genes for the biosynthesis of epothilones
NO20091055A NO20091055L (en) 1998-06-18 2009-03-09 Genes for the biosynthesis of epothilones

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US9950498A 1998-06-18 1998-06-18
US09/099,504 1998-06-18
US10163198P 1998-09-24 1998-09-24
US60/101,631 1998-09-24
US11890699P 1999-02-05 1999-02-05
US60/118,906 1999-02-05

Publications (2)

Publication Number Publication Date
WO1999066028A2 true WO1999066028A2 (en) 1999-12-23
WO1999066028A3 WO1999066028A3 (en) 2000-06-29

Family

ID=27378840

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1999/004171 WO1999066028A2 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones

Country Status (16)

Country Link
EP (1) EP1088078A2 (en)
JP (3) JP2002518004A (en)
KR (1) KR100511233B1 (en)
CN (1) CN100374565C (en)
AU (1) AU753567B2 (en)
BR (1) BR9911349A (en)
CA (1) CA2329774A1 (en)
HU (1) HUP0102186A3 (en)
ID (1) ID29128A (en)
IL (3) IL139735A0 (en)
NO (2) NO20006195L (en)
NZ (1) NZ508326A (en)
PL (1) PL200157B1 (en)
SK (1) SK19242000A3 (en)
TR (1) TR200003759T2 (en)
WO (1) WO1999066028A2 (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000022139A2 (en) * 1998-10-09 2000-04-20 Bristol-Myers Squibb Company Dna sequences for enzymatic synthesis of polyketide or heteropolyketide compounds
WO2000031247A3 (en) * 1998-11-20 2000-12-07 Kosan Biosciences Inc Recombinant methods and materials for producing epothilone and epothilone derivatives
WO2001053533A2 (en) * 2000-01-21 2001-07-26 Kosan Biosciences, Inc. Method for cloning polyketide synthase genes
WO2001083800A2 (en) * 2000-04-28 2001-11-08 Kosan Biosciences, Inc. Heterologous production of polyketides
US6410301B1 (en) 1998-11-20 2002-06-25 Kosan Biosciences, Inc. Myxococcus host cells for the production of epothilones
WO2003078411A1 (en) 2002-03-12 2003-09-25 Bristol-Myers Squibb Company C3-cyano epothilone derivatives
US6989450B2 (en) 2000-10-13 2006-01-24 The University Of Mississippi Synthesis of epothilones and related analogs
US6998256B2 (en) 2000-04-28 2006-02-14 Kosan Biosciences, Inc. Methods of obtaining epothilone D using crystallization and /or by the culture of cells in the presence of methyl oleate
US7091226B2 (en) 1998-02-25 2006-08-15 Novartis Ag Cancer treatment with epothilones
US7257562B2 (en) 2000-10-13 2007-08-14 Thallion Pharmaceuticals Inc. High throughput method for discovery of gene clusters
WO2012103516A1 (en) 2011-01-28 2012-08-02 Amyris, Inc. Gel-encapsulated microcolony screening
WO2012158466A1 (en) 2011-05-13 2012-11-22 Amyris, Inc. Methods and compositions for detecting microbial production of water-immiscible compounds
WO2014025941A1 (en) 2012-08-07 2014-02-13 Jiang Hanxiao Methods for stabilizing production of acetyl-coenzyme a derived compounds
WO2014144135A2 (en) 2013-03-15 2014-09-18 Amyris, Inc. Use of phosphoketolase and phosphotransacetylase for production of acetyl-coenzyme a derived compounds
WO2015020649A1 (en) 2013-08-07 2015-02-12 Amyris, Inc. Methods for stabilizing production of acetyl-coenzyme a derived compounds
WO2016210350A1 (en) 2015-06-25 2016-12-29 Amyris, Inc. Maltose dependent degrons, maltose-responsive promoters, stabilization constructs, and their use in production of non-catabolic compounds
CN106916834A (en) * 2015-12-24 2017-07-04 武汉臻智生物科技有限公司 The biological synthesis gene cluster of compound and its application
CN111138444A (en) * 2020-01-08 2020-05-12 山东大学 Epothilone B glucoside compounds and enzymatic preparation and application thereof

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0977563B1 (en) 1996-12-03 2005-10-12 Sloan-Kettering Institute For Cancer Research Synthesis of epothilones, intermediates thereto, analogues and uses thereof
EP1856262B1 (en) * 2005-01-31 2012-08-15 Merck Sharp & Dohme Corp. Upstream and a downstream purification process for large scale production of plasmid dna

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998022461A1 (en) * 1996-11-18 1998-05-28 GESELLSCHAFT FüR BIOTECHNOLOGISCHE FORSCHUNG MBH (GBF) Epothilone c, d, e and f, production process, and their use as cytostatic as well as phytosanitary agents

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998022461A1 (en) * 1996-11-18 1998-05-28 GESELLSCHAFT FüR BIOTECHNOLOGISCHE FORSCHUNG MBH (GBF) Epothilone c, d, e and f, production process, and their use as cytostatic as well as phytosanitary agents

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MOLNAR I. ET AL.: "The biosynthetic gene cluster for the microtubule-stabilizing agents epothilones A and B from Sorangium cellulosum So ce90" CHEM. BIOL., vol. 7, 2000, pages 97-109, XP000904734 *
SCHUPP T. ET AL.: "A Sorangium cellulosum (myxobacterium) gene cluster for the biosysnthesis of the macrolide antibiotic soraphen A: cloning, characterization and homology to polyketide synthase genes from actinomycetes" J. BACTERIOL., vol. 177, no. 13, 1995, pages 3673-3679, XP000893003 *
TANG LI, ET AL.: "Cloning and heterologous expression of the epothilone gene cluster." SCIENCE, vol. 287, 28 January 2000 (2000-01-28), pages 640-642, XP002135841 *

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7091226B2 (en) 1998-02-25 2006-08-15 Novartis Ag Cancer treatment with epothilones
WO2000022139A2 (en) * 1998-10-09 2000-04-20 Bristol-Myers Squibb Company Dna sequences for enzymatic synthesis of polyketide or heteropolyketide compounds
WO2000022139A3 (en) * 1998-10-09 2001-01-18 Biotechnolog Forschung Gmbh Dna sequences for enzymatic synthesis of polyketide or heteropolyketide compounds
US7129071B1 (en) 1998-11-20 2006-10-31 Kosan Biosciences, Inc. Recombinant methods and materials for producing epothilone and epothilone derivatives
KR100716272B1 (en) * 1998-11-20 2007-05-09 코산 바이오사이언시즈, 인코포레이티드 Recombinant methods and materials for producing epothilone and epothilone derivatives
US7732186B2 (en) 1998-11-20 2010-06-08 Kosan Biosciences, Inc. Recombinant methods and materials for producing epothilone and epothilone derivatives
JP2007097595A (en) * 1998-11-20 2007-04-19 Kosan Biosciences Inc Recombinant method and material for producing epothilone and epothilone derivatives
KR100851418B1 (en) * 1998-11-20 2008-08-08 코산 바이오사이언시즈, 인코포레이티드 Recombinant methods and materials for producing epothilone and epothilone derivatives
JP2002530107A (en) * 1998-11-20 2002-09-17 コーサン バイオサイエンシーズ, インコーポレイテッド Recombinant methods and materials for producing epothilone and epothilone derivatives
US7402421B2 (en) 1998-11-20 2008-07-22 Kosan Biosciences, Inc. Recombinant methods and materials for producing epothilone and epothilone derivatives
US6583290B1 (en) 1998-11-20 2003-06-24 Kosam Biosciences, Inc. 14-methyl epothilone derivatives
US6303342B1 (en) 1998-11-20 2001-10-16 Kason Biosciences, Inc. Recombinant methods and materials for producing epothilones C and D
US6858411B1 (en) 1998-11-20 2005-02-22 Kosan Biosciences, Inc. Recombinant methods and materials for producing epothilone and epothilone derivatives
US6921650B1 (en) 1998-11-20 2005-07-26 Kosan Biosciences, Inc. Recombinant methods and materials for producing epothilone and epothilone derivatives
US6410301B1 (en) 1998-11-20 2002-06-25 Kosan Biosciences, Inc. Myxococcus host cells for the production of epothilones
WO2000031247A3 (en) * 1998-11-20 2000-12-07 Kosan Biosciences Inc Recombinant methods and materials for producing epothilone and epothilone derivatives
US7067286B2 (en) 1998-11-20 2006-06-27 Kosan Biosciences, Inc. Cystobacterineae host cells containing heterologous PKS genes for the synthesis of polykedtides
WO2001053533A2 (en) * 2000-01-21 2001-07-26 Kosan Biosciences, Inc. Method for cloning polyketide synthase genes
WO2001053533A3 (en) * 2000-01-21 2002-07-25 Kosan Biosciences Inc Method for cloning polyketide synthase genes
US7323573B2 (en) 2000-04-28 2008-01-29 Kosan Biosciences, Inc. Production of polyketides
US6998256B2 (en) 2000-04-28 2006-02-14 Kosan Biosciences, Inc. Methods of obtaining epothilone D using crystallization and /or by the culture of cells in the presence of methyl oleate
EP1652926A2 (en) * 2000-04-28 2006-05-03 Kosan Biosciences, Inc. Heterologous production of polyketides
KR100832145B1 (en) * 2000-04-28 2008-05-27 코산 바이오사이언시즈, 인코포레이티드 Production of polyketides
WO2001083800A3 (en) * 2000-04-28 2003-04-10 Kosan Biosciences Inc Heterologous production of polyketides
EP1652926A3 (en) * 2000-04-28 2006-08-09 Kosan Biosciences, Inc. Heterologous production of polyketides
WO2001083800A2 (en) * 2000-04-28 2001-11-08 Kosan Biosciences, Inc. Heterologous production of polyketides
US6989450B2 (en) 2000-10-13 2006-01-24 The University Of Mississippi Synthesis of epothilones and related analogs
US7257562B2 (en) 2000-10-13 2007-08-14 Thallion Pharmaceuticals Inc. High throughput method for discovery of gene clusters
WO2003078411A1 (en) 2002-03-12 2003-09-25 Bristol-Myers Squibb Company C3-cyano epothilone derivatives
WO2012103516A1 (en) 2011-01-28 2012-08-02 Amyris, Inc. Gel-encapsulated microcolony screening
WO2012158466A1 (en) 2011-05-13 2012-11-22 Amyris, Inc. Methods and compositions for detecting microbial production of water-immiscible compounds
WO2014025941A1 (en) 2012-08-07 2014-02-13 Jiang Hanxiao Methods for stabilizing production of acetyl-coenzyme a derived compounds
WO2014144135A2 (en) 2013-03-15 2014-09-18 Amyris, Inc. Use of phosphoketolase and phosphotransacetylase for production of acetyl-coenzyme a derived compounds
WO2015020649A1 (en) 2013-08-07 2015-02-12 Amyris, Inc. Methods for stabilizing production of acetyl-coenzyme a derived compounds
EP3663392A1 (en) 2013-08-07 2020-06-10 Amyris, Inc. Methods for stabilizing production of acetyl-coenzyme a derived compounds
WO2016210350A1 (en) 2015-06-25 2016-12-29 Amyris, Inc. Maltose dependent degrons, maltose-responsive promoters, stabilization constructs, and their use in production of non-catabolic compounds
WO2016210343A1 (en) 2015-06-25 2016-12-29 Amyris, Inc. Maltose dependent degrons, maltose-responsive promoters, stabilization constructs, and their use in production of non-catabolic compounds
CN106916834A (en) * 2015-12-24 2017-07-04 武汉臻智生物科技有限公司 The biological synthesis gene cluster of compound and its application
CN106916834B (en) * 2015-12-24 2022-08-05 武汉合生科技有限公司 Biosynthetic gene cluster of compounds and application thereof
CN111138444A (en) * 2020-01-08 2020-05-12 山东大学 Epothilone B glucoside compounds and enzymatic preparation and application thereof

Also Published As

Publication number Publication date
JP2002518004A (en) 2002-06-25
CN100374565C (en) 2008-03-12
JP2008092958A (en) 2008-04-24
IL190391A0 (en) 2008-11-03
NO20006195D0 (en) 2000-12-06
KR20010052962A (en) 2001-06-25
WO1999066028A3 (en) 2000-06-29
AU4611699A (en) 2000-01-05
JP2006061166A (en) 2006-03-09
KR100511233B1 (en) 2005-08-31
IL139735A (en) 2009-06-15
CN1305530A (en) 2001-07-25
EP1088078A2 (en) 2001-04-04
NZ508326A (en) 2003-10-31
PL345579A1 (en) 2001-12-17
HUP0102186A3 (en) 2005-10-28
IL139735A0 (en) 2002-02-10
SK19242000A3 (en) 2001-07-10
TR200003759T2 (en) 2001-06-21
HUP0102186A2 (en) 2001-10-28
ID29128A (en) 2001-08-02
AU753567B2 (en) 2002-10-24
NO20091055L (en) 2001-02-16
NO20006195L (en) 2001-02-16
CA2329774A1 (en) 1999-12-23
BR9911349A (en) 2001-03-13
PL200157B1 (en) 2008-12-31

Similar Documents

Publication Publication Date Title
US6858404B2 (en) Genes for the biosynthesis of epothilones
JP2006061166A (en) Gene for biosynthesis of epothilone
JP4662635B2 (en) Recombinant methods and materials for producing epothilone and epothilone derivatives
US6410301B1 (en) Myxococcus host cells for the production of epothilones
AU2001295195B2 (en) Myxococcus host cells for the production of epothilones
AU2001295195A1 (en) Myxococcus host cells for the production of epothilones
TWI770070B (en) Modified streptomyces fungicidicus isolates and their use
RU2265054C2 (en) Recombinant cell-host (variants) and bac clone
RU2234532C2 (en) Nucleic acid (variants), it using for expression of epotilones, polypeptide (variants), escherichia coli microorganism clone
CN100359014C (en) Novel epothilones compound and its preparation method and application
CN100374566C (en) Genes for the biosynthesis of epothilones
KR20130097538A (en) Chejuenolide biosynthetic gene cluster from hahella chejuensis
MXPA00012342A (en) Genes for the biosynthesis of epothilones
CZ20004693A3 (en) Isolated nucleic acid encoding polypeptide participating in biosynthesis of epothilone, chimeric gene, vector and host cells containing such nucleic acid
AU2007200160A1 (en) Heterologous production of polyketides

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 1200100075

Country of ref document: VN

Ref document number: 99807421.7

Country of ref document: CN

AK Designated states

Kind code of ref document: A2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 1999929243

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 139735

Country of ref document: IL

WWE Wipo information: entry into national phase

Ref document number: 508326

Country of ref document: NZ

WWE Wipo information: entry into national phase

Ref document number: 46116/99

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 2000/07145

Country of ref document: ZA

Ref document number: 200007145

Country of ref document: ZA

ENP Entry into the national phase

Ref document number: 2329774

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: PA/a/2000/012342

Country of ref document: MX

WWE Wipo information: entry into national phase

Ref document number: PV2000-4693

Country of ref document: CZ

WWE Wipo information: entry into national phase

Ref document number: 19242000

Country of ref document: SK

Ref document number: IN/PCT/2000/830/CHE

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 1020007014348

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2000/03759

Country of ref document: TR

WWP Wipo information: published in national office

Ref document number: PV2000-4693

Country of ref document: CZ

WWP Wipo information: published in national office

Ref document number: 1999929243

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1020007014348

Country of ref document: KR

NENP Non-entry into the national phase

Ref country code: CA

WWG Wipo information: grant in national office

Ref document number: 46116/99

Country of ref document: AU

WWR Wipo information: refused in national office

Ref document number: 1020007014348

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 190391

Country of ref document: IL