WO1996006188A1 - Peptide librairies as a source of syngenes - Google Patents
Peptide librairies as a source of syngenes Download PDFInfo
- Publication number
- WO1996006188A1 WO1996006188A1 PCT/US1995/010523 US9510523W WO9606188A1 WO 1996006188 A1 WO1996006188 A1 WO 1996006188A1 US 9510523 W US9510523 W US 9510523W WO 9606188 A1 WO9606188 A1 WO 9606188A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- binding
- protein
- nucleic acid
- dna
- ligand
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/02—Libraries contained in or displayed by microorganisms, e.g. bacteria or animal cells; Libraries contained in or displayed by vectors, e.g. plasmids; Libraries containing only microorganisms or vectors
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P43/00—Drugs for specific purposes, not provided for in groups A61P1/00-A61P41/00
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/47—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
- C07K14/4701—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
- C07K14/4702—Regulators; Modulating activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1037—Screening libraries presented on the surface of microorganisms, e.g. phage display, E. coli display
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/02—Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/61—Fusion polypeptide containing an enzyme fusion for detection (lacZ, luciferase)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/70—Fusion polypeptide containing domain for protein-protein interaction
- C07K2319/73—Fusion polypeptide containing domain for protein-protein interaction containing coiled-coiled motif (leucine zippers)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/70—Fusion polypeptide containing domain for protein-protein interaction
- C07K2319/735—Fusion polypeptide containing domain for protein-protein interaction containing a domain for self-assembly, e.g. a viral coat protein (includes phage display)
Definitions
- the present invention relates generally to synthetic gene sequences ("syngenes”), particularly for use in gene therapy.
- Syngenes are nucleic acids that comprise synthetic gene sequences identified by screening synthetic random peptide libraries, for peptides that bind a ligand of choice.
- the synthetic gene sequences together with, optionally, other DNA sequences that target the synthetic gene sequences or their encoded proteins to particular locations in vivo or intracellularly, or that contain processing
- the syngenes are used, for example, in gene therapy to supply, via expression of their encoded proteins, a therapeutic product.
- the invention relates to protein or peptide products of syngenes and their therapeutic and diagnostic uses.
- Gene therapy today is often directed to the replacement of a product of a defective gene.
- a human adenosine deaminase gene has been engineered into skin fibroblasts of mice through the use of a retroviral vector (Palmer et al., 1988,
- the human ⁇ -globin gene has been transferred into mouse bone marrow cells and mouse erythroleukemia cells as a model for the treatment of ⁇ -thalassemia and sickle cell anemia (Novak et al., 1990, Proc. Natl. Acad.
- a defective gene product examples include: the delivering of the human cystic fibrosis transmembrane conductance regulator via a retrovirus into airway epithelia of mice (Hyde et al., 1993, Nature 362:250-255); in a rabbit model, the transfer of human low density lipoprotein receptor cDNA into hepatocytes (Wilson et al., 1992, J. Biol. Chem. 267:963-967); the transfer of human Factor IX cDNA into skin fibroblasts,
- mice followed by engraftment of the fibroblasts into mice (Louis and Verma, 1988, Proc. Natl. Acad. Sci. USA 85:3150-3154).
- gene therapy can also be used to provide a cell or an organism with a new function.
- examples of such a use include: transferring the human growth hormone gene via a retrovirus into human keratinocytes and then grafting the keratinocytes into nude mice. (Morgan et al., 1987, Science 237:1476-1479);
- Another aspect of gene therapy involves the inhibition or enhancement of the activity of a
- the preselected gene is one that is involved in some aspect of a diseased state.
- the ability to modulate the activity of such a gene would be useful in the treatment of the diseased state.
- gene therapy is limited by the
- a further problem is the possibility that no human gene may possess the desired function. Even if a human gene is found that has the desired function, that gene may not have the desired specificity; it may be too large to be easily used in current gene therapy protocols; it may be encoded by multiple exons; and fragments of the gene may not be suitable for use in gene therapy because the gene's binding regions may be too complex for fragments of the endogenous protein to mimic its function. In order to avoid these problems, it would be highly desirable to have a method of producing genes for use in gene therapy that does not rely on the human genome as a source of those genes.
- peptide libraries have generally been constructed by one of two approaches. According to one approach, peptides have been chemically
- soluble fusion proteins expressed in biological systems as either soluble fusion proteins or viral capsid fusion proteins.
- M13 is a filamentous bacteriophage that has been a workhorse in molecular biology laboratories for the past 20 years.
- M13 viral particles consist of six different capsid proteins and one copy of the viral genome, as a single-stranded circular DNA molecule. Once the M13 DNA has been introduced into a host cell such as E. coli , it is converted into double-stranded, circular DNA. The viral DNA carries a second origin of replication that is used to generate the single- stranded DNA found in the viral particles.
- the M13 virus is neither lysogenic nor lytic like other bacteriophage (e.g., ⁇ ); cells, once infected, chronically release virus. This feature leads to high titers of virus in infected cultures, i.e., 10 12 pfu/ml.
- the genome of the M13 phage is ⁇ 8000
- the viral capsid protein, protein III (pIII) is responsible for infection of bacteria.
- pIII protein III
- the pillin protein encoded by the F factor interacts with pIII protein and is responsible for phage uptake.
- all E. coli hosts for M13 virus are considered male because they carry the F factor.
- Several investigators have determined from mutational analysis that the 406 amino acid long pIII capsid protein has two domains. The C-terminus anchors the protein to the viral coat, while portions of the
- N-terminus of pIII are essential for interaction with the E. coli pillin protein (Crissman, J.W. and Smith, G.P., 1984, Virology 132: 445-455).
- N-terminus of the pIII protein has been shown to be necessary for viral infection, the extreme N-terminus of the mature protein does tolerate alterations.
- George Smith published experiments reporting the use of the pIII protein of bacteriophage M13 as an experimental system for expressing a heterologous protein on the viral coat surface (Smith, G.P., 1985, Science 228: 1315-1317). It was later recognized, independently by two groups, that the M13 phage pIII gene display system could be a useful one for mapping antibody epitopes. De la Cruz, V., et al., (1988, J. Biol. Chem. 263: 4318-4322) cloned and expressed segments of the cDNA encoding the Plasmodium
- biopanning in which mixtures of recombinant phage were incubated with biotinylated monoclonal antibodies, and phage-antibody complexes could be specifically recovered with streptavidin-coated
- Biol. 227:711-718 described a phage display library, expressing decapeptides.
- the starting DNA was
- oligonucleotide comprising the degenerate codons [NN(G/T)] 10 (SEQ ID NO: 2) with a self-complementary 3' terminus.
- This sequence in forming a hairpin, creates a self-priming replication site which could be used by T4 DNA polymerase to generate the complementary strand.
- the double- stranded DNA was cleaved at the Sfii sites at the 5' terminus and hairpin for cloning into the fUSE5 vector described by Scott and Smith, supra.
- the protein pVIII is a major M13 viral capsid protein and interacts with the single stranded DNA of M13 viral particles at its C-terminus. It is 50 amino acids long and exists in approximately 2,700 copies per particle.
- N-terminus of the protein is exposed and will tolerate insertions, although large inserts have been reported to disrupt the assembly of pVIII fusion proteins into viral particles (Cesareni, 1992, FEBS Lett. 307:66- 70).
- a phagemid system has been utilized.
- Bacterial cells carrying the phagemid are infected with helper phage and secrete viral particles that have a mixture of both wild-type and pVIII fusion capsid molecules.
- pVIII has also served as a site for expressing peptides on the surface of M13 viral particles.
- Plasmodium falciparum major surface antigen corresponding to different segments of the Plasmodium falciparum major surface antigen have been cloned and expressed in the comparable gene of the filamentous bacteriophage fd (Greenwood et al., 1991, J. Mol.
- Ladner et al. disclose the use of oligonucleotide ligands to screen a phage display library in which known, naturally occurring DNA binding proteins have been cloned. Following mutagenesis of certain defined positions in the sequences of the naturally occurring DNA binding proteins, stronger binding versions of those proteins were isolated.
- Another example of the screening of a non-random phage display library with an oligonucleotide is given in Rebar and Pabo, 1993, Science 263:671-673. This work resulted in the isolation of variants of known zinc finger DNA binding proteins.
- Rebar and Pabo do not disclose the use of random peptide libraries that are totally synthetic.
- oligonucleotides have not been used to screen a totally synthetic, random, phage display peptide library.
- a common goal of screening peptide libraries has generally been to find a peptide with a desired biological effect and (1) use that peptide directly; (2) use the peptide as a basis for designing
- syngenes is especially advantageous in cases where use of a long peptide (>20 amino acids) is necessary. In such cases, making an appropriate peptidomimetic is difficult or impossible.
- Ladner does not, however, disclose the use of totally synthetic random peptide libraries.
- Ladner stresses the use of peptide display libraries that are derived from naturally occurring sequences (in this instance, sequences encoding known binding proteins) by in vitro mutagenesis of these sequences.
- sequences encoding known binding proteins sequences encoding known binding proteins
- the present invention relates generally to synthetic gene sequences ("syngenes''), particularly for use in gene therapy.
- Syngenes are nucleic acids that comprise synthetic gene sequences identified by screening random peptide libraries for peptides that bind a ligand of choice.
- the synthetic gene sequences are cloned into suitable expression vectors together with, optionally, other DNA sequences that target the synthetic gene sequences or their encoded proteins to particular locations in vivo or intracellularly, or that contain processing signals, or that code for other peptides or amino acid sequences.
- the syngenes are used, for example, in gene therapy to supply a therapeutic product via expression of their encoded proteins.
- the invention relates to protein or peptide products of syngenes and their therapeutic and diagnostic uses.
- compositions comprising syngenes or their encoded peptides are also provided.
- the invention provides methods of identifying syngenes that bind to a
- the ligand of choice is a transcriptional regulatory site, a transcription factor that binds to a transcriptional regulatory site, or a binding partner/inhibitor of the transcription factor. In one aspect, such a
- the transcriptional regulatory site is an NF- ⁇ B, AP-1, or ATF binding site.
- the transcription factor is NF- ⁇ B, AP-1, or ATF.
- the present invention provides methods for identifying syngenes that inhibit or enhance the transcriptional activity of a wide variety of naturally occurring genes, preferably with a specificity not found in natural systems. Syngenes are also useful for modulating signal transduction pathways, metabolic pathways, RNA translation, and intracellular trafficking. In the cell membrane, syngenes may be used to modulate the activity of membrane receptors, ion channels, or exocytotic and endocytotic pathways. In tissue, syngenes, via expression of their encoded proteins, may be used to regulate cell/cell signalling and transcytosis.
- Syngenes may be used to regulate cell adhesion or cell/cell
- Figure 1 (A-F) schematically illustrates construction of TSAR libraries.
- Figure 1A (A-F) schematically illustrates construction of TSAR libraries.
- FIG. 1B schematically depicts the synthesis and assembly of synthetic oligonucleotides for the linear libraries and bimolecular libraries illustrated in Figure 1B and 1C.
- N A, C, G or T;
- B G, T or C and
- V G, A, or C;
- n and m are integers, such that 10 ⁇ n ⁇ 100 and 10 ⁇ m ⁇ 100.
- Figure 1D-F schematically depicts representative libraries which are designed to be semirigid libraries. The synthesis and assembly of the oligonucleotides for the semirigid libraries are as in Figure 1A with modifications to include
- Figure 2 schematically depicts an exemplary mRNA expressed by a syngene.
- Figure 3 is a schematic depiction of a shuttle vector which can be used in one embodiment of the invention.
- (1) and (2) allow replication in E. coli and mammalian cells, respectively.
- (3) allows selection in E. coli .
- (4) and (5) allow transcription and mRNA processing, respectively, in mammalian cells.
- (6) is a syngene that preferably encodes amino-terminal spacer sequences, binding domain, targeting/localization signal, and, optionally, a second functional domain.
- H2 ⁇ B, IL6 ⁇ B, IL8 ⁇ B, and NFIL6 refer to oligonucleotide ligands as described in Section 6.1.1. Phage clones were tested for binding to plates coated with the oligonucleotide ligands. Phage clones were also tested for binding to plates coated with BSA. The height of the bars above the numbers along the
- abscissa represents the ratio of a phage clone's binding to an oligonucleotide ligand-coated plate compared to the phage clone's binding to the BSA- coated plate.
- Figure 5 schematically illustrates construction of the TSAR-9 library.
- N A, C, G or T;
- B G, T or C and
- V G, A or C.
- Figure 6 schematically illustrates the m663 expression vector.
- Figure 7 schematically illustrates construction of the TSAR-12 library.
- Figure 8 schematically illustrates construction of the TSAR-13 librar .
- R26 expression library was constructed essentially as described for the TSAR-9 library that is described in Section 6.9.1 and its subsections, except for the modifications depicted in Figure 9.
- ctgtgcctcgagB (NNB) 12 Nccgcgg is SEQ ID NO: 17
- ctgtgctctaga (VNN) 12 VNccgcgg is SEQ ID NO: 18
- tcgagB (NNB) 12 Nccgcgg is SEQ ID NO: 19;
- VNN VNccgcgg is SEQ ID NO: 20;
- SHSS(S/R)X 12 ⁇ A ⁇ X 12 SRPSRT is SEQ ID NO: 21.
- Figure 10 represents circular restriction maps of phagemid vectors, derived from phagemid pBluescript II SK + , in which a truncated portion encoding amino acid residues 198-406 of the pIII gene of M13 is linked to a leader sequence of the E. coli Pel B gene and is expressed under control of a lac promoter.
- G and S represent the amino acids glycine and serine, respectively;
- c-myc represents the human c-myc oncogene epitope recognized by the 9E10 monoclonal antibody described in Evan et al., 1985,
- Figure 10A illustrates the restriction map of phagemid pDAF1
- Figure 10B illustrates the restriction map of phagemid pDAF2
- Figure 10C illustrates the restriction map of phagemid pDAF3
- Figure 10D schematically illustrates the construction of phagemids pDAF1, pDAF2 and pDAF3.
- FIG 11 schematically depicts the construction of the R8C library.
- the R8C expression library was constructed essentially as described for the TSAR-9 library that is described in Section 6.9.1 and its subsections, except for the modifications de icted in Figure 11.
- FIG. 12 schematically depicts the origin of one class of double-insert recombinants in the R8C library.
- TCGAGTTGT (NNK) 8 TGTGGATCTAGATCCACA(MNN) 8 AAAAC is SEQ ID NO: 27; TCGAGTTGT(NNK) 8 TGTGGA
- TCTAGATCCACA(MNN) 8 ACAAC is SEQ ID NO: 28;
- SSCX 8 CGSRSTX 8 TTR is SEQ ID NO: 29.
- FIG. 13 schematically illustrates the construction of the DC43 TSAR library.
- the DC43 expression library was constructed essentially as described for the TSAR-9 library that is described in Section 6.9.1 and its subsections, except for the modifications depicted in Figure 13.
- Figure 14 shows the selectivity of binding of two H2 ⁇ B binding phage (H2 ⁇ B-1 and H2 ⁇ B-2), the NFIL6 binding phage NFIL6-1, and the parental phage m663 for three different target sites: the H2 ⁇ B oligonucleotide, the NFIL6 oligonucleotide, and the IL6 ⁇ B oligonucleotide.
- Figure 15 illustrates the molecular evolution scheme for phage H2 ⁇ B-2 described in Section 6.1.5 that produced the ME#1 library.
- %p indicates the approximate frequency at which the original amino acid residue is expected to occur in the phage of the
- Figures 16A and 16B present the results of phage ELISAs that show the relative binding avidity for the H2 ⁇ B oligonucleotide of clones isolated from the ME#1 library as compared to the H2 ⁇ B-2 clone.
- 16A For 16A:
- FIGS 17A and 17B illustrate the molecular evolution schemes described in Section 6.1.5 that produced the ME#2a and ME#2b libraries, respectively.
- the present invention relates to nucleic acids (“syngenes”) that comprise a random synthetic nucleic acid sequence that encodes a protein, peptide, or polypeptide (used interchangeably hereinafter and most commonly collectively referred to as "peptide” unless indicated otherwise explicitly or by context) that binds to a ligand of choice, and their uses in gene therapy fo myriad diseases and disorders of interest.
- nucleic acids that comprise a random synthetic nucleic acid sequence that encodes a protein, peptide, or polypeptide (used interchangeably hereinafter and most commonly collectively referred to as "peptide” unless indicated otherwise explicitly or by context) that binds to a ligand of choice, and their uses in gene therapy fo myriad diseases and disorders of interest.
- the invention provides methods for
- compositions comprising syngenes as well as uses of the syngenes, e . g. , in diagnosis and therapy of various disorders.
- Syngenes encode a syngene product which is a peptide having at least one functional domain.
- the functional domain is a binding domain with affinity for a ligand of choice.
- syngenes of the present invention are synthetic genes, that is, genes that are not known to be present in a naturally occurring genome.
- a syngene is a nucleic acid encoding a peptide comprising a binding domain in which the binding domain sequence is identified from a peptide library comprising at least 5 unpredictable contiguous amino acids in the variable portion of the library.
- syngenes are made up of, at least in part, combinations of genes encoding functional domains, such combinations not occurring in nature.
- Syngenes may be composed of totally synthetic gene sequences, combinations of natural and totally synthetic gene sequences, or combinations of natural DNA sequences juxtaposed so as not to form a gene present in nature.
- the present invention vastly expands the number of genes that are available for use in gene therapy.
- the present invention provides methods for finding synthetic nucleic acids encoding peptides with diagnostic or therapeutic value. The present
- invention further provides methods for delivery of such peptides in vivo by expression in vivo from the administered syngene, that potentially avoid the common problems of instability, clearance, etc. faced when dosing with proteins directly. More importantly, however, there are many instances where a protein that possesses a desired biological effect is unknown. For example, a known protein may not possess a desired level of binding specificity for a particular ligand. Syngenes encoding peptides with such novel
- the present invention provides methods for identifying synthetic peptides, and the nucleic acids encoding them, to fulfill roles that naturally
- syngene nucleic acid
- Such a gene referred to as a syngene
- the syngene may be introduced into an appropriate host cell and thereby used for the recombinant production of its encoded protein.
- the present invention provides methods for identifying syngenes that inhibit or enhance the transcriptional activity of a wide variety of naturally occurring genes, preferably with a specificity not found in natural systems.
- Syngenes are also useful for modulating processes that occur outside of the nucleus.
- the encoded protein of a syngene produced by intracellular expression of the syngene, may be used to affect the activity of
- inhibitors of transcription factors such as IF- ⁇ B. Syngenes may also be used to modulate signal
- syngenes may be used to modulate the activity of membrane receptors, ion channels, or exocytotic and endocytotic pathways.
- syngenes via expression of their encoded proteins, may be used to regulate cell/cell signalling and transcytosis.
- Cell/cell junctions and the extracellular matrix are appropriate targets for syngene expressed peptides.
- Syngenes may be used to regulate cell adhesion or cell/cell recognition.
- syngenes may be used to regulate the activity of receptor ligands.
- the invention provides the peptides comprising binding domains that are encoded by syngenes, as well as their therapeutic and diagnostic uses, and compositions comprising such peptides.
- IDENTIFYING SYNTHETIC GENE SEQUENCES ENCODING A BINDING DOMAIN Binding domains encoded by the syngenes of the invention can be identified from a random peptide expression library or a chemically synthesized random peptide library.
- a nucleic acid which expresses a peptide which binds to a ligand of choice can be identified and recovered from a random peptide
- amino acid sequence of an appropriate binding domain can be determined by direct determination of the amino acid sequence of a peptide selected from a random peptide library containing chemically
- synthesized peptides whereby an appropriate syngene encoding the peptide can be designed and made.
- direct amino acid sequencing of a binding peptide selected from a random peptide expression library can also be performed, and an encoding nucleic acid designed.
- methods can be used to identify portions of the determined synthetic amino acid or nucleotide sequences which respectively mediate binding, or encode the sequences which mediate binding, as
- Random peptide libraries is meant to include within its scope libraries of both
- peptide sequences with a stretch of at least 5 unpredictable amino acids as well as invariant amino acid sequences are included within the scope of the random peptides.
- the syngenes encoding such peptides will have both codons of unpredictable and codons of invariant sequence or (due to the degeneracy of the genetic code) degenerate codons thereof.
- the binding domains of syngenes are not cDNA or generated so as to be genomic sequences.
- the syngenes of the invention are not sequences generated by a method comprising mutagenesis (even random mutagenesis) of a cDNA sequence or portion thereof or genomic sequence or portion thereof coding for a peptide having a
- the binding domains encoded by syngenes are advantageously identified from random peptide libraries.
- random peptide libraries will be encoded by synthetic oligonucleotides with at least 15 contiguous variant nucleotide positions having the potential to encode all 20 naturally occurring amino acids.
- the sequence of amino acids encoded by the variant nucleotides is unpredictable and substantially random.
- the terms "unpredicted”, “unpredictable” and “substantially random” are used interchangeably with respect to the amino acids encoded and are intended to mean that the variant nucleotides at any given position encoding the binding domain of the syngene product are such that it cannot be predicted which of the 20 naturally
- biological random peptide libraries envisioned for use include those in which a bias has been introduced into the random sequence, e.g., to disfavor stop codon usage.
- a syngene of the invention encodes a peptide comprising a binding domain (which binds to a ligand of choice), in which the nucleotide sequence encoding the binding domain is a sequence identified by a method comprising screening a library of recombinant vectors, said vectors
- unpredictable nucleotides arranged in one or more contiguous sequences, wherein the total number of unpredictable nucleotides is greater than or equal to 15 and less than or equal to about 300. In other specific embodiments, the total number of such
- the peptide libraries used in the present invention may be libraries that are chemically
- the total number of unpredictable amino acids in the peptides of the library used for screening is greater than or equal to 5 and less than or equal to 25; in other embodiments the total is in the range of 5-15 or 5-10 amino acids, preferably contiguous amino acids.
- binding domain can be identified from chemically synthesized peptide libraries and an appropriate syngene synthesized to encode such a binding domain, such domains would be small (i.e. less than 10 amino acids, and most probably 5-6 amino acids, in length). Therefore, this approach is less preferred than the biological peptide libraries containing unpredictable sequences of greater length, described below.
- biological random peptide libraries are used to identify a binding domain which binds to a ligand of choice.
- Many suitable biological random peptide libraries are known in the art and can be used to screen for a peptide that binds to a ligand of choice, according to
- soluble fusion proteins expressed in biological systems as either soluble fusion proteins or viral capsid fusion proteins.
- peptide libraries according to this approach have used the M13 phage. Although the N-terminus of the viral capsid protein, protein III (pIII), has been shown to be necessary for viral infection, the extreme N-terminus of the mature protein does tolerate alterations such as insertions. Accordingly, various random peptide libraries, in which the diverse peptides are expressed as pIll fusion proteins, are known in the art; these libraries can be used to identify syngene-encoded binding domains by screening against a ligand of choice.
- oligonucleotides of about 17 or 23 degenerate bases with an 8 nucleotide long palindromic sequence at their 3' ends.
- the R8C random peptide library (described in Section 6.10 and Figures 11 and 12) is used.
- the R26 peptide library (described in Section 6.9.4 and Figure 9) is used.
- the DC43 peptide library (described in Section 6.9.5 and Figure 13 is used).
- the protein pVIII is a major M13 viral capsid protein which can also serve as a site for expressing peptides on the surface of M13 viral particles in the construction of random peptide libraries.
- the average functional domain within a natural protein is considered to be about 40 amino acids.
- the random peptide libraries from which the binding domains encoded by the syngenes of the present invention are preferably identified encode peptides having in the range of 5 to 200 total variant amino acids.
- biologically expressed random peptide libraries displaying short random inserts could be used to identify syngenes of the invention, the most preferred binding domains will be identified from biologically expressed random peptide libraries in which the displayed peptide has 20 or greater unpredictable amino acids i.e. preferably in the range of 20 to 100, and most preferably 20 to 50 amino acids, as exemplified by the TSAR libraries described herein.
- One of the objects of the present invention is to provide syngenes encoding binding domains of greater binding specificity than found in nature. To accomplish this, the invention preferably uses
- libraries encoding more than about 20 amino acids be constructed, but such libraries can be advantageously screened to identify peptides having binding
- Libraries composed of longer length oligonucleotides afford the ability to identify peptides in which a short sequencenof amino acids is common to or shared by a number of peptides binding a given ligand, i.e., library members having shared binding motifs.
- the use of longer length libraries also affords the ability to identify peptides which do not have any shared sequences with other peptides but which nevertheless have binding specificity for the same ligand.
- oligonucleotide sequences provide the opportunity to identify or map binding sites which encompass not only a few contiguous amino acid residues, i.e., simple binding sites, but also those which encompass
- discontinuous amino acids i.e., complex binding sites.
- the large size of the inserted synthesized oligonucleotides of certain libraries provides the opportunity for the development of secondary and/or tertiary structure in the potential binding peptides and in sequences flanking the actual binding site in the binding domain. Secondary and tertiary structure often significantly affect the ability of a sequence to mediate binding, as well as the strength and specificity of any binding which occurs. Such complex structural developments are not feasible when only small length oligonucleotides are used.
- binding domains and the syngenes encoding them will be identified from biologically expressed random peptide libraries in which the displayed peptide is 20 or greater amino acids in length.
- Random peptide libraries examples include the TSAR libraries described in detail in Section 5.1.3 and its subsections.
- the most preferred libraries for practicing the invention are those that are generated or
- the putative binding domains in these peptides are not known to be naturally occurring amino acid sequences or encoded by naturally occurring nucleotide sequences.
- the random peptide libraries from which the binding domains encoded by syngenes are identified are not cDNA or genomic libraries.
- the sequence of any given peptide from the preferred libraries cannot be predicted in advance.
- the peptides are expressed on the surface of the recombinant vectors of the library.
- the random library is a linear, non-constrained library. As would be
- these libraries express peptides that are substantially random but contain a small percentage of fixed residues within or flanking the random sequences that have the result of conferring structure or some degree of conformational rigidity to the peptide.
- the plurality of synthetic oligonucleotides express peptides that are each able to adopt only one or a small number of different conformations that are constrained by the positioning of codons encoding certain structure conferring amino acids in or
- the DNA encoding the random peptides of the libraries can be synthetic or natural in origin.
- DNA can be prepared by a method similar to methods used for constructing a cDNA library.
- DNA is then sheared and/or digested into DNA pieces of 30 base pairs or smaller.
- the DNA pieces are then ligated together, preferably to sizes of about 120 base pairs to create DNA sequences that do not occur in nature.
- Peptides encoded by such sequences can be expressed in a library and screened for binding domains.
- such DNA sequences will contain stop codons and therefore would be a less desirable source of coding sequences. It is noted that such a library, by virtue of the juxtaposition of cDNA sequences used in creating it, is not a cDNA library.
- a biological peptide library that is a random peptide "TSAR" library is screened to identify the synthetic gene sequences encoding a binding domain, for use in constructing a syngene.
- the syngenes of the present invention encode peptides called TSARs which bind to a ligand of choice.
- TSARs is an acronym for "Totally Synthetic Affinity
- TSAR libraries construction and use, and specific examples of TSAR libraries are described in these publications and in detail below.
- nucleic acids encoding TSARs or a TSAR portion which mediates binding to the ligand of choice can be used to
- a TSAR is intended to encompass a concatenated heterofunctional peptide that includes at least two distinct functional regions.
- One region of the heterofunctional TSAR molecule is a binding domain with affinity for a ligand, that is preferably characterized by 1) its strength of binding under specific conditions, 2) the stability of its binding under specific conditions, and 3) its selective specificity for the chosen ligand.
- a second region of the heterofunctional TSAR molecule is an effector domain which in specific embodiments of the TSAR libraries is the pIII protein or other structural protein providing for phage display of the heterofunctional peptide.
- such a sequence can also include a sequence that is biologically or chemically active to enhance expression and/or detection and/or
- Such a sequence can be chosen from a number of biologically or chemically active proteins including a structural protein or fragment that is accessibly expressed as a surface protein of a vector, an enzyme or fragment thereof, a toxin or fragment thereof, a therapeutic protein or peptide, or a protein or peptide whose function is to provide a site for attachment of a substance such as a metal ion, etc., that is useful for enhancing
- a TSAR can contain an optional additional linker domain or region between the binding domain and the effector domain.
- the linker region serves (1) as a structural spacer region between the binding and effector domains; (2) as an aid to uncouple or
- a TSAR may be a heterofunctional fusion protein, said fusion protein comprising (a) a binding domain encoded by an oligonucleotide comprising unpredictable nucleotides in which the unpredictable nucleotides are arranged in one or more contiguous sequences, wherein the total number of unpredictable nucleotides is greater than or equal to about 60 and less than or equal to about 600, and optionally, (b) an effector domain encoded by an oligonucleotide sequence which is a protein or peptide that enhances expression or detection of the binding domain.
- a TSAR may be a heterofunctional fusion protein comprising (i) a binding domain encoded by a double stranded oligonucleotide comprising unpredictable nucleotides in which the unpredictable nucleotides are arranged in one or more contiguous sequences, wherein the total number of unpredictable nucleotides is greater than or equal to about 60 and less than or equal to about 600 and the contiguous sequences are flanked by invariant residues designed to encode amino acids that confer a desired structure to the binding domain of the expressed
- heterofunctional fusion protein and, optionally, (ii) an effector domain encoded by an oligonucleotide sequence encoding a protein or peptide that enhances expression or detection of the binding domain.
- TSAR LIBRARIES in order to identify and obtain the syngenes of the present invention, use is made of TSAR libraries.
- TSAR libraries In order to prepare a library of vectors expressing a plurality of protein TSARs according to one embodiment of the present invention, single stranded sets of nucleotides are synthesized and assembled in vi tro, by way of example, according to the following scheme.
- the synthesized nucleotide sequences are designed to have variant or unpredicted as well as invariant nucleotide positions. Pairs of variant nucleotides in which one individual member is
- N is A, C, G or T
- B is G, T or C
- V is G, A or C
- n is an integer, such that 10 ⁇ n ⁇ 100
- m is an integer, such that 10 ⁇ m ⁇ 100, are synthesized for assembly into synthetic oligonucleotides.
- N is A, C, G or T
- B is G, T or C
- V is G, A or C
- n is an integer, such that 10 ⁇ n ⁇ 100
- m is an integer, such that 10 ⁇ m ⁇ 100, are synthesized for assembly into synthetic oligonucleotides.
- variant nucleotide positions have the potential to encode all 20 naturally occurring amino acids and, when assembled as taught by the present method, encode only one stop codon, i.e., TAG.
- sequence of amino acids encoded by the variant have the potential to encode all 20 naturally occurring amino acids and, when assembled as taught by the present method, encode only one stop codon, i.e., TAG.
- nucleotides is unpredictable and substantially random in sequence.
- the terms “unpredicted”, “unpredictable” and “substantially random” are used interchangeably in the present application with respect to the amino acids encoded and are intended to mean that at any given position within the binding domain of the TSARs encoded by the variant nucleotides which of the 20 naturally
- the variant nucleotides encode all twenty naturally occurring amino acids by use of 48 different codons.
- Invariant nucleotides are positioned at particular sites in the nucleotide sequences to aid in assembly and cloning of the synthesized
- the invariant nucleotides encode for efficient restriction enzyme cleavage sites.
- the invariant nucleotides at the 5' termini are chosen to encode pairs of sites for cleavage by restriction enzymes (1) which can function in the same buffer conditions; (2) are commercially available at high specific activity; (3) are not complementary to each other to prevent self-ligation of the synthesized double stranded oligonucleotides; and (4) which require either 6 or 8 nucleotides for a cleavage recognition site in order to lower the frequency of cleaving within the inserted double stranded
- restriction site pairs are selected from Xho I and Xba I, and Sal I and Spe I.
- Other examples of useful restriction enzyme sites include, but are not limited to: Nco I, Nsi I, Pal I, Not I, Sfi I, Pme I, etc. Restriction sites at the 5' termini invariant positions function to promote proper orientation and efficient production of recombinant molecule formation during ligation when the
- oligonucleotides are inserted into an appropriate expression vector.
- nucleotides are synthesized using one or more
- nucleotides encoding restriction sites for efficient cleavage, are synthesized using non-methylated dNTPs.
- This embodiment provides for efficient cleavage of long length synthesized oligonucleotides at the termini for insertion into an appropriate vector, while avoiding cleavage in the variant nucleotide sequences.
- the 3' termini invariant nucleotide positions are complementary pairs of 6, 9 or 12 nucleotides to aid in annealing two synthesized single stranded sets of nucleotides together and conversion to double-stranded DNA, designated herein synthesized double stranded oligonucleotides.
- the 3' termini invariant nucleotides are selected from
- the 3' termini invariant nucleotides of the coding strand are 5' GGGTGCGGC 3' which encode glycine, cysteine, glycine. In an oxidizing environment the cysteine forms a disulfide bond with another cysteine
- the complementary 3' termini also encode an amino acid sequence that provides a short charge cluster (for example, KKKK
- complementary 3' termini also encode a short amino acid sequence that provides a peptide known to have a desirable binding or other biological activity.
- ⁇ is a non-polar amino acid.
- RGD RGD
- HAV HPQ ⁇ where ⁇ is a non-polar amino acid.
- Figure 1A generally illustrates an assembly process.
- the oligonucleotide sequences are thus assembled by a process comprising: synthesis of pairs of single stranded nucleotides having a formula represented by:
- n is an integer, such that 10 ⁇ n ⁇ 100 and m is an integer, such that 10 ⁇ m ⁇ l00.
- the single stranded nucleotides are represented as : pairs of nucleotide sequences of a first formula 5' X (NNB) n J Z 3' and a second nucleotide sequence of the formula 3' Z' O U (NNV).
- Y 5' where X and Y are restriction enzyme recognition sites, such that X ⁇ Y;
- N is A, C, G or T;
- B is G, T or C
- V is G, A or C
- n is an integer, such that 10 ⁇ n ⁇ 100;
- n is an integer, such that 10 ⁇ m ⁇ 100;
- Z and Z' are each a sequence of 6, 9 or 12 nucleotides, such that
- Z and Z' are complementary to each other; and J is A, C, G, T or nothing;
- 0 is A, C, G, T or nothing
- U is G, A, C or nothing; provided, however, if any one of J, O or U is nothing then J, O and U are all nothing.
- any method for synthesis of the single stranded sets of nucleotides is suitable, including use of an automatic nucleotide synthesizer.
- the synthesizer can be programmed so that the nucleotides can be incorporated, either in equimolar or non- equimolar ratios at the variant positions, i.e., N, B, V, J, O or U.
- the nucleotide sequences of the desired length are purified, for example, by HPLC.
- Pairs of the purified, single stranded nucleotides of the desired length are reacted together in appropriate buffers through repetitive cycles of annealing and DNA synthesis using an appropriate DNA polymerase, such as Taq, VentTM or Bst DNA polymerase, and appropriate temperature cycling.
- an appropriate DNA polymerase such as Taq, VentTM or Bst DNA polymerase, and appropriate temperature cycling.
- modified T7 DNA polymerase (modified T7 DNA polymerase) is preferred for use, and can be employed according to the instructions of the manufacturer (U.S. Biochemical, Cleveland, OH), without temperature cycling. Klenow fragment of E. coli DNA polymerase could be used but, as would be understood by those of skill in the art, such
- the double stranded DNA reaction products now greater than m + n in length, are isolated, for example, by phenol/chloroform extraction and precipitation with ethanol.
- the double stranded synthetic oligonucleotides are cleaved with appropriate restriction enzymes to yield a plurality of synthesized oligonucleotides.
- the double-stranded synthesized oligonucleotides should be selected for those of the appropriate size by means of high
- oligonucleotides substantially eliminates abortive assembly products of inappropriate size and incomplete digestion products.
- the TSAR scheme for synthesis and assembly of the oligonucleotides provides sequences of oligonucleotides encoding unpredicted amino acid sequences which are larger in size than conventional libraries.
- synthesized double stranded oligonucleotides comprise at least about 77-631 nucleotides encoding the
- the synthesized double stranded oligonucleotides comprise at least 77-331 nucleotides and encode about 20-100 unpredicted amino acids in the TSAR binding domain.
- the synthesized oligonucleotides encode 20, 24 and 36 unpredicted amino acids and 27, 35 and 42 total amino acids, respectively for the TSAR-9, TSAR-12, and
- syngenes are isolated from a library which expresses a plurality of TSAR peptides having some degree of conformational rigidity in their structure
- TSAR-13 and TSAR-14 libraries described in Section 6.4 and its subsections. At least four different methods can be used to engineer TSAR libraries used to identify syngene sequences encoding a binding domain, so that the expressed peptides are semirigid or have some degree of conformational rigidity.
- the synthesized oligonucleotides are designed so that the expressed peptides have a pair of invariant cysteine residues positioned in, or flanking, the unpredicted or variant residues (See Figure ID).
- the cysteine residues should be in the oxidized state, most likely cross-linked by disulfide bonds to form cystines.
- the peptides would form rigid or semirigid loops.
- the nucleotides encoding the cysteine residues should be placed from 6 to 27 amino acids apart flanking the variant nucleotide sequences.
- peptide library having such structure is the R8C library described in Section 6.13 infra .
- the actual positions of the invariant residues can be modeled on the arrangement observed in isolates from a linear peptide library, for example, TSAR peptides in which two or four cysteines are encoded by the inserted synthesized oligonucleotides, isolated from the TSAR-9 or TSAR-12 libraries (see Section 6.9 and its subsections infra) .
- TSAR peptides in which two or four cysteines are encoded by the inserted synthesized oligonucleotides, isolated from the TSAR-9 or TSAR-12 libraries (see Section 6.9 and its subsections infra) .
- the following general formulas illustrate the structure of these peptides:
- a double stranded oligonucleotide sequence providing a cloverleaf structure (see Figure 1E) can be represented, for example, by the formula:
- peptides expressed by this type of rigid library should form many different ligand binding pockets from which to select the best fit. It should be noted that when a semirigid library of the first or second type above is expressed in a viral vector in an oxidizing environment, there will likely be a selection against odd numbers of cysteines occurring within the
- TSAR-13 and TSAR-14 libraries described in Sections 6.9.4 and 6.9.4.1 infra are the TSAR-13 and TSAR-14 libraries described in Sections 6.9.4 and 6.9.4.1 infra .
- nucleotides are designed and assembled so that the plurality of proteins expressed have both invariant cysteine and histidine residues positioned within the variant nucleotide sequences (see Figure 1F).
- the positions of the invariant residues can be modeled after the arrangement of cysteine and histidine residues seen in zinc finger proteins (i.e., -CX 2- 4 CX 12 HX 3 - 4 H-, where X is any amino acid), thereby creating a library of zinc finger-like proteins.
- zinc finger-like proteins is intended to mean any of the plurality of proteins expressed which contain invariant cysteine and
- histidine residues which confer a zinc finger or similar structure on the expressed protein.
- the plurality of proteins are designed to have invariant histidine residues positioned within the variant nucleotide sequences.
- exemplary histidine containing TSARs can be represented by the following general formulas:
- CAC represents the codon for histidine.
- the TSAR proteins are expressed and harvested in the presence of 1-1000 ⁇ M zinc chloride.
- the expressed proteins could also be saturated with other divalent metal cations, such as Cu 2+ and Ni 2+ .
- the members of this type of rigid library may have advantageous chemical reactivity, since metal ions are often within the catalytic sites of enzymes.
- the synthesized single stranded nucleotides are assembled by annealing a first nucleotide sequence of the formula: 5'X [ ⁇ (NNB) a ] c JZ 3' with a second nucleotide
- ⁇ is an invariant nucleotide sequence whose complimentary nucleotide sequence confers some structure in the encoded peptide;
- X, Y, N, B, V, Z, Z', J, O, U are as defined above.
- a and ⁇ could include a codon for one or more cysteine residues, for example Gly-Cys-Gly, in which instance a and b are each preferably ⁇ 6 and ⁇ 27, to generate disulfide bonds between different cysteines in the expressed loop forming peptide structures.
- a and ⁇ could include a codon for one or more cysteine residues, for example Gly-Cys-Gly, in which instance a and b are each preferably ⁇ 6 and ⁇ 27, to generate disulfide bonds between different cysteines in the expressed loop forming peptide structures.
- oligonucleotides could be assembled, by annealing for example, a first nucleotide sequence of the formula: 5' X ⁇ (NNB) a ⁇ (NNB) a Z 3' with a second nucleotide sequence of the formula
- the synthesized single stranded nucleotides are assembled by annealing a first nucleotide sequence: 5' X(GGG)(TGT)(GGG)(NNB) 7 (GGG)(TGT)(GGG)(NNB) 7 (GGG) (TGT)(GGG) 3' with a second nucleotide sequence:
- GGG represents the codon for glycine and TGT represents the codon for cysteine.
- oligonucleotide scheme encodes peptides, whose amino acid sequence would be GCGX-GCGX-GCGX-GCG.
- a and ⁇ could encode one or more histidine residues, for example GHGHG (SEQ ID NO: 54).
- or and ⁇ could encode a Leu residue in which instance a and b are each s about 7. Such alternative embodiment would provide an alpha helical structure in the expressed peptides.
- an a group could be used for Z, and ⁇ for Z' to provide the complementary sequences to aid in annealing the nucleotides.
- nucleotide sequences encoding amino acids that will impose structural constraints on the expressed peptides are possible as would become apparent to one of skill in the art based on the above description and syngenes encoding such constrained peptides are encompassed within the scope of the present invention.
- reducing agents i . e . , DTT, ⁇ -mercaptoethanol
- divalent cation chelators i.e., EDTA, EGTA
- Such reagents can be used, for example, to elute a TSAR library expressed on phage vectors from target ligands.
- EDTA or EGTA at low concentrations, does not appear to disrupt phage integrity or infectivity.
- the reduced cysteine residues can be alkylated with iodoacetamide. This treatment prevents renewed disulfide bond formation and only diminishes phage infectivity 10-100 fold, which is tolerable since phage cultures usually attain titers of 10 12 plaque forming units per milliliter.
- the elution reagents can be removed by dialysis (i.e., dialysis bag, Centricon/Amicon microconcentrators).
- suitable host expresses the plurality of peptides as heterofunctional fusion proteins with an expressed component of the vector (effector domain) which are screened to identify TSARs having affinity for a ligand of choice.
- the plurality of peptides further comprise a linking domain between the binding and effector domains.
- the linker domain is expressed as a fusion protein with the effector domain of the vector into which the plurality of oligonucleotides are inserted.
- a promoter is a region of DNA at which RNA polymerase attaches and initiates transcription.
- the promoter selected may be any one that has been
- E. coli a commonly used host system, has numerous promoters such as the lac or trp promoter or the promoters of its
- bacteriophages or its plasmids.
- synthetic or recombinantly produced promoters such as the p TAC promoter may be used to direct high level expression of the gene segments adjacent to them. Signals are also necessary in order to attain efficient translation of the inserted
- a ribosome binding site includes the translational start codon AUG or GUG in addition to other sequences complementary to the bases of the 3' end of 165 ribosomal RNA.
- S/D Shine/Dalgarno
- S/D-ATG sequence which is compatible with the host cell system can be employed.
- S/D-ATG sequences include, but are not limited to, the S/D-ATG sequences of the cro gene or N gene of bacteriophage lambda, the tryptophan E, D, C, B or A genes, a synthetic S/D sequence or other S/D-ATG sequences known and used in the art.
- regulatory elements control the expression of the polypeptide or proteins to allow directed synthesis of the reagents in cells and to prevent constitutive synthesis of products which might be toxic to host cells and thereby
- vectors can be used, including, but not limited to bacteriophage vectors such as ⁇ X174, ⁇ , M13 and its derivatives, f1, fd, Pf1, etc., phagemid vectors, plasmid vectors, insect viruses, such as baculovirus vectors, mammalian cell vectors, including such virus vectors as parvovirus vectors, adenovirus vectors, vaccinia virus vectors, retrovirus vectors, etc., yeast vectors such as Tyl, killer particles, etc.
- bacteriophage vectors such as ⁇ X174, ⁇ , M13 and its derivatives, f1, fd, Pf1, etc.
- phagemid vectors such as baculovirus vectors
- mammalian cell vectors including such virus vectors as parvovirus vectors, adenovirus vectors, vaccinia virus vectors, retrovirus vectors, etc.
- yeast vectors such as Tyl, killer particles, etc.
- An appropriate vector contains or is
- the effector domain gene contains or is engineered to contain multiple cloning sites. At least two different restriction enzyme sites within such gene, comprising a polylinker, are preferred.
- the vector DNA is cleaved within the polylinker using two different restriction enzymes to generate termini complementary to the termini of the double stranded synthesized oligonucleotides assembled as described above. Preferably the vector termini after cleavage have or are modified, using DNA polymerase, to have non-compatible sticky ends that do not self-ligate, thus favoring insertion of the double-stranded
- synthesized oligonucleotides and hence formation of recombinants expressing the TSAR fusion proteins, polypeptides and/or peptides.
- the double stranded synthesized oligonucleotides are ligated to the appropriately cleaved vector using DNA ligase.
- telomere fragment within the polylinker region of the vector when the vector (e.g. phage or plasmid) is intended to express the TSAR as a heterofunctional fusion protein that is expressed on the surface of the vector.
- restriction enzyme sites at the termini of the stuffer fragment are useful for insertion of the synthesized double stranded oligonucleotides, resulting in
- the stuffer fragment can comprise a known DNA sequence encoding a protein that is immunologically active (i.e., an immunological marker)
- the presence or absence of the stuffer fragment can be easily detected either at the nucleotide level, by DNA sequencing, PCR or hybridization, or at the amino acid level, e. g. , using an immunological assay.
- Such determination allows rapid discrimination between recombinant (TSAR expressing) vectors generated by insertion of the synthesized double stranded
- the stuffer fragment comprises the DNA fragment encoding the epitope of the human c-myc protein recognized by the murine monoclonal antibody 9E10 having the amino acid sequence EQKLISEEDLN (SEQ ID NO: 55) (Evan et al., 1985, Mol. Cell. Biol. 5:3610-3616) with a short flanking sequence of amino acids at the 5' and 3' termini which serve as restriction enzyme sites so that the stuffer fragment can be removed and the synthesized double stranded oligonucleotides can be inserted using the restriction sites.
- EQKLISEEDLN SEQ ID NO: 55
- the stuffer fragment provides an efficient means to remove any non- recombinant vectors to enhance or enrich the
- the stuffer fragment is expressed e . g. , as an immunologically active surface protein on the surface of non-recombinant vectors, it provides an accessible target for binding e. g. , to an immobilized antibody.
- the non-recombinants thus could be easily removed from a library for example by serial passage over a column having the antibody immobilized thereon to enrich the population of recombinant TSAR-expressing vectors in the library.
- the vector providing for expression of the TSAR libraries is or is derived from a filamentous bacteriophage, including but not limited to M13, f1, fd, Pf1, etc. vector encoding a phage structural protein, preferably a phage coat protein, such as pIII, pVIII, etc.
- the filamentous phage is an M13-derived phage vector such as m655, m663, and m666 described in Fowlkes et al., 1992, BioTechniques, 13:422-427 which encodes the structural coat protein pIII.
- the phage vector is chosen to contain or is constructed to contain a cloning site located in the 5 ' region of a gene encoding a bacteriophage
- synthesized double stranded oligonucleotides inserted are expressed as fusion proteins on the surface of the bacteriophage. This advantageously provides not only a plurality of accessible expressed peptides but also provides a physical link between the peptides and the inserted oligonucleotides to provide for easy
- the vector is chosen to contain or is constructed to contain a cloning site near the 3' region of a gene encoding structural protein so that the plurality of expressed proteins constitute C- terminal fusion proteins.
- the structural bacteriophage protein is pIII.
- the library may be constructed by cloning the plurality of synthesized oligonucleotides into a cloning site near the N-terminus of the mature coat protein of the appropriate vector, preferably the pIII protein, so that the oligonucleotides are expressed as coat protein-fusion proteins.
- oligonucleotides is inserted into a phagemid vector.
- Phagemids are utilized in combination with a defective helper phage to supply missing viral proteins and replicative functions.
- Helper phage useful for propagation of Ml3 derived phagemids as viral
- the appropriate phagemid vector is constructed by engineering the Bluescript II SK+ vector (GenBank #52328) (Alting-Mees et al., 1989, Nucl. Acid Res.
- oligonucleotides could be cloned and expressed in the same reading frame as the m663 phage vector; and (3) the linker sequence encoding GGGGS (SEQ ID NO: 56) between the polylinker and the pIII gene.
- oligonucleotides are inserted into a plasmid vector.
- An illustrative suitable plasmid vector for expressing the TSAR libraries is a derivative of plasmid p340-1 (ATCC No. 40516).
- the Nco I - Bam HI fragment is removed from p340-1 plasmid and replaced by a double stranded sequence having Xho I and Xba I restriction sites in the correct reading frame.
- p340-1 is cleaved using
- oligonucleotides are inserted, using the Xho I and Xba I restriction sites, into the p340-lD vector the coding frame is restored and the TSAR binding domain is expressed as a fusion protein with the ⁇ - galactosidase.
- the vectors expressing the TSAR library would produce identifiable blue colonies.
- plasmid pTrc99A plasmid derivative of plasmid pTrc99A (Amann et al., 1988, Gene 69:301-315) (Pharmacia, Piscataway, NJ) designated plasmid pLamB which is constructed to contain the LamB protein gene of E. coli (Clement and Hofnung, 1981, Cell
- oligonucleotides inserted are expressed as fusion proteins of the LamB protein.
- the appropriate expression vectors are prepared, they are inserted into an appropriate host, such as E. coli , Bacillus subtil is, insect cells, mammalian cells, yeast cells, etc., for example by electroporation, and the plurality of oligonucleotides is expressed by culturing the transfected host cells under appropriate culture conditions for colony or phage production.
- an appropriate host such as E. coli , Bacillus subtil is, insect cells, mammalian cells, yeast cells, etc., for example by electroporation
- the plurality of oligonucleotides is expressed by culturing the transfected host cells under appropriate culture conditions for colony or phage production.
- the host cells are protease deficient, and may or may not carry
- a small aliquot of the electroporated cells are plated and the number of colonies or plaques are counted to determine the number of recombinants.
- the library of recombinant vectors in host cells is plated at high density for a single amplification of the recombinant vectors.
- recombinant M13 vector m666, m655 or m663, engineered to contain the synthesized double stranded oligonucleotides is transfected into DH5 ⁇ F' E. coli cells by electroporation.
- TSARs are expressed on the outer surface of the viral capsid extruded from the host E. coli cells and are
- the parent m666, m655 or m663 vectors contain the c-myc epitope (stuffer fragment).
- the stuffer fragment is removed. The cloning efficiency of the expressed library is easily determined by filter blotting with the 9E10 antibody that recognizes the c-myc epitope.
- the c-myc epitope is retained. Then the c-myc epitope is expressed in the pIII-fusion protein expressed by the vector.
- An advantage of the m663 vector is that it contains an intact LacZ + gene, which can be easily seen as a blue dot when expressed in E. coli plated on X-gal and IPTG.
- TSARs can be expressed in a plasmid vector contained in bacterial host cells such as E. coli .
- the TSAR proteins accumulate inside the E. coli cells and a cell lysate is prepared for screening.
- Use of plasmid p340-lD is described as an illustrative example.
- Phagemid vectors containing the synthesized double stranded oligonucleotides, expressed on the outer surface of the extruded phage are propagated either as infected bacteria or as bacteriophage with helper phage.
- the expressed pDAF2-3 phagemids (See Section 6.10 and its subsections) have the added advantage that they include the c-myc gene which can serve as an "epitope tag" for the fusion pIII proteins.
- TSAR peptide downstream synthesized oligonucleotide, expressed TSAR peptide is appropriately expressed.
- E. coli may be of value to electroporate several different strains of E. coli and establish different versions of the same library.
- E. coli strain would need to be used for the entire set of screening experiments. This strategy is based on the consideration that there is likely an in vivo biological selection, both positive and negative, on the viral assembly, secretion, and infectivity rate of individual M13 recombinants due to the sequence nature of the peptide-pIII fusion
- E. coli with different genotypes i.e., chaperonin overexpressing, or secretion
- the desired random peptide library is screened to identify and recover a syngene encoding a binding domain that binds to a ligand of choice.
- a ligand is a substance for which it is desired to isolate a specific binding partner from a synthetic random peptide library.
- the term "ligand" is thus intended to include but not be limited to a substance,
- a binding domain which binds to a ligand can function as a receptor, i.e., a lock into which the ligand fits and binds; or a binding domain can function as a key which fits into and binds a ligand when the ligand is a larger protein molecule.
- a ligand includes, but is not limited to, a non-ionic chemical group, an organic chemical group, an ion, a metal, a metal or non-metal inorganic ion, a glycoprotein, a protein, a polypeptide, a peptide, a nucleic acid, a carbohydrate or carbohydrate polymer, a lipid, a fatty acid, a viral particle, a membrane vesicle, a cell wall component, a synthetic organic compound, a bioorganic compound and an inorganic compound or any portion of any of the above.
- Ligands also include the variable region of an antibody, an enzyme/substrate binding site, an enzyme/co-factor binding site, a regulatory DNA binding protein, an RNA binding protein, a binding site of a metal binding protein, a nucleotide fold or GTP binding protein, a calcium binding protein, a membrane protein, a viral protein and an integrin.
- the ligand is a peptide that is an intracellular targeting or processing signal (e.g., nuclear localization signals), e.g., whereby the syngene-encoded binding peptide which recognizes it will interfere with proper targeting of in vivo proteins containing such a targeting signal.
- a preferred method for identifying syngenes that encode a binding domain that binds to a ligand of choice comprises screening a library of recombinant vectors that express a plurality of heterofunctional fusion proteins, said fusion proteins comprising (a) a binding domain encoded by an oligonucleotide
- Nucleotide sequence analysis can be carried out by any method known in the art, including but not limited to the method of Maxam and Gilbert (1980, Meth. Enzymol.
- Patent No. 4,795,699 SequenaseTM, U.S. Biochemical Corp.), or Taq polymerase, or use of an automated DNA sequenator (e.g., Applied Biosystems, Foster City, CA).
- syngenes encoding binding domains are identified by a method comprising
- (and/or peptide) which binds to a ligand of choice comprising: (a) generating a library of vectors expressing a plurality of heterofunctional fusion proteins comprising (i) a binding domain encoded by a double stranded oligonucleotide comprising
- unpredictable nucleotides in which the unpredictable nucleotides are arranged in one or more contiguous sequences, wherein the total number of unpredictable nucleotides is greater than or equal to about 60 and less than or equal to about 600 and the contiguous sequences are flanked by invariant residues designed to encode amino acids that confer a desired structure to the binding domain of the expressed
- heterofunctional fusion protein and, optionally, (ii) an effector domain encoded by an oligonucleotide sequence encoding a protein or peptide that enhances expression or detection of the binding domain; and (b) screening the library of vectors by contacting the plurality of heterofunctional fusion proteins with the ligand of choice under conditions conducive to ligand binding and isolating the heterofunctional fusion protein which binds to the ligand.
- the methods of the invention further comprise determining the nucleotide sequence encoding the binding domain of the heterofunctional fusion protein identified to deduce the amino acid sequence of the binding domain.
- the library is screened to identify peptides having binding affinity for a ligand of choice. Screening the libraries can be accomplished by any of a variety of methods known to those of skill in the art. See, e.g., the following references, which disclose
- the libraries are expressed as fusion proteins with a cell surface molecule, then screening is advantageously achieved by contacting the vectors with an immobilized target ligand and harvesting those vectors that bind to said ligand.
- the target ligand can be immobilized on plates, beads, such as magnetic beads, sepharose, etc., or on beads used in columns.
- the immobilized target ligand can be "tagged", e . g. , using such as biotin, 2-fluorochrome, e.g. for FACS sorting.
- screening a library of phage expressing random peptides on phage and phagemid vectors can be achieved as follows using magnetic beads.
- Target ligands are conjugated to magnetic beads, according to the instructions of the
- the beads are used to block non-specific binding to the beads, and any unreacted groups.
- BSA bovine serum albumin
- an aliquot of a library is mixed with a sample of resuspended beads.
- the tube contents are tumbled at 4°C for 1-2 hrs.
- the magnetic beads are then recovered with a strong magnet and the liquid is removed by aspiration.
- the beads are then washed by adding PBS-0.05% Tween ® 20, inverting the tube several times to resuspend the beads, and then drawing the beads to the tube wall with the magnet.
- the contents are then removed and washing is repeated 5-10 additional times. 50 mM glycine-HCl (pH 2.0), 100 ⁇ g/ml BSA solution are added to the washed beads to denature proteins and release bound phage.
- the beads are pulled to the side of the tubes with a strong magnet and the liquid contents are then transferred to clean tubes.
- 1 M Tris-HCl (pH 7.5) or 1 M NaH 2 PO 4 (pH 7) is added to the tubes to neutralize the pH of the phage sample.
- the phage are then diluted, e.g., 10° to 10 -6 , and aliquots plated with E. coli DH5 ⁇ F' cells to determine the number of plaque forming units of the sample. In certain cases, the platings are done in the presence of XGal and IPTG for color discrimination of plaques (i.e., 2acZ+ plaques are blue, lacZ- plaques are white).
- the titer of the input samples is also determined for comparison (dilutions are
- screening a library of phage expressing random peptides can be achieved by panning using microtiter plates.
- Target ligand is diluted, e.g., in 100 mM NaHCO 3 , pH 8.5 and a small aliquot of ligand solution is adsorbed onto wells of microtiter plates (e.g. by incubation overnight at 4°C).
- An aliquot of BSA solution (1 mg/ml, in 100 mM NaHCO 3 , pH 8.5) is added and the plate incubated at room temperature for 1 hr. The contents of the microtiter plate are flicked out and the wells washed carefully with PBS- 0.05% Tween ® 20.
- the plates are washed free of unbound targets repeatedly. A small aliquot of phage solution is introduced into each well and the wells are incubated at room temperature for 1-2 hrs. The contents of microtiter plates are flicked out and washed repeatedly. The plates are incubated with wash solution in each well for 20 minutes at room
- a pH change is used.
- An aliquot of 50 mM glycine-HCl (pH 2.0), 100 ⁇ g/ml BSA solution is added to the washed wells to denature proteins and release bound phage. After 10 minutes at 65°C, the contents are then transferred into clean tubes, and a small aliquot of 1 M Tris-HCl (pH 7.5) or IM NaH 2 PO 4 (pH 7) is added to neutralize the pH of the phage sample.
- the phage are then diluted, e .g. , 10 -3 to 10 -6 and aliquots plated with E. coli DH5 ⁇ F' cells to determine the number of the plaque forming units of the sample. In certain cases, the platings are done in the presence of XGal and IPTG for color
- plaques i.e., lacZ+ plaques are blue, lacZ- plaques are white.
- the titer of the input samples is also determined for comparison
- a large volume, approximately 100 ⁇ l, of LB+ ampicillin is added to each well and the plate is incubated at 37°C for 2 hr.
- the bound cells undergo cell division in the rich culture medium and the daughter cells detach from the immobilized targets.
- the contents of the wells are then
- bacterial cell can be screened by passing a solution of the library over a column, of a ligand immobilized to a solid matrix, such as sepharose, silica, etc., and recovering those phage that bind to the column after extensive washing and elution.
- a ligand immobilized to a solid matrix such as sepharose, silica, etc.
- weak binding library members can be isolated based on retarded chromatographic properties.
- fractions are collected as they come off the column, saving the trailing fractions (i.e., those members that are retarded in mobility relative to the peak fraction are saved). These members are then concentrated and passed over the column a second time, again saving the retarded fractions.
- oligonucleotides encoding the binding domain selected in this manner can be mutagenized, expressed and rechromatographed (or screened by another method) to discover improved binding activity.
- homobifunctional (e.g., DSP, DST, BSOCOES, EGS, DMS) or heterobifunctional (e.g., SPDP) cross-linking agents can be used in combination with any of the above methods, to promote capture of weak binding members; these cross-linkers should be reversible, with a treatment (i.e., exposure to thiols, base, periodate, hydroxylamine) gentle enough not to disrupt members structure or infectivity, to allow recovery of the library member.
- the elution reagents can be removed by dialysis (i.e., dialysis bag,
- screening a library of can be achieved using a method comprising a first "enrichment” step and a second filter lift step as follows.
- Random peptides from an expressed library capable of binding to a given ligand are initially enriched by one or two cycles of panning or affinity chromatography, as described above. The goal is to enrich the positives to a frequency of about > 1/10 5 .
- a filter lift assay is conducted. For example, approximately 1-2 x 10 5 phage, enriched for binders, are added to 500 ⁇ l of log phase E. coli and plated on a large LB-agarose plate with 0.7% agarose in broth. The agarose is allowed to solidify, and a nitrocellulose filter (e.g., 0.45 ⁇ ) is placed on the agarose surface.
- registration marks is made with a sterile needle to allow re-alignment of the filter and plate following development as described below. Phage plaques are allowed to develop by overnight incubation at 37 °C (the presence of the filter does not inhibit this process). The filter is then removed from the plate with phage from each individual plaque adhered in si tu . The filter is then exposed to a solution of BSA or other blocking agent for 1-2 hours to prevent nonspecific binding of the ligand (or "probe").
- the probe itself is labeled, for example, either by biotinylation (using commercial NHS-biotin) or direct enzyme labeling, e.g., with horse radish peroxidase (HRP) or alkaline phosphatase. Probes labeled in this manner are indefinitely stable and can be re-used several times.
- the blocked filter is exposed to a solution of probe for several hours to allow the probe to bind in si tu to any phage on the filter displaying a peptide with significant affinity to the probe.
- the filter is then washed to remove unbound probe, and then developed by exposure to enzyme substrate solution (in the case of directly labeled probe) or further exposed to a solution of enzyme-labeled avidin (in the case of biotinylated probe).
- enzyme substrate solution in the case of directly labeled probe
- enzyme-labeled avidin in the case of biotinylated probe
- an HRP-labeled probe is detected by ECL western blotting methods (Amersham, Arlington Heights, ID, which involves using luminol in the presence of phenol to yield enhanced
- chemiluminescence detectable by brief exposure of film by autoradiography in which the exposed areas of film correspond to positive plaques on the original plate.
- positive phage plaques are identified by localized deposition of colored enzymatic cleavage product on the filter which corresponds to plaques on the original plate.
- the developed filter or film is simply realigned with the plate using the registration marks, and the "positive" plaques are cored from the agarose to recover the phage. Because of the high density of plaques on the original plate, it is usually impossible to isolate a single plaque from the plate on the first pass. Accordingly, phage recovered from the initial core are re-plated at low density and the process is repeated to allow isolation of
- the recovered cells are then plated at a low density to yield isolated colonies for individual analysis.
- the individual colonies are selected and used to inoculate LB culture medium containing
- Binding to other supports, having attached thereto a non-relevant ligand, can be used as a negative
- interactions during recovery of the phage are specific for every given peptide sequence from a plurality of proteins expressed on phage. For example, certain interactions may be disrupted by acid pH's but not by basic pH's, and vice versa .
- it may be desirable to test a variety of elution conditions including but not limited to pH 2-3, pH 12-13, excess target in competition, detergents, mild protein denaturants, urea, varying temperature, light, presence or absence of metal ions, chelators, etc.
- Some of these elution conditions may be incompatible with phage infection because they are bactericidal and will need to be removed by dialysis (i.e., dialysis bag,
- flanking sequences may also be due to the denaturation of the specific peptide region involved in binding to the target but also may be due to conformational changes in the flanking regions. These flanking sequences may also be
- flanking regions may also change their secondary or tertiary structure in response to
- elution conditions i.e., pH 2-3, pH 12-13, excess target in competition, detergents, mild protein denaturants, urea, heat, cold, light, metal ions, chelators, etc.
- the conformational deformation of the peptide responsible for binding to the target i.e., pH 2-3, pH 12-13, excess target in competition, detergents, mild protein denaturants, urea, heat, cold, light, metal ions, chelators, etc.
- TSAR libraries can be prepared and screened by: (1) engineering a vector, preferably a phage vector, so that a DNA sequence encodes a segment of Factor Xa (or Factor Xa protease cleavable peptide) and is present adjacent to the gene encoding the effector domain, e.g., the pIII coat protein gene; (2) construct and assemble the double stranded synthetic
- oligonucleotides as described above and insert into the engineered vector; (3) express the plurality of vectors in a suitable host to form a library of vectors; (4) screen for binding to an immobilized ligand; (5) wash away excess phage; and (6) treat the entire library with Factor Xa protease.
- the particle will be uncoupled from the peptide-ligand complex and can then be used to infect bacteria to regenerate the particle with its full-length pIII molecule for additional rounds of screening.
- TSAR libraries are screened with
- oligonucleotides to identify syngenes encoding binding domains specific for particular DNA sequences.
- binding is done with high levels of salts, e.g., to maximize the hydrophobic interactions that are characteristic of specific protein/DNA
- Non-specific protein/DNA interactions are generally electrostatic and can be reduced by high salt concentrations that cause saturation of the charges on the protein and DNA.
- Syngene-encoded sequences that can be used to regulate transcription are identified by screening the desired random peptide library for binding to a ligand of choice, where the ligand of choice is (or comprises) a nucleic acid sequence (transcriptional regulatory site) that regulates transcription; a transcription factor that binds to a transcriptional regulatory site and thereby regulates transcription; or a protein binding partner of the transcription factor, which binds to the transcription factor and thereby inhibits the transcription factor's binding to the transcriptional regulatory site.
- the ligand of choice is (or comprises) a nucleic acid sequence (transcriptional regulatory site) that regulates transcription; a transcription factor that binds to a transcriptional regulatory site and thereby regulates transcription; or a protein binding partner of the transcription factor, which binds to the transcription factor and thereby inhibits the transcription factor's binding to the transcriptional regulatory site.
- transcription factor have a great deal of similarity among themselves.
- the nucleotide sequences found in common among all the sites for a given factor are called that factor's "consensus sequence.”
- the consensus sequence for NF-IL6 also known as C/EBP
- the consensus sequence for NF- ⁇ B is 5'GGGRNNYYCC (SEQ ID NO: 60) (Okamoto et al., 1993, J. Biol. Chem. 269:8582-8589).
- the consensus sequence for GATA-1, GATA-2, and GATA-3 is 5' (T/A) GATA(G/A) (Zon et al., 1991, Proc. Natl. Acad. Sci. USA
- transcription factors are not single proteins, but are multimers composed of the products of different genes.
- the AP-1 transcription site is activated by heterodimers of c-jun and c-fos.
- the transcriptional activator may be composed of either homo- or heterodimers of members of the NF- ⁇ B family, including p65 (also known as Rel A) and p50.
- p65 also known as Rel A
- p50 Different NF- ⁇ B sites may be activated preferentially by different combinations of p65 and/or p50 (Muroi et al., 1993, J. Biol. Chem. 268:19534-19539; Zabel et al., 1991, J.
- the activity of some transcription factors is regulated by proteins that bind to arid thus inhibit the factor's activity.
- the activity of NF- ⁇ B can be inhibited by binding to the protein IF- ⁇ B (Kerr et al., 1992, Current Opin. Cell. Biol. 4:496- 501).
- the activity of the AP-1 transcription factor can be inhibited by binding of the factor IP-1 (Auwerx and Sassone-Corsi, 1991, Cell 64:983-993).
- the inhibitory protein is located in the cytoplasm, where its binding to the transcription factor sequesters that factor away from the nucleus, thus preventing the factor from
- transcription factors that can be found in the cytoplasm, presumably in inactive form, include: the glucocorticoid receptor, the yeast transcription factor SW15, and NF-AT (a transcription factor for certain lymphokines and cytokines).
- Alternative splicing may give rise to several versions of a transcription factor. Some of these versions may inhibit rather than activate transcription, and the interaction of different versions of the same factor may produce a complex pattern of regulation (Foulkes and Sassone-Corsi, 1992, Cell 68:411-414).
- a given gene may be regulated by one or more different transcription factors in a tissue or cell type specific manner. For instance, it has been demonstrated that at least three cis-elements (AP-1, C/EBP (also known as NF-IL6) and NF- ⁇ B-like sites) are involved in IL-8 gene activation (Okamoto et al., 1994, J. Biol. Chem. 269:8582-8589; Mukaida et al.,
- Transcription factors have been shown to be modular proteins. That is, they are composed of more or less discrete domains that perform more or less discrete functions. For example, most transcription factors possess a DNA binding domain that mediates binding to a transcriptional regulatory site. Also common are transcriptional activator signals (TASs). T Ss are domains that, when bound to a transcriptional regulatory site of a gene (by virtue of being linked to a DNA binding domain), mediate the transcriptional activation of the gene. Another common domain found in transcription factors is a domain that permits the dimerization of the factor. The dimerization is sometimes homodimerization, sometimes
- NF- ⁇ B is a transcription factor that is a heterodimer consisting of p50 and p65 subunits
- IF- ⁇ B (or I ⁇ B) is a cytoplasmic protein that binds to and specifically inhibits NF- ⁇ B (Baeuerle and
- Free NF- ⁇ B migrates to the nucleus where it binds to the ⁇ B site in the enhancer/promoter of various genes, such as the T cell receptor ⁇ chain, interleukin-2 receptor a chain, myosin heavy chain class I, and cytokine genes such as those for beta-interferon, GM-colony stimulating factor (CSF), G-CSF, interleukin-2, and tumor necrosis factor- ⁇ r and - ⁇ , thereby regulating their expression (Lenardo and Baltimore, 1989, Cell 58:227-229; Baeuerle, 1991, Biochem. Biophys. Acta 1072:63-80).
- CSF GM-colony stimulating factor
- G-CSF G-CSF
- interleukin-2 interleukin-2
- tumor necrosis factor- ⁇ r and - ⁇ thereby regulating their expression
- Binding sites for the transcription factors NF- ⁇ B and NF-IL6 have been shown to be important in the regulation of a wide variety of genes that are involved in health and disease. For example, it has been suggested that derangements in the control of expression of the IL-8 gene may be involved in the pathogenesis of several inflammatory diseases (Okamoto et al., 1994, J. Biol. Chem. 269:8582-8589). An NF- ⁇ B binding site in the promoter of the IL-8 gene has been shown to be involved in the transcriptional control of the IL-8 gene (id. ) . The same NF- ⁇ B site has been shown to be one of the targets through which the immunosuppressant FK506 acts ( id. ) . The ability to regulate this site through the use of syngenes
- the GATA family of transcription factors are a group of zinc finger DNA binding proteins that recognize related binding sites that contain the core sequence GATA.
- the known members of this family are GATA-1, GATA-2, GATA-3, GATA-GTl, and GATA-GT2 (Zon et al., 1991, Proc. Natl. Acad. Sci. USA 88:10642-10646; Maeda, 1994, J. Biochem. 115:6-14).
- GATA-1, GATA-2, and GATA-3 are expressed primarily in hematopoietic tissues such as the erythroid, megakaryocyte, T cell, and mast cell lineages. Gene knockout studies have shown that GATA-1 is required for erythroid
- mice development in mice (Pevny et al., 1991, Nature
- GATA-2 is found in non-hematopoietic cells such as brain, liver, and endothelial cells (Dorfman et al., 1992, J. Biol. Chem. 267:1279-1285).
- GATA-GTl and GATA-GT2 are found in gastric parietal cells. They bind to a sequence motif in the 5' upstream regions of both the H + /K + -ATPase ⁇ and ⁇ subunit genes. GATA-GTl and GATA-GT2 may be involved in gastric specific transcriptional regulation of many proteins (Maeda, 1994, J. Biochem. 115:6-14).
- the Ets family of transcription factors includes at least 30 different DNA binding proteins in species as evolutionarily distant as Drosophila and humans. These proteins show pronounced amino acid sequence similarities in an approximately 84 amino acid region that corresponds to their DNA binding domains. They recognize distinct but related DNA binding sites that are about ten nucleotides long and that share a common, short motif - (C/A)GGA(A/T). The specificity of binding of the individual Ets proteins is determined mainly by the other nucleotides in the binding site.
- TCRc T cell antigen receptor ⁇ chain gene
- the invention involves the identification of syngene-encoded peptide
- nucleic acid sequences that regulate that regulate (promote or alternatively inhibit)
- transcriptional regulatory sites are preferably those situated in vivo in or near the sequence of a preselected gene, the product of which is involved in health or disease.
- transcriptional regulatory site is a sequence recognized by a transcription factor.
- Oligonucleotides containing those transcriptional regulatory sites are synthesized and are used to screen a random peptide library. In this way, peptides are found that specifically bind the transcriptional regulatory sites.
- a synthetic binding domain can be identified that specifically regulates the transcription of one or more genes of choice, but not of all the genes whose transcription is regulated by a natural transcription factor which recognizes the same transcriptional regulatory site as does the syngene-encoded binding domain.
- a binding domain that specifically regulates one or more but not all of a family of transcriptional regulatory sites (a family referring to those transcriptional regulatory sites regulated by an identical natural transcription factor), is
- HSDB highly specific DNA binding domain
- HSDBs that bind specifically to a particular transcriptional regulatory site within or near a particular gene.
- transcriptional regulatory site within a gene whose activity it is desired to regulate for therapeutic reasons. Such binding interferes with the binding of a natural transcription factor to that site and thus blocks the transcription of the preselected gene.
- the syngene product in this case would not contain a transcription activating signal (TAS).
- the site at which the syngene product binds could be a site that is naturally involved in the activation of the transcription of the preselected gene.
- the site could be a site that is not naturally involved in the transcriptional regulation of the gene.
- the invention encompasses a method for functionally identifying protein structural domains that can be combined with other functional domains to effect the production of a unique, non-naturally occurring protein that can act at specific cis-acting negative or positive transcriptional regulatory sites within eukaryotic cells.
- the invention is further directed to the syngenes encoding such a protein as well as to the uses of such a syngene.
- This invention in a particularly preferred embodiment is directed to the repression and activation of genes regulated, at least in part, by the transcriptional factors NF- ⁇ B and NF-IL6 (also known as C/EBP).
- this embodiment is directed to the genes for IL-6 and IL-8, whose products are cytokines that respond to different inflammatory signals and are regulated by both NF- ⁇ B and NF-IL6.
- regulation of the gene for the HLA class I locus, which responds to NF- ⁇ B, but not NF-IL6, is a subject of this invention.
- the site to which the encoded syngene peptide binds is an NF-IL6 site.
- the encoded syngene product binds to a protein that is a transcriptional regulator (i.e. that binds to DNA and regulates transcription).
- a transcriptional regulator i.e. that binds to DNA and regulates transcription.
- the encoded syngene product can bind to the NF- ⁇ B protein and thereby prevent the NF- ⁇ B protein from activating transcription.
- the encoded syngene product need not be localized to the nucleus.
- the product of the syngene functions in the cytoplasm where it binds to an endogenous binding partner of a
- the syngene product activates transcription of a
- a transcription factor inhibitor that is involved in the inhibition of transcription of the preselected gene.
- a transcription factor inhibitor is IF- ⁇ B.
- the syngenes encode a binding domain that is highly specific for the DNA binding site of a transcription factor.
- Such syngenes encode proteins containing HSDBs that are identified by screening a random peptide library for binding to a target ligand, in which the ligand is a nucleic acid that contains a binding site recognized by the transcription factor.
- HSDBs will show greater specificity for a given transcription factor binding site than does the natural transcription factor that binds to that site.
- An HSDB selectively binds a single or limited examples of a given transcription factor's binding site from a single gene or a limited number of genes, whereas the natural transcription factor will bind to sites in additional genes.
- Transcription factor binding sites are known for a very large number of genes and are suitable for use as target ligands.
- the following tables represent examples of known transcription factors and their respective binding sites that can be used in the methods of the present invention to isolate binding domains specific for those factors or sites. Sequences encoding such binding domains could be built into syngene constructs that would be useful for the modulation of the activity of genes whose transcription is dependent upon the binding of
- Table 1 shows that a DNA binding site for the Ets-like transcription factor Fli-1 is present in the promoter of the c-myc proto- oncogene.
- C-myc has been shown to be involved in the development of a wide variety of cancers (Marcu et al., 1992, Ann. Rev. Biochem. 61:809-860; Cole, 1986, Ann. Rev. Genet. 20:361-384).
- the ability to inhibit the activity of c-myc would be expected to have utility in the treatment of those cancers in which c- myc is upregulated.
- Table 2 gives the DNA sequences recognized by several ubiquitous transcription factors. Such ubiquitous transcription factor binding sites are ideal subjects for use as targets to select highly specific binding domains for incorporation into syngenes. This is because these factors regulate a wide variety of genes in many cell types. Using the methods of the present invention as disclosed in
- Section 6.1 Section 6.1 and its subsections, one can obtain syngenes that encode products that are able to
- Table 3 shows a number of transcription factors, the DNA sequences they bind to, and the genes they regulate during T cell development and
- Table 4 shows a number of transcription factors that could be used as targets to obtain syngenes that encode products that would be useful in regulating the T cell receptor.
- Table 5 shows a large number of known Ets family transcription factor binding sites. Among the genes listed in Table 5, genes such as fos (involved in growth control) and interleukin-2 (involved in the control of the immune system) are attractive
- NF- ⁇ B sites which are suitable for use as a ligand of choice are shown in Table 6 (see Muroi et al., 1993, J. Biol. Chem.
- transcription factor binding sites gel shift assays to determine if known transcription factors bind to potential transcription factor binding sites in the gene, and mutagenesis of a suspected binding site followed by assays to determine if the expression of the mutagenized gene has been affected by the
- the molecule comprising the transcriptional regulatory site that constitutes the ligand of choice used to screen the random peptide library is a site that is not known to be bound by a protein transcription factor, but which is situated close to (e.g., within 50 nucleotides of) the initiation site of the gene, and which has the ability to regulate transcription of such gene upon binding by a syngene-encoded product.
- a syngene-encoded product would be expected to sterically hinder the ordered process of transcription factor and RNA polymerase binding to this region of the gene that is necessary for transcription to occur.
- the invention provides a method for identifying a syngene that encodes a protein that binds to a ligand of interest comprising screening a random synthetic peptide library to identify a peptide that binds to a ligand of interest, in which the ligand is (1) a DNA region that functions to regulate transcription; (2) a protein that is a transcriptional regulator that functions by binding to DNA; or (3) a protein modulator that binds to and inactivates the function of a protein transcriptional regulator.
- transcription factor affords the opportunity for regulating the transcription of genes that depend upon the transcription factor for their activity. This can be done by developing syngenes that encode binding domains that are specific for either the transcription factor or its inhibitor.
- binding domains (and their encoding nucleic acids) will be identified that are capable of specifically binding the
- binding domains mimic transcription factor inhibitors.
- binding domains can function as inhibitors of
- binding domains would be identified that could function, when part of a syngene-encoded product, much in the way IF- ⁇ B
- binding domains would be identified that could be used to create IP-1-like peptides encoded by syngenes.
- binding domains would be identified that, when expressed from a syngene, would bind to the inhibitor, thus preventing it from binding to the transcription factor. This would free the
- transcription factor allowing it to migrate into the nucleus and activate transcription.
- transcription factor inhibitors could be used as activators of transcription. 5.2.2. SYNGENES FOR REGULATION OF APOPTOSIS
- Syngenes may be used to control apoptosis.
- Apoptosis the process of programmed cell death, is seen in a wide range of organisms where it is involved in the development of the nervous system, the
- apoptosis is a normal and important part of many different biological processes, it is important for the health of an organism that apoptosis be properly regulated. It appears that some instances of disease can arise as a result of improper
- Alzheimer's and Parkinson's diseases are associated with the premature death of certain neurons (Jenner, 1989, J. Neurol. Neurosurg. Psychiatry 22-28; Kosik, 1992, Science 256:780-783); inhibition of programmed cell death may be involved in autoimmune disease
- Bcl-2 was first identified as a gene on human chromosome 18 that was involved in the t(14;18) chromosomal translocations observed in follicular B-cell lymphomas (Tsujimoto et al., 1985, Science 229:1390-1393; Bakhshi et al., 1985, Cell 41:899-906; Cleary and Sklar, 1985, Proc. Natl. Acad. Sci. USA 82:7439-7449).
- bcl-2 and bax proteins control the process of apoptosis in response to various stimuli. It has been postulated that the ratio of bcl-2 to bax determines whether a cell will initiate the program of apoptosis in response to apoptotic stimuli or will ignore such stimuli and continue to grow, or at least survive (Oltvai et al., 1993, Cell 74:609-619). A high ratio of bcl-2 to bax leads to survival; a low ratio to death. The interaction of bcl-2 and bax is thought to depend on the formation of heterodimers of the proteins encoded by these two genes . A study in which mutations were engineered in two domains of bcl- 2 found a correlation between the mutant bcl-2
- the two domains in the human bcl-2 protein that appear to interact with bax are amino acids 136- 156 and amino acids 187-202 (ibid. ) .
- binding domains that are specific for the bcl-2 protein.
- binding domains could also be used to make syngenes that would be useful in the control of apoptosis.
- binding domains would also be useful in making syngenes for use in regulating apoptosis.
- syngenes Although a preferred embodiment of the invention is directed to syngenes whose products bind to transcriptional regulatory sites, other embodiments of the invention encompass a plethora of uses. In particular, it should be emphasized that syngene products are in no way restricted to functioning solely in the nucleus. As discussed below, syngenes are useful for regulating activities that take place in the cytoplasm, or on the cell's surface, as well.
- a syngene product may have one or more of a variety of functional properties. Its product may be a signal transduction inhibitor. Such a signal transduction inhibitor would block the activity of a signal pathway such as that which leads from a cell membrane receptor through various intermediate kinases, phosphatases or other molecules to the nucleus, where transcription of particular genes is affected.
- An example of such a signal transduction inhibiting syngene would be a syngene the product of which binds to and blocks the function of a tyrosine kinase membrane receptor.
- Other types of signalling molecules may also be targets of the syngenes of the invention. Examples of such signalling molecules are cytoplasmic kinases (for example, src or raf), adenylyl cyclases, guanylyl cyclases and the like.
- SH2 and/or SH3 domains are involved in protein/protein interactions that are important for the signalling process (Anderson et al., 1990, Science 250:979-982). It may be that syngenes that encode a protein having binding domains specific for SH2 and/or SH3 domains will be especially useful for regulating the various signalling pathways of cells.
- the syngenes via expression of their encoded proteins, can be used to block or down-regulate the translation of a specific mRNA.
- the syngene can be used to stabilize or destabilize a specific RNA.
- the syngene encodes a protein that incorporates a binding domain that has been selected for binding to a portion of the RNA sequence that is to be down-regulated, stabilized or destabilized.
- a syngene is created whose product mimics the structure of a known hormone receptor but that has a different DNA binding site from that of the known hormone receptor.
- the DNA binding site of the known hormone receptor is replaced with a HSDB directed against a DNA binding site that represents a transcriptional regulatory site of a preselected gene.
- the effect of the use of such a syngene is that the syngene product (which mimics the hormone receptor) allows the preselected gene to be regulated by the hormone.
- the hormone binds to the syngene product and the syngene product-hormone complex is translocated to the nucleus of a cell where it binds to the transcriptional regulatory site of the preselected gene, thus activating or repressing the gene.
- Syngenes encoding binding domains directed to cyclins or cyclin dependent kinases can be used to inhibit the progression of cells through the cell cycle.
- Syngene products containing binding domains directed to DNA polymerases and associated factors involved in DNA replication can be used to stop or retard DNA synthesis, thus stopping or retarding cell division.
- products of the present invention is the cellular trafficking machinery that directs or sorts proteins, for example, into the Golgi and to their final
- tetrapeptide KDEL targets proteins to the lumen of the endoplasmic reticulum.
- tetrapeptide YQRL (SEQ ID NO: 117), found in the membrane protein TGN38 of the trans Golgi network, has been demonstrated to be necessary and sufficient to target membrane proteins to the trans Golgi (Luzio and Banting, 1993, Trends in Biochem. Sci. 18:395-398) .
- Such peptide targeting sequences are appropriate ligands for use in isolating binding domains for incorporation into syngenes.
- Syngenes can also be useful in regulating the interactions of cells with the extracellular matrix. Such interactions depend on binding between the components of the matrix and receptors on the surface of cells. One of the most important
- laminin By interacting with specific receptors on the surface of cells, laminin serves as a bridge between the cells and the matrix (Martin and Timpl,
- binding domains that are specific for laminin or its receptor, it is possible to isolate binding domains that can be incorporated into the peptides encoded by syngenes that are useful in modulating the process of axonal or dendrite migration as well as other
- GTPases are thought to be involved in such membrane transport processes as movement of membrane vesicles from the endoplasmic reticulum (ER) through the Golgi to the plasma
- GTPases The seven classes of GTPases are:
- Sar proteins are believed to be involved in membrane vesicle budding from the ER and transport of the budded vesicles to the Golgi.
- a postulated mechanism envisions Sar proteins in their GDP-bound form interacting with a specific receptor on the transitional elements of the ER. Transitional
- nascent vesicle buds off the ER in a process accompanied by the exchange of the bound GDP for GTP. This exchange is catalyzed by a
- GEF Sar specific guanine nucleotide exchange factor
- GAP Sar specific GTPase activating protein
- Arf proteins are believed to function in a manner similar to Sar proteins. But, whereas Sar proteins are involved in vesicle budding and transport between the ER and the Golgi, Arf proteins are thought to mediate transport between compartments of the
- Sar proteins, Arf proteins, and their associated GEFs and GAPs are suitable targets for the action of syngenes.
- a random peptide library could be screened for binding domains that are specific for these proteins. Such binding domains could be
- dynamin does not seem to be involved in transport of membrane vesicles from the ER to the Golgi. Instead, dynamin seems to participate in the receptor mediated endocytotic pathway involving clathrin-coated vesicles. Mutants of dynamin that are unable to bind or hydrolyze GTP block receptor mediated endocytosis in mammalian cells (van der Bliek et al., 1993, J. Cell Biol. 122:553- 563; Herskovits et al., 1993, J. Cell Biol. 122: 565- 578).
- dynamin may be involved in the rapid endocytosis of synaptic vesicles after exocytosis in nerve terminals (Robinson, et al., 1993, Nature 365:163-166). Syngenes containing binding domains specific for dynamin could be used to regulate the process of endocytosis.
- Rab proteins are a large class of proteins (more than 30 members have been discovered in
- Rab 1 is involved in the early phases of the secretory pathway such as vesicle budding from the ER and vesicle movement from the ER to the cis Golgi (Peter et al., 1993, J. Cell Biol. 122:1155- 1168; Plutner et al., 1991, J. Cell Biol. 115:31-43; Tisdale et al., 1992, J. Cell Biol. 119:749-761;
- Rab 6 is found in a complex with a 62 kd protein in the membranes of the trans Golgi network. Antibodies to Rab 6 (or to the 62 kd protein)
- subsections show, by using the methods of the present invention, one can isolate binding domains that can distinguish between closely related ligands. One can then incorporate those binding domains into syngenes. In this way, it should be possible to make syngenes containing binding domains that are specific for a single member of the Rab family. These syngenes can then be used to modulate those aspects of membrane trafficking in which that member participates without affecting other aspects.
- heterotrimeric G proteins in some aspects of membrane trafficking.
- agonists and antagonists of G ⁇ have been shown to affect export from the ER
- G ⁇ i inhibits membrane budding from the trans Golgi while G ⁇ s promotes budding (Leyte et al., 1992, Trends Cell Biol. 2:91-94;
- a peptide from a random peptide library has been identified as a binder for a particular target ligand of interest according to the methods of the invention, it may be useful to determine what region (s) of the expressed peptide sequence is (are) responsible for binding to the target ligand. Such analysis can be conducted at two different levels, i.e., the nucleotide sequence and amino acid sequence levels.
- the inserted oligonucleotides can be cleaved using appropriate restriction enzymes and religated into the original expression vector and the expression product of such vector screened for ligand binding to identify the oligonucleotides that encode the binding region of the peptide.
- the oligonucleotides can be transferred into another vector, e.g., from phage to phagemid or to p340-lD or to pLamB plasmid.
- the newly expressed fusion proteins should acquire the same binding activity if the domain is necessary and sufficient for binding to the ligand.
- This last approach also assesses whether or not flanking amino acid residues encoded by the original vector (i.e., fusion partner) influence peptide binding in any fashion.
- the oligonucleotides can be
- the inserted oligonucleotides are subdivided into two equal halves. If the peptide domain important for binding is small, then one recombinant clone would demonstrate binding and the other would not. If neither have binding, then either both are important or the essential portion of the domain spans the middle (which can be tested by expressing just the central region).
- the binding domains can be analyzed.
- the entire peptide should be synthesized and assessed for binding to the target ligand to verify that the peptide is necessary and sufficient for binding.
- overlapping 10-mers can be synthesized, based on the amino acid sequence of the random peptide binding domain, and tested to identify those binding the ligand.
- linear motifs may become apparent after comparing the primary structures of different binding peptides from the library having binding affinity for a target ligand. The contribution of these motifs to binding can be verified with synthesized peptides in competition experiments (i.e., determine the concentration of peptide capable of inhibiting 50% of the binding of the phage to its target; IC 50 ). Conversely, the motif or any region suspected to be important for binding can be removed from or mutated within the DNA encoding the random peptide insert and the altered displayed peptide can be retested for binding.
- binding domain of a random peptide has been identified, differently displayed binding domains can be created by isolating and fusing the binding domain of one random peptide to a new effector domain.
- the biologically or chemically active effector domain of the peptide can thus be varied.
- the binding characteristics of an individual peptide can be modified by varying the binding domain sequence to produce a related family of peptides with differing properties for a specific ligand.
- the identified random peptides can be improved by additional rounds of random mutagenesis, selection, and amplification of the nucleotide sequences encoding the binding domains.
- Mutagenesis can be accomplished by creating and cloning a new set of oligonucleotides that differ slightly from the parent sequence, e.g., by 1-10%. Selection and amplification are achieved as described above. By way of example, to verify that the isolated peptides have improved binding
- syngenes of the present invention are synthetic genes, that is, genes that are not derived from a naturally occurring genome.
- syngenes are made up of, at least in part,
- Syngenes may be composed of totally synthetic gene sequences or combinations of natural and totally synthetic gene sequences.
- Syngenes encode a syngene product which is a protein having at least one functional domain.
- the functional domain is a binding domain with affinity for a ligand. In a preferred aspect, this binding domain is characterized by l)its strength of binding under specific conditions, 2) the stability of its binding under specific conditions, and 3) its
- syngene may encode additional functional domains, for example an amino acid sequence that directs cellular trafficking of the syngene product to the desired compartment of the cell; e.g., the
- tetrapeptide KDEL (SEQ ID NO: 116), KKXX (SEQ ID NO: 118) or KXKXX (SEQ ID NO: 119) where X is any amino acid), to target peptides to the lumen of the
- a third functional domain may be one that enhances activity of the syngene product, for example a
- the syngenes thus optionally encode
- additional sequences besides the encoded binding peptide.
- additional sequences can be markers, linkers, transcriptional activation signals,
- the syngene preferably includes regulatory sequences that control the expression of the coding region of the syngene, nucleic acid coding sequences encoding the binding peptide, and optional sequences.
- the syngenes of the invention are created by inserting nucleotides encoding synthetic binding domains discovered by screening libraries as described herein into the context of additional protein coding sequences. This can be done by replacing the DNA binding domain of a known protein or by assembling a totally synthetic gene.
- the syngene will generally contain other non-protein coding sequences. These non-protein coding sequences will generally be involved in the control of the expression of the syngene.
- the syngene preferably contains sequences for initiating
- transcriptional processing signals such as a poly A addition signal.
- the syngene preferably also contains a translation termination signal.
- splicing of the RNA transcript encoded by the syngene may also be desirable.
- the addition of nucleotide sequences in the syngene other than those encoding the binding domain can provide for in vivo or
- the syngene comprises a promoter operably linked to the coding sequences, followed by (3') transcriptional termination and poly A addition signals.
- the encoded syngene product can contain an additional region, i.e., an amino acid sequence that functions as a linker domain between one or more of the other domains of the syngene.
- an additional region i.e., an amino acid sequence that functions as a linker domain between one or more of the other domains of the syngene.
- the presence or absence of the linker domain is optional, as is the type of linker that may be used.
- the syngene is preferably incorporated into a vector, for replication, and/or expression of the syngene. Any vector known in the art can be used. In addition to comprising the syngene, the vector
- syngene products are hybrid proteins.
- the amino terminus will be amino terminus.
- binding domains generally consist of a methionine residue followed by a spacer sequence of 1-15 amino acids, followed by one or more binding domains (i.e., the sequences
- the middle domain of the hybrid protein will often consist of a marker that aids in pre-clinical characterization, e.g. monitoring expression of the syngene in vi tro, as in, for
- tissue culture cells This is likely to be a readily detectable epitope or other label, e.g., ⁇ - galactosidase, luciferase or chloramphenicol acetyl transferase genes.
- This marker generally will be deleted when the syngene is used clinically.
- a cellular trafficking domain e.g., KDEL (SEQ ID NO: 116) or YQRL (SEQ ID NO: 117
- KDEL SEQ ID NO: 116
- YQRL SEQ ID NO: 117
- NLS nuclear localization signal
- Additional functional domains e.g., transcriptional activation sequences, or spacers can also be included.
- the binding domains, the marker, the trafficking domain, and additional functional domains appear in the syngene may not be critical. The order disclosed above is merely illustrative of one
- the trafficking signal such as KDEL (SEQ ID NO: 116) or YQRL (SEQ ID NO: 117), or is a marker (label) domain.
- the additional functional domains can be transcriptional activation sequences, NLSs, or
- syngene product is designed to activate the transcription of a gene, then the syngene will, in addition to the domains discussed above (or instead of the marker domain), contain a
- transcriptional activation domain examples include: a homopolymeric stretch of glutamine residues, an acidic activation domain (sometimes known as a "negative noodle"), a proline-rich domain, or a serine-threonine-rich domain. It has been
- a nuclear localization signal such as that of the nucleoplasmin gene, may be added.
- nucleoplasmin nuclear localization signal has been shown to function in a variety of contexts, where it leads to the nuclear localization of proteins in which it appears.
- Nuclear localization signals that can be used in the present invention are not limited to those specifically disclosed herein; they include any that are known in the art.
- the amino acid coding sequence of a syngene product begins with an amino-terminal spacer sequence, for example, MGASGAS (SEQ ID NO: 120), followed by a nuclear localization sequence which may be KRPAATKKAGQAKKKKR (SEQ ID NO: 121) (the nucleoplasmin NLS, see Kang et al., 1994, Proc. Natl. Acad. Sci. USA 91:340-344) followed by a second spacer sequence, for example, GASGAS (SEQ ID NO: 122), followed by the binding domain.
- the amino acid coding sequence of a syngene product begins with an amino-terminal spacer sequence, for example, MGASGAS (SEQ ID NO: 120), followed
- sequences of a syngene product begins with an amino- terminal spacer sequence, followed by the binding domain, followed by a second spacer sequence, followed by the carboxy-terminal nuclear localization signal MISESLRKAIGKR (SEQ ID NO: 123) (Shieh et al., 1993, Plant-Physiol. 101:353-61).
- the NLS of the p50 subunit of NF- ⁇ B, VQRLRQLM (SEQ ID NO: 124) is used (Henkel et al., 1992, Cell 68:1121- 1133).
- Other nuclear localization signals which can be used as part of a syngene product include but are not limited to those found in various steroid hormone receptors, the SV40 large T antigen, or the consensus sequence therefor, as set forth in Table 7 below.
- the syngene can encode multiple copies of one or more binding domains or additionally contain multiple non-binding domains.
- the syngene can be modified at the base moiety, sugar moiety, or phosphate backbone, to stabilize the syngene, promote in vivo transport or localization, etc.
- Such modifications to bases, sugar moieties, and phosphate backbones are made in such a way that the modifications do not interfere with the transcription of the syngene, and thus do not
- the syngene may include other appended groups such as peptides, or agents
- the syngene may comprise at least one modified base moiety which is selected from the group including but not limited to 5-fluorouracil,
- hypoxanthine xantine
- 4-acetylcytosine 4-acetylcytosine
- 2-thiocytosine 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl- 2-thiouracil, 3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine.
- the syngene comprises at least one modified sugar moiety selected from the group including but not limited to arabinose,
- the syngene comprises at least one modified phosphate backbone selected from the group consisting of a
- phosphordiamidate a methylphosphonate, an alkyl phosphotriester, and a formacetal or analog thereof.
- a portion of the syngene (preferably noncoding) consists of oi-anomeric nucleotides.
- An ⁇ -anomeric oligonucleotide forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual 0-units, the strands run parallel to each other (Gautier et al., 1987, Nucl. Acids Res. 15:6625-6641).
- Syngenes of the invention may be synthesized by standard methods known in the art, e.g. by use of an automated DNA synthesizer (such as are commercially available from Biosearch, Applied Biosystems, etc.).
- an automated DNA synthesizer such as are commercially available from Biosearch, Applied Biosystems, etc.
- phosphorothioate oligonucleotides may be synthesized by the method of Stein et al. (1988, Nucl. Acids Res. 16:3209), methylphosphonate
- oligonucleotides can be prepared by use of controlled pore glass polymer supports (Sarin et al., 1988, Proc. Natl. Acad. Sci. U.S.A. 85:7448-7451), etc.
- the syngenes are made by recombinant DNA techniques commonly known in the art, e.g. by replicating a vector comprising the syngene in a suitable host cell.
- the syngene comprises a 2'-0-methylribonucleotide (Inoue et al., 1987, Nucl. Acids Res. 15:6131-6148), or a chimeric RNA-DNA analogue (Inoue et al., 1987, FEBS Lett.
- the syngene is constructed as part of an expression vector that can be introduced in vivo such that it is taken up by a cell, within which cell the vector or a portion thereof is transcribed, leading to the production of the encoded syngene product.
- a vector can remain episomal or become chromosomally integrated, as long as it can be transcribed to produce the syngene product.
- vectors can be constructed by recombinant DNA technology methods standard in the art.
- Vectors can be plasmid, viral, or others known in the art, used for replication and expression in mammalian cells. Expression of the sequence encoding the syngene product can be by any promoter known in the art to act in mammalian, preferably human, cells.
- Such promoters can be inducible or constitutive.
- Such promoters include but are not limited to: the SV40 early promoter region (Bernoist and Chambon, 1981, Nature 290:304-310), the promoter contained in the 3' long terminal repeat of Rous sarcoma virus (Yamamoto et al., 1980, Cell
- promoters that function in a wide variety of cell types can be used to obtain expression in a wide variety of cell types where the syngene is introduced.
- a viral promoter such as a human cytomegalovirus early gene promoter or the adenovirus major late promoter to drive the expression of the syngene.
- a viral promoter such as a human cytomegalovirus early gene promoter or the adenovirus major late promoter
- Such viral promoters are active in a wide variety of mammalian cell types.
- constitutive promoters are actin promoters and the PGK (phosphoglycerol kinase) promoter. Since the latter promoter is functional for high level transcription in all living mammalian cells it is an especially preferred choice.
- tissue specific control regions which is active in pancreatic acinar cells (Swift et al., 1984, Cell 38:639-646; Ornitz et al., 1986, Cold Spring Harbor Symp. Quant. Biol.
- pancreatic beta cells Pancreatic beta cells (Hanahan, 1985, Nature 315:115- 122), immunoglobulin gene control region which is active in lymphoid cells (Grosschedl et al., 1984, Cell 38:647-658; Adames et al., 1985, Nature 318:533- 538; Alexander et al., 1987, Mol. Cell. Biol. 7:1436- 1444), mouse mammary tumor virus control region which is active in testicular, breast, lymphoid and mast cells (Leder et al., 1986, Cell 45:485-495), albumin gene control region which is active in liver (Pinkert et al., 1987, Genes and Devel.
- alpha- fetoprotein gene control region which is active in liver (Krumlauf et al., 1985, Mol. Cell. Biol. 5:1639- 1648; Hammer et al., 1987, Science 235:53-58; alpha 1- antitrypsin gene control region which is active in the liver (Kelsey et al., 1987, Genes and Devel. 1:161- 171), beta-globin gene control region which is active in myeloid cells (Mogram et al., 1985, Nature 315:338- 340; Kollias et al., 1986, Cell 46:89-94; myelin basic protein gene control region which is active in
- oligodendrocyte cells in the brain (Readhead et al., 1987, Cell 48:703-712); myosin light chain-2 gene control region which is active in skeletal muscle (Sani, 1985, Nature 314:283-286), and gonadotropic releasing hormone gene control region which is active in the hypothalamus (Mason et al., 1986, Science
- shuttle vectors may be used such that a plasmid is grown in E. coli and purified DNA transferred into mammalian cells by the use of electroporation, calcium phosphate precipitates, DEAE dextran or with the assistance of liposomes.
- electroporation calcium phosphate precipitates
- DEAE dextran DEAE dextran
- liposomes Such an exemplary shuttle vector is shown in Figure 3.
- herpes simplex virus may be used to deliver genes to the brain (Leib and Olivo, 1993, Bioessays 15:547-54), retroviruses (Salmons and Gunzburg, 1993, Human Gene Therapy 4:129-142) and adeno-a ⁇ sociated virus (Walsh et al., 1993, Proc. Soc. Exp. Biol. Med. 204:289-300), which have the advantage of becoming integrated into the cellular DNA, may be used to deliver syngenes via transduction into hematopoietic, pluripotent stem cells and other somatic cells.
- the degeneracy of the genetic code allows for a plurality of DNA sequences to be constructed that all code for the same peptide or protein in a syngene.
- codon frequency usage such as that found in Ausubel et al. (eds.), 1993, Current Protocols in Molecular Biology, John Wiley & Sons, Inc., N.Y., at Appendix 1 A.1.6.
- Syngenes that encode binding domains that bind to a DNA transcriptional regulatory region, but that do not contain a transcriptional activation signal, can be tested for their ability to inhibit the transcription of a gene by the following assay, presented by way of example but not limitation.
- the assay is carried out by using three sets of plasmids in an established cell line.
- the cell line expresses a transcription factor that has a DNA binding site that is the same as the DNA binding site of the syngene product that is to be assayed.
- the three plasmids have the following characteristics:
- Plasmid 1 This plasmid contains the syngene to be tested cloned into a vector that will direct the expression of the syngene in the cell line;
- Plasmid 2 This plasmid contains a reporter gene.
- the promoter of the reporter gene contains the DNA sequence which forms the binding site for the transcription factor expressed by the cells and for the binding domain encoded by the syngene. Normally, this DNA sequence will be essentially the same
- the reporter gene codes for the expression of any detectable marker such as, for example,
- chloramphenicol acetyl transferase CAT
- ⁇ - galactosidase horse radish peroxidase, etc.
- Plasmid 3 This is a control plasmid to monitor the efficiency of transfection. It directs the expression of a second reporter gene expressing a product that can be easily assayed. The activity of the promoter that drives the expression of this product is not dependent upon binding of a
- the promoter is constitutive in the cell being used.
- it is an inducible
- the reporter gene in plasmid 3 is different from that in plasmid 2.
- Two procedures are done, consisting of a test procedure and a control procedure. In the test procedure, the three plasmids are simultaneously co- introduced into each cell. Introduction can be by any of the well known methods in the art. Such methods include electroporation, calcium phosphate mediated transfection, and liposome mediated transfection.
- introduction is by electroporation.
- plasmids 2 and 3 are co- introduced, without plasmid 1 (which contains the syngene).
- the amounts of expressed reporter gene products from plasmids 2 and 3 are measured. If the syngene product is able to compete with the transcription factor for binding to the DNA binding sequence in the promoter of the first reporter gene on plasmid 2, then the level of expression of the first reporter gene product will be low. This is because binding of the syngene product to the DNA binding sequence in the promoter of the first reporter gene will prevent binding of the transcription factor to that sequence. Therefore, the transcription factor will not be able to activate the transcription of the first reporter gene. If the syngene product is not able to compete with the transcription factor, then the level of expression of the first reporter gene product will be high. This will be the case in the control procedure, where plasmid 1, which contains the syngene, is not used.
- the level of expression of the first reporter gene product (from plasmid 2) is compared to the level of expression of the second reporter gene product (from plasmid 3) to determine the ratio of expression of the two products.
- This ratio is determined for both the test procedure with, and the control procedure without, plasmid 1 containing the syngene. When the ratio from the test procedure is less that the ratio from the control procedure, this indicates that the syngene product is competing with the transcription factor for binding to the DNA binding sequence in the promoter of the first reporter gene. When the ratio from the test procedure is equal to the ratio from the control procedure, this
- a syngene product can inhibit the transcription of a gene the transcription of which depends upon binding of the transcription factor NF- ⁇ B to its DNA binding site
- a human cell line that expresses NF- ⁇ B e.g., Jurkat, a T cell line; Okamoto et al., 1994, J. Biol. Chem. 269:8582-8587.
- the first plasmid e.g., Jurkat, a T cell line; Okamoto et al., 1994, J. Biol. Chem. 269:8582-8587.
- the second plasmid contains the syngene which is driven by the adenovirus major late promoter or by some other promoter that is active in the cells being used.
- the second plasmid contains the first reporter gene which consists of the H2 ⁇ b promoter (Baldwin and Sharp, 1988, Proc. Natl. Acad. Sci. USA 85:723-727) directing the expression of the CAT gene.
- the H2 ⁇ b promoter is dependent upon NF- ⁇ B binding for its activity.
- the control plasmid consists of the luciferase gene (the second reporter gene) driven by a promoter that does not depend upon NF- ⁇ B for its expression.
- the three plasmids are introduced into the cells via electroporation.
- the syngene that is being assayed expresses a binding domain that is specific for the NF- ⁇ B binding site of the H2 ⁇ b gene but does not also contain a transcriptional activation site, then the ratio of CAT activity to luciferase activity will be low when compared to the same ratio obtained when no syngene, or a control syngene that does not encode a binding site that is specific for the NF- ⁇ B binding site, is used. This is because the syngene that encodes a binding domain that specifically binds the NF- ⁇ B binding site will bind the NF- ⁇ B site in the reporter plasmid containing the CAT gene. This will prevent binding of the cell's endogenous NF- ⁇ B transcription factor to that site, thus preventing NF- ⁇ B from activating the promoter of the reporter gene.
- a cell line To test for activation, a cell line must be used that does not express the transcription factor whose binding site is the same as that of the syngene product. Because the cell line does not express the transcription factor, the first reporter gene on the second plasmid will not be transcribed in the absence of a syngene product. If the syngene product is capable of binding to the DNA binding sequence in the promoter of the first reporter gene and is also capable of activating transcription, the reporter gene on the second plasmid will be expressed. In this assay, the syngene to be tested will be tested.
- telomeres preferably encode a transcriptional activation signal in addition to a binding domain.
- the ratio of first reporter gene product over second reporter gene products is compared for the test and control procedures (with and without the syngene).
- the presence in the test procedure of a syngene that encodes a product that is capable of binding to the promoter of, and activating the transcription of, the first reporter gene will result in a larger value for the ratio in the case of the test procedure as compared to the control procedure.
- a cell line is used that substantially does not express NF- ⁇ B.
- HeLa cells (Scheinman et al., 1993, Mol. Cell. Biol. 13:6089-6101) express only low levels of
- NF- ⁇ B endogenous NF- ⁇ B and thus can be used.
- F9 embryonal carcinoma cells can be used.
- the same three plasmids as above are co-transfected into this cell line. If the syngene product contains a binding domain specific for the NF- ⁇ B site as well as a transcriptional activation domain, the ratio of CAT activity to luciferase activity will be high when compared to the same ratio obtained when no syngene, or a control syngene product that lacks a binding domain specific for the NF- ⁇ B site, is used.
- CAT activity When no syngene, or a control syngene, is used, CAT activity will be very low because the cell, lacking NF- ⁇ B activity, will transcribe the H2 ⁇ b promoter (which drives expression of the CAT gene) at a very low rate. Thus, there will be very little CAT activity unless the syngene to be assayed binds to the NF- ⁇ B site on the H2 ⁇ b promoter and activates expression of the CAT gene.
- a cell line that does not express a transcription factor of interest may be made by methods known in the art.
- the gene for the transcription factor of interest may be inactivated by promoting homologous recombination of that gene with a non-functional copy of itself (Roller and Smithies, 1989, Proc. Natl.
- This vector consists of an adenovirus major late promoter driving a gene cassette with the adenovirus tripartite leader sequence and a hemoglobin splice donor and acceptor followed by the desired coding sequences followed by the
- this vector has the pBR322 replicon and an antibiotic resistance marker for growth in E. coli .
- the DNA sequences that define the binding sites of many transcription factors differ somewhat from gene to gene.
- the NF- ⁇ B binding site in the HLA class I genes differs from the NF- ⁇ B site in the IL-6 gene or the IL-8 gene
- IL-6 ⁇ B and IL-8 ⁇ B are recognized by slightly different versions of the transcription factors
- the same transcription factor recognizes the somewhat different versions of its binding site.
- the use of the syngenes of the present invention affords the opportunity for achieving a level of regulation of the different sites that surpasses that which could be achieved by the use of the natural transcription factor. This is because it is possible to isolate highly specific DNA binding domains from random synthetic peptide libraries according to the present invention that bind specifically to one member of the related binding sites of a given transcription factor but that do not bind to the other sites.
- the invention also provides methods for use of the syngenes, e.g., in diagnosis and therapy of various disorders.
- the present invention vastly expands the number of genes that are available for use in gene therapy.
- the present invention provides methods for finding synthetic nucleic acids encoding proteins with diagnostic or therapeutic value, or value in in vi tro assays.
- the present invention further provides methods for delivery of such peptides or proteins in vivo by expression in vivo from the administered syngene.
- the present invention provides methods for using synthetic peptides or proteins, and the genes encoding them, to fulfill roles that naturally
- syngene a nucleic acid that encodes the peptide.
- a syngene can be used, e.g., in gene therapy.
- the present invention provides methods for identifying syngenes that inhibit or enhance the transcriptional activity of a wide variety of naturally occurring genes, preferably with a specificity not found in natural systems.
- syngenes can be used to effect differential regulation of closely related transcriptional regulatory sites as described in Section 5.7.
- Syngene products may bind to actin and thus be of use in regulating mitotic spindle formation and cell division.
- Syngenes are also useful for modulating processes that occur outside of the nucleus.
- the encoded protein of a syngene produced by intracellular expression of the syngene, may be used to affect the activity of
- inhibitors of transcription factors such as IF- ⁇ B. They may also be used to modulate signal transduction pathways, metabolic pathways, RNA translation, and intracellular trafficking. In the cell membrane, syngenes may be used to modulate the activity of membrane receptors, ion channels, or exocytotic and endocytotic pathways.
- syngenes via expression of their encoded proteins, may be used to regulate cell/cell signalling and transcytosis.
- Cell/cell junctions and the extracellular matrix are appropriate targets for syngenes.
- Syngenes may be used to regulate cell adhesion or cell/cell recognition.
- syngenes may be used to regulate the activity of receptor ligands.
- Syngenes can be used in any appropriate method of gene therapy, as would be recognized by those in the art upon considering this disclosure.
- the resulting action of a syngene in the gene therapy patient can, for example, lead to the activation or inhibition of a preselected gene in the patient, thus leading to improvement of the diseased condition afflicting the patient.
- Methods of gene therapy are detailed in Section 5.8.1.
- the syngenes may be introduced into appropriate host cells and thereby used for the recombinant production of their encoded peptides.
- the peptides thus produced can be used, e.g., in in vi tro assays to detect and/or quantitate the ligand of choice to which they bind, for
- the invention relates to the peptide products of syngenes and their therapeutic and diagnostic uses.
- the syngene-encoded peptides can be used therapeutically and diagnostically, to
- syngene products can be used in in vi tro binding assays, to detect and/or measure amounts of their binding partners
- syngene products can be used in vivo, e.g., labeled with an appropriate marker, to image their binding partners.
- gene therapy refers to therapy performed by the administration of a nucleic acid to a subject.
- the nucleic acid either directly or indirectly via its encoded protein, mediates a therapeutic effect in the subject.
- syngenes of the present invention can be used in any of the methods for gene therapy available in the art.
- the descriptions below are meant to be illustrative of such methods. It will be readily understood by those of skill in the art that the methods illustrated represent only a sample of all available methods of gene therapy.
- Delivery of the syngene into a patient may be either direct, in which case the patient is
- syngene or syngene-carrying vector directly exposed to the syngene or syngene-carrying vector, or indirect, in which case, cells are first transformed with the syngene in vitro, then
- the syngene is directly administered in vivo for therapeutic effect, whereby it is expressed to produce the syngene
- nucleic acid expression vector This can be accomplished by any of numerous methods known in the art, e.g., by constructing it as part of an appropriate nucleic acid expression vector and administering it so that it becomes intracellular, e. g. , by infection using a defective or attenuated retroviral or other viral vector (see U.S. Patent No. 4,980,286), or by direct injection of naked DNA, or: by use of microparticle bombardment (e.g., a gene gun; Biolistic, Dupont) , or coating with lipids or cell- surface receptors or transfecting agents,
- microparticle bombardment e.g., a gene gun; Biolistic, Dupont
- a syngeneligand complex can be formed in which the ligand comprises a fusogenic viral peptide to disrupt
- the syngene can be targeted in vivo for cell specific uptake and expression, by targeting a specific receptor (see, e.g., PCT Publications WO 92/06180 dated April 16, 1992 (Wu et al.); WO 92/22635 dated December 23, 1992 (Wilson et al.); WO92/20316 dated November 26, 1992 (Findeis et al.); W093/14188 dated July 22, 1993
- retroviral vectors are retrovirus that has been modified to incorporate a preselected gene in order to effect the expression of that gene. It has been found that many of the naturally occurring DNA sequences of
- retroviruses are dispensable in retroviral vectors. Only a small subset of the naturally occurring DNA sequences of retroviruses is necessary. In general, a retroviral vector must contain all of the cis-acting sequences necessary for the packaging and integration of the viral genome. These cis-acting sequences are: a) a long terminal repeat (LTR) , or portions thereof, at each end of the vector;
- LTR long terminal repeat
- the gene to be used in gene therapy is cloned into the vector, which facilitates delivery of the gene into a patient.
- retroviral vectors More detail about retroviral vectors can be found in Boesen et al., 1994, Biotherapy 6:291-302, which describes the use of a retroviral vector to deliver the mdrl gene to hematopoietic stem cells in order to make the stem cells more resistant to
- Adenoviruses are also of use in gene
- Adenoviruses are especially attractive vehicles for delivering genes to respiratory
- Adenoviruses naturally infect respiratory epithelia where they cause a mild disease.
- Other targets for adenovirus-based delivery systems are liver, the central nervous system, endothelial cells, and muscle.
- Adenoviruses have the advantage of being capable of infecting non-dividing cells.
- AAV adeno-associated virus
- Another approach to gene therapy involves transferring a gene to cells in tissue culture by such methods as electroporation, lipofection, calcium phosphate mediated transfection, or viral infection.
- the method of transfer includes the transfer of a selectable marker to the cells. The cells are then placed under selection to isolate those cells that have taken up and are expressing the transferred gene. Those cells are then delivered to a patient.
- the syngene is introduced into a cell prior to administration in vivo of the resulting recombinant cell.
- introduction can be carried out by any method known in the art, including but not limited to transfection,
- the technique should provide for the stable transfer of the syngene to the cell, so that the syngene is expressible by the cell and preferably heritable and expressible by its cell progeny.
- the resulting recombinant cells can be delivered to a patient by various methods known in the art.
- epithelial cells are injected, e.g., subcutaneously.
- epithelial cells are injected, e.g., subcutaneously.
- recombinant skin cells may be applied as a skin graft onto the patient.
- Recombinant blood cells e.g., hematopoietic stem or progenitor cells
- the amount of cells envisioned for use depends on the desired effect, patient state, etc., and can be determined by one skilled in the art.
- Cells into which a syngene can be introduced for purposes of gene therapy encompass any desired, available cell type, and include but are not limited to epithelial cells, endothelial cells, keratinocytes, fibroblasts, muscle cells, hepatocytes; blood cells such as T lymphocytes, B lymphocytes, monocytes, macrophages, neutrophils, eosinophils, megakaryocytes, granulocytes; various stem or progenitor cells, in particular hematopoietic stem or progenitor cells, e.g., as obtained from bone marrow, umbilical cord blood, peripheral blood, fetal liver, etc.
- the cell used for gene therapy is autologous to the patient.
- a syngene is introduced into the cells such that it is expressible by the cells or their progeny, and the recombinant cells are then administered in vivo for therapeutic effect; stem and progenitor cells are preferred for use. Any stem and/or progenitor cells which can be isolated and maintained in vitro can potentially be used in
- Such stem cells include but are not limited to hematopoietic stem cells (HSC) , stem cells of epithelial tissues such as the skin and the lining of the gut, embryonic heart muscle cells, liver stem cells (PCT Publication WO 94/08598, dated April 28, 1994), and neural stem cells (Stemple and Anderson, 1992, Cell 71:973-985).
- HSC hematopoietic stem cells
- stem cells of epithelial tissues such as the skin and the lining of the gut
- embryonic heart muscle cells embryonic heart muscle cells
- liver stem cells PCT Publication WO 94/08598, dated April 28, 1994
- neural stem cells Stemple and Anderson, 1992, Cell 71:973-985.
- ESCs Epithelial stem cells
- keratinocytes can be obtained from tissues such as the skin and the lining of the gut by known procedures (Rheinwald, 1980, Meth. Cell Bio. 21A:229). In stratified epithelial tissue such as the skin, renewal occurs by mitosis of stem cells within the germinal layer, the layer closest to the basal lamina. Stem cells within the lining of the gut provide for a rapid renewal rate of this tissue.
- ESCs or keratinocytes obtained from the skin or lining of the gut of a patient or donor can be grown in tissue culture
- ESCs are provided by a donor, a method for suppression of host versus graft reactivity (e.g., irradiation, drug or antibody administration to promote moderate immunosuppression) can also be used.
- a method for suppression of host versus graft reactivity e.g., irradiation, drug or antibody administration to promote moderate immunosuppression
- HSC hematopoietic stem cells
- any technique which provides for the isolation, propagation, and maintenance in vi tro of HSC can be used in this embodiment of the invention. Techniques by which this may be accomplished include (a) the isolation and establishment of HSC cultures from bone marrow cells isolated from the future host, or a donor, or (b) the use of previously established long- term HSC cultures, which may be allogeneic or
- Non-autologous HSC are used preferably in conjunction with a method of suppressing
- human bone marrow cells can be obtained from the posterior iliac crest by needle aspiration (see, e.g., Kodo et al., 1984, J. Clin. Invest. 73:1377-1384).
- the HSCs can be made highly enriched or in substantially pure form. This
- enrichment can be accomplished before, during, or after long-term culturing, and can be done by any techniques known in the art.
- Long-term cultures of bone marrow cells can be established and maintained by using, for example, modified Dexter cell culture techniques (Dexter et al., 1977, J. Cell Physiol.
- the syngene to be introduced for purposes of gene therapy comprises an inducible promoter operably linked to the coding region, such that expression of the syngene is
- controllable by controlling the presence or absence of the appropriate inducer of transcription.
- the invention provides methods of treatment by administration to a subject of an effective amount of a pharmaceutical (therapeutic) composition
- syngene comprising a syngene.
- a syngene envisioned for therapeutic use is referred to hereinafter as a
- Therapeutic or “Therapeutic of the invention.”
- “Therapeutic” or “Therapeutic of the invention” shall also be used to refer to molecules comprising the encoded peptide products of syngenes, e.g., where a molecule comprising the peptide binding domain encoded by a syngene is used therapeutically.
- the Therapeutic is
- the subject is preferably an animal, including but not limited to animals such as cows, pigs, horses, chickens, cats, dogs, etc., and is preferably a mammal, and most preferably human.
- Formulations and methods of administration that can be employed when the Therapeutic comprises a syngene are described in Section 5.8.1; additional appropriate formulations and routes of administration can be selected from among those described
- a Therapeutic of the invention e.g., encapsulation in liposomes, microparticles, microcapsules, recombinant cells containing the
- introduction include but are not limited to intradermal, intramuscular, intraperitoneal,
- the compounds may be administered by any convenient route, for example by infusion or bolus injection, by absorption through epithelial or
- mucocutaneous linings e.g., oral mucosa, rectal and intestinal mucosa, etc.
- Administration can be systemic or local.
- intraventricular injection may be facilitated by an intraventricular catheter, for example, attached to a reservoir, such as an Ommaya reservoir.
- a reservoir such as an Ommaya reservoir.
- liposomes targeted via antibodies to specific identifiable cell surface antigens may be desirable to utilize liposomes targeted via antibodies to specific identifiable cell surface antigens.
- the Therapeutic comprises a syngene that is part of an expression vector that expresses the syngene in a suitable host.
- a syngene has a promoter operably linked to the syngene coding region, said promoter being inducible or constitutive.
- the syngene is a nucleic acid molecule in which the syngene coding sequences and any other desired sequences are flanked by regions that promote homologous recombination at a desired site in the genome, thus providing for intrachromosomal expression of the syngene (Roller and Smithies, 1989, Proc. Natl. Acad. Sci. USA 86:8932-8935; Zijlstra et al., 1989, Nature 342:435-438).
- the Therapeutics of the invention may be desirable to administer the Therapeutics of the invention locally to the area in need of treatment; this may be achieved by, for example, and not by way of limitation, local infusion during surgery, topical application, e.g., in conjunction with a , dressing after surgery, by injection, by means of a catheter, by means of a suppository, or by means of an implant, said implant being of a porous, non-porous, or gelatinous material, including membranes, such as sialastic membranes, or fibers.
- the present invention provides
- compositions comprise a therapeutically effective amount of a
- Such a carrier includes but is not limited to saline, buffered saline, dextrose, water, glycerol, ethanol, and combinations thereof.
- the carrier and composition can be sterile.
- formulation should suit the mode of administration.
- the composition can also contain minor amounts of wetting or emulsifying agents, or pH buffering agents.
- the composition can be a liquid solution, suspension, emulsion, tablet, pill, capsule, sustained release formulation, or powder.
- the composition can be formulated as a suppository, with traditional binders and carriers such as triglycerides.
- Oral formulation can include standard carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, etc.
- the composition is formulated in accordance with routine procedures as a pharmaceutical composition adapted for intravenous administration to human beings.
- a pharmaceutical composition adapted for intravenous administration to human beings.
- compositions for intravenous administration are solutions in sterile isotonic aqueous buffer. Where necessary, the composition may also include a
- solubilizing agent and a local anesthetic such as lignocaine to ease pain at the site of the injection.
- the ingredients are supplied either separately or mixed together in unit dosage form, for example, as a dry lyophilized powder or water free concentrate in a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent.
- a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent.
- composition administered by infusion, it can be dispensed with an infusion bottle containing sterile pharmaceutical grade water or saline.
- an ampoule of sterile water for injection or saline can be provided so that the ingredients may be mixed prior to administration.
- the Therapeutics of the invention can be formulated as neutral or salt forms.
- Pharmaceutically acceptable salts include those formed with free amino groups such as those derived from hydrochloric, phosphoric, acetic, oxalic, tartaric acids, etc., and those formed with free carboxyl groups such as those derived from sodium, potassium, ammonium, calcium, ferric hydroxides, isopropylamine, triethylamine, 2- ethylamino ethanol, histidine, procaine, etc.
- the invention also provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the
- Optionally associated with such container (s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of
- TSAR library is utilized; however, to those skilled in the art, it will be apparent that other random peptide display libraries may be used.
- An example of a TSAR library is the TSAR-9 library disclosed in Kay et al., 1993, Gene 128:59-65.
- TSAR-9 constructs display a peptide of about 38 amino acids in length having 36 totally random positions.
- H2 ⁇ B-U 5'gggtGGGGATTCCCCatct
- SEQ ID NO: 136 5' agatGGGGAATCCCCaccc
- H2 ⁇ b-L 5' agatGGGGAATCCCCaccc
- H2 ⁇ B oligonucleotide hybridization of H2 ⁇ B-U and H2 ⁇ B-L is called the H2 ⁇ B oligonucleotide.
- the upper case sequence is the known NF- ⁇ B binding site (see Baldwin and Sharp, 1988, Proc. Natl. Acad. Sci, USA 85:723-727). This particular sequence is homologous to the NF- ⁇ B regulatory domain of the murine homologue of the human HLA class I gene.
- the murine sequence is from the H-2k (d) gene and is in the databases as GenBank accession X01815.
- the upper case NF- ⁇ B site is completely conserved in human HLA class I genes such as HLA-B27 (GenBank M12967) and HLA-J (GenBank M80470) and HLA-A2 (GenBank K02883).
- IL-6 is an attractive target for regulation by syngenes since it has been shown to be involved in immune responses and the acute phase protein responses (Kishimoto, 1989, Blood 74:1-10; Clark, 1989, Ann.
- IL6 ⁇ B-U 5' atgtGGGATTTTCCcatg
- IL6 ⁇ B-L 5' catgGGAAAATCCCacat
- IL6 ⁇ B-L 5' catgGGAAAATCCCacat
- oligonucleotides For the NF- ⁇ B site in the IL-8 gene, the following oligonucleotides are used:
- IL8 ⁇ B-U 5'atcgTGGAATTTCCtctg
- IL8 ⁇ B-L 5' cagaGGAAATTCCAcgat
- IL8 ⁇ B-L 5' cagaGGAAATTCCAcgat
- NF-IL6 site found in the IL6 gene, the following oligonucleotides are synthesized:
- NFIL6-U 5' acgtcATTGCACAAtctt
- NFIL6-L 5' aagaTTGTGCAATgacgt
- the ligated fragments of DNA can be analyzed by gel electrophoresis with visualization of ethidium bromide stained DNA using short wave UV light to determine the average size of a ligated fragment and thus, the average number of cis-elements per fragment. For the experiments carried out below, it is optimal to have either one site per molecule (unligated fragments) or 10 sites per molecule. Other numbers of sites are less desirable. Fragments of the appropriate size can be recovered by gel electrophoresis or velocity sedimentation centrifugation techniques.
- Phage from a TSAR library that bind to specific DNA sequences are isolated by a process of screening called panning.
- panning a process of screening
- Universal Covalent multiwell plates obtained from Costar (Cambridge, MA) , or similar multiwell plates, are utilized.
- the specific DNA target is immobilized directly to the wells in a procedure supplied by the manufacturer.
- Costar' s Universal Covalent surface is intended to covalently immobilize biomolecules via an abstractable hydrogen using UV illumination resulting in a carbon-carbon bond. Although the linkage is non-specific and does not allow for site-directed orientation of a biomolecule, this surface is very useful for the immobilization of double stranded DNA.
- a 100 ⁇ l DNA sample (10 ⁇ g/ml in 50 mM Na Acetate buffer) is added to each of the appropriate wells and incubated at room temperature for 1 hr. The solution is removed and the DNA covalently immobilized to the surface by exposing the plate to UV light for the time determined to be optimal in the UV Intensity
- panning can be carried out is as follows. Aliquots of phage are first placed into ELISA plate wells coated only with streptavidin. This removes those phage capable of binding to streptavidin. Then the aliquots are transferred to wells coated with target DNA
- a biotin linked to streptavidin bound to the surface of an ELISA plate well Preferably, Reacti-Bind brand streptavidin coated strip plates, cat. # 15124 are used. These are obtainable from Pierce, 3747 North Meridian Road, Rockford, IL.
- E.I.A./R.I.A. multi-well plates (cat. #3590). All wells are coated with streptavidin using 1 ⁇ g/ml in 100 mM Na Bicarbonate, pH 8.4 buffer. 50 ⁇ l are added to each well and incubated at 4oC overnight or 22°C for 4 hrs. The streptavidin solution is pipetted away, and the remaining binding surfaces are blocked with 300 ⁇ l/well of a solution containing 5 mg/ml BSA in 100 mM Na Bicarbonate, pH 8.4 buffer for 1 hr at 37°C or overnight at 4°C.
- the wells on a coated 96-well multi-well plate are labelled in pairs. Two wells are used for each specific DNA target, one well for each target is labelled "A" and the other is labeled "B".
- the wells are washed with a Wash Buffer composed of IX phosphate buffered saline (PBS), 0.1% Tween ® 20 and 0.1% bovine serum albumin (BSA). The washing is repeated three times.
- PBS IX phosphate buffered saline
- BSA bovine serum albumin
- oligonucleotide fragments (specific target DNA) are added to wells labelled "B" using 50 ⁇ l of a DNA solution (20 ⁇ g/ml) in lx PBS buffer.
- the streptavidin is allowed to capture the DNA for 2 hrs at room temperature.
- the DNA solutions are withdrawn with a pipette and discarded. All non-specifically bound DNA is washed away using wash buffer described above.
- Elution of DNA binding phage is done by the addition of 60 ⁇ l 50 mM glycine-HCl pH 2.0 buffer for 10 minutes at 65°C. The eluted phage are transferred to a 60 ⁇ l solution of 200 mM sodium phosphate, pH 7.4 to neutralize.
- One round of panning is now complete.
- This population of phage is not a pure population of phage that bind specific DNA target sequences, but rather a population that is substantially enriched for phage that bind to the target sequences.
- four sublibraries are being isolated, one that binds to a NFIL6 site and three that have specificity toward different NF- ⁇ B sites. See Section 6.1.1 for a description of the oligonucleotide target ligands for these sublibraries. These sub-libraries are composed of "DNA binding phage" or DB phage.
- Each sublibrary of DB phage is amplified, titered and stored, e.g., as described in Kay et al., 1993, Gene 128:59-65.
- To obtain phage that have the highest affinity for each of the DNA binding targets three more rounds of panning are carried out in substantially the same fashion as described above. However, since the population of DNA binding phage have been significantly enriched there are many copies of each unique phage in each 10 11 aliquot of the amplified sublibrary. Three rounds of additional panning toward each target is carried out without additional amplification of the phage in between adsorption and elution. After the final elution and neutralization, the phage are used to infect E. coli strain DH5 ⁇ rF' and plated out to obtain substantially pure plaques.
- the TSAR insert in each phage is sequenced by the usual methods, e.g., Sanger dideoxy sequencing. Generally one obtains several identical or very similar phage in each group of 24 from one panning experiment.
- the translation of the DNA sequence (syngene) encoding the TSAR domain within the phage is the peptide sequence responsible for the binding of a given phage to the DNA target sequence.
- One isolate for each unique peptide encoding sequence is utilized for further characterization of its DNA binding properties.
- the selectivity of a given phage for a target sequence can be determined by any of several means.
- One method involves the carrying out of a "phage" ELISA utilizing an ELISA kit from Pharmacia
- the phage to be tested is grown in appropriate cells for 6 hrs to overnight in 2XYT at 37°C; 2. The cells are spun down and the
- This supernatant is collected. This supernatant can be stored at 4°C for several days;
- the binding selectivity for a given DNA target sequence of any unique TSAR phage can be rapidly determined by bringing aliquots (10 8 phage particles) of a particular phage into contact with specific target DNAs immobilized in wells of a
- a phage that binds to only one target sequence is a highly specific DNA binding phage (HSDB phage).
- a phage that binds to more than one target sequence is a cross-specific DNA binding phage.
- characterization of a given peptide displaying phage's ability to bind to a particular target sequence can be determined by carrying out so-called competition experiments routinely used by those skilled in immunoassays.
- competition experiments routinely used by those skilled in immunoassays.
- an aliquot of phage is first brought into contact with diverse pairs of non-biotinylated oligonucleotides under conditions and for a time sufficient to allow any binding to occur. Subsequently each aliquot is added to a well of a multi-well plate coated with a specific DNA target sequence. If the unlabeled competitor DNA is unable to bind to the TSAR domain of the phage, then the phage may bind to the specific DNA target sequence immobilized in the well.
- the phage bound in the well can be detected and measured by any convenient technique, e.g., use of the horseradish peroxidase conjugated sheep anti-M13 IgG and ABTS ® of the phage ELISA described above. If the unlabeled DNA fragment is able to bind to the TSAR domain in the phage, then that phage will be unable to bind to the target DNA immobilized in the well. By varying the concentration of unlabeled DNA in solution one can estimate the relative affinity of a given phage TSAR domain for a specific DNA target as well as make a determination about the actual specificity of a TSAR domain for a specific DNA site.
- H2 ⁇ B-L H2 ⁇ B oligonucleotide, see Section 6.1.1 above
- the R26 library was screened and panned as described above using "Universal Covalent" multiwell plates with the H2 ⁇ B oligonucleotide.
- Four phage clones were isolated that exhibited specific binding to the H2 ⁇ B oligonucleotide. To examine further the specificity of binding of these clones, they were tested in a phage ELISA (as described above) against the H2 ⁇ B oligonucleotide and three other
- oligonucleotides The other oligonucleotides that were tested against these four phage were: (1) the oligonucleotide formed by IL6 ⁇ B-U and IL6 ⁇ B-L (IL6 ⁇ B oligonucleotide); (2) the oligonucleotide formed by IL8 ⁇ B-U and IL8 ⁇ B-L (IL8 ⁇ B oligonucleotide); (3) the oligonucleotide formed by NFIL6-U and NFIL6-L (NFIL6 oligonucleotide, see Section 6.1.1).
- Figure 4 shows the results of these ELISAs.
- Clones 1, 2, and 6 all showed strong binding to the H2 ⁇ B oligonucleotide as compared to their binding to background (the BSA coated plates) .
- Clone 5 showed less strong, but still about 2.5-fold greater, binding to the H2 ⁇ B oligonucleotide than to background.
- This binding on the part of clones 1, 2, 5, and 6 can be contrasted with the binding of m663 and a randomly selected phage from the R26 library, both of which bound the BSA coated plates about as well as they bound the plates coated with the H2kB oligonucleotide.
- oligonucleotides This is seen by the much higher ratios of binding to the oligonucleotide coated plates versus binding to the BSA coated plates when the oligonucleotide was the H2 ⁇ B oligonucleotide rather than any of the other three oligonucleotides.
- oligonucleotide sequences of the random inserts of clones 1, 2, 5, and 6 were determined.
- clone 1 is called H2 ⁇ B-l and clone 2 is called H2 ⁇ B-2.
- NFIL6 oligonucleotide formed by the sequences NFIL6-U and NFIL6-L (NFIL6 oligonucleotide, see Section 6.1.1 above) were isolated from the DC43 peptide display library. (See Section 6.9.6 and Figure 13 regarding the DC43 library).
- the DC43 library was screened and panned as described above using "Universal Covalent” multiwell plates with the NFIL6 oligonucleotide.
- a phage clone (NFIL6-1) was isolated that exhibited specific binding to the NFIL6 oligonucleotide.
- NFIL6-1 The oligonucleotide sequence of the random insert of NFIL6-1 was determined. This allowed for the determination of the corresponding amino acid sequence for the peptide encoded by this insert. The following was the result obtained: NFIL6-1 - REWGVPGAHNRIRDHCNGPRCHAIRTNASHTQHI
- oligonucleotides H2 ⁇ B, IL6 ⁇ B, and NFIL6 was tested by phage ELISA. Phage ELISA were performed as- in Section 6.1.3. Each phage was assayed for its ability to bind to each oligonucleotide following immobilization of that oligonucleotide in the well of a microtiter plate. This binding was compared to the binding of the phage to wells that had been coated with BSA
- Figure 14 shows that libraries of random peptides can be successfully screened to identify and isolate binding domains that specifically bind to specific DNA sequences.
- Phage NFIL6-1 binds well to the NFIL6 oligonucleotide with some binding to the other two oligonucleotides.
- Phage H2 ⁇ B-2 binds only to the target H2 ⁇ B oligonucleotide.
- Phage H2 ⁇ B-1 shows much greater binding to the H2 ⁇ B oligonucleotide than to the other oligonucleotides.
- oligonucleotide New DNA oligonucleotides were synthesized; the sequences of these oligonucleotides deviated from the sequence of the H2 ⁇ B oligonucleotide by only one or two (and in one instance, three) bases.
- Oligonucleotides for the upper and lower strands were synthesized. For each pair of upper and lower strands, 20 ⁇ g of upper and lower strands
- oligonucleotides were combined in 100 ⁇ l of TE buffer supplemented with 200 mM NaCl. Annealing and filling- in was carried out at 65°C for 5 minutes, 42°C for 5 minutes, and 37°C for 15 minutes. The resulting double stranded DNA fragments were diluted in PBS, pH 5.0. 50 ⁇ l of the appropriate DNA fragments were added to Costar Universal Covalent microtiter plates as in Section 6.1.2 and incubated overnight at 4°C. Phage ELISA assays were carried out three times for each target DNA using the H2 ⁇ B-2 phage. The results are shown in Table 8.
- the binding of H2 ⁇ B-2 to that target was compared to the binding of H2 ⁇ B-2 to the H2 ⁇ B oligonucleotide (wild type (WT) ) and expressed as percent binding compared to the wild type oligonucleotide (% WT).
- the relative specificity of an HSDB for a specific DNA site is more important than the actual affinity of the domain for the DNA site.
- the effect of low affinity can be compensated for by increased levels of gene expression within the target cell.
- amino acid sequence encoded by the TSAR domain of the phage having the best properties can be
- oligonucleotides synthesized from "doped" nucleotide reservoirs are produced from "doped" nucleotide reservoirs.
- the doping is carried out such that the original peptide sequence is represented only once in 10 6 unique clones of the mutagenized oligonucleotide.
- the assembled oligonucleotides are cloned into the parental TSAR vector.
- the vector is m663 (Fowlkes et al., 1992, BioTechniques 13:422-427).
- m663 is able to make blue plaques when grown in E.
- a library of greater than 10 6 is preferred; however a library of 10 5 is
- DNA binding TSAR mutant library Once a DNA binding TSAR mutant library has been constructed and amplified, it can be panned with immobilized target DNA sequences in a manner analogous to that described for the isolation of the initial DNA binding phage. It is preferable, but not necessary, to selectively pan out those phage capable of binding to sequences related to the target DNA sequence.
- the related sequence is that of another site subject to regulation by the same transcription factor as the site desired to be regulated by the syngene product. Therefore, it is preferable to use four wells for this set of panning and phage
- the phage obtained by this process are called "highly specific DNA binding phage" or HSDB phage.
- Another means for blocking the binding of phage capable of binding to sequences related, but not identical, to the target DNA sequence is to add soluble DNA oligonucleotide fragments (to a final concentration of 0.1 ⁇ g/ml) to aliquots of the mutant library before panning.
- the added DNA oligonucleotide fragments bear the sequences related (but not
- HSDB phage To characterize HSDB phage relative to parental and other DB phage isolated from a TSAR-9 library, one takes advantage of the fact that the TSAR-9 library phage have no lacZ activity, whereas the mutant libraries used to clone the HSDB phage are from a vector that expresses lacZ activity in the appropriate E. coli cell lines.
- a mutant binds to the target DNA with more specificity than its parent DB phage
- equal amounts of the two phage to be compared are mixed together and then applied to a well containing a target DNA sequence.
- an equal amount of the two phages are applied to a well containing another DNA sequence.
- the phage recovered are plaqued onto the appropriate E. coli in the presence of X-gal. Ratios of blue to white phage are calculated. Since the mutant phage can convert X-gal to a blue product, one can readily determine which mutant phage have improved DNA binding
- Oligonucleotides o8909 and o8545 were synthesized and purified by reverse phase HPLC (see Figure 15). Double stranded DNA fragments were made by annealing 10 ⁇ g of each oligonucleotide in 200 ⁇ l of IX SequenaseTM buffer (United States Biochemical Corp., Cleveland, OH) with 10 mM DTT and 150 ⁇ M of each dNTP. The fill-in reaction was initiated with 3 ⁇ l of SequenaseTM, Version 2.0 (United States
- the fragments were cloned into m663 in the usual manner.
- a library of about 1.5 x 10 8 phage variants was obtained. This library is called the ME#1 library (See Figure 15).
- the ME#1 library was panned using the original H2 ⁇ B
- IL8 ⁇ B IL8 ⁇ B
- NFIL6 NFIL6 oligonucleotides
- binding ability Twenty-two clones from the ME#1 library that bound the H2 «cB oligonucleotide (binders) were selected for further analysis. Also, twenty-six clones were identified that did not have significant ability to bind the H2 ⁇ B oligonucleotide (non-binders). In addition, random plaques from the ME#1 library were tested by phage ELISA for the ability to bind the H2 ⁇ B oligonucleotide; about one third demonstrated some binding ability.
- Table 9 shows that there was no apparent bias in the sequences for both the binders and non- binders in the first 16 residues. However, at
- residues 18 through 28 There are significant differences between the two groups for residues 18 through 28. While the original residue is clearly favored in the binding class for residues 18-24 and 26, this is not the case for positions 25, 27, and 28. In position 25, the original methionine was observed at less than the expected frequency in both the binders and non-binders and isoleucine was observed at higher than expected frequency in both classes.
- Residues 27 and 28 are very informative. For the binders, histidine was observed at position 27 in 73% of the clones, but was never observed at that position in the non-binders. At position 28, lysine was observed in 73% of the binders but in only 12% of the non-binders. Furthermore, the original asparagine was expected at a frequency of 51%, but occurred in only 18% of the binders.
- H2 ⁇ B-2 clone clearly binds well to the H2 ⁇ B oligonucleotide target, it does not carry the optimal amino acid sequence to do so (see below and Figures 16A and 16B for a further discussion of this point).
- substituting a histidine for the lysine at position 26 and a lysine for the asparagine at position 28 leads to avid binding of the clones to the target DNA.
- Phage stocks were prepared for the binders from the ME#1 library as well as for the original H2 ⁇ B-2 clone and the stocks were titered by serial dilution. Subsequently, the appropriate dilutions of each phage were analyzed by phage ELISA for binding to the H2 ⁇ B oligonucleotide. The results are shown in Figures 16A and 16B. Binding is expressed as signal strength (O.D.). Phage having higher relative avidity for the target (as compared to H2 ⁇ B-2) continue to produce a detectable signal at low concentrations of phage and thus the curves for those phage are shifted to the left as compared to the curve for H2 ⁇ B-2.
- oligonucleotide (as compared to H2 ⁇ B-2) maintained the same relative target specificity as H2 ⁇ B-2. However, some binders with less avidity for the H2 ⁇ B
- oligonucleotide had increased avidity for certain of the variant H2 ⁇ B oligonucleotide targets.
- sequences e.g., nuclear localization sequences, transcriptional activation sequences, for constructing a functional syngene directed to the regulation of the activity of a gene with H2 ⁇ B sequences.
- Molecular evolution library #2a (the ME#2a library) contained the critical residues as a fixed core flanked by 10 random residues on each side.
- Molecular evolution library #2b (the ME#2b library) was constructed in a similar manner, but with a smaller group of core residues. Specifically, the ME#2b library lacks the initial arginine, serine, glycine, and arginine found within the core of the ME#2a library. In addition, the core of the ME#2b library was flanked with 12 random residues on each side.
- Figures 17A and 17B summarize the construction of the ME#2a and ME#2b libraries.
- the ME#2a and ME#2b libraries were panned using the H2 ⁇ B oligonucleotide as ligand. A number of clones were isolated as binders from the ME#2a and ME#2b libraries. Phage stocks were prepared for the binders from the ME#2a and ME#2b libraries as well as for the original H2 ⁇ B-2 clone and the stocks were titered by serial dilution. Subsequently, the
- Binding is expressed as signal strength (O.D.). Phage having higher relative avidity for the target (as compared to H2 ⁇ B-2) continue to produce a detectable signal at low concentrations of phage and thus the curves for those phage are shifted to the left as compared to the curve for H2 ⁇ B-2.
- Table 12 shows the amino acid sequences of the inserts of the binders from the ME#2a and ME#2b libraries as well as the inserts of clone 959496-10 and H2 ⁇ B-2.
- the major modification consists in the use of DNA target sequences from GATA transcription factor binding sites rather than DNA target sequences from
- the target DNA sequence is 5' ctggccTTATCTccggct (SEQ ID NO: 180) for the upper strand and 5 ' agccggAGATAAggccag (SEQ ID NO: 181) for the lower strand (Dorfman et al., 1992, J. Biol. Chem. 267:1279-1285).
- the human gastric (H + + K + ) ATPase gene is responsible for maintaining the large (about 5 units) pH difference between the cytoplasm and gastric lumen in the stomach. It has been proposed that regulation of the human gastric (H + + K + ) ATPase gene would be useful in the treatment of gastric ulcers (Maeda,
- HSDBs can be obtained by screening random peptide libraries with the an oligonucleotide formed from an upper strand of 5 'gacatGGGGGGATCTGGgca (SEQ ID NO: 182) and a lower strand of
- oligonucleotide represents a sequence similar to that of bindings site for the GATA-GT family of
- the target DNA sequence is 5' gacactTGATAAcagaaa (SEQ ID NO: 184) for the upper strand and 5' tttctgTTATCAagtgtc (SEQ ID NO: 185) for the lower strand (Ko et al., 1991, Mol. Cell. Biol. 11:2778-2784).
- the DNA target sequences for the isolation of GATA transcription factor binding site HSDBs can be monovalent or multivalent. Panning, amplification, and analysis of the isolated phage are done as for NF- ⁇ B or NF-IL6. Mutagenesis to produce HSDBs of greater selectivity can also be done as for NF- ⁇ B or NF-IL6.
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP95929609A EP0777748A1 (en) | 1994-08-18 | 1995-08-17 | Peptide librairies as a source of syngenes |
AU33308/95A AU3330895A (en) | 1994-08-18 | 1995-08-17 | Peptide librairies as a source of syngenes |
JP8508213A JPH10504718A (en) | 1994-08-18 | 1995-08-17 | Peptide libraries as syngene sources |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US29290294A | 1994-08-18 | 1994-08-18 | |
US292,902 | 1994-08-18 | ||
US51519095A | 1995-08-15 | 1995-08-15 | |
US515,190 | 1995-08-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1996006188A1 true WO1996006188A1 (en) | 1996-02-29 |
Family
ID=26967630
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1995/010523 WO1996006188A1 (en) | 1994-08-18 | 1995-08-17 | Peptide librairies as a source of syngenes |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP0777748A1 (en) |
JP (1) | JPH10504718A (en) |
AU (1) | AU3330895A (en) |
CA (1) | CA2197864A1 (en) |
WO (1) | WO1996006188A1 (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6171820B1 (en) | 1995-12-07 | 2001-01-09 | Diversa Corporation | Saturation mutagenesis in directed evolution |
WO2001018036A2 (en) * | 1999-09-03 | 2001-03-15 | Beth Israel Deaconess Medical Center | Methods and reagents for regulating gene expression |
US6238884B1 (en) | 1995-12-07 | 2001-05-29 | Diversa Corporation | End selection in directed evolution |
US6352842B1 (en) | 1995-12-07 | 2002-03-05 | Diversa Corporation | Exonucease-mediated gene assembly in directed evolution |
US6358709B1 (en) | 1995-12-07 | 2002-03-19 | Diversa Corporation | End selection in directed evolution |
US6361974B1 (en) | 1995-12-07 | 2002-03-26 | Diversa Corporation | Exonuclease-mediated nucleic acid reassembly in directed evolution |
US6537776B1 (en) | 1999-06-14 | 2003-03-25 | Diversa Corporation | Synthetic ligation reassembly in directed evolution |
US6562594B1 (en) | 1999-09-29 | 2003-05-13 | Diversa Corporation | Saturation mutagenesis in directed evolution |
US6713279B1 (en) | 1995-12-07 | 2004-03-30 | Diversa Corporation | Non-stochastic generation of genetic vaccines and enzymes |
US6713281B2 (en) | 1995-12-07 | 2004-03-30 | Diversa Corporation | Directed evolution of thermophilic enzymes |
US6740506B2 (en) | 1995-12-07 | 2004-05-25 | Diversa Corporation | End selection in directed evolution |
US6939689B2 (en) | 1995-12-07 | 2005-09-06 | Diversa Corporation | Exonuclease-mediated nucleic acid reassembly in directed evolution |
US7018793B1 (en) | 1995-12-07 | 2006-03-28 | Diversa Corporation | Combinatorial screening of mixed populations of organisms |
US8853367B1 (en) * | 2004-09-15 | 2014-10-07 | The Uab Research Foundation | Compositions and methods for modulating rank activities |
CN112194706A (en) * | 2020-09-30 | 2021-01-08 | 广州派真生物技术有限公司 | Adeno-associated virus mutant and application thereof |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5338665A (en) * | 1991-10-16 | 1994-08-16 | Affymax Technologies N.V. | Peptide library and screening method |
-
1995
- 1995-08-17 CA CA002197864A patent/CA2197864A1/en not_active Abandoned
- 1995-08-17 JP JP8508213A patent/JPH10504718A/en active Pending
- 1995-08-17 AU AU33308/95A patent/AU3330895A/en not_active Abandoned
- 1995-08-17 EP EP95929609A patent/EP0777748A1/en not_active Withdrawn
- 1995-08-17 WO PCT/US1995/010523 patent/WO1996006188A1/en not_active Application Discontinuation
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5338665A (en) * | 1991-10-16 | 1994-08-16 | Affymax Technologies N.V. | Peptide library and screening method |
Non-Patent Citations (11)
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6696275B2 (en) | 1995-12-07 | 2004-02-24 | Diversa Corporation | End selection in directed evolution |
US6352842B1 (en) | 1995-12-07 | 2002-03-05 | Diversa Corporation | Exonucease-mediated gene assembly in directed evolution |
US6171820B1 (en) | 1995-12-07 | 2001-01-09 | Diversa Corporation | Saturation mutagenesis in directed evolution |
US6713282B2 (en) | 1995-12-07 | 2004-03-30 | Diversa Corporation | End selection in directed evolution |
US6709841B2 (en) | 1995-12-07 | 2004-03-23 | Diversa Corporation | Exonuclease-mediated gene assembly in directed evolution |
US6358709B1 (en) | 1995-12-07 | 2002-03-19 | Diversa Corporation | End selection in directed evolution |
US6361974B1 (en) | 1995-12-07 | 2002-03-26 | Diversa Corporation | Exonuclease-mediated nucleic acid reassembly in directed evolution |
US7018793B1 (en) | 1995-12-07 | 2006-03-28 | Diversa Corporation | Combinatorial screening of mixed populations of organisms |
US6939689B2 (en) | 1995-12-07 | 2005-09-06 | Diversa Corporation | Exonuclease-mediated nucleic acid reassembly in directed evolution |
US6635449B2 (en) | 1995-12-07 | 2003-10-21 | Diversa Corporation | Exonuclease-mediated nucleic acid reassembly in directed evolution |
US6238884B1 (en) | 1995-12-07 | 2001-05-29 | Diversa Corporation | End selection in directed evolution |
US6773900B2 (en) | 1995-12-07 | 2004-08-10 | Diversa Corporation | End selection in directed evolution |
US6740506B2 (en) | 1995-12-07 | 2004-05-25 | Diversa Corporation | End selection in directed evolution |
US6713279B1 (en) | 1995-12-07 | 2004-03-30 | Diversa Corporation | Non-stochastic generation of genetic vaccines and enzymes |
US6713281B2 (en) | 1995-12-07 | 2004-03-30 | Diversa Corporation | Directed evolution of thermophilic enzymes |
US6537776B1 (en) | 1999-06-14 | 2003-03-25 | Diversa Corporation | Synthetic ligation reassembly in directed evolution |
WO2001018036A3 (en) * | 1999-09-03 | 2001-06-14 | Beth Israel Hospital | Methods and reagents for regulating gene expression |
WO2001018036A2 (en) * | 1999-09-03 | 2001-03-15 | Beth Israel Deaconess Medical Center | Methods and reagents for regulating gene expression |
US6562594B1 (en) | 1999-09-29 | 2003-05-13 | Diversa Corporation | Saturation mutagenesis in directed evolution |
US8853367B1 (en) * | 2004-09-15 | 2014-10-07 | The Uab Research Foundation | Compositions and methods for modulating rank activities |
CN112194706A (en) * | 2020-09-30 | 2021-01-08 | 广州派真生物技术有限公司 | Adeno-associated virus mutant and application thereof |
Also Published As
Publication number | Publication date |
---|---|
JPH10504718A (en) | 1998-05-12 |
EP0777748A1 (en) | 1997-06-11 |
AU3330895A (en) | 1996-03-14 |
CA2197864A1 (en) | 1996-02-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7008780B2 (en) | Chimeric DNA-binding proteins | |
US6326166B1 (en) | Chimeric DNA-binding proteins | |
EP0781331B1 (en) | Improvements in or relating to binding proteins for recognition of dna | |
US6015561A (en) | Antigen binding peptides (abtides) from peptide libraries | |
US6627405B1 (en) | 53BP2 complexes | |
WO1996006188A1 (en) | Peptide librairies as a source of syngenes | |
US20090029468A1 (en) | Zinc finger binding domains for cnn | |
JP2002502238A (en) | Nucleic acid binding polypeptide library | |
JP2005500061A5 (en) | ||
JPH08506487A (en) | Total synthetic affinity reagent | |
EP0973391B1 (en) | Endoderm, cardiac and neural inducing factors | |
JP2001514007A (en) | Chimeric transcription activators and compositions and uses related thereto | |
JPH09508019A (en) | Zinc finger protein derivative and method therefor | |
Williams et al. | Group 13 HOX proteins interact with the MH2 domain of R-Smads and modulate Smad transcriptional activation functions independent of HOX DNA-binding capability | |
KR101161923B1 (en) | Atap peptides, nucleic acids encoding the same and associated methods of use | |
US5986055A (en) | CDK2 interactions | |
JPH11503011A (en) | Conditional expression system | |
JP2002510490A (en) | LYST protein complex and LYST interacting protein | |
JP2009504156A (en) | Zinc finger binding domain for CNN | |
JP2002514893A (en) | Rapamycin-based regulation of biological events | |
WO2001018036A2 (en) | Methods and reagents for regulating gene expression | |
US20020099171A1 (en) | Endoderm, cardiac and neural inducing factors - xenopus frazzled (frzb-1) protein | |
US20040106565A1 (en) | Gene expression, genome alteration and reporter expression in myofibroblasts and myofibroblast-like cells | |
KR20130097601A (en) | Novel expression cassettes for adrenomedullin gene harboring intron 1 and process for preparing the adrenomedullin using the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AM AU BB BG BR BY CA CN CZ EE FI GE HU IS JP KG KP KR KZ LK LR LT LV MD MG MK MN MX NO NZ PL RO RU SG SI SK TJ TM TT UA UZ VN |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): KE MW SD SZ UG AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2197864 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1995929609 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1995929609 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1995929609 Country of ref document: EP |