WO2005051976A2 - Protein and peptide ligation processes and one-step purification processes - Google Patents

Protein and peptide ligation processes and one-step purification processes Download PDF

Info

Publication number
WO2005051976A2
WO2005051976A2 PCT/US2004/039045 US2004039045W WO2005051976A2 WO 2005051976 A2 WO2005051976 A2 WO 2005051976A2 US 2004039045 W US2004039045 W US 2004039045W WO 2005051976 A2 WO2005051976 A2 WO 2005051976A2
Authority
WO
WIPO (PCT)
Prior art keywords
protein
peptide
sequence
fusion protein
molecule
Prior art date
Application number
PCT/US2004/039045
Other languages
French (fr)
Other versions
WO2005051976A3 (en
Inventor
Hongyuan Mao
Scott A. Hart
Brian A. Pollok
Original Assignee
Ansata Therapeutics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ansata Therapeutics, Inc. filed Critical Ansata Therapeutics, Inc.
Publication of WO2005051976A2 publication Critical patent/WO2005051976A2/en
Publication of WO2005051976A3 publication Critical patent/WO2005051976A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K1/00General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
    • C07K1/107General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by chemical modification of precursor peptides
    • C07K1/1072General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by chemical modification of precursor peptides by covalent attachment of residues or functional groups
    • C07K1/1075General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by chemical modification of precursor peptides by covalent attachment of residues or functional groups by covalent attachment of amino acids or peptide residues
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Definitions

  • the invention relates to efficient enzyme-catalyzed processes for linking a molecule of interest to a protein or peptide, and related kits and products.
  • the invention relates also to processes, products and kits useful for expressing and purifying recombinant proteins and peptides.
  • Protein engineering is becoming a widely used tool in many areas of protein biochemistry.
  • One engineering method is controlled protein ligation, and over the past ten years some progress has been made.
  • synthetic-based chemistry allows joining synthetic peptides together through native chemical ligation and a 166 amino-acid polymer-modified erythropoiesis protein has been synthesized using this method.
  • native chemical ligation relies on efficient preparation of synthetic peptide esters, which can be technically difficult to prepare for large polypeptides such as proteins.
  • the reaction sometimes is performed in an organic solvent to produce the requisite protein ester for ligation.
  • Other ligation methods not requiring the production of protein esters generate protein thioesters.
  • intein-based protein ligation system was used to generate a protein by thiolysis of a corresponding protein-intein thioester fusion.
  • a prerequisite for this intein-mediated ligation method is that the target protein is expressed as a correctly folded fusion with intein, and that sufficient spacing between the target and intein is needed to allow formation of the intein-thioester.
  • the intein-fusion proteins can only be obtained from inclusion bodies when expressed in Escherichia coli, which often cannot be refolded. This difficulty significantly limits the application of intein-based protein ligation. [0003] Purification of a tag-free recombinant protein often is challenging and often requires multiple chromatography steps.
  • a tag can be linked to a recombinant protein, and after purification, the tag on the fusion may be cleaved from the target protein by treatment with an exogenously added site- specific protease. Additional chromatographic steps then are required to separate the target protein from the un-cleaved fusion, the affinity tag, and the peptidase.
  • a N-terminal His 6 tag from a recombinant protein may be cleaved by an engineered His 6 -tagged aminopeptidase, and a subtractive immobilized metal-ion affinity chromatography (IMAC) step can be used to recover the untagged target.
  • IMAC immobilized metal-ion affinity chromatography
  • Other methods may require two or more chromatography steps and even special treatment of the exogenous peptidase, such as biotinylation, to facilitate its removal.
  • an enzyme utilized for each process has transamidase catalytic activity.
  • an efficient enzyme-catalyzed protein ligation method has been developed for linking a protein or peptide to a molecule of interest, and it has been discovered that a transamidase enzyme can be utilized to catalyze the ligation process.
  • the fransamidase is a sortase enzyme. Sortases are present in Gram-positive bacteria and catalyze proteolytic cleavage and linkage of peptides to cell wall components of the bacteria in vivo.
  • the linkage process is efficient, economical and widely applicable.
  • Four characteristics of the process underscoring these features are (1) the reaction can be performed simply by combining a transamidase with the protein or peptide and the molecule of interest (e.g., isolating an enzyme-protein intermediate is not required), (2) the reaction can be performed in an aqueous solution and does not require esterification of the protein or peptide (i.e., an organic solvent is not required), (3) the process can be performed in a cell-free environment (e.g., ligation does not require bacterial cell wall components or an intact bacterial cell wall); and (4) relatively small concentrations of reagents are required.
  • One embodiment is a method for linking a molecule A to a protein or peptide B, which comprises contacting in a cell free system the molecule A, the protein or peptide B and a transamidase enzyme, whereby the transamidase enzyme links the molecule A to the protein or peptide B.
  • Another embodiment is a method for linking a molecule A to a protein or peptide B, which comprises contacting in a system (e.g., a cell-free system or a system containing cells) the molecule A, the protein or peptide B and a transamidase enzyme, where the molecule A is not a component of a bacterial cell wall in the system, whereby the transamidase enzyme links the molecule A to the protein or peptide B.
  • a system e.g., a cell-free system or a system containing cells
  • the system is an aqueous system
  • the protein or peptide is not esterified significantly and/or is not purposefully esterified
  • the protein or peptide B is not part of an isolated enzyme conjugate
  • the ratio of the enzyme to the protein or peptide B often is greater than 1 : 1000.
  • the transamidase often is a sortase, and sometimes sortase A or sortase B from S. aureus.
  • the molecule A comprises a NH 2 -CH 2 - moiety, which often is present when the molecule is added to the system or sometimes is incorporated after the molecule is added to the system.
  • the protein or peptide often is synthesized or generated with an amino acid sequence that the transamidase recognizes and acts upon, which is referred to herein as a "recognition sequence” or “recognition motif.” Examples of such recognition sequences are provided hereafter, and often are not native to the protein or peptide.
  • the protein or peptide sometimes includes a non-native amino acid sequence that allows purification or identification of the ligated product (e.g., a polyhistidine tag that binds to a nickel-conjugated solid support and/or an antibody epitope (e.g., FLAG)) and the non-native amino acid sequence sometimes is cleaved from the protein or peptide upon ligation to the molecule of interest.
  • a non-native amino acid sequence that allows purification or identification of the ligated product (e.g., a polyhistidine tag that binds to a nickel-conjugated solid support and/or an antibody epitope (e.g.
  • kits for performing the ligation processes described herein include a container that includes a DNA plasmid into which the user can clone a DNA sequence that encodes the protein or peptide B.
  • the protein or peptide B produced using the plasmid typically includes aN-terminal or C-terminal transamidase recognition motif (described in more detail hereafter).
  • the protein or peptide B is produced from the plasmid with a non- native sequence that allows purification or identification of the ligation product.
  • the kit sometimes includes an isolated transamidase, often a sortase enzyme, useful for performing the ligation reaction, and may include a plasmid from which the user can prepare the enzyme.
  • the kit includes an appropriately derivitized solid support (e.g., a derivitized glass slide, silicon chip, bead or resin) or an appropriately derivitized detectable label (e.g., a fluorescent molecule or radioisotope), to which the user can ligate a protein or peptide using the transamidase.
  • the kit typically includes a set of instructions for performing one or more ligation processes described herein.
  • Expressing a recombinant target protein as a fusion can potentially offer several advantages. Fusing a highly stable carrier protein at the N-terminus can increase the expression and solubility of the target protein.
  • incorporating an affinity-fusion tag can ease the capture of the target protein on an affinity column.
  • Protein fusion systems have been developed that generate free recombinant protein or peptide in a single affinity chromatographic step, and are disclosed herein.
  • the fusion proteins include a peptidase capable of cleaving the fusion sequence at an internal location, and the peptidase often has moderate catalytic activity.
  • the fusion protein comprises a similar enzyme or the same enzyme as used for ligation processes described herein. Because the peptidase is co-expressed as a fusion with the target protein or peptide, the purification system does not require a step of adding an exogenous peptidase.
  • a fusion protein often comprises a solid phase association region, a peptidase, a target protein or target peptide, and a peptidase recognition sequence.
  • the solid phase association region, peptidase, target protein or target peptide, and peptidase recognition sequence elements are in a contiguous amino acid sequence and these elements are arranged in any convenient orientation.
  • the solid phase association region sometimes is at the N-terminus and it sometimes is at the C-terminus of a fusion protein.
  • the peptidase sequence sometimes is located N-terminal of the target protein or target peptide sequence and sometimes is located C-terminal of the target protein or target peptide sequence.
  • the peptidase recognition sequence often is located between the target protein or target peptide sequence and the peptidase sequence.
  • a fusion protein sometimes comprises one or more linker sequences that flank elements of the fusion protein (e.g., flanking the solid phase association region, target protein or target peptide sequence, peptidase sequence, and/or peptidase recognition sequence), which sometimes are located between the peptidase and the target protein or target peptide sequences.
  • a fusion protein sometimes comprises an export sequence that exports the fusion protein to an intracellular compartment near a host cell surface or secretes fusion protein outside of a host cell.
  • the peptidase in the fusion protein is sortase A from Staphylococcus aureus (SrtAc), a fragment of SrtAc, or a variant of the foregoing, which recognize and cleaves a threonine- glycine bond in an LPXTG sequence.
  • a SrtAc fragment in a fusion is a catalytic core region that recognizes and cleaves a threonine-glycine bond in an LPXTG sequence with moderate activity, which is referred to herein as a "SrtAc catalytic region.”
  • the SrtAc catalytic region is amino acids 60 to 206 of native SrtAc or a variant thereof.
  • the solid phase association region comprises or consists of a polyhistidine sequence (e.g., six contiguous histidine amino acids), which is capable of binding to an immobilized metal-ion affinity chromatography (IMAC) reagent.
  • IMAC immobilized metal-ion affinity chromatography
  • a fusion protein consists of an N-terminal His 6 tag, SrtAc catalytic region, and an LPXTG linker followed by a target protein or target peptide at the C-terminus.
  • a fusion protein in combination with a solid support, where the solid support specifically binds to the solid phase association region in the fusion protein.
  • nucleic acids that encode the fusion proteins disclosed herein Nucleic acids that encode the fusion protein sometimes do not include a target protein- or target peptide- encoding nucleotide sequence, and the target protein-encoding or target peptide-encoding nucleotide sequence sometimes is inserted into the nucleic acid.
  • kits which comprises a container that includes a nucleic acid or fusion protein described herein, often with instructions for producing a fusion protein, and often purifying a target protein or target peptide.
  • the nucleotide sequence that encodes the fusion protein is in any nucleic acid convenient for expressing the fusion protein, or for preparing a nucleic acid that expresses the fusion protein.
  • the nucleic acid often is DNA, often is a plasmid, sometimes is linear, and sometimes is RNA.
  • the nucleic acid often does not include a target protein-encoding or target peptide- encoding sequence, and the kit sometimes includes instructions for inserting a target protein or target peptide sequence into the nucleic acid.
  • a kit includes a component useful for inserting a target protein or target peptide sequence into the nucleic acid, including but not limited to, one or more oligonucleotides, a polymerase for performing a polymerase chain reaction (PCR) procedure, a topoisomerase, one or more restriction enzymes and/or a ligase.
  • the kit sometimes comprises a solid support capable of binding the solid phase association region in the fusion protein expressed from the nucleic acid, which sometimes is an IMAC solid support.
  • a kit includes an organism useful for expressing the fusion protein from the nucleic acid, including but not limited to, a strain of bacteria, yeast, fungi, insect cells or mammalian cells, and optionally includes reagents and/or instructions for inserting the nucleic acid into the organism.
  • an organism useful for expressing the fusion protein from the nucleic acid including but not limited to, a strain of bacteria, yeast, fungi, insect cells or mammalian cells, and optionally includes reagents and/or instructions for inserting the nucleic acid into the organism.
  • Fusion proteins can be expressed in any host organism that expresses the fusion protein in detectable amounts, including but not limited to, a strain of bacteria, yeast, fungi, insect cells or mammalian cells.
  • a nucleic acid encoding the fusion protein is inserted into a host cell.
  • Nucleotide sequences in the nucleic acid that flank the fusion protein-encoding nucleotide sequence often are selected according to the host organism chosen for expression.
  • the nucleotide sequence that encodes the fusion protein sometimes is codon-optimized depending upon the host organism selected for expression.
  • cleavage sites for the peptidase in the fusion not located between the peptidase and the target protein or peptide sequence often are removed from the nucleic acid before the fusion protein is expressed.
  • the fusion protein sometimes is secreted outside of the host organism after expression, in which cases host cells often are not lysed.
  • the host cells In some embodiments in which a fusion protein is exported to regions near the cell surface, the host cells often are exposed to conditions more gentle than cell lysis, such as osmotic stress. In embodiments where a fusion protein is not secreted outside of the host cells, the host cells often are lysed after fusion protein expression. After expression, the fusion protein is contacted with a solid support, where the solid support often specifically binds to a solid phase association region in a fusion protein. Immobilized fusion proteins are cleaved by the peptidase on the solid support, and components that accelerate or facilitate cleavage sometimes are added to the system.
  • the peptidase in the fusion protein is a sortase, a sortase fragment, or a variant of the foregoing
  • calcium ions and/or triglycine sometimes are added to the system as accelerants.
  • the peptidase in the fusion protein is a sortase, a sortase fragment, or a variant of the foregoing and the peptidase is closer to the N-terminus of the fusion protein than the target protein or target peptide
  • a target protein or target peptide with a N-terminal glycine often is isolated and the N-terminal portion of the fusion protein remains associated with the solid phase.
  • the released target protein or target peptide sometimes includes a N-terminal sequence with more amino acids added than just a glycine.
  • the peptidase in the fusion protein is a sortase, a sortase fragment, or a variant of the foregoing and the target protein or target peptide is closer to the N- terminus of the fusion protein than the peptidase
  • the target protein or target peptide with a C-terminal LPXTGGG sequence sometimes is isolated when the system is contacted with triglycine.
  • the target protein or target peptide sometimes includes a C-terminal LPXTGZ moiety when the system is contacted with a NH 2 -CH 2 -Z substance.
  • the process yields an isolated or purified target protein or target peptide having 90% or more purity, 91% or more purify, 92% or more purify, 93% or more purity, 94% or more purify, 95% or more purify, 96% or more purity, 97% or more purity, 98% or more purity, or 99% or more purity.
  • the process sometimes generates a target protein or target peptide yield of 1 mg/L of cell culture or more, 2 mg/L of cell culture or more, 5 mg/L of cell culture or more, 10 mg/L of cell culture or more, 15 mg/L of cell culture or more, 20 mg/L of cell culture or more, 25 mg/L of cell culture or more, 30 mg/L of cell culture or more, 35 mg/L of cell culture or more, 40 mg L of cell culture or more, 50 mg/L of cell culture or more, 75 mg/L of cell culture or more, 100 mg/L of cell culture or more, 250 mg/L of cell culture or more, 500 mg/L of cell culture or more, 750 mg/L of cell culture or more, 1000 mg/L of cell culture or more, 2000 mg/L of cell culture or more, or 5000 mg/L of cell culture or more.
  • Figure 1 shows a synthetic scheme for generating a molecule A useful for protein or peptide ligation, as exemplified by the synthesis of a triglycine folate molecule.
  • Figure 2 shows an example of a one-step protein purification scheme of a self-cleavable sortase fusion. The fusion from crude cell lysate is captured on an IMAC column. The immobilized fusion protein is then equilibrated with a calcium-containing solution ( ⁇ Gly3) to induce SrtAc- mediated cleavage at the LPXTG site. The target protein having an extra N-terminal glycine is collected in the cleavage flow through.
  • Figure 3 depicts cloning schemes in a pET15b vector and the coding sequences of plasmid pGHSL-emGFP and the three variations. Regions of His6 tag (H), SrtAc (S) and LPETG cleavage site (L), emGFP as well as restriction sites used for cloning are depicted in pGHSL-emGFP.
  • the Trpl94* designation refers to the Trp position in the full-length sortase A of S. aureus, and the actual amino acid position in the fusion is 156.
  • Plasmid pAHSL-emGFP encodes a Gly2 to Ala mutation
  • pGHS'L- emGFP encodes a Trpl94* to Ala mutation
  • pGHS'L-emGFP encodes both Gly2 to Ala and Trp 194* to Ala mutations.
  • S', G, and A are designations for the SrtAc mutant, the N-terminal glycine, and the N-terminal alanine, respectively.
  • Figure 4 shows a fusion protein and a purification scheme in which a fusion comprising a target protein sequence, a peptidase recognition sequence, a peptidase and a protein solid phase association region is labeled using an amino glycine-derivative and purified on a solid support.
  • Processes have been developed for efficiently linking a protein or peptide to a molecule of interest.
  • the development of these processes is significant as they are widely applicable to many proteins and peptides and a multitude of different modifying molecules.
  • the processes are useful for ligating proteins or peptides to one another, ligating synthetic peptides to proteins, linking a reporting molecule to a protein or peptide, joining a nucleic acid to a protein or peptide, conjugating a protein or peptide to a solid support, and linking a protein or peptide to a toxin, for example.
  • Provided also are fusion proteins capable of self-cleavage and one-step purification, and related products, processes, and kits.
  • Such products and processes save cost and time associated with target protein or target polypeptide production, and are useful for conveniently linking a molecule of interest to a target protein or target peptide.
  • the ligation products and purified products are useful in a variety of applications, such as diagnostic procedures (e.g., an antibody that specifically binds to a cancer cell epitope is joined to a radioisotope and the conjugate is administered to a patient to detect the presence or absence of cancer cells); therapeutic procedures (e.g., an antibody that specifically binds to a cancer cell epitope is joined to a toxin such as ricin A and the conjugate is administered to a patient to selectively treat the cancer), and research methods (e.g., a NH 2 -CH 2 -derivitized fluorophore and sortase are contacted with fixed cells that express a protein linked to a sortase recognition sequence and the location of the protein is detected by a fluorescence imaging technique), for example.
  • diagnostic procedures e.g
  • transamidase Enzymes Protein and peptide ligation processes described herein often are catalyzed by a transamidase.
  • a transamidase is an enzyme that can form a peptide linkage (i.e., amide linkage) between a protein or peptide and a molecule of interest containing a NH 2 -CH 2 - moiety.
  • Sortases are enzymes having transamidase activity and have been isolated from Gram-positive bacteria. Gram positive bacteria retain the crystal violet stain in the presence of alcohol or acetone. They have, as part of their cell wall structure, peptidoglycan as well as polysaccharides and/or teichoic acids.
  • Gram- positive bacteria include the following genera: Actinomyces, Bacillus, Bifidobacterium, Cellulomonas, Clostridium, Corynebacterium, Micrococcus, Mycobacterium, Nocardia, Staphylococcus, Streptococcus and Streptomyces.
  • Enzymes identified as "sortases” from Gram-positive bacteria cleave and translocate proteins to proteoglycan moieties in intact cell walls. Two sortases have been isolated from Staphylococcus aureus, which are sortase A (Srt A) and sortase B (Srt B).
  • isolated often refers to having a specific activity of at least tenfold greater than the sortase- transamidase activity present in a crude extract, lysate, or other state from which proteins have not been removed and also in substantial isolation from proteins found in association with sortase-transamidase in the cell.
  • An "isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized.
  • the term "substantially free” refers to preparing a target polypeptide having less than about 30%, 20%, 10% and sometimes 5% (by dry weight), of non-target polypeptide (also referred to herein as a "contaminating protein"), or of chemical precursors or non-target chemicals.
  • a target polypeptide or a biologically active portion thereof is recombinantly produced, it also often is substantially free of culture medium, where culture medium represents less than about 20%, sometimes less than about 10%, and often less than about 5% of the volume of the polypeptide preparation.
  • Isolated or purified target polypeptide preparations sometimes are 0.01 milligrams or more or 0.1 milligrams or more, and often 1.0 milligrams or more and 10 milligrams or more in dry weight.
  • Amino acid sequences of Srt A and Srt B and the nucleotide sequences that encode them are disclosed in US 2003/0153020 Al, published on August 14, 2003, which are incorporated herein by reference.
  • the amino acid sequences of SrtA and SrfB are homologous, sharing 22% sequence identify and 37% sequence similarity.
  • the amino acid sequence of a sortase-transamidase from Staphylococcus aureus also has substantial homology with sequences of enzymes from other Gram-positive bacteria, and such fransamidases can be utilized in the ligation processes described herein.
  • SrtA there is about a 31% sequence identity (and about 44% sequence similarity) with best alignment over the entire sequenced region of the S. pyogenes open reading frame.
  • There is about a 28% sequence identity with best alignment over the entire sequenced region of the A. naeslundii open reading frame.
  • transamidase bearing 18% or more sequence identity, 20% or more sequence identity, or 30% or more sequence identity with the S. pyogenes, A. naeslundii, S. mutans, E. faecalis or B.
  • Transamidases from other organisms also can be utilized in the processes described herein. Such transamidases often are encoded by nucleotide sequences substantially identical or similar to the nucleotide sequences that encode Srt A and Srt B. A similar or substantially identical nucleotide sequence may include modifications to the native sequence, such as substitutions, deletions, or insertions of one or more nucleotides.
  • nucleotide sequences that sometimes are 55%, 60%, 65%, 70%, 75%, 80%, or 85% or more identical to a native quadruplex-forming nucleotide sequence, and often are 90% or 95% or more identical to the native quadruplex-forming nucleotide sequence (each identity percentage can include a 1%, 2%, 3% or 4% variance).
  • One test for determining whether two nucleic acids are substantially identical is to determine the percentage of identical nucleotide sequences shared between the nucleic acids. [0026] Calculations of sequence identity can be performed as follows. Sequences are aligned for optimal comparison purposes and gaps can be introduced in one or both of a first and a second nucleic acid sequence for optimal alignment.
  • non-homologous sequences can be disregarded for comparison purposes.
  • the length of a reference sequence aligned for comparison purposes sometimes is 30% or more, 40% or more, 50% or more, often 60% or more, and more often 70%, 80%, 90%, 100% of the length of the reference sequence.
  • the nucleotides at corresponding nucleotide positions then are compared among the two sequences. When a position in the first sequence is occupied by the same nucleotide as the corresponding position in the second sequence, the nucleotides are deemed to be identical at that position.
  • the percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, introduced for optimal alignment of the two sequences.
  • Comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. Percent identity between two nucleotide sequences can be determined using the algorithm of Meyers & Miller, CABIOS 4:11 17 (1989), which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. Percent identity between two nucleotide sequences can be determined using the GAP program in the GCG software package (available at http address www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6.
  • a set of parameters often used is a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frame shift gap penalty of 5.
  • Another manner for determining if two nucleic acids are substantially identical is to assess whether a polynucleotide homologous to one nucleic acid will hybridize to the other nucleic acid under stringent conditions.
  • stringent conditions refers to conditions for hybridization and washing. Stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y., 6.3.1 6.3.6 (1989). Aqueous and non-aqueous methods are described in that reference and either can be used.
  • stringent conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 50°C.
  • Another example of stringent conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 55°C.
  • a further example of stringent conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 60°C.
  • stringent conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 65°C.
  • stringency conditions include hybridization in 0.5M sodium phosphate, 7% SDS at 65°C, followed by one or more washes at 0.2X SSC, 1% SDS at 65°C.
  • a variant sequence can depart from a native amino acid sequence in different manners. Amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, helix-forming properties and/or amphipathic properties and the resulting variants are screened for antimicrobial activity.
  • negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; and amino acids with uncharged polar head groups having similar hydrophilicity values include leucine, isoleucine, valine, glycine, alanrne, asparagine, glutamine, serine, threonine, phenylalanine, and tyrosine.
  • Conservative substitutions may be made, for example, according to Table A.
  • Amino acids in the same block in the second column and in the same line in the third column may be substituted for one another other in a conservative substitution. Certain conservative substitutions are substituting an amino acid in one row of the third column corresponding to a block in the second column with an amino acid from another row of the third column within the same block in the second column.
  • homologous substitution may occur, which is a substitution or replacement of like amino acids, such as basic for basic, acidic for acidic, polar for polar amino acids, and hydrophobic for hydrophobic, for example.
  • Non-homologous substitutions can be introduced to a native sequence, such as from one class of residue to another (e.g., a non-hydrophobic to a hydrophobic amino acid), or substituting a naturally occurring amino acid with an unnatural amino acids or non- classical amino acid replacements.
  • Srt A and Srt B nucleotide sequences may be used as "query sequences" to perform a search against public databases to identify related sequences.
  • Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al, J. Mol. Biol. 215:403 410 (1990).
  • Gapped BLAST can be utilized as described in Altschul, et al., Nucleic Acids Res. 25(17):3389-3402 (1997).
  • Transamidase fragments having transamidation activity also can be utilized in the methods described herein. Such fragments can be identified by producing transamidase fragments by known recombinant techniques or proteolytic techniques, for example, and determining the rate of protein or peptide ligation. The fragment sometimes consists of about 80% of the full-length transamidase amino acid sequence, and sometimes about 70%, about 60%, about 50%, about 40% or about 30% of the full- length transamidase amino acid sequence.
  • proteins and peptides utilized in the ligation processes described herein sometimes include or are modified with an appropriate sortase recognition motif.
  • One or more appropriate sortase recognition sequences can be added to a protein or peptide not having one by known synthetic and recombinant techniques.
  • 2, 3, 4, 5, 6, 7, 8, 9, 10 or more sortase or transamidase recognition sequences are incorporated in the protein or peptide.
  • the protein or peptide often comprises the amino acid sequence X ⁇ PX 2 X 3 G, where Xi is leucine, isolucine, valine or methionine; X 2 is any amino acid; X 3 is threonine, serine or alanine; P is proline and G is glycine.
  • Xi is leucine and X 3 is threonine.
  • X 2 is aspartate, glutamate, alanine, glutamine, lysine or methionine.
  • the protein or peptide often comprises the amino acid sequence NPX ⁇ TX 2 , where Xi is glutamine or lysine; X 2 is asparagine or glycine; N is asparagine; P is proline and T is threonine.
  • Transamidases utilized in the ligation methods described herein sometimes are isolated and incorporated into a kit. Any convenient method known can be utilized to isolate the sortase or transamidase. In certain embodiments, the sortase or transamidase is produced with a N-terminal or C- terminal amino acid sequence that facilitates purification.
  • a kit includes a plasmid having a sortase or transamidase-encoding nucleotide sequence that the user uses to produce the sortase or transamidase in a cell culture or cell free translation system, and often, then purifies the sortase or transamidase according to instructions provided with the kit.
  • the transamidase, protein or peptide and molecule of interest are contacted with one another in a system.
  • the term "contacting" refers to placing the components of the process in close proximity to one another and allowing the molecules to collide by diffusion. Contacting these components with one another can be accomplished by adding them to one body of fluid and/or in one reaction vessel, for example.
  • the components in the system may be mixed in a variety of manners, such as by oscillating a vessel, subjecting a vessel to a vortex generating apparatus, repeated mixing with a pipette or pipettes, or by passing fluid containing one assay component over a surface having another assay component immobilized thereon, for example.
  • system refers to an environment that receives the ligation components, which includes, for example, microtiter plates (e.g., 96-well or 384-well plates), silicon chips having molecules immobilized thereon and optionally oriented in an array (see, e.g., U.S. Patent No. 6,261,776 and Fodor, Nature 364: 555-556 (1993)), and microfiuidic devices (see, e.g., U.S. Patent Nos. 6,440,722; 6,429,025; 6,379,974; and 6,316,781).
  • microtiter plates e.g., 96-well or 384-well plates
  • silicon chips having molecules immobilized thereon and optionally oriented in an array
  • microfiuidic devices see, e.g., U.S. Patent Nos. 6,440,722; 6,429,025; 6,379,974; and 6,316,781).
  • the system can include attendant equipment such as signal detectors, robotic platforms, and pipette dispensers.
  • the system often is cell free and often does not include bacterial cell wall components or intact bacterial cell walls.
  • the system includes one or more cells, often non- bacterial cells or non-Gram-positive bacterial cells.
  • one or more components often are expressed by one or more recombinant nucleotide sequences in a cell, which nucleotide sequences are integrated into the cell genome or non-integrated (e.g., in a plasmid). Cells in such systems often are maintained in vivo, sometimes ex vivo, and sometimes in vivo.
  • the system is maintained at any convenient temperature at which the ligation reaction can be performed.
  • the temperature often is room temperature (e.g., about 25°C) and the temperature can be optimized by repetitively performing the same ligation procedure at different temperatures and determining ligation rates.
  • Any convenient assay volume and component ratio is utilized. In certain embodiments, a component ratio of 1 :1000 or greater transamidase enzyme to protein or peptide is utilized, or a ratio of 1 : 1000 or greater transamidase enzyme to molecule of interest is utilized.
  • ratios of enzyme to protein or peptide or enzyme to molecule of interest is about 1 :1, including 1 :2 or greater, 1 :3 or greater, 1 :4 or greater, 1 :5 or greater, 1 :6 or greater, 1 :7 or greater, 1 :8 or greater, and 1 :9 or greater.
  • the ligation process often is performed in a system comprising an aqueous environment. Water with an appropriate buffer and/or salt content often is utilized. An alcohol or organic solvent may be included in certain embodiments.
  • the amount of an organic solvent often does not appreciably esterify the protein or peptide in the ligation process (e.g., esterified protein or peptide often increase only by 5% or less upon addition of an alcohol or organic solvent).
  • Alcohol and/or organic solvent contents sometimes are 20% or less, 15% or less, 10% or less or 5% or less, and in embodiments where a greater amount of an alcohol or organic solvent is utilized, 30% or less, 40% or less, 50% or less, 60% or less, 70% or less, or 80% or less alcohol or organic solvent is present.
  • the system includes only an alcohol or an organic solvent, with only limited amounts of water if it is present.
  • One or more components for ligation or a ligation product may be immobilized to a solid support.
  • the attachment between an assay component and the solid support may be covalent or non- covalent (e.g., U.S. Patent No. 6,022,688 for non-covalent attachments).
  • the solid support may be one or more surfaces of the system, such as one or more surfaces in each well of a microtiter plate, a surface of a silicon wafer, a surface of a bead (e.g., Lam, Nature 354: 82-84 (1991)) that is optionally linked to another solid support, or a channel in a microfiuidic device, for example.
  • Proteins and Polypeptides Any protein or peptide may be utilized as a target in the ligation process described herein.
  • the protein or peptide often is isolated when utilized in a cell-free system.
  • the protein or peptide sometimes is a subregion of a protein, such as in the N-terminus, C-terminus, extracellular region, intracellular region, transmembrane region, active site (e.g., nucleotide binding region or a substrate binding region), a domain (e.g., an SH2 or SH3 domain) or a post-translationally modified region (e.g., phosphorylated, glycosylated or ubiquinated region), for example.
  • active site e.g., nucleotide binding region or a substrate binding region
  • a domain e.g., an SH2 or SH3 domain
  • a post-translationally modified region e.g., phosphorylated, glycosylated or ubiquinated region
  • Peptides often are 50 amino acids or fewer in length (e.g., 45, 40, 35, 30, 25, 20, or 15 amino acids or fewer in length) and proteins sometimes are 100 or fewer amino acids in length, or 200, 300, 400, 500, 600, 700, or 900 or fewer amino acids in length.
  • the protein or peptide sometimes includes the modification moiefy or a portion thereof (e.g., the glycosyl group or a portion thereof).
  • the protein is a signal transduction factor, cell proliferation factor, apoptosis factor, angiogenesis factor, or cell interaction factor.
  • cell interaction factors include but are not limited to cadherins (e.g., cadherins E, N, BR, P, R, and M; desmocollins; desmogleins; and protocadherins); connexins; integrins; proteoglycans; immunoglobulins (e.g., ALCAM, NCAM-1 (CD56), CD44, intercellular adhesion molecules (e.g., ICAM-1 and ICAM-2), LFA-1, LFA-2, LFA-3, LECAM-1, VLA-4, ELAM andN- CAM); selectins (e.g., L-selectin (CD62L), E-selectin (CD62e), and P-selectin (CD62P)); agrin; CD34; and a cell surface protein that is cyclically internalized or internalized in response to ligand binding.
  • cadherins e.g., cadherins E, N, BR, P,
  • signal transduction factors include but are not limited to protein kinases (e.g., mitogen activated protein (MAP) kinase and protein kinases that directly or indirectly phosphorylate it, Janus kinase (JAK1), cyclin dependent kinases, epidermal growth factor (EGF) receptor, platelet-derived growth factor (PDGF) receptor, fibroblast-derived growth factor receptor (FGF), insulin receptor and insulin-like growth factor (IGF) receptor); protein phosphatases (e.g., PTP1B, PP2A and PP2C); GDP/GTP binding proteins (e.g., Ras, Raf, ARF, Ran and Rho); GTPase activating proteins (GAFs); guanine nucleotide exchange factors (GEFs); proteases (e.g., caspase 3, 8 and 9), ubiquitin ligases (e.g., MDM2, an E3 ubiquitin ligase), acetylation and methyl
  • the protein sometimes is a nucleic acid-associated protein (e.g., histone, transcription factor, activator, repressor, co-regulator, polymerase or origin recognition (ORC) protein), which directly binds to a nucleic acid or binds to another protein bound to a nucleic acid.
  • the protein sometimes is useful as a detectable label, such as a green or blue fluorescent protein.
  • the protein or peptide sometimes is an antibody.
  • Antibodies sometimes are IgG, IgM, IgA, or IgE, sometimes are polyclonal or monoclonal, and sometimes are chimeric, humanized or bispecific versions of such antibodies.
  • polyclonal and monoclonal antibodies that bind specific antigens are commercially available, and methods for generating such antibodies are known.
  • polyclonal antibodies are produced by injecting an isolated antigen into a suitable animal (e.g., a goat or rabbit); collecting blood and/or other tissues from the animal containing antibodies specific for the antigen and purifying the antibody.
  • Methods for generating monoclonal antibodies include injecting an animal with an isolated antigen (e.g., often a mouse or a rat); isolating splenocytes from the animal; fusing the splenocytes with myeloma cells to form hybridomas; isolating the hybridomas and selecting hybridomas that produce monoclonal antibodies which specifically bind the antigen (e.g., Kohler & Milstein, Nature 256:495 497 (1975) and StGroth & Scheidegger, J Immunol Methods 5:1 21 (1980)).
  • an isolated antigen e.g., often a mouse or a rat
  • isolating splenocytes from the animal fusing the splenocytes with myeloma cells to form hybridomas
  • isolating the hybridomas and selecting hybridomas that produce monoclonal antibodies which specifically bind the antigen e.g., Kohler & Milstein, Nature 256
  • Examples of monoclonal antibodies are anti MDM 2 antibodies, anti-p53 antibodies (pAB421, DO 1, and an antibody that binds phosphoryl-serl5), anti-dsDNA antibodies and anti-BrdU antibodies, described hereafter.
  • Methods for generating chimeric and humanized antibodies also are known (see, e.g., U.S. patent No. 5,530,101 (Queen, et al.), U.S. patent No. 5,707,622 (Fung, et al.) and U.S. PatentNos.
  • Antigen-binding regions of antibodies include a light chain and a heavy chain, and the variable region is composed of regions from the light chain and the heavy chain.
  • the variable region of an antibody is formed from six complementarity-determining regions (CDRs) in the heavy and light chain variable regions, one or more CDRs from one antibody can be substituted (i.e., grafted) with a CDR of another antibody to generate chimeric antibodies.
  • humanized antibodies are generated by introducing amino acid substitutions that render the resulting antibody less immunogenic when administered to humans.
  • the protein or peptide sometimes is an antibody fragment, such as a Fab, Fab', F(ab)' 2 , Dab, Fv or single-chain Fv (ScFv) fragment, and recombinant methods for generating antibody fragments are known (e.g., U.S. PatentNos. 6,099,842 and 5,990,296 and PCT/GB00/04317).
  • single-chain antibody fragments are constructed by joining a heavy chain variable region with a light chain variable region by a polypeptide linker (e.g., the linker is attached at the C-terminus or N- terminus of each chain), and such fragments often exhibit specificities and affinities for an antigen similar to the original monoclonal antibodies.
  • Bifunctional antibodies sometimes are constructed by engineering two different binding specificities into a single antibody chain and sometimes are constructed by joining two Fab' regions together, where each Fab' region is from a different antibody (e.g., U.S. Patent No. 6,342,221).
  • Antibody fragments often comprise engineered regions such as CDR-grafted or humanized fragments.
  • the binding partner is an intact immunoglobulin, and in other embodiments the binding partner is a Fab monomer or a Fab dimer.
  • Proteins and peptides sometimes are chemically synthesized using known techniques (e.g., Creighton, 1983 Proteins. New York, N.Y.: W. H. Freeman and Company; and Hunkapiller et al., (1984) Nature July 12 -18;310(5973):105-11).
  • a peptide can be synthesized by a peptide synthesizer.
  • non-classical amino acids or chemical amino acid analogs can be introduced as a substitution or addition into the fragment sequence.
  • Non-classical amino acids include but are not limited to D-isomers of the common amino acids, 2,4-diaminobutyric acid, a-amino isobutyric acid, 4- aminobutyric acid, Abu, 2-amino butyric acid, g-Abu, e-Ahx, 6-amino hexanoic acid, Aib, 2-amino isobutyric acid, 3 -amino propionic acid, ornithine, norleucine, norvaline, hydroxyproline, sarcosine, cittulline, homocifrulline, cysteic acid, t-butylglycine, t-butylalanine, phenylglycine, cyclohexylalanine, b-alanine, fluoroamino acids, designer amino acids such as b-methyl amino acids, Ca-methyl amino acids, Na-methyl amino acids, and amino acid analogs in general.
  • Each amino acid in the peptide often is L (levorotary) and sometimes is D (dextrorotary). Proteins often are produced by known recombinant methods, or sometimes are purified from natural sources. [0048] Native protein and peptide sequences sometimes are modified. For example, conservative amino acid modifications may be introduced at one or more positions in the amino acid sequences of target polypeptides. A "conservative amino acid substitution” is one in which the amino acid is replaced by another amino acid having a similar structure and/or chemical function. Families of amino acid residues having similar structures and functions are well known.
  • amino acids with basic side chains e.g., lysine, arginine, histidine
  • acidic side chains e.g., aspartic acid, glutamic acid
  • uncharged polar side chains e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine
  • nonpolar side chains e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan
  • beta-branched side chains e.g., threonine, valine, isoleucine
  • aromatic side chains e.g., tyrosine, phenylalanine, tryptophan, histidine
  • Proteins or peptides may exist as chimeric or fusion polypeptides.
  • a "chimeric polypeptide” or “fusion polypeptide” includes a protein or peptide linked to a different polypeptide. The different polypeptide can be fused to the N-terminus or C-terminus of the target polypeptide.
  • Fusion polypeptides can include a moiety having high affinity for a ligand.
  • the fusion polypeptide can be a GST-target fusion polypeptide in which the protein or peptide sequences are fused to the C-terminus of the GST sequences, or a polyhistidine-target fusion polypeptide in which the protein or peptide is fused at the N- or C-terminus to a string of histidine residues (e.g., sometimes three to six histidines).
  • Such fusion polypeptides can facilitate purification of recombinant protein or peptide.
  • Fusion polypeptides are commercially available that already encode a fusion moiety, and a nucleotide sequence encoding the peptide or polypeptide can be cloned into an expression vector such that the fusion moiety is linked in-frame to the target polypeptide.
  • the fusion polypeptide can be a protein or peptide containing a heterologous signal sequence at its N- terminus.
  • expression, secretion, cellular internalization, and cellular localization of a target polypeptide can be increased through use of a heterologous signal sequence.
  • Fusion polypeptides sometimes include all or a part of a serum polypeptide (e.g., an IgG constant region or human serum albumin).
  • the protein or peptide sometimes is modified by a process or with a moiety not typically incorporated into a protein during translation.
  • the protein or peptide comprises one or more moieties selected from an alkyl moiety (e.g., methyl moiety), an alkanoyl moiefy (e.g., an acefyl group (e.g., an acetylated histone)), an alkanoic acid or alkanoate moiety (e.g., a fatty acid), a glyceryl moiety (e.g., a lipid), a phosphoryl moiety, a glycosyl moiety (e.g., N-linked or O-linked carbohydrate chains) or an ubiquitin moiety.
  • an alkyl moiety e.g., methyl moiety
  • an alkanoyl moiefy e.g., an acefyl group (e.g., an acetylated histone)
  • any of numerous chemical modifications may be carried out by known techniques, including but not limited to specific chemical cleavage by cyanogen bromide, trypsin, chymotrypsin, papain, V8 protease, NaBH 4 ; acetylation, formylation, oxidation, reduction; metabolic synthesis in the presence of tunicamycin; and the like.
  • the N-terminal and/or C-terminal ends may be processed (e.g., the N-terminal methionine may not be present due to prokaryotic expression of the protein or peptide) and chemical moieties may be attached to the amino acid backbone.
  • Proteins and peptides sometimes are modified with a detectable label, such as an enzymatic, fluorescent, isotopic or affinity label to allow for detection and isolation of the polypeptide.
  • a detectable label such as an enzymatic, fluorescent, isotopic or affinity label to allow for detection and isolation of the polypeptide.
  • the protein or peptide often is modified to include an appropriate recognition sequence.
  • a recognition sequence is present in a native protein or peptide amino acid sequence, the recognition sequence often is removed unless it is located near the N-terminus or C-terminus.
  • a recognition sequence can be removed in a native amino acid sequence by synthesizing it without the recognition sequence or by modifying some and/or all amino acids in the nucleotide sequence encoding the amino acid recognition sequence by known recombinant techniques.
  • the recognition sequence When a recognition sequence and non-native sequence useful for purification are introduced to the native protein or peptide amino acid sequence, the recognition sequence often is incorporated closer to the N-terminus than the non-native sequence useful for purification such that the latter is cleaved from the protein or peptide during ligation.
  • the transamidase also is modified with the non-native sequence and the ligated product is purified away from the reactants when contacted with a solid support that binds the non-native sequence.
  • the protein or peptide is modified with a detectable label or homing sequence for a detectable label
  • such sequences often are incorporated closer to the N-terminus than the recognition sequence so they are not cleaved from the protein or peptide in the ligation process.
  • A is a molecule of interest; N is aNH 2 -CH 2 - moiefy; B is a protein or peptide; RS is a recognition sequence for the transamidase; C is a portion of the protein or peptide released after the transamidase reaction; RS' is a portion of the recognition sequence retained with the protein or peptide after the transamidase reaction and RS" is a portion of the recognition sequence released with C after the transamidase reaction.
  • C sometimes does not exist in embodiments where the recognition sequence is at the C-terminus of the peptide or protein.
  • the protein or peptide sometimes includes more than one recognition sequence. Recognition sequences are described above.
  • the transamidase catalyzes formation of an amide linkage between a NH 2 -CH 2 - moiety, which is joined to and/or is in the molecule of interest, and a carboxyl moiety in the protein or peptide.
  • Suitable NH 2 -CH 2 - moieties are known and can be determined by performing the linkage processes described herein in a routine manner. Where a molecule of interest does not include a suitable NH 2 - CH 2 - moiety, one or more NH 2 -CH 2 - moieties are joined to the molecule. Methods for joining one or more NH 2 -CH 2 - moieties to a molecule of interest are known and can be developed.
  • NH 2 -CH 2 - moieties often utilized in the processes described herein are present in one or more glycine amino acids in or derivitized to the molecule of interest. In certain embodiments, between one and six glycines are present in or are incorporated into/onto the molecule of mterest, and in specific embodiments, the molecule of mterest is derivitized with three glycines.
  • the molecule of interest can be any molecule that leads to a useful molecule/protein or peptide conjugate. In certain embodiments, the molecule of interest is a protein or peptide.
  • the molecule of mterest sometimes is an antibody epitope, an antibody, a recombinant protein, a synthetic peptide or polypeptide, a peptide comprising one or more D-amino acids, a peptide comprising all D- amino acids, a peptide comprising one or more unnatural or non-classical amino acids (e.g., ornithine), a peptide mimetic, or a branched peptide.
  • the molecule of mterest is a peptide that confers enhanced cell penetrance to the protein or peptide (e.g., a greater amount of the protein or peptide conjugated to the peptide of mterest is translocated across a cell membrane in a certain time frame as compared to the protein or peptide not conjugated to the peptide of mterest), which is referred to herein as a "protein transduction domain (PTD)" peptide or "transduction peptide.”
  • PTD protein transduction domain
  • Any PTD can be conjugated to a protein or peptide using the methods described herein.
  • PTD peptides are known, and include amino acid subsequences from HFV-tat (e.g., U.S.
  • Patent No. 6,316,003 sequences from a phage display library (e.g., U.S. 20030104622) and sequences rich in amino acids having positively charged side chains (e.g., guanidino-, amidino- and amino-containing side chains; e.g., U.S. Patent No. 6,593,292).
  • the PTD peptide sometimes is branched as described hereafter.
  • the molecule of mterest sometimes is a detectable moiety. Any known and convenient detectable moiety can be utilized. In certam embodiments avidin, streptavidin, a fluorescent molecule, or a radioisotope is linked to the protein or peptide.
  • biotin or another vitamin, such as thiamine or folate is linked to the protein or peptide.
  • detectable moieties include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioisotopes.
  • suitable enzymes include horseradish peroxidase, alkaline phosphatase, ⁇ -galactosidase, or acetylcholinesterase;
  • suitable prosthetic group complexes include sfreptavidin/biotin and avidin/biotin;
  • suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorofriazinylamine fluorescein, dansyl chloride or phycoerythrin;
  • an example of a luminescent material includes luminol;
  • examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioisotopes include I25 1, 131 1, 35 S or 3 H.
  • the radioisotope sometimes is selected based upon its appropriate use in a nuclear medicinal procedure, such as Be-7, Mg-28, Co-57, Zn-65, Cu-67, Ge-68, Sr-82, Rb-83, Tc-95m, Tc-96, Pd-103, Cd-109, and Xe-127 for example.
  • Conjugates between a protein or peptide and a detectable label are useful in diagnostic procedures. Diagnostic procedures include, for example, nuclear medicinal procedures for locating diseased locations of a subject, and procedures for detecting specific components or pathogens in a biological sample from a subject. The conjugates also are useful as research tools.
  • the conjugates are useful in flow cytometry techniques and for detecting a cellular location of a specific protein or peptide in a cell.
  • a NH 2 -CH 2 -derivitized fluorophore and a transamidase are contacted with cells that express a protein linked to a sortase recognition sequence.
  • the transamidase often a sortase, joins the fluorophore to the protein and allows detection of the cell, protein in the cell, or protein on the cell surface.
  • the transamidase is expressed by a nucleotide sequence that encodes it in the cell (e.g., the nucleotide sequence sometimes is in a plasmid), and in other embodiments useful for detecting a protein on a cell surface or in a fixed cell, exogenous transamidase protem, often isolated protein, is contacted with the cell.
  • the location of the protein in a cell sometimes is detected by a fluorescence imaging technique in cells fixed to a solid support. Imaging techniques include, for example, using a standard light microscope or a confocal microscope (e.g., U.S. PatentNos. 5,283,433 and 5,296,703 (Tsien)).
  • Appropriate light microscopes are commercially available and are useful for probing cells in two dimensions (i.e., the height of a cell often is not resolved), and confocal microscopy is useful for probing cells in three-dimensions.
  • Many microscopy techniques are useful for determining the location of a protein in a cell (e.g., in the nucleus, cytoplasm, plasma cell membrane, nucleolus, mitochondria, vacuoles, endoplasmic reticulum or Golgi apparatus). Some microscopic techniques are useful for determining the location of molecular antigens in groups of cells, tissue samples, and organs. Cellular locations often are visualized by counter-staining for subcellular organelles.
  • cells expressing the protein are subjected to a known flow cytometry procedure, such as flow microfluorimetry (FMF) and fluorescence activated cell sorting (FACS); U.S. PatentNos. 6,090,919 (Cormack, et al.); 6,461,813 (Lorens); and 6,455,263 (Payan)).
  • FMF flow microfluorimetry
  • FACS fluorescence activated cell sorting
  • U.S. PatentNos. 6,090,919 Cormack, et al.
  • 6,461,813 Lifens
  • 6,455,263 Paymentan
  • the molecule of mterest sometimes is a polymer or a small molecule. Polymers sometimes are useful for enhancing protein or peptide solubility, stability and circulating time, and/or for decreasing immunogenicity when the protem or peptide is administered to a subject.
  • the polymer sometimes is a water soluble polymer such as polyethylene glycol, ethylene glycol/propylene glycol copolymers, carboxymethylcellulose, dexfran, polyvinyl alcohol and the like, for example.
  • the protein or peptide may include one, two, three or more attached polymer moieties after ligation.
  • the polymer may be of any molecular weight, and may be branched or unbranched. For polyethylene glycol, the molecular weight often is between about 1 kDa and about 100 kDa (the term "about” indicating that in preparations of polyethylene glycol, some molecules will weigh more, some less, than the stated molecular weight).
  • Any small molecule derivitized with or having a NH 2 -CH 2 - moiety can be linked to a protein or peptide having a transamidase recognition site.
  • Figure 1 shows an example of a method useful for derivitizing a small molecule with a triglycine moiety.
  • Compounds can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive (see, e.g., Zuckermann et al., J. Med. Chem.37: 2678-85 (1994)); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; "one-bead one-compound” library methods; and synthetic library methods using affinity chromatography selection.
  • Biolibrary and peptoid library approaches are typically limited to peptide libraries, while the other approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, Anticancer Drug Des. 12: 145, (1997)).
  • Examples of methods for synthesizing molecular libraries are described, for example, in DeWitt et al., Proc. Natl. Acad. Sci. U.S.A. 90: 6909 (1993); Erb et al., Proc. Natl. Acad. Sci. USA 91 : 11422 (1994); Zuckermann et al., J. Med. Chem.
  • plasmids (Cull et al., Proc. Natl. Acad. Sci. USA 89: 1865-1869 (1992)).
  • Compounds may alter expression or activity of KIAA0861 polypeptides and may be a small molecule.
  • Small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.
  • peptides e.g., peptoids
  • amino acids amino acid analogs
  • polynucleotides polynucleotide analogs
  • nucleotides nucleotide analogs
  • the small molecule folate, spermine or puromycin is linked to a protein or peptide.
  • the small molecule is a modification moiety described above, such as a phosphoryl moiety, ubiquitin moiety or a glycosyl moiety, for example.
  • the molecule of mterest is a nucleic acid, such as a deoxyribonucleic acid, a ribonucleic acid, nucleic acid derivatives, and a modified nucleic acid.
  • the nucleic acid may comprise or consist of DNA nucleotide sequences (e.g., genomic DNA (gDNA) and complementary DNA (cDNA)) or RNA nucleotide sequences (e.g., mRNA, tRNA, and rRNA).
  • the nucleic acid sometimes is about 8 to about 50 nucleotides in length, about 8 to about 35 nucleotides in length, and sometimes from about 10 to about 25 nucleotides in length.
  • Nucleic acids often are 40 or fewer nucleotides in length, and sometimes are 35 or fewer, 30 or fewer, 25 or fewer, 20 or fewer, and 15 or fewer nucleotides in length.
  • Synthetic oligonucleotides can be synthesized using standard methods and equipment, such as by using an ABI3900 High Throughput DNA Synthesizer, which is available from Applied Biosystems (Foster City, CA).
  • a nucleic acid sometimes is an analog or derivative nucleic acid, which can include backbone/linkage modifications (e.g., peptide nucleic acid (PNA) or phosphothioate linkages) and/or nucleobase modifications. Examples of such modifications are set forth in U.S. Patent No. 6,455,308 (Freier et al.); in U.S. PatentNos.
  • Nucleic acids may be modified by chemical linkages, moieties, or conjugates that enhance activity, cellular distribution, or cellular uptake of nucleic acid, and examples of modifications in modified nucleic acids are in U.S. Patent Nos. 6,455,308 (Freier), 6,455,307 (McKay et al.), 6,451,602 (Popoff et al.), and 6,451,538 (Cowsert).
  • the molecule of interest sometimes is a toxin.
  • any toxin may be selected, and often is selected for high cytotoxic activity.
  • Proteins or peptides ligated to a toxin often are useful as therapeutics.
  • a protein or peptide antibody or receptor that specifically binds to a cancer cell when linked to a toxin such as ricin is useful for treating cancer in subjects.
  • the toxin is selected from the group consisting of abrin, ricin A, pseudomonas exotoxin and diphtheria toxin.
  • the molecule of mterest sometimes is a solid support.
  • the solid support often is derivitized with multiple NH 2 -CH 2 - moieties, such as triglycine moieties.
  • the solid support may be any described herein.
  • the solid support is a glass slide, a glass bead, a silicon wafer or a resin.
  • a resin such as EAH Sepharose is derivitized with triglycine moieties using a FMOC/EDC derivitization procedure.
  • the molecule of mterest sometimes is a phage that expresses a NH 2 -CH 2 - moiefy on the surface.
  • the phage expresses a protein or peptide comprising one or more N- terminal glycines, sometimes three or five glycines at the N-terminus, and the phage expressing such a protein is contacted with a protem or peptide containing a transamidase recognition motif and a transamidase, thereby producing a phage/protein or phage/peptide conjugate.
  • a protein or peptide expressed at the phage surface comprises the fransamidase recognition motif, and the phage is contacted with a molecule of interest comprising a NH 2 -CH 2 - moiefy and a transamidase, thereby producing a conjugate between the phage and the molecule of interest.
  • a molecule of interest comprising a NH 2 -CH 2 - moiefy and a transamidase
  • a protem or peptide comprising a NH 2 -CH 2 - moiety or a transamidase recognition sequence can be expressed on the surface of any other phage or virus, including but not limited to, murine leukemia virus (MLV), mouse mammary tumor virus (MMTV), Rous sarcoma virus (RSV), Fujinami sarcoma virus (FuSV), Moloney murine leukemia virus (Mo-MLV), FBR murine osteosarcoma virus (FBR MSV), Moloney murine sarcoma virus (Mo-MSV), Abelson murine leukemia virus (A-MLV), Avian myelocytomatosis virus- 29 (MC29), and Avian erythroblastosis virus (AEV), human immunodeficiency virus (HIV), simian immunodefici
  • MMV murine leukemia virus
  • MMTV mouse mammary tumor virus
  • RSV Rous sarcoma virus
  • the protem or peptide sometimes comprises the molecule of interest, such that the protein or peptide is cyclized in the ligation reaction.
  • the molecule of interest sometimes is an amino acid sequence located at the N-terminus of a protein or peptide, where the amino acid sequence initiates with one or more glycines.
  • Addition of a transamidase, such as a sortase then cyclizes the linear peptide or protem. Cyclized proteins or peptides often exhibit advantageously enhanced stability as compared to the linear counterpart, and sometimes exhibit enhanced affinity for a target receptor as compared to the linear counterpart.
  • a fusion protein often comprises a solid phase association region, a target protein or target peptide, a peptidase, and a peptidase recognition sequence, where the peptidase is capable of cleaving the fusion protein at the recognition sequence.
  • the solid phase association region, peptidase, target protein or target peptide, and peptidase recognition sequence elements are in a contiguous amino acid sequence and these elements are arranged in any suitable orientation.
  • the solid phase association region sometimes is located at the N-terminus of a fusion protein, and in some embodiments, it is located at the C-terminus of a fusion protein.
  • the peptidase sequence sometimes is located closer to the N-terminus of the fusion protein than the target protein or target peptide sequence, and sometimes is located closer to the C-terminus of the fusion protein than the target protein or target peptide sequence.
  • the peptidase recognition sequence often is located between the target protein or target peptide sequence and the peptidase sequence.
  • a fusion protein sometimes comprises amino acid sequences in the following orientation (N-terminal to C-terminal orientation): solid phase association region, peptidase, peptidase recognition sequence, and target protein or target peptide.
  • a fusion protein sometimes comprises amino acid sequences in the following orientation (N-terminal to C-terminal orientation): target protein or target peptide, peptidase, peptidase recognition sequence, and solid phase association region.
  • a fusion protein sometimes is in association with a solid support that specifically binds to the solid phase association region.
  • a fusion protein sometimes comprises sequences other than a solid phase association region, target protein or target peptide sequence, peptidase sequence, and peptidase recognition sequence.
  • a fusion protein sometimes comprises one or more linker sequences that flank elements of the fusion protein (e.g., flanking the solid phase association region, target protein or target peptide sequence, peptidase sequence, and/or peptidase recognition sequence).
  • a linker sequence sometimes is located between the peptidase sequence and the target protein or target peptide sequence.
  • a linker sequence sometimes is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25 or fewer, 30 or fewer, 40 or fewer, 50 or fewer, 60 or fewer, 70 or fewer, 80 or fewer, 90 or fewer or 100 or fewer amino acids in length.
  • a fusion protem sometimes includes an N-terminal sequence and/or C-terminal sequence other than a solid phase association region, target protein or target peptide sequence, peptidase sequence, and peptidase recognition sequence.
  • a fusion protein sometimes comprises a sequence that increases expression, secretion, cellular internalization, and cellular localization of the fusion protein. Fusion proteins sometimes include all or a part of a serum polypeptide (e.g., an IgG constant region or human serum albumin).
  • a fusion protem comprises a sequence capable of exporting the fusion protein to an intracellular compartment near a host cell surface (e.g., the fusion protein can be released from host cells by exposing the host cells to osmotic shock conditions).
  • a fusion protein comprises a sequence capable of secreting the fusion protein outside of a host cell (e.g., host cells need not be lysed as fusion proteins are secreted from the cell and readily collected). Examples of such sequences are known and sometimes are included in the nucleic acids and fusion proteins described herein (e.g., Izard et al., Mol Microbiol. 1994 Sep;13(5):765-73; Bolhuis et al., Microbiol Mol Biol Rev. 2000 Sep;64(3):515-47; Giga-Hama et al., Biotechnol. Appl. Biochem. 1999 30:235-244). [0070] A fusion protein comprises amino acids of any type.
  • Amino acids include, but are not limited to, D- or L-isomer amino acids, natural amino acids (e.g., any of the 20 naturally occurring L- isomer amino acids), unnatural or non-classical amino acids, and homologs of alpha amino acids such as beta2- and beta3- amino acids and gamma amino acids.
  • Unnatural or non-classical amino acids include, but are not limited to, ornithine, diaminobutyric acid, norleucine, pyrylalanine, thienylalanine, naphthylalanine, phenylglycine, alpha* and alpha-disubstituted* amino acids, N-alkyl amino acids*, lactic acid*, halide derivatives of natural amino acids such as trifluorotyrosine*, p-Cl-phenylalanine*, p-Br-phenylalanine*, p-I-phenylalanine*, L-allyl-glycine*, beta-alanine*, L-alpha-amino butyric acid*, L-gamma-amino butyric acid*, L-alpha-amino isobutyric acid*, L-epsilon-amino caproic acid#, 7- amino heptanoic acid*, L-methion
  • the notation * indicates a derivative having hydrophobic characteristics and # indicates a derivative having hydrophilic characteristics.
  • Methods for introducmg unnatural or non-classical amino acids and amino acid homologs are known, which include, for example, processes utilizing heterologous tRNA/synthetase pair in E.coli, where the tRNA recognizes an amber stop codon and is loaded with an unnatural amino acid (e.g., http address www.iupac.org/news/prize/2003/wang.pdf).
  • a fusion protein also may include suitable spacer groups inserted between any two amino acids, such as alkyl groups (e.g., methyl, ethyl or propyl groups) or amino acid spacers (e.g., glycine or beta-alanine), for example.
  • a fusion protein also may comprise peptoids.
  • the term "peptoids" refers to variant amino acid structures where the alpha-carbon substituent group is linked to the backbone nitrogen atom rather than the alpha-carbon. Processes for preparing peptides in peptoid form are known (e.g., Simon et al., PNAS (1992) 89(20), 9367-9371 and Horwell, Trends Biotechnol.
  • a fusion protem sometimes is modified by a process or with a moiefy not typically incorporated into a protem during translation.
  • a fusion protein comprises one or more moieties selected from an alkyl moiety (e.g., methyl moiefy), an alkanoyl moiefy (e.g., an acetyl group (e.g., an acetylated histone)), an alkanoic acid or alkanoate moiety (e.g., a fatty acid), a glyceryl moiefy (e.g., a lipid), a phosphoryl moiety, a glycosyl moiety (e.g., N-linked or O-linked carbohydrate chains) or an ubiquitin moiety.
  • an alkyl moiety e.g., methyl moiefy
  • an alkanoyl moiefy e.g., an acetyl group (e.g., an acetyl
  • any of numerous chemical modifications may be carried out by known techniques, including but not limited to specific chemical cleavage by cyanogen bromide, trypsin, chymotrypsin, papain, V8 protease, NaBH 4 ; acetylation, formylation, oxidation, reduction; metabolic synthesis in the presence of tunicamycin; and the like.
  • the N-terminal and/or C- terminal ends may be processed (e.g., the N-terminal methionine may not be present due to prokaryotic expression of the protein or peptide) and chemical moieties may be attached to the amino acid backbone.
  • the resulting N-terminal amino acid sometimes is substituted or deleted or one or more amino acids are inserted before it when that amino acid is glycine.
  • Fusion proteins sometimes are modified with a detectable label, such as an enzymatic, fluorescent, isotopic or affinity label to allow for detection.
  • a solid phase association region which sometimes is referred to herein as a "solid phase association sequence” includes any moiety or amino acid sequence suitable for associating a fusion protein with a solid support.
  • Any suitable solid phase for protein or peptide purification processes can be utilized (e.g., cellulose, plastic, glass, polystyrene), and the solid support often is derivitized with a binding pair member and the solid phase association region comprises, consists essentially of, or consists of the other binding pair member.
  • Any binding pair members can be utilized that allow for association of a fusion protein with a solid phase via the solid phase association region.
  • binding pair members include, but are not limited to, protein/ligand (e.g., maltose binding protein/maltose, glutathione S-fransferase/glutathione); metal/metal-binding moiety (e.g., metal/polyhistidine amino acid sequence, nickel His 6 ); antibody/epitope (e.g., antibody/FLAG sequence); antibody/antigen; antibody/antibody; antibody/antibody fragment; antibody/antibody receptor; antibody/protein A or protein G; hapten/anti-hapten; biotin/avidin; biotin/streptavidin; folic acid/folate binding protem; vitamin B12/infrinsic factor; nucleic acid/complementary nucleic acid (e.g., DNA, RNA, PNA); and chemical reactive group/complementary chemical reactive group (e.g., sulfhydryl/maleimide, sulfhydryl/haloacetyl derivative,
  • the solid phase association region comprises, consists essentially of, or consists of a polyhistidine amino acid sequence (e.g., His 6 ) and the solid phase is derivitized with nickel or copper ions.
  • the solid phase association region directly associates with the solid phase, and sometimes it is associated indirectly with the solid phase (e.g., the solid phase association region and derivitized solid phase are linked by a bifunctional linker moiety).
  • a solid phase association region sometimes is included in the fusion protein during fusion protein production.
  • a solid phase association region or a portion of it sometimes is incorporated in a nucleic acid that encodes the fusion protein.
  • an expressed fusion protein sometimes includes a polyhistidine track that can bind a metal-derivitized solid phase, and a biotin moiety sometimes is included in a fusion protem produced by recombinant expression (e.g., pcDNATM6 BioEaseTM Gateway® Biotinylation System (Invitrogen); an avidin- or streptavidin- derivitized solid phase can bind the biotin in the fusion protein).
  • a solid phase association region sometimes is added to a fusion protein after fusion protein production.
  • Methods for derivitizing a fusion protem with a solid phase associating agent are known (e.g., a biotin-derivitized antibody or antibody fragment that specifically binds the fusion protein sometimes is contacted with fusion protem and the product is contacted with an avidin- or streptavidin-derivitized solid phase).
  • amino acid sequence for any peptidase, peptidase fragment or sequence variant thereof that is capable of cleaving the fusion protem can be incorporated in the fusion protem using known processes.
  • the term "peptidase” refers to any amino acid sequence capable of cleaving a backbone amide bond in the fusion protem, and a peptidase, in some embodiments, may be capable of performing additional types of reactions.
  • the peptidase has hydrolytic activity, fransferase activity, ligase activity, splicing activity, cyclization activity, or a combination of the foregoing.
  • the peptidase sometimes modifies the fusion protein by linking the following to an end of a fusion protem cleavage product: a water molecule (e.g., hydrolytic activity), another molecule (e.g., an exogenous molecule; transferase or ligase activity), an end of another fusion protein cleavage product (e.g., splicing activity), or another end of the same fusion protein cleavage product (e.g., cyclization activity).
  • a water molecule e.g., hydrolytic activity
  • another molecule e.g., an exogenous molecule; transferase or ligase activity
  • an end of another fusion protein cleavage product e.g., splicing activity
  • another end of the same fusion protein cleavage product e.g., cyclization activity
  • the peptidase selected sometimes cleaves the fusion protem by an intramolecular reaction (i.e., the peptidase cleaves the fusion protein in which it is located), sometimes cleaves by an intermolecular reaction (i.e., the peptidase cleaves another fusion protein molecule), and sometimes cleaves by intermolecular and intramolecular reactions.
  • the selected peptidase does not excise all or part of its own sequence from the fusion protein and link ends of the fusion protein cleavage products to one another.
  • a selected peptidase often has endopeptidase activity, where the peptidase cleaves within the fusion protem sequence, and often the peptidase does not cleave a terminal amino acid from the fusion protein.
  • the peptidase can cleave the fusion protein at any catalytic rate, and peptidases having moderate to slow catalytic activity often are selected.
  • a selected peptidase often has moderate to low hydrolytic activity.
  • the catalytic activity of the peptidase is expressed in terms of the kinetic parameter k c JK- M , and sometimes this parameter is in a range of about 0.5 MV 1 to about 50 MV 1 , and sometimes is about 5 M ' V 1 .
  • the peptidase has hydrolytic activity and ligase activity depending upon substrates available.
  • the peptidase sometimes is a sortase (described in further detail above) that hydrolyzes the fusion protem when the system is substantially free of NH 2 - CH 2 -containing substrate (e.g., a polyglycine such as triglycine), and/or cleaves and ligates a fusion protein cleavage product to a NH 2 -CH 2 -containing substrate present in the system.
  • the peptidase also comprises transamidase activity, and sometimes the peptidase is a sortase, such as a sortase described above. [0076]
  • peptidase sequence fragments or variants are selected to favor certain fusion protem parameters.
  • peptidase sequence fragments and variants sometimes are selected for lower catalytic activity and/or enhanced fusion protein production levels as compared to the corresponding full-length, native peptidase sequence.
  • Peptidase sequence fragments and variants having lower catalytic activity sometimes are selected to reduce rates of fusion protein cleavage during protein purification, described in further detail herein. Determining effects of a peptidase sequence fragment or variant in a fusion protein on such parameters sometimes are determined by producing fusion proteins with full-length, native peptidase sequences and fusion proteins with fragment and/or variant peptidase sequences and comparing parameters for the respective fusions.
  • a fragment sometimes consists of about 80% of the full-length peptidase amino acid sequence, and sometimes about 70%, about 60%, about 50%, about 40% or about 30% of the full-length peptidase amino acid sequence.
  • a peptidase sequence variant sometimes differs by one or more amino acid substitutions, insertions or deletions, such as 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acid substitutions, insertions or deletions from the native sequence or subsequence, and sometimes is substantially identical to the native peptide sequence or subsequence.
  • substantially identical refers to sequences sharing one or more identical amino acid sequences. Included is an amino acid sequence that is 55% or more, 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more (each often within a 1%, 2%, 3% or 4% variability) identical to another amino acid sequence.
  • One test for determining whether two sequences are substantially identical is to determine the percent of identical sequences shared between nucleic acids, proteins or peptides. Sequence identity can be determined as described above.
  • a peptidase sequence can be utilized as a "query sequence" in database searches useful for identifying alternative peptidase sequences for use in fusion proteins that are substantially identical to a query peptidase sequence.
  • the peptidase sequence is a sortase fragment having catalytic activity, and the fragment sometimes is a SrtAc fragment.
  • a sortase fragment sometimes is a catalytic core region that recognizes and cleaves a threonine-glycine bond in a Leu-Pro-Xaa-Thr-Gly sequence.
  • a catalytic core region from SrtAc is utilized, and sometimes the catalytic core region is from about position 60 to about position 206 of native SrtAc.
  • the peptidase sequence sometimes is a sortase variant or sortase fragment variant, sometimes a SrtAc variant or SrtAc fragment variant, and sometimes is a variant with reduced activity compared to native sortase or native SrtAc.
  • the SrtAc variant or SrtAc fragment variant includes an amino acid substitution at Trp 194, such as an alanine at that position. [0079] As disclosed above, the peptidase is capable of cleaving the fusion protein at the recognition sequence.
  • the recognition sequence often is selected based upon the peptidase sequence in the fusion protein, and any recognition sequence that the peptidase can specifically recognize can be utilized.
  • Peptidase recognition sequences are known and can be incorporated into a fusion using known recombinant molecular biology processes.
  • the recognition sequence often comprises the amino acid sequence X ⁇ PX 2 X 3 G, where Xi is leucine, isolucine, valine or methionine; X 2 is any amino acid; X 3 is threonine, serine or alanine; P is proline and G is glycine.
  • X t is leucine and X 3 is threonine.
  • X 2 is aspartate, glutamate, alanine, glutamine, lysine or methionine.
  • the recognition sequence often comprises the amino acid sequence NPX ⁇ TX 2 , where Xi is glutamine or lysine; X 2 is asparagine or glycine; N is asparagine; P is proline and T is threonine.
  • the recognition sequence often is Leu-Pro-Xaa-Thr-Gly, where Xaa is any amino acid.
  • the term "at the recognition sequence” sometimes refers to the peptidase cleaving the fusion protein within the recognition sequence, such that each cleavage product includes a portion of the recognition sequence.
  • the term sometimes refers to the peptidase cleaving the fusion protem at a site in the fusion protein adjacent to the recognition sequence, sometimes at a position 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids away from an end of the recognition sequence, and sometimes at a position 15 or fewer, 20 or fewer, 50 or fewer or 100 or fewer amino acids away from the an end of the recognition sequence.
  • the recognition sequence often is located between the peptidase sequence and target protein sequence or target peptide sequence in a fusion protein.
  • a linker sequence sometimes is located between the peptidase sequence and target protem sequence or target peptide sequence in a fusion protein and the recognition sequence sometimes is located in the linker sequence.
  • the recognition sequence often is located closer to the target protein or target peptide sequence, where an end of the recognition sequence sometimes is 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids away from an end of the target protein or target peptide sequence, and sometimes is 15 or fewer, 20 or fewer, 50 or fewer or 100 or fewer amino acids away from the an end of the target protein or target peptide sequence.
  • Target Proteins Target Peptides. Nucleic Acids and Host Cells
  • Any protein or peptide amino acid sequence may be incorporated into a fusion protem as a target protem or target peptide for the one-step purification processes described herein. Proteins and peptides described above for ligation processes can be incorporated into fusion proteins for one-step purification processes for example. Expressing the protem or peptide in a fusion sometimes increases the solubility of the protein or peptide as compared to when it is expressed alone and not part of a fusion protein.
  • a nucleic acid sometimes comprises a nucleotide sequence that encodes a fusion protein comprising a solid phase association region, a peptidase, and a peptidase recognition sequence, where the peptidase is capable of cleaving the fusion protein at the recognition sequence.
  • the nucleic acid is of any composition useful for generating further copies of the nucleic acid and/or fusion protem expression.
  • the nucleic acid often comprises, consists essentially of or consists of DNA or RNA, often is double-stranded, sometimes is single-stranded, sometimes is linear and sometimes is a plasmid.
  • a nucleic acid sometimes includes a region for inserting a target protein or target peptide sequence, such as one or more topoisomerase recognition sites, one or more sites adapted for an amplification process (e.g., polymerase chain reaction (PCR) process) and a nucleotide sequence with one or more restriction enzyme sites convenient for cloning the fusion protein-encoding nucleotide sequence into the nucleic acid, for example.
  • a nucleic acid sometimes includes a nucleotide sequence that encodes a target protein or target peptide.
  • a nucleic acid often includes one or more regulatory sequences operatively linked to the nucleotide sequence that encodes the fusion protein.
  • regulatory sequence includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals), for example. Regulatory sequences include those that direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. Regulatory sequences are of viral origin in certam embodiments. For example, commonly used viral promoter sequences are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40. A nucleic acid sometimes is capable of directing fusion protein expression in a particular cell type (e.g., tissue-specific regulatory elements are used to express the fusion protem).
  • tissue-specific promoters include, but are not limited to, an albumin promoter (liver-specific; Pinkert et al, Genes Dev. 1 : 268-277 (1987)), lymphoid-specific promoters (Calame & Eaton, Adv. Immunol. 43: 235-275 (1988)), promoters of T cell receptors (Winoto & Baltimore, EMBO J.
  • albumin promoter liver-specific; Pinkert et al, Genes Dev. 1 : 268-277 (1987)
  • lymphoid-specific promoters Calame & Eaton, Adv. Immunol. 43: 235-275 (1988)
  • promoters of T cell receptors Winoto & Baltimore, EMBO J.
  • promoters of immunoglobulins (Banerji et al., Cell 33: 729-740 (1983); Queen & Baltimore, Cell 33: 741-748 (1983)), neuron-specific promoters (e.g., the neurofilament promoter; Byrne & Ruddle, Proc. Natl. Acad. Sci. USA 86: 5473- 5477 (1989)), pancreas-specific promoters (Edlund et al., Science 230: 912-916 (1985)), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Patent No. 4,873,316 and European Application Publication No. 264,166).
  • a nucleic acid sometimes is transiently or stably transfected or transformed into host cells.
  • a nucleic acid sometimes includes one or more integration nucleotide sequences for integrating a nucleotide sequence that encodes a fusion protein into a host cell genome, and such integration sequences often flank the fusion protein-encoding nucleotide sequence.
  • nucleic acid sometimes depends on such factors as the choice of host cell to be transformed, desired level of fusion protein expression, and the like.
  • the nucleic acid can be designed for fusion protein expression in prokaryotic and/or eukaryotic cells.
  • fusion proteins can be expressed in bacteria (e.g., E. coli), insect cells (e.g., Sf9 cells using baculovirus expression vectors), yeast cells, fungi cells, or mammalian cells.
  • bacteria e.g., E. coli
  • insect cells e.g., Sf9 cells using baculovirus expression vectors
  • yeast cells e.g., fungi cells, or mammalian cells.
  • Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990), for example.
  • nucleotide sequences encoding the fusion protein in the nucleic acid can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.
  • a nucleotide sequence in the nucleic acid that encodes a fusion protem sometimes is codon-optimized depending upon the host organism selected for expression.
  • cleavage sites for the peptidase in the fusion protein other than the cleavage site that releases the target protein or target peptide from the fusion protein often are removed from or modified in the nucleic acid before the fusion protem is expressed.
  • Any host cell suitable for producing and expressing a fusion protein can be transfected or transformed with a nucleic acid described herein.
  • compositions comprising a host cell in combination with a nucleic acid described herein and compositions comprising a host cell in combination with a fusion protein described herein.
  • Host cells sometimes are a strain of bacteria, a strain of yeast, a strain of fungi, a strain of insect cells or a strain of mammalian cells.
  • a fusion protem described herein in a system which comprise contacting a system that comprises a nucleic acid encoding the fusion protein with conditions suitable for expressing the fusion protein.
  • the system sometimes is a cell-free environment for in vitro expression of the fusion protein.
  • the system often is an environment containing cells, such as a cell culture plate or flask containing host organism cells. Any host cell suitable for protein expression is utilized, including but not limited to, a strain of bacteria, a strain of yeast, a strain of fungi, a strain of insect cells and a strain of mammalian cells.
  • a host cell sometimes is transiently transfected or transformed and sometimes stably transfected or transformed with a nucleic acid encoding the fusion protein.
  • Any condition suitable for expressing the fusion protein can be utilized, such as conditions in which host cells are multiplying or conditions in which host cells are not multiplying. Parameters such as media type, vessel type (e.g., flask, dish, fermentation reactor), temperature, humidity, agitation levels, inducer concentration, and time(s) of inducer addition sometimes are set according to standard procedures and sometimes are modified to optimize fusion protein expression.
  • An inducer often is selected based upon the type of promoter sequence(s) in the nucleic acid (e.g., the inducer isopropyl beta-D-thiogalactoside often is utilized with nucleic acids having a T7 polymerase promoter sequence).
  • the host organism is in suspension, and the fusion protein sometimes is produced at a level of 1 mg/L of cell culture or more, 2 mg/L of cell culture or more, 5 mg/L of cell culture or more, 10 mg/L of cell culture or more, 15 mg/L of cell culture or more, 20 mg/L of cell culture or more, 25 mg/L of cell culture or more, 30 mg/L of cell culture or more, 35 mg/L of cell culture or more, 40 mg/L of cell culture or more, 50 mg/L of cell culture or more, 75 mg/L of cell culture or more, 100 mg/L of cell culture or more, 250 mg/L of cell culture or more, 500 mg/L of cell culture or more, 750 mg/L of cell culture or more or 1000 mg/L of cell culture or more.
  • a fusion protein produced in a system often is purified and isolated.
  • a method for isolating a target protein or target peptide which comprises: contacting a fusion protein described herein with a solid phase capable of specifically binding to the solid phase association region of the fusion protein and collecting target protein or target peptide cleaved from the fusion protein associated with the solid support.
  • isolated and purified often refer to a target protein or target peptide activity having a specific activity of at least ten-fold greater than the corresponding activity present in a crude extract, lysate, or other state from which target protein or target peptide have not been removed.
  • isolated and purified also refer to the target protein or target peptide being substantially free from proteins and other components in host cell or cell-free systems.
  • substantially free often refers to a target protein or target peptide having 30% or less, 20% or less, 10% or less, and sometimes 5% or less (by dry weight) of non-target polypeptide (also referred to herein as a "contaminating protein"), or of chemical precursors or non-target chemicals.
  • Isolated and purified target protein or target peptide often is substantially free of culture medium, where culture medium represents less than about 20%, sometimes less than about 10%, and often less than about 5% of the volume of the target protein or target peptide preparation.
  • Isolated or purified target protein or target peptide preparations sometimes are 0.01 milligrams or more or 0.1 milligrams or more, and often 1.0 milligrams or more and 10 milligrams or more in dry weight. Purify of an isolated target peptide or target protem can be determined by a suitable method, such as mass spectrometry, gel electrophoresis and densitometry, for example.
  • a suitable method such as mass spectrometry, gel electrophoresis and densitometry, for example.
  • the host cell In embodiments where the fusion protein is produced in a host cell, the host cell sometimes is contacted with conditions that release the fusion protein, including but not limited to, cell lysis conditions osmotic sfress conditions.
  • the cells sometimes are contacted with reagents before, during or after the cells are exposed to the releasing conditions, such as contacting cells or cell lysates with one or more protease inhibitors.
  • the fusion protein sometimes is secreted from a host cell, and in such embodiments, medium surrounding the cells often is collected and fusion protein often is purified from the collected medium.
  • Reagents e.g., one or more protease inhibitors
  • Any solid phase suitable for protem or peptide purification can be utilized for the purification processes, such as a solid phase described herein.
  • the solid phase is derivitized with metal ions, such as nickel ions, and the solid phase association region of the fusion protein includes an amino acid sequence that binds to a metal ion, such as a polyhistidine sequence (e.g., His 6 ).
  • the solid phase is arranged in a configuration suitable for protein purification, such as in a vessel that retains the solid phase and allows liquid reagents to pass through (e.g., chromatography columns, cenfrifugation columns having a membrane, high performance liquid chromatography columns) and vessels for separating a solid phase from liquid phase by cenfrifugation (cenfrifugation vessels with an optional membrane).
  • a fusion protein is contacted with a solid phase under conditions that allow the fusion protein to associate with the solid phase, often by specific binding of the solid phase association region to the solid phase, and often in a buffer solution comprising low salt concentration.
  • components not associated with the solid phase often are separated from the solid phase using standard procedures.
  • the solid phase often is washed under conditions that maintain association of the fusion protein with the solid support.
  • a fusion protein in association with a solid phase often is contacted with a substance that facilitates or accelerates fusion protein cleavage by the peptidase.
  • any substance that facilitates or accelerates peptidase activity can be utilized, and in embodiments where the peptidase is a sortase, sortase fragment, or sequence variant thereof, the substance sometimes is a NH 2 -CH 2 - containing substance, optionally with calcium ions. Calcium ions can be introduced to the purification system by addition of calcium chloride, for example.
  • the NH 2 -CH 2 - containing substance sometimes is a polyglycine and sometimes is triglycine.
  • the cleaved target protein or target peptide often includes an added N-terminal glycine donated by the polyglycine substance for fusion proteins oriented with the peptidase region closer to the N-tenninus of the fusion protein than the target protein or target peptide region.
  • the NH 2 -CH 2 - containing substance sometimes is characterized by the formula NH 2 -CH 2 -Z, where Z is a molecule of interest.
  • the cleaved target protein or target peptide often includes an added C-terminal Z moiefy donated by the NH 2 -CH 2 -Z substance for fusion proteins oriented with the target protein or target peptide region closer to the N-terminus of the fusion protein than the peptidase region.
  • Z is any molecule of interest that yields a stable target protein-Z product or target peptide-Z product. Any suitable molecule of mterest can be utilized such as molecules of interest described above.
  • the temperature sometimes is room temperature (e.g., about 25°C), sometimes is colder than room temperature (e.g., about 4 °C, which can beneficially reduce peptidase cleavage rates and stabilized eluted target protein or target peptide), and temperature can be optimized by performing purification procedures at different temperatures.
  • Purification processes often are performed in a system comprising an aqueous environment. Water with an appropriate buffer and/or salt content often is utilized.
  • An alcohol or organic solvent may be included in certain embodiments. The amount of an organic solvent often does not appreciably esterify the fusion protein, target protem or target peptide (e.g., esterified protein or peptide often increase only by 5% or less upon addition of an alcohol or organic solvent).
  • Alcohol and/or organic solvent contents sometimes are 20% or less, 15% or less, 10% or less or 5% or less, and in embodiments where a greater amount of an alcohol or organic solvent is utilized, 30% or less, 40% or less, 50% or less, 60% or less, 70% or less, or 80% or less alcohol or organic solvent is present.
  • the purification system includes only an alcohol or an organic solvent, with only limited amounts of water if it is present.
  • Target protein or target peptide released from the fusion protein sometimes is 90% or more pure, 91% or more pure, 92% or more pure, 93% or more pure, 94% or more pure, 95% or more pure, 96% or more pure or 97% or more pure, and sometimes is 98% or more pure or 99% or more pure.
  • kits which comprise one or more containers that include any of the compositions and products described herein, and often include instructions for producing a fusion protein and/or purifying a target protein or target peptide from the fusion protein.
  • one or more containers in the kit comprise a nucleic acid described herein.
  • a nucleic acid in the kit sometimes comprises a nucleotide sequence that encodes a fusion protein containing a solid phase association region, a peptidase region and a peptidase recognition sequence, and sometimes includes a sequence that facilitates incorporation of a nucleotide sequence encoding a target protem or target peptide into the nucleic acid.
  • a kit sometimes includes instructions for inserting a target protein- encoding or target peptide-encoding nucleotide sequence into a provided nucleic acid.
  • a kit sometimes includes one or more oligonucleotides, polymerases, isomerases (e.g., topoisomerase), restriction enzymes and/or ligases, which sometimes are utilized to insert a target protein-encoding or target peptide-encoding nucleotide sequence into the nucleic acid provided in tbe kit.
  • a nucleic acid in the kit sometimes includes a nucleotide sequence that encodes a target protein or target peptide.
  • a kit in some embodiments, includes a solid support capable of binding a solid phase association region in a fusion protein, and sometimes the solid support is a metal-derivitized for associating a fusion protein containing a polyhistidine solid phase association region.
  • a kit sometimes includes host cells into which a nucleic acid can be transformed or transfected, and often is useful for expressing a fusion protein from the nucleic acid.
  • Such kits can include any fype of suitable host cells, including a sfrain of bacteria, a strain of yeast, a strain of fungi, a strain of insect cells and a strain of mammalian cells.
  • a kit sometimes includes one or more substances that facilitate cleavage of a fusion protein, such as a calcium ion containing substance (e.g., CaCl 2 ), and/or a NH 2 -CH 2 -containing substance, such as a polyglycine (e.g., triglycine) and/or a NH 2 -CH 2 -Z substance described herein.
  • a calcium ion containing substance e.g., CaCl 2
  • a NH 2 -CH 2 -containing substance such as a polyglycine (e.g., triglycine) and/or a NH 2 -CH 2 -Z substance described herein.
  • Example 1 Protem Expression and Purification
  • the cell culture was grown in LB medium containing 50 mg/L ampicillin at 37°C. Protein expression was induced with 0.2 mM isopropyl ⁇ -D- thiogalactoside (IPTG) at 30°C for 3 hours.
  • IPTG isopropyl ⁇ -D- thiogalactoside
  • the cell paste then was suspended in 10 mM Tris-HCl, pH 7.5, 50 mM NaCl, and 1 mM ⁇ -mecaptoethanol and lysed by sonication. Nucleic acids were precipitated by the addition of 0.1% polyethylene imine (PEI). The supernatant of the lysate was applied to a 5-ml Ni-NTA (Qiagen) column.
  • PEI polyethylene imine
  • the column was washed with 10 mM Tris-HCl, pH 7.5, 500 mM NaCl, 30 mM imidazole, and ImM ⁇ -mecaptoethanol.
  • the protein was eluted with 10 mM Tris-HCl, pH 7.5, 50 mM NaCl, 250 mM imidazole and ImM ⁇ -mecaptoethanol (BME). Pooled fractions were buffer exchanged into 50 mM Tri-HCl, pH 7.5 and 150 mM NaCl through a 10DG desalting column (BioRad).
  • the protein was subjected to MALDI-TOF mass spectroscopy (Bruker Daltonic Autoflex) with an average mass of 19,067 ⁇ 20 (predicted 19,048).
  • GGCCGCTATCAGTGGTGGTGGTGGTGGTGCTCGAGGCCGGTTTCCGGAAGG-3' were used to anneal and insert into pGEX-4T-l vector between EcoRI and Notl sites encoding GST with a C- terminal LPETGHHHHHH sequence.
  • the plasmid was fransformed into Escherichia coli BL21 for protein expression.
  • the cell culture was grown in LB medium containing 50 mg/L ampicillin at 37°C.
  • the protem expression was induced with 0.2 mM IPTG at 30°C for 3 hours.
  • the cell paste was then suspended in 35 ml of IX phosphate buffered saline (PBS) buffer containing 1 mM TCEP [Tris(2- carboxyethyl ⁇ phosphine].
  • PBS IX phosphate buffered saline
  • TCEP Tris(2- carboxyethyl ⁇ phosphine].
  • the lysate supernatant was applied to a 5-ml GST-binding column (Novagen).
  • the column was washed with 50ml PBS containing 1 mM TCEP.
  • the protein was eluted with 10 mM reduced glutathione. Pooled fractions were buffer exchanged into 150 mM NaCl through a 10DG desalting column.
  • the protein mass was confirmed by MALDI-TOF spectroscopy.
  • GST-LPXTG-6His and GFP-LPXTG-6His (10 ⁇ M and 35 ⁇ M, respectively) were incubated with 10 ⁇ M of sortase in buffers containing 150 mM NaCl, 5 mM CaCl 2 , and 2 mM BME.
  • the buffer pH was controlled by 50 mM sodium acetate (pH 5.5), 50 mM MES (pH 6.5), 50 mM TrisHCl (pH 7.5) or 50 mM TrisHCl (pH 8.5). The reactions were incubated at 37°C for 20 hours.
  • the rate of substrate cleavage was measured with the fluorescence increase of Edans at an emission wavelength of 460 nm and an excitation wavelength of 360 nm on a fluorometer (Applied Biosystems CYTOFLUOR Series 4000).
  • the product formation was monitored by a C-l 8 reverse phase HPLC (Vydac, Cat 218TP54) over the course of 28 hrs, using a gradient of 0.5% to 38% CH 3 CN in 0.1% trifluoroacetic acid in 40 minutes at a flow rate of 1 ml/min. Elution of peptides was monitored at 214 nm and fractions were collected for mass analysis on a MALDI-TOF mass spectrometer. Conjugation of Acetyl-RE(£ ⁇ «s)LPKTGK(D ⁇ cy/)R with a L- or D-Tat (GYGRKKRRQRRR) also followed the same procedures.
  • Fluorescein-Ahx (aminohexanoic acid)-SKLPKTGSE ( -ahx- SKLPKTGSE) was dissolved in ddH 2 0 and added to a final concentration of 1 mM.
  • the peptide was incubated with 10 ⁇ M SrtA in buffer containing 50 mM Tri-HCl, pH 8.5, 150 mM NaCl, 5mM CaCl 2 , and 2mM BME at 37°C. The products were analyzed on a MALDI-TOF mass spectrometer.
  • the protein substrate (GFP-LPXTG-6His or GST-LPXTG- 6His) with concentrations ranging 10 ⁇ M to 35 ⁇ M was used and a peptide substrate was added in 5 to 10-fold excess.
  • the reactions were incubated at 37°C for 24 to 48 hours in the presence of 10 ⁇ M SrtA, 50 mM Tri-HCl, pH 7.5, 150 mM NaCl, 5mM CaCl 2 , and 2mM BME.
  • the ligation reactions were terminated by passing the reaction mixtures through a 0.5 ml Ni-NTA column equilibrated with 50 mM Tris-HCl pH 7.5 and 150 mM NaCl.
  • the protein ligation product was collected in the column flow through. The column flow through was further purified on 10DG desalting column to remove the unligated peptide.
  • NIH3T3 cells were seeded into twelve-well plates at a density that achieved approximately 80% confluency after an overnight incubation.
  • the cells were incubated with approximately 1 ⁇ M of protein in 500 ⁇ l of DMEM (Dulbecco's modified Eagle's medium) (Invitrogen) supplemented with 10% fetal bovine serum (FBS) or in 500ul of Opti-MEM (Invitrogen).
  • DMEM Dulbecco's modified Eagle's medium
  • FBS fetal bovine serum
  • Opti-MEM Invitrogen
  • the hydrolytic product GST-LPXT (calculated mass 27,276; observed mass 27,297 ⁇ 20) was the major product across the pH 6.5 to 8.5, and the hydrolysis was not significantly affected by the solution pH. In contrast, less than half of GFP-LPXTG-6His was hydrolyzed after twenty hours, and solution pH appeared to affect the hydrolytic efficiency. Slightly more hydrolytic product GFP- LPXT (calculated mass 27,382; observed mass 27,378 ⁇ 20) was observed at lower pH (less than 7). This observed difference in hydrolysis between the two substrates may be due to the different accessibilities of the LPXTG-motif on the proteins.
  • HMW conjugates may be formed by non-specific nucleophilic attacks from protein lysine side chains on the sortase intermediate. Increasing solution pH facilitates deprotonation of these amino groups, making them more nucleophilic. This may explain why at higher pH such as pH 8.5, significantly more HMW conjugates were observed.
  • lysine side chains are capable of performing nucleophilic attack on a sortase intermediate at high pH.
  • the only amino groups that are available for nucleophilic attack are the side chains of two lysines.
  • Both H 2 0 and a free amino group on protem can act as nucleophile to slowly release the LPXTG containing product from sortase.
  • the solution pH can be lowered below 7.
  • Example 3 Ligation with LPXTG-Containing Peptides and Proteins In Vitro
  • sortase catalyzed franspeptidation was effected in vitro in the presence of a tripeptide glycines.
  • the native conjugation partner for LPXTG-containing protein in vivo is a pentaglycine cross bridge on cell walls.
  • pentaglycine cross bridge was required for efficient conjugation, and whether a peptide longer than a tripeptide glycine 3 could be specifically linked to an LPXTG sequence.
  • Example 4 Conjugating a D-peptide to LPXTG-Containing Substrates [0113]
  • a L-polypeptide was effectively conjugated to the C-terminus of an LPXTG motif in the presence of an N-terminal glycine, and the number of glycines did not significantly affect conjugation. Because glycine is an achiral amino acid, there was a possibility that sortase would not discriminate the chirality of the amino acids C-terminal to the glycine.
  • a L- and a D-Tat peptide with identical sequence were synthesized.
  • D-Tat was able to conjugate to form KE(Edans)LPK ⁇ - ⁇ A (observed mass 2,631.5, calculated mass 2,631.4).
  • the amount of D-Tat ligation product was only slightly less than that of L-Tat.
  • the D-Tat peptide was ligated to the GFP-LPXTG-6His protein with a peptide to protein ratio of 5 to 1. Over the course of 48 hours, over 90% of the protem was conjugated to the D-peptide.
  • sortase could conjugate both L- and D-peptide substrates to an LPXTG motif.
  • Sortase appears to contain an elongated binding groove to recognize the LPXTG motif, but the binding site for the amino donor substrate is much shallower.
  • the chirality of the incoming peptide should not interfere with the conjugation, as long as it contains an N- terminal glycine that can serve as a nucleophile to attack the sortase-thioester intermediate.
  • Example 5 Conjugating NH 2 -CH r Containing Compounds to LPXTG Substrates
  • Sortase activity was tested further with non-peptidyl substrates. Since an N-tenninal glycine rather than amino acids with a branched alpha-carbon facilitates nucleophilic attack, it was possible that sortase might accommodate a substrate with a NH 2 -CH 2 - group.
  • sortase was able to use both spermine and NH 2 -PEG-COOH as substrates to form specific GFP conjugates.
  • the ligation efficiencies of both spermine and NH 2 -PEG-COOH are better than that of a free glycine, but less effective than an N-terminal glycine containing peptide.
  • Sortase-mediated side products were also found both in spermine and NH 2 -PEG-COOH ligation reactions.
  • Protein fransduction domains are a class of cationic peptides able to facilitate efficient protein transduction in vitro and in vivo. Recombmant protems with PTD fusions typically are expressed as aggregates or inclusion bodies in E. coli. In an effort to obtain soluble active PTD fusion protein, sortase-mediated ligation was utilized to conjugate a protein with PTDs.
  • the synthetic peptide RRQRRTSKLMKR (PTD5) has been shown to possess protem fransduction activity, and was used as a PTD.
  • a fusion protein containing a single PTD5 sequence may be generated by recombinant expression
  • a protein containing more complex and efficient PTD moieties cannot be readily generated using either the recombinant expression or chemical synthesis.
  • the sortase-based conjugation method was used to generate a conjugate between the synthetic branched PTD peptide GGY-K-K(Ahx-PTD5) 2 and the GFP-LPXTG-6His recombinant protein.
  • a linear peptide also was prepared GGY-PTD5 with GFP-LPXTG-6His. It was determined that the conjugation efficiency with the branched peptide was similar to that of a linear peptide (GGY-PTD5).
  • Example 7 Application of a Sortase Variant to Protein Ligation Processes
  • Nucleic acids encoding sortase B are prepared and isolated according to processes described in Mazmanian etal., Proc. Natl. Acad. Sci. USA 99: 2293-2298 (2002), US 2003/0153020 and documents referenced therein, from which sortase B enzyme is produced and isolated as described therein and above in Example 1.
  • Sortase B is utilized in the processes described in Examples 2-6, with target proteins and peptides having a NPX ⁇ TX 2 recognition sequence, where Xi is glutamine or lysine; X 2 is asparagine or glycine; N is asparagine; P is proline and T is threonine.
  • Example 8 One Step Protein Purification Systems [0118] Described hereafter is a fusion system for one-step purification processes.
  • a component of the fusion, sortase A is a franspeptidase found in the cell envelop of Staphylococcus aureus.
  • SrtA uses calcium as a cofactor to first cleave the Thr-Gly bond at an LPXTG recognition motif on a surface protem and subsequently to form a peptide bond between the threonine and a pentaglycine on the cell wall peptidoglycan.
  • a purification scheme that combines affinity purification, SrtA cleavage, and separation of the fusion partner in a single IMAC chromatography step (e.g., Figure 2), and generates purified recombinant protein with an extra glycine only at its N-terminus.
  • pGHSL-emGFP The gene encoding SrtAc-LPETG-emGFP fusion was constructed using overlapping PCR. First, primers 1 and 2 (5'-GATATACATATGCAAGCTAAACCTCAAATTCCG-3' and 5'-GGATCCGGTTTCCGGAAGCTTTTTGACTTCTGTAGCTACAAAG-3', respectively) and template pET23b-SrtAc were used to PCR amplify the DNA sequence that encodes the amino acids 60 to 206 of SrtA (GenBank Accession No. AF 162687). An additional sequence encoding a KLPETGS linker was added at the 3' end of the SrtA gene. Primers 3 and 4 (5'- AAGCTTCCGGAAACCGGATCCATGGTGAGCAAGGGCG-3', and 5'-
  • ATATACATATGGTGAGCAAGGGCG-3'respectively and template pET24b-emGFP were used to PCR amplify a second DNA sequence encoding a KLPETGS linker and emerald GFP (emGFP). Subsequently, the first two PCR products were mixed together, and primers 1 and 4 were used to amplify the DNA sequence encoding the protein fusion SrtAc-KLPETGS-emGFP (SL-emGFP), which contains a Bam HI Site between the coding sequences of SrtAc-LPETG and emGFP.
  • SL-emGFP protein fusion SrtAc-KLPETGS-emGFP
  • pAHSL-emGFP— -Two oligos (5'-GATATACCATGGCCAGCAGCCATCATC-3' and 3'- GATGATGGCTGCTGGCCATGGTATATC-5') were used to make the Gly 2 to Ala mutation of the fusion in the plasmid pGHSL-emGFP using a Quick Change Mutagenesis kit (Sfratagene).
  • pGHS'L-emGFP Two oligoes (5'- AATGAAAAGACAGGCGTTGCGGAAAAACGTAAAATCTTT-3' and 3'- AAAGATTTTACGTTTTTCCGCAACGCCTGTCTTTTCATT-5') were use to mutate the Tip 194 of SrtA to Ala using the pGHSL-emGFP plasmid as a template. The resulting plasmid was used for expressing the fusion protein GHS'L-emGFP.
  • pAHS'L-emGFP Similar to pGHS'L-emGFP, the same oligoes were used to mutate the Trp 194 of SrtA to Ala using the pAHSL-emGFP plasmid as a template. The resulting plasmid was used for expressing the fusion protein AHS'L-emGFP.
  • pET15b-emGFP Primers (5'-GGCAGCCATATGATGGTGAGCAAGGGCGAG-3' and 5'- CGGATCCTCGAGTCACTTGTACAGCTCGTCCATGC-3') were used to PCR amplify DNA sequence encoding emGFP from a pET24b-emGFP vector. The PCR product was then digested with Nde I and Xho I and inserted into pET15b vector to express a fusion protein H-emGFP that contains an N-terminal His 6 tag, a thrombin cleavage site followed by emGFP.
  • Plasmid pGHSL-Cre was also constructed by overlapping PCR similarly as pGHSL-emGFP, except that the emGFP coding sequence was replaced with the bacteriophage PI Cre (GenBank Accession No.
  • primers 5 and 6 (5'- AAGCTTCCGGAAACCGGATCCATGTCCAATTTACTGACCGTAC-3' and 5'- TCCTTACTCGAGTTAATCGCCATCTTCCAGCAG-3', respectively) and template P 24b-Tat-Cre in the first round of PCR, and primers 1 and 6 were used in the overlapping PCR to obtain the DNA sequence encoding the protein fusion SrtAc-LPETGS-Cre (SL-Cre). The resulting DNA fragment was digested with Nde I and Xho I and subsequently ligated to pETl 5b vector.
  • Flas ids pAHSL-Cre,pGHS'L-Cre, and pAHS 'L-Cre were generated through site-directed mutagenesis using similar approaches as the emGFP constructs.
  • Expression constructs for p27 Plasmid pGHSL-p27 was constructed similarly to pGHSL- emGFP, except that primers 7 and 8 (5'-
  • AAGCTTCCGGAAACCGGATCCATGTCAAACGTGCGAGTGT-3' and 5'- TCCTTACTCGAGTTACGTTTGACGTCTTCTGAGG-3', respectively) and template pTAT-HA-p27 were used to PCR amplify the human cyclin-dependent kinase inhibitor IB ( ⁇ 27) (GenBank Accession No. NM 004064), and primers 1 and 8 were used to amplify the DNA sequence encoding the protein fusion SrtAc-LPETGS-p27 (SL-p27). The resulting DNA fragment was digested with Nde I and Xho I and subsequently ligated into the pETl 5b vector.
  • Plasmids pAHSL-p27, pGHS 'L-p27, and pAHS 'L-p27 were generated through site-directed mutagenesis using similar approaches as the emGFP constructs. Protein expression screening and cleavage analysis [0128] Plasmid pGHSL-emGFP was fransformed into E. coli BL21(DE3) and the cell culture was grown at 37 degrees Celcius in 3 ml LB medium containing 50 mg/L ampicillin and 0.4% glucose.
  • the GHSL-emGFP expression was induced at 30 degrees Celcius with isopropyl beta-D-thiogalactoside (IPTG) in concentrations ranging from 25 micromolar to 1 millimolar for 1 to 5 h. Aliquots of cells were lysed in lxSDS protein loading buffer and protein expression was analyzed on a NOVEX 4-20% Tris-Glycine SDS PAGE gel.
  • IPTG isopropyl beta-D-thiogalactoside
  • the cell paste was suspended in 5 ml Buffer A (20 mM Tris-HCl, pH 7.5, 50 mM NaCl, and 5 mM beta-mercaptoethanol [BEM] and lysed by sonication. Nucleic acids were precipitated by the addition of 0.1% polyethylene imine (PEI). The supernatant of the lysate was loaded onto a 1 ml Ni-NTA (Qiagen) column and washed with ten times column volumes of Buffer B (20 mM Tris-HCl, pH 7.5, 500 mM NaCl, 30 mM imidazole, and 5 mM BME) followed by five column volumes of Buffer A.
  • Buffer A 20 mM Tris-HCl, pH 7.5, 50 mM NaCl, and 5 mM beta-mercaptoethanol [BEM] and lysed by sonication. Nucleic acids were precipitated by the addition of 0.1% polyethylene imine (PEI). The supernatant of
  • cleavage buffers (+/- 5 mM CaCl 2 and 0 to 10 mM Gly 3 in Buffer A) and incubated at 4 degrees Celcius or 25 degrees Celcius for 2 to 6 h. All reactions were stopped by mixing the protein samples with equal volume of 2xSDS protein loading buffer and boiled for 5 minutes. The samples were subsequently analyzed on a NOVEX 4-20% Tris-glycine PAGE (Invitrogen). Alternatively the cleavage reactions were incubated at 25 degrees Celcius for 6 h, and the protein released from induced cleavage was collected in the supernatant.
  • the resin was washed 5 times with buffer A, and proteins remained bound to the IMAC column were eluted by 500 mM imidazole solution in buffer A. Samples of the cleavage flow-through and the IMAC elution were analyzed subsequently on a NOVEX 4-20% Tris-glycine SDS PAGE. The intensities of protein bands were quantified on an Alpha Innotech FluoChem 9900 imaging system.
  • emGFP Protein purification
  • the protein was obtained from fusion cleavage of GHSL-emGFP, AHSL- emGFP, GHS'L-emGFP, or AHS'L-emGFP. Briefly, BL21(DE3) transformed OnpGHSL-emGFP, pAHSL-e ⁇ nGFP,pGHS'L-emGFP, or pAHS'L-emGFP was grown in one-liter media. The cell growth, expression induction, and IMAC column purification conditions were scaled up from the procedures described above.
  • the SrtAc-mediated cleavage was induced by equilibrating the column with Buffer C (20 mM Tris-HCl, pH 7.5, 50 mM NaCl, 5 mM beta-mercaptoethanol, 5 mM CaCl 2 , and 5 mM Gly 3 ) and incubated at 25 degrees Celcius for 4 to 6 hours.
  • Buffer C (20 mM Tris-HCl, pH 7.5, 50 mM NaCl, 5 mM beta-mercaptoethanol, 5 mM CaCl 2 , and 5 mM Gly 3 .
  • the cleavage flow-through containing emGFP was collected at one-hour intervals.
  • the protein purify was analyzed on a SDS- PAGE gel and the molecular weight of the protein was analyzed on MALDI-TOF mass spectroscopy (Bruker Daltonic Autoflex).
  • emGFP was purified from thrombin cleavage of fusion protein H-emGFP. Briefly, plasmid pET15b-emGFP was fransformed into BL21(DE3) to express fusion protem H- emGFP. The cell growth, expression induction, and IMAC column purification conditions were the similar to those of GHSL-emGFP. After column washing with Buffer B, the fusion protein was eluted from the IMAC column using Buffer A containing 200 mM imidazole.
  • the pooled fractions were exchanged into Buffer D (50 mM Tri-HCl, pH 7.5, 5 mM CaCl 2 , and 150 mM NaCl) through a 10 DG desalting column (BioRad) and concentrated to 5 mg/ml.
  • Buffer D 50 mM Tri-HCl, pH 7.5, 5 mM CaCl 2 , and 150 mM NaCl
  • the fusion was incubated with 20 units of restriction grade thrombin (Novagen) at 4 degrees Celcius overnight.
  • the reaction mixture next was subjected to a second round of IMAC in which the His 6 tag fragment and the uncleaved fusion were removed while the emGFP was collected in the column flow-through. No further purification was carried out to remove the thrombin from the purified emGFP.
  • the protein purity was analyzed on a SDS-PAGE gel and the molecular weight of the protein was analyzed by MALDI-TOF mass spectroscopy.
  • Cre the recombinant free protein was obtained from expression, purification and fusion cleavage using pGHSL-Cre, pAHSL-Cre, pGHS 'L-Cre, or pAHS'L-Cre. The expression and purification procedure was similar to that of emGFP.
  • p27 the recombinant free protein was obtained from expression, purification and fusion cleavage using pGHSL-p27,pAHSL-p27,pGHS'L-p27, or pAHS'L-p27. The expression and purification procedure was similar to that of emGFP.
  • a prototype plasmid pGHSL-emGFP was made in HiQpET15b vector to express the fusion protein GHSL-emGFP with an N-terminal His 6 tag, SrtAc, and an LPETG linker followed by emGFP at the C-terminus ( Figure 3). At least three factors were taken into consideration in designing the self- cleavable fusion protein: fusion stability, affinity purification, and on-column cleavage.
  • the fusion contains an unconventional design by linking SrtAc and its recognition sequence LPETG together with a single glycine spacer and by placing the recognition sequence near the SrtAc substrate- binding site.
  • This design carries a potential risk concerning fusion stability.
  • the cleavage activity of SrtAc is inducible and moderate, it was expected that the fusion would be stable during protein expression.
  • the His 6 Tag which is small and does not interfere with SrtAc activity, provides an affinity purification means for the fusion protein.
  • the buffer used for induced cleavage is compatible with an IMAC column. Providing that the LPETG recognition sequence is accessible in the fusion, it is anticipated that emGFP will be released from the immobilized fusion via endogenous SrtAc cleavage (inter- or intra-molecular SrtAc cleavage) upon induction.
  • GHSL-emGFP was used as a model system for studying the fusion expression and stability, the IMAC purification, and the fusion cleavage mediated by infra- or inter-molecular SrtAc cleavage at the LPETG recognition junction.
  • emGFP as the first target provides the benefit of a visual marker to track the target protein during purification.
  • Other protein targets can also be cloned into the fusion construct using the Bam HI sites flanking the emGFP gene in the pGHSL-emGFP plasmid ( Figure 3).
  • fusion variants also were prepared to address two sequence components that could potentially affect the stability of the fusion protein.
  • the first deals with the N- terminal MGSS sequence encoded in the pGHSL-emGFP construct, because the endogenous E. coli aminopeptidase is efficient at removing the first methionine during post-translational processing.
  • the new N-terminal glycine can act as an infra- or inter-molecular nucleophile to cleave emGFP from the fusion and potentially reduce the stability of GHSL-emGFP.
  • pAHSL-emGFP Figure 3
  • Figure 3 which encodes the fusion protein AHSL-emGFP with a Gly 2 to Ala mutation, was prepared by quick-change mutagenesis mpGHSL-emGFP.
  • the second variant addresses the aforementioned concern that the wild type SrtAc might be too efficient at cleaving the LPETG sequence and neither the GHSL-emGFP nor the AHSL-emGFP fusion could be accumulated in sufficient quantity.
  • a Tip 194* to Ala mutation which reduces the franspeptidation activity of SrtAc, was incorporated into the constructs of pGHS'L-emGFP and pAHS'L-emGFP to express GHS'L-emGFP and AHS'L-emGFP, respectively (Figure 3).
  • AHSL-emGFP Under similar induction conditions, more full-length AHSL-emGFP appeared to accumulate than GHSL-emGFP, and no obvious high molecular weight ligation products were observed. There also was an unexpected observation: one of the cleavage products (the AHSL fragment) migrated around 23 kDa, a position that was different from that of the GHSL fragment (which migrates around 20 kDa).
  • the N-terminal glycine in GHSL likely performed a nucleophilic attack on its own LPETG sequence and formed a cyclized GHSL, which could migrate faster than the linear AHSL in the SDS-PAGE gel.
  • GHSL-emGFP The full length GHSL-emGFP was the major product purified from the IMAC column, although the high-molecular weight conjugates (GHSL) n -emGFP (n > 2) and the cleaved N-terminal portion GHSL also were purified. [0139] To determine the efficiency of on-column cleavage, the effects of temperature, presence of calcium, and concentration of Gly 3 were systematically investigated. Some conclusions could be drawn from the analyses of the protem bands. First, cleavage efficiency was generally higher at 25°C than 4°C.
  • Gly 3 also facilitated the cleavage, noticeably by reducing the amount of higher molecular weight ligation products (GHSL) n -emGFP (n > 2) and pushing the equilibrium of cyclized GHSL (the band at 20 kDa) to the linearized GHSL-G 3 (the band at 23 kDa).
  • GHSL higher molecular weight ligation products
  • emGFP was eluted in the flow-through after cleavage, while the un-cleaved fusion and the GHSL segment remained bound to the IMAC resin.
  • emGFP After having obtained emGFP through on-column cleavage of GHSL-emGFP, I next compared the efficiency and feasibility of protein production was compared for the four fusion constructs (i.e., GHSL-, AHSL-, GHS'L-, AHS'L-emGFP). Under the same expression and purification conditions, emGFP (observed mass 27,045 ⁇ 16 Da, calculated mass 27,039 Da) with homogeneity up to approximately 98% was obtained from all four fusions. Purified protein yields were different among these four constructs. The Gly 2 to Ala mutation reduced the AHSL-emGFP cleavage during protein expression and purification.
  • apET15b- emGFP plasmid was constructed to express H-emGFP (an N-terminal His 6 tag and emGFP fusion linked by a thrombin cleavage site) and the untagged emGFP was obtained by a typical thrombin cleavage method.
  • H-emGFP an N-terminal His 6 tag and emGFP fusion linked by a thrombin cleavage site
  • the untagged emGFP was obtained by a typical thrombin cleavage method.
  • the H-emGFP fusion was first purified on an IMAC column and eluted using an imidazole solution.
  • the H-emGFP fusion was concentrated and treated with thrombin overnight. The mixture then was passed through a second IMAC column to remove the His 6 tag and the un-cleaved fusion. However, separating the unmodified thrombin (33 kDa) away from emGFP (27 kDa) was not routine.
  • One solution was to use biotinylated thrombin, which can be subsequently removed by sfreptavidin agarose.
  • the purified emGFP was collected within 6 hours from a single IMAC purification using the GHSL-emGFP fusion, whereas three or more chromatography steps and more than 24 hours were required using the H-emGFP fusion. In the end, purities of the purified untagged emGFP were similar. These results showed that time and reagents are saved by the
  • SrtAc at the N-terminus promoted expression in E. coli. This phenomenon is particularly exemplified in the case of emGFP. While emGFP without an N-terminal tag expresses poorly in E. coli ( ⁇ 1 to 2 % of the total proteins in cell lysate), expression dramatically increased with the N-terminal SrtAc (> 30% of the total proteins). Small peptides (15 to 20 amino acids in length), which expressed poorly by themselves, also expressed at high levels when fused C-terminal to SrtAc (data not shown). Although SrtAc enhanced fusion expression, it did not affect fusion solubility.
  • emGFP and p27 fusions were very soluble and the Cre fusion was less soluble. Fusion solubility therefore likely is determined by the intrinsic property of the target protein.
  • One question raised for the fusion constructs concerns enzymatic properties of SrtAc.
  • the target protein or peptide released from the fusion is a result of hydrolysis and franspeptidation catalyzed by SrtAc.
  • SrtAc can catalyze a franspeptidation between an LPXTG sequence and an aminoglycine-containing subsfrate.
  • a fusion system was constructed for purifying untagged recombinant proteins via a single chromatography step.
  • the purify of the protein obtained from this method was 98% to 99%.
  • the purification method was cost effective, time efficient, and generally applicable to a variefy of protein targets.

Abstract

Provided are enzyme-catalyzed methods, and related products and kits, for efficiently linking a molecule to a protein or peptide and conjugates generated therefrom. An enzyme often utilized has transamidase activity, such as a sortase or related enzyme. Provided also are products, processes and kits for purifying in one step a recombinant target protein or target peptide initially expressed as a fusion product. The fusion product includes a solid phase association region, a peptidase, a peptidase recognition sequence and a target protein or target peptide. The peptidase is capable of cleaving the fusion protein when the fusion is bound to a solid support that specifically binds the solid phase association region, whereby the target protein or target peptide is released from the solid support in a substantially purified form. The peptidase sometimes has hydrolase activity, such as a sortase or related enzyme for example.

Description

PROTEIN AND PEPTDDE LIGATION PROCESSES AND ONE-STEP PURIFICATION PROCESSES
Field of the Invention [0001] The invention relates to efficient enzyme-catalyzed processes for linking a molecule of interest to a protein or peptide, and related kits and products. The invention relates also to processes, products and kits useful for expressing and purifying recombinant proteins and peptides.
Background [0002] Protein engineering is becoming a widely used tool in many areas of protein biochemistry. One engineering method is controlled protein ligation, and over the past ten years some progress has been made. For instance, synthetic-based chemistry allows joining synthetic peptides together through native chemical ligation and a 166 amino-acid polymer-modified erythropoiesis protein has been synthesized using this method. However, native chemical ligation relies on efficient preparation of synthetic peptide esters, which can be technically difficult to prepare for large polypeptides such as proteins. For example, the reaction sometimes is performed in an organic solvent to produce the requisite protein ester for ligation. Other ligation methods not requiring the production of protein esters generate protein thioesters. An intein-based protein ligation system was used to generate a protein by thiolysis of a corresponding protein-intein thioester fusion. A prerequisite for this intein-mediated ligation method is that the target protein is expressed as a correctly folded fusion with intein, and that sufficient spacing between the target and intein is needed to allow formation of the intein-thioester. In many instances, the intein-fusion proteins can only be obtained from inclusion bodies when expressed in Escherichia coli, which often cannot be refolded. This difficulty significantly limits the application of intein-based protein ligation. [0003] Purification of a tag-free recombinant protein often is challenging and often requires multiple chromatography steps. A tag can be linked to a recombinant protein, and after purification, the tag on the fusion may be cleaved from the target protein by treatment with an exogenously added site- specific protease. Additional chromatographic steps then are required to separate the target protein from the un-cleaved fusion, the affinity tag, and the peptidase. For example, a N-terminal His6 tag from a recombinant protein may be cleaved by an engineered His6-tagged aminopeptidase, and a subtractive immobilized metal-ion affinity chromatography (IMAC) step can be used to recover the untagged target. Other methods may require two or more chromatography steps and even special treatment of the exogenous peptidase, such as biotinylation, to facilitate its removal.
Summary [0004] Efficient enzymic processes are described herein that are useful for ligating and purifying proteins and/or peptides. In some embodiments, an enzyme utilized for each process has transamidase catalytic activity. For example, an efficient enzyme-catalyzed protein ligation method has been developed for linking a protein or peptide to a molecule of interest, and it has been discovered that a transamidase enzyme can be utilized to catalyze the ligation process. In specific embodiments, the fransamidase is a sortase enzyme. Sortases are present in Gram-positive bacteria and catalyze proteolytic cleavage and linkage of peptides to cell wall components of the bacteria in vivo. The linkage process is efficient, economical and widely applicable. Four characteristics of the process underscoring these features are (1) the reaction can be performed simply by combining a transamidase with the protein or peptide and the molecule of interest (e.g., isolating an enzyme-protein intermediate is not required), (2) the reaction can be performed in an aqueous solution and does not require esterification of the protein or peptide (i.e., an organic solvent is not required), (3) the process can be performed in a cell-free environment (e.g., ligation does not require bacterial cell wall components or an intact bacterial cell wall); and (4) relatively small concentrations of reagents are required. [0005] Thus, featured herein are processes for linking a molecule of interest to a protein or peptide and the linked conjugates produced by these processes. One embodiment is a method for linking a molecule A to a protein or peptide B, which comprises contacting in a cell free system the molecule A, the protein or peptide B and a transamidase enzyme, whereby the transamidase enzyme links the molecule A to the protein or peptide B. Another embodiment is a method for linking a molecule A to a protein or peptide B, which comprises contacting in a system (e.g., a cell-free system or a system containing cells) the molecule A, the protein or peptide B and a transamidase enzyme, where the molecule A is not a component of a bacterial cell wall in the system, whereby the transamidase enzyme links the molecule A to the protein or peptide B. In certain embodiments, the system is an aqueous system, the protein or peptide is not esterified significantly and/or is not purposefully esterified, the protein or peptide B is not part of an isolated enzyme conjugate, and the ratio of the enzyme to the protein or peptide B often is greater than 1 : 1000. [0006] In the embodiments described herein, the transamidase often is a sortase, and sometimes sortase A or sortase B from S. aureus. The molecule A comprises a NH2-CH2- moiety, which often is present when the molecule is added to the system or sometimes is incorporated after the molecule is added to the system. Also, the protein or peptide often is synthesized or generated with an amino acid sequence that the transamidase recognizes and acts upon, which is referred to herein as a "recognition sequence" or "recognition motif." Examples of such recognition sequences are provided hereafter, and often are not native to the protein or peptide. The protein or peptide sometimes includes a non-native amino acid sequence that allows purification or identification of the ligated product (e.g., a polyhistidine tag that binds to a nickel-conjugated solid support and/or an antibody epitope (e.g., FLAG)) and the non-native amino acid sequence sometimes is cleaved from the protein or peptide upon ligation to the molecule of interest. [0007] Another embodiment is a kit for performing the ligation processes described herein. The kit often includes a container that includes a DNA plasmid into which the user can clone a DNA sequence that encodes the protein or peptide B. The protein or peptide B produced using the plasmid typically includes aN-terminal or C-terminal transamidase recognition motif (described in more detail hereafter). In some embodiments, the protein or peptide B is produced from the plasmid with a non- native sequence that allows purification or identification of the ligation product. The kit sometimes includes an isolated transamidase, often a sortase enzyme, useful for performing the ligation reaction, and may include a plasmid from which the user can prepare the enzyme. In certain embodiments, the kit includes an appropriately derivitized solid support (e.g., a derivitized glass slide, silicon chip, bead or resin) or an appropriately derivitized detectable label (e.g., a fluorescent molecule or radioisotope), to which the user can ligate a protein or peptide using the transamidase. The kit typically includes a set of instructions for performing one or more ligation processes described herein. [0008] Also provided are one-step protein and peptide purification processes. Expressing a recombinant target protein as a fusion can potentially offer several advantages. Fusing a highly stable carrier protein at the N-terminus can increase the expression and solubility of the target protein. Also, incorporating an affinity-fusion tag can ease the capture of the target protein on an affinity column. Protein fusion systems have been developed that generate free recombinant protein or peptide in a single affinity chromatographic step, and are disclosed herein. The fusion proteins include a peptidase capable of cleaving the fusion sequence at an internal location, and the peptidase often has moderate catalytic activity. In some embodiments, the fusion protein comprises a similar enzyme or the same enzyme as used for ligation processes described herein. Because the peptidase is co-expressed as a fusion with the target protein or peptide, the purification system does not require a step of adding an exogenous peptidase. The single-step purification systems and processes described herein are robust, cost-effective and time-efficient. [0009] Thus, provided herein are fusion proteins and nucleic acids that encode them. A fusion protein often comprises a solid phase association region, a peptidase, a target protein or target peptide, and a peptidase recognition sequence. The solid phase association region, peptidase, target protein or target peptide, and peptidase recognition sequence elements are in a contiguous amino acid sequence and these elements are arranged in any convenient orientation. For example, the solid phase association region sometimes is at the N-terminus and it sometimes is at the C-terminus of a fusion protein. The peptidase sequence sometimes is located N-terminal of the target protein or target peptide sequence and sometimes is located C-terminal of the target protein or target peptide sequence. The peptidase recognition sequence often is located between the target protein or target peptide sequence and the peptidase sequence. A fusion protein sometimes comprises one or more linker sequences that flank elements of the fusion protein (e.g., flanking the solid phase association region, target protein or target peptide sequence, peptidase sequence, and/or peptidase recognition sequence), which sometimes are located between the peptidase and the target protein or target peptide sequences. [0010] A fusion protein sometimes comprises an export sequence that exports the fusion protein to an intracellular compartment near a host cell surface or secretes fusion protein outside of a host cell. In some embodiments, the peptidase in the fusion protein is sortase A from Staphylococcus aureus (SrtAc), a fragment of SrtAc, or a variant of the foregoing, which recognize and cleaves a threonine- glycine bond in an LPXTG sequence. In certain embodiments, a SrtAc fragment in a fusion is a catalytic core region that recognizes and cleaves a threonine-glycine bond in an LPXTG sequence with moderate activity, which is referred to herein as a "SrtAc catalytic region." In specific embodiments, the SrtAc catalytic region is amino acids 60 to 206 of native SrtAc or a variant thereof. In some embodiments, the solid phase association region comprises or consists of a polyhistidine sequence (e.g., six contiguous histidine amino acids), which is capable of binding to an immobilized metal-ion affinity chromatography (IMAC) reagent. In certam embodiments, a fusion protein consists of an N-terminal His6 tag, SrtAc catalytic region, and an LPXTG linker followed by a target protein or target peptide at the C-terminus. Provided also is a fusion protein in combination with a solid support, where the solid support specifically binds to the solid phase association region in the fusion protein. [0011] Provided also are nucleic acids that encode the fusion proteins disclosed herein. Nucleic acids that encode the fusion protein sometimes do not include a target protein- or target peptide- encoding nucleotide sequence, and the target protein-encoding or target peptide-encoding nucleotide sequence sometimes is inserted into the nucleic acid. [0012] Also provided is a kit which comprises a container that includes a nucleic acid or fusion protein described herein, often with instructions for producing a fusion protein, and often purifying a target protein or target peptide. The nucleotide sequence that encodes the fusion protein is in any nucleic acid convenient for expressing the fusion protein, or for preparing a nucleic acid that expresses the fusion protein. The nucleic acid often is DNA, often is a plasmid, sometimes is linear, and sometimes is RNA. The nucleic acid often does not include a target protein-encoding or target peptide- encoding sequence, and the kit sometimes includes instructions for inserting a target protein or target peptide sequence into the nucleic acid. In some embodiments, a kit includes a component useful for inserting a target protein or target peptide sequence into the nucleic acid, including but not limited to, one or more oligonucleotides, a polymerase for performing a polymerase chain reaction (PCR) procedure, a topoisomerase, one or more restriction enzymes and/or a ligase. The kit sometimes comprises a solid support capable of binding the solid phase association region in the fusion protein expressed from the nucleic acid, which sometimes is an IMAC solid support. In certain embodiments, a kit includes an organism useful for expressing the fusion protein from the nucleic acid, including but not limited to, a strain of bacteria, yeast, fungi, insect cells or mammalian cells, and optionally includes reagents and/or instructions for inserting the nucleic acid into the organism. [0013] Provided also are processes for producing and sometimes purifying a fusion protein described herein. Fusion proteins can be expressed in any host organism that expresses the fusion protein in detectable amounts, including but not limited to, a strain of bacteria, yeast, fungi, insect cells or mammalian cells. In certain embodiments, a nucleic acid encoding the fusion protein is inserted into a host cell. Nucleotide sequences in the nucleic acid that flank the fusion protein-encoding nucleotide sequence often are selected according to the host organism chosen for expression. The nucleotide sequence that encodes the fusion protein sometimes is codon-optimized depending upon the host organism selected for expression. In some embodiments, cleavage sites for the peptidase in the fusion not located between the peptidase and the target protein or peptide sequence often are removed from the nucleic acid before the fusion protein is expressed. [0014] The fusion protein sometimes is secreted outside of the host organism after expression, in which cases host cells often are not lysed. In some embodiments in which a fusion protein is exported to regions near the cell surface, the host cells often are exposed to conditions more gentle than cell lysis, such as osmotic stress. In embodiments where a fusion protein is not secreted outside of the host cells, the host cells often are lysed after fusion protein expression. After expression, the fusion protein is contacted with a solid support, where the solid support often specifically binds to a solid phase association region in a fusion protein. Immobilized fusion proteins are cleaved by the peptidase on the solid support, and components that accelerate or facilitate cleavage sometimes are added to the system. For embodiments in which the peptidase in the fusion protein is a sortase, a sortase fragment, or a variant of the foregoing, calcium ions and/or triglycine sometimes are added to the system as accelerants. [0015] In embodiments where the peptidase in the fusion protein is a sortase, a sortase fragment, or a variant of the foregoing and the peptidase is closer to the N-terminus of the fusion protein than the target protein or target peptide, a target protein or target peptide with a N-terminal glycine often is isolated and the N-terminal portion of the fusion protein remains associated with the solid phase. The released target protein or target peptide sometimes includes a N-terminal sequence with more amino acids added than just a glycine. In embodiments where the peptidase in the fusion protein is a sortase, a sortase fragment, or a variant of the foregoing and the target protein or target peptide is closer to the N- terminus of the fusion protein than the peptidase, the target protein or target peptide with a C-terminal LPXTGGG sequence sometimes is isolated when the system is contacted with triglycine. In latter embodiments, the target protein or target peptide sometimes includes a C-terminal LPXTGZ moiety when the system is contacted with a NH2-CH2-Z substance. Z sometimes is referred to herein as "a molecule of interest" or a detectable "label," described in greater detail hereafter. [0016] In some embodiments, the process yields an isolated or purified target protein or target peptide having 90% or more purity, 91% or more purify, 92% or more purify, 93% or more purity, 94% or more purify, 95% or more purify, 96% or more purity, 97% or more purity, 98% or more purity, or 99% or more purity. For embodiments in which host cells are grown and maintained in suspension, the process sometimes generates a target protein or target peptide yield of 1 mg/L of cell culture or more, 2 mg/L of cell culture or more, 5 mg/L of cell culture or more, 10 mg/L of cell culture or more, 15 mg/L of cell culture or more, 20 mg/L of cell culture or more, 25 mg/L of cell culture or more, 30 mg/L of cell culture or more, 35 mg/L of cell culture or more, 40 mg L of cell culture or more, 50 mg/L of cell culture or more, 75 mg/L of cell culture or more, 100 mg/L of cell culture or more, 250 mg/L of cell culture or more, 500 mg/L of cell culture or more, 750 mg/L of cell culture or more, 1000 mg/L of cell culture or more, 2000 mg/L of cell culture or more, or 5000 mg/L of cell culture or more. Brief Description of the Drawings [0017] Figure 1 shows a synthetic scheme for generating a molecule A useful for protein or peptide ligation, as exemplified by the synthesis of a triglycine folate molecule. [0018] Figure 2 shows an example of a one-step protein purification scheme of a self-cleavable sortase fusion. The fusion from crude cell lysate is captured on an IMAC column. The immobilized fusion protein is then equilibrated with a calcium-containing solution (± Gly3) to induce SrtAc- mediated cleavage at the LPXTG site. The target protein having an extra N-terminal glycine is collected in the cleavage flow through. [0019] Figure 3 depicts cloning schemes in a pET15b vector and the coding sequences of plasmid pGHSL-emGFP and the three variations. Regions of His6 tag (H), SrtAc (S) and LPETG cleavage site (L), emGFP as well as restriction sites used for cloning are depicted in pGHSL-emGFP. The Trpl94* designation refers to the Trp position in the full-length sortase A of S. aureus, and the actual amino acid position in the fusion is 156. Plasmid pAHSL-emGFP encodes a Gly2 to Ala mutation, pGHS'L- emGFP encodes a Trpl94* to Ala mutation, and pGHS'L-emGFP encodes both Gly2 to Ala and Trp 194* to Ala mutations. S', G, and A are designations for the SrtAc mutant, the N-terminal glycine, and the N-terminal alanine, respectively. [0020] Figure 4 shows a fusion protein and a purification scheme in which a fusion comprising a target protein sequence, a peptidase recognition sequence, a peptidase and a protein solid phase association region is labeled using an amino glycine-derivative and purified on a solid support.
Detailed Description [0021] Processes have been developed for efficiently linking a protein or peptide to a molecule of interest. The development of these processes is significant as they are widely applicable to many proteins and peptides and a multitude of different modifying molecules. The processes are useful for ligating proteins or peptides to one another, ligating synthetic peptides to proteins, linking a reporting molecule to a protein or peptide, joining a nucleic acid to a protein or peptide, conjugating a protein or peptide to a solid support, and linking a protein or peptide to a toxin, for example. Provided also are fusion proteins capable of self-cleavage and one-step purification, and related products, processes, and kits. Such products and processes save cost and time associated with target protein or target polypeptide production, and are useful for conveniently linking a molecule of interest to a target protein or target peptide. The ligation products and purified products are useful in a variety of applications, such as diagnostic procedures (e.g., an antibody that specifically binds to a cancer cell epitope is joined to a radioisotope and the conjugate is administered to a patient to detect the presence or absence of cancer cells); therapeutic procedures (e.g., an antibody that specifically binds to a cancer cell epitope is joined to a toxin such as ricin A and the conjugate is administered to a patient to selectively treat the cancer), and research methods (e.g., a NH2-CH2-derivitized fluorophore and sortase are contacted with fixed cells that express a protein linked to a sortase recognition sequence and the location of the protein is detected by a fluorescence imaging technique), for example. 1. Ligation Processes and Products
Transamidase Enzymes [0022] Protein and peptide ligation processes described herein often are catalyzed by a transamidase. A transamidase is an enzyme that can form a peptide linkage (i.e., amide linkage) between a protein or peptide and a molecule of interest containing a NH2-CH2- moiety. Sortases are enzymes having transamidase activity and have been isolated from Gram-positive bacteria. Gram positive bacteria retain the crystal violet stain in the presence of alcohol or acetone. They have, as part of their cell wall structure, peptidoglycan as well as polysaccharides and/or teichoic acids. Gram- positive bacteria include the following genera: Actinomyces, Bacillus, Bifidobacterium, Cellulomonas, Clostridium, Corynebacterium, Micrococcus, Mycobacterium, Nocardia, Staphylococcus, Streptococcus and Streptomyces. [0023] Enzymes identified as "sortases" from Gram-positive bacteria cleave and translocate proteins to proteoglycan moieties in intact cell walls. Two sortases have been isolated from Staphylococcus aureus, which are sortase A (Srt A) and sortase B (Srt B). As used herein, the term "isolated" often refers to having a specific activity of at least tenfold greater than the sortase- transamidase activity present in a crude extract, lysate, or other state from which proteins have not been removed and also in substantial isolation from proteins found in association with sortase-transamidase in the cell. An "isolated" or "purified" polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. In one embodiment, the term "substantially free" refers to preparing a target polypeptide having less than about 30%, 20%, 10% and sometimes 5% (by dry weight), of non-target polypeptide (also referred to herein as a "contaminating protein"), or of chemical precursors or non-target chemicals. When the target polypeptide or a biologically active portion thereof is recombinantly produced, it also often is substantially free of culture medium, where culture medium represents less than about 20%, sometimes less than about 10%, and often less than about 5% of the volume of the polypeptide preparation. Isolated or purified target polypeptide preparations sometimes are 0.01 milligrams or more or 0.1 milligrams or more, and often 1.0 milligrams or more and 10 milligrams or more in dry weight. [0024] Amino acid sequences of Srt A and Srt B and the nucleotide sequences that encode them are disclosed in US 2003/0153020 Al, published on August 14, 2003, which are incorporated herein by reference. The amino acid sequences of SrtA and SrfB are homologous, sharing 22% sequence identify and 37% sequence similarity. The amino acid sequence of a sortase-transamidase from Staphylococcus aureus also has substantial homology with sequences of enzymes from other Gram-positive bacteria, and such fransamidases can be utilized in the ligation processes described herein. For example, for SrtA there is about a 31% sequence identity (and about 44% sequence similarity) with best alignment over the entire sequenced region of the S. pyogenes open reading frame. There is about a 28% sequence identity with best alignment over the entire sequenced region of the A. naeslundii open reading frame. There is about a 27% sequence identity (and about 47% sequence similarity) with best alignment over the entire sequenced region of the S. mutans open reading frame. There is about a 25% sequence identity (and about 45% sequence similarity) with best alignment over the entire sequenced region of the E. faecalis open reading frame. Similarly, there is significant homology to the entire sequenced region of the B. subtilis open reading frame. However, higher sequence identity 23% (and about 38% sequence similarity) exist between the B. subtilis and S. mutans amino acid sequences. Thus, in certain embodiments a transamidase bearing 18% or more sequence identity, 20% or more sequence identity, or 30% or more sequence identity with the S. pyogenes, A. naeslundii, S. mutans, E. faecalis or B. subtilis open reading frame can be screened, and enzymes having transamidase activity comparable to Srt A or Srt B from S. aureas can be utilized (e.g., comparable activity sometimes is 10% of Srt A or Srt B activity or more). [0025] Transamidases from other organisms also can be utilized in the processes described herein. Such transamidases often are encoded by nucleotide sequences substantially identical or similar to the nucleotide sequences that encode Srt A and Srt B. A similar or substantially identical nucleotide sequence may include modifications to the native sequence, such as substitutions, deletions, or insertions of one or more nucleotides. Included are nucleotide sequences that sometimes are 55%, 60%, 65%, 70%, 75%, 80%, or 85% or more identical to a native quadruplex-forming nucleotide sequence, and often are 90% or 95% or more identical to the native quadruplex-forming nucleotide sequence (each identity percentage can include a 1%, 2%, 3% or 4% variance). One test for determining whether two nucleic acids are substantially identical is to determine the percentage of identical nucleotide sequences shared between the nucleic acids. [0026] Calculations of sequence identity can be performed as follows. Sequences are aligned for optimal comparison purposes and gaps can be introduced in one or both of a first and a second nucleic acid sequence for optimal alignment. Also, non-homologous sequences can be disregarded for comparison purposes. The length of a reference sequence aligned for comparison purposes sometimes is 30% or more, 40% or more, 50% or more, often 60% or more, and more often 70%, 80%, 90%, 100% of the length of the reference sequence. The nucleotides at corresponding nucleotide positions then are compared among the two sequences. When a position in the first sequence is occupied by the same nucleotide as the corresponding position in the second sequence, the nucleotides are deemed to be identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, introduced for optimal alignment of the two sequences. [0027] Comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. Percent identity between two nucleotide sequences can be determined using the algorithm of Meyers & Miller, CABIOS 4:11 17 (1989), which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. Percent identity between two nucleotide sequences can be determined using the GAP program in the GCG software package (available at http address www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A set of parameters often used is a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frame shift gap penalty of 5. [0028] Another manner for determining if two nucleic acids are substantially identical is to assess whether a polynucleotide homologous to one nucleic acid will hybridize to the other nucleic acid under stringent conditions. As use herein, the term "stringent conditions" refers to conditions for hybridization and washing. Stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y., 6.3.1 6.3.6 (1989). Aqueous and non-aqueous methods are described in that reference and either can be used. An example of stringent conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 50°C. Another example of stringent conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 55°C. A further example of stringent conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 60°C. Often, stringent conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at 65°C. Also, stringency conditions include hybridization in 0.5M sodium phosphate, 7% SDS at 65°C, followed by one or more washes at 0.2X SSC, 1% SDS at 65°C. [0029] A variant sequence can depart from a native amino acid sequence in different manners. Amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, helix-forming properties and/or amphipathic properties and the resulting variants are screened for antimicrobial activity. For example, negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; and amino acids with uncharged polar head groups having similar hydrophilicity values include leucine, isoleucine, valine, glycine, alanrne, asparagine, glutamine, serine, threonine, phenylalanine, and tyrosine. Conservative substitutions may be made, for example, according to Table A. Amino acids in the same block in the second column and in the same line in the third column may be substituted for one another other in a conservative substitution. Certain conservative substitutions are substituting an amino acid in one row of the third column corresponding to a block in the second column with an amino acid from another row of the third column within the same block in the second column. Table A
Figure imgf000011_0001
[0030] In certam embodiments homologous substitution may occur, which is a substitution or replacement of like amino acids, such as basic for basic, acidic for acidic, polar for polar amino acids, and hydrophobic for hydrophobic, for example. Non-homologous substitutions can be introduced to a native sequence, such as from one class of residue to another (e.g., a non-hydrophobic to a hydrophobic amino acid), or substituting a naturally occurring amino acid with an unnatural amino acids or non- classical amino acid replacements. [0031] Srt A and Srt B nucleotide sequences may be used as "query sequences" to perform a search against public databases to identify related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al, J. Mol. Biol. 215:403 410 (1990). BLAST nucleotide searches can be performed with the NBLAST program, score = 100, wordlength = 12 to obtain homologous nucleotide sequences. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul, et al., Nucleic Acids Res. 25(17):3389-3402 (1997). When utilizing BLAST and Gapped BLAST programs, default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used (see, http address www.ncbi.nlm.nih.gov). [0032] Transamidase fragments having transamidation activity also can be utilized in the methods described herein. Such fragments can be identified by producing transamidase fragments by known recombinant techniques or proteolytic techniques, for example, and determining the rate of protein or peptide ligation. The fragment sometimes consists of about 80% of the full-length transamidase amino acid sequence, and sometimes about 70%, about 60%, about 50%, about 40% or about 30% of the full- length transamidase amino acid sequence. [0033] Different transamidases specifically recognize different recognition sequences. Accordingly, proteins and peptides utilized in the ligation processes described herein sometimes include or are modified with an appropriate sortase recognition motif. One or more appropriate sortase recognition sequences can be added to a protein or peptide not having one by known synthetic and recombinant techniques. In certain embodiments, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more sortase or transamidase recognition sequences are incorporated in the protein or peptide. [0034] In embodiments where Srt A is utilized as a transamidase enzyme, the protein or peptide often comprises the amino acid sequence XιPX2X3G, where Xi is leucine, isolucine, valine or methionine; X2 is any amino acid; X3 is threonine, serine or alanine; P is proline and G is glycine. In specific embodiments, Xi is leucine and X3 is threonine. In certain embodiments, X2 is aspartate, glutamate, alanine, glutamine, lysine or methionine. [0035] In embodiments where Srt B is utilized as a transamidase enzyme, the protein or peptide often comprises the amino acid sequence NPXιTX2, where Xi is glutamine or lysine; X2 is asparagine or glycine; N is asparagine; P is proline and T is threonine. [0036] Transamidases utilized in the ligation methods described herein sometimes are isolated and incorporated into a kit. Any convenient method known can be utilized to isolate the sortase or transamidase. In certain embodiments, the sortase or transamidase is produced with a N-terminal or C- terminal amino acid sequence that facilitates purification. For example, the sortase or transamidase sometimes is produced with a C-terminal or N-terminal polyhistidine sequence, often six histidines, which have affinity and bind to nickel-derivitized solid supports. In alternative embodiments, a kit includes a plasmid having a sortase or transamidase-encoding nucleotide sequence that the user uses to produce the sortase or transamidase in a cell culture or cell free translation system, and often, then purifies the sortase or transamidase according to instructions provided with the kit.
Systems [0037] In the ligation processes described herein, the transamidase, protein or peptide and molecule of interest are contacted with one another in a system. As used herein, the term "contacting" refers to placing the components of the process in close proximity to one another and allowing the molecules to collide by diffusion. Contacting these components with one another can be accomplished by adding them to one body of fluid and/or in one reaction vessel, for example. The components in the system may be mixed in a variety of manners, such as by oscillating a vessel, subjecting a vessel to a vortex generating apparatus, repeated mixing with a pipette or pipettes, or by passing fluid containing one assay component over a surface having another assay component immobilized thereon, for example. The components may be added in any order to the system. [0038] As used herein, the term "system" refers to an environment that receives the ligation components, which includes, for example, microtiter plates (e.g., 96-well or 384-well plates), silicon chips having molecules immobilized thereon and optionally oriented in an array (see, e.g., U.S. Patent No. 6,261,776 and Fodor, Nature 364: 555-556 (1993)), and microfiuidic devices (see, e.g., U.S. Patent Nos. 6,440,722; 6,429,025; 6,379,974; and 6,316,781). The system can include attendant equipment such as signal detectors, robotic platforms, and pipette dispensers. [0039] The system often is cell free and often does not include bacterial cell wall components or intact bacterial cell walls. In some embodiments, the system includes one or more cells, often non- bacterial cells or non-Gram-positive bacterial cells. In such embodiments, one or more components often are expressed by one or more recombinant nucleotide sequences in a cell, which nucleotide sequences are integrated into the cell genome or non-integrated (e.g., in a plasmid). Cells in such systems often are maintained in vivo, sometimes ex vivo, and sometimes in vivo. [0040] The system is maintained at any convenient temperature at which the ligation reaction can be performed. The temperature often is room temperature (e.g., about 25°C) and the temperature can be optimized by repetitively performing the same ligation procedure at different temperatures and determining ligation rates. Any convenient assay volume and component ratio is utilized. In certain embodiments, a component ratio of 1 :1000 or greater transamidase enzyme to protein or peptide is utilized, or a ratio of 1 : 1000 or greater transamidase enzyme to molecule of interest is utilized. In specific embodiments, ratios of enzyme to protein or peptide or enzyme to molecule of interest is about 1 :1, including 1 :2 or greater, 1 :3 or greater, 1 :4 or greater, 1 :5 or greater, 1 :6 or greater, 1 :7 or greater, 1 :8 or greater, and 1 :9 or greater. [0041] The ligation process often is performed in a system comprising an aqueous environment. Water with an appropriate buffer and/or salt content often is utilized. An alcohol or organic solvent may be included in certain embodiments. The amount of an organic solvent often does not appreciably esterify the protein or peptide in the ligation process (e.g., esterified protein or peptide often increase only by 5% or less upon addition of an alcohol or organic solvent). Alcohol and/or organic solvent contents sometimes are 20% or less, 15% or less, 10% or less or 5% or less, and in embodiments where a greater amount of an alcohol or organic solvent is utilized, 30% or less, 40% or less, 50% or less, 60% or less, 70% or less, or 80% or less alcohol or organic solvent is present. In certain embodiments, the system includes only an alcohol or an organic solvent, with only limited amounts of water if it is present. [0042] One or more components for ligation or a ligation product may be immobilized to a solid support. The attachment between an assay component and the solid support may be covalent or non- covalent (e.g., U.S. Patent No. 6,022,688 for non-covalent attachments). The solid support may be one or more surfaces of the system, such as one or more surfaces in each well of a microtiter plate, a surface of a silicon wafer, a surface of a bead (e.g., Lam, Nature 354: 82-84 (1991)) that is optionally linked to another solid support, or a channel in a microfiuidic device, for example. Types of solid supports, linker molecules for covalent and non-covalent attachments to solid supports, and methods for immobilizing nucleic acids and other molecules to solid supports are known (e.g., U.S. Patent Nos. 6,261,776; 5,900,481; 6,133,436; and 6,022,688; and WIPO publication WO 01/18234).
Proteins and Polypeptides [0043] Any protein or peptide may be utilized as a target in the ligation process described herein. The protein or peptide often is isolated when utilized in a cell-free system. The protein or peptide sometimes is a subregion of a protein, such as in the N-terminus, C-terminus, extracellular region, intracellular region, transmembrane region, active site (e.g., nucleotide binding region or a substrate binding region), a domain (e.g., an SH2 or SH3 domain) or a post-translationally modified region (e.g., phosphorylated, glycosylated or ubiquinated region), for example. Peptides often are 50 amino acids or fewer in length (e.g., 45, 40, 35, 30, 25, 20, or 15 amino acids or fewer in length) and proteins sometimes are 100 or fewer amino acids in length, or 200, 300, 400, 500, 600, 700, or 900 or fewer amino acids in length. The protein or peptide sometimes includes the modification moiefy or a portion thereof (e.g., the glycosyl group or a portion thereof). In certain embodiments, the protein is a signal transduction factor, cell proliferation factor, apoptosis factor, angiogenesis factor, or cell interaction factor. Examples of cell interaction factors include but are not limited to cadherins (e.g., cadherins E, N, BR, P, R, and M; desmocollins; desmogleins; and protocadherins); connexins; integrins; proteoglycans; immunoglobulins (e.g., ALCAM, NCAM-1 (CD56), CD44, intercellular adhesion molecules (e.g., ICAM-1 and ICAM-2), LFA-1, LFA-2, LFA-3, LECAM-1, VLA-4, ELAM andN- CAM); selectins (e.g., L-selectin (CD62L), E-selectin (CD62e), and P-selectin (CD62P)); agrin; CD34; and a cell surface protein that is cyclically internalized or internalized in response to ligand binding. Examples of signal transduction factors include but are not limited to protein kinases (e.g., mitogen activated protein (MAP) kinase and protein kinases that directly or indirectly phosphorylate it, Janus kinase (JAK1), cyclin dependent kinases, epidermal growth factor (EGF) receptor, platelet-derived growth factor (PDGF) receptor, fibroblast-derived growth factor receptor (FGF), insulin receptor and insulin-like growth factor (IGF) receptor); protein phosphatases (e.g., PTP1B, PP2A and PP2C); GDP/GTP binding proteins (e.g., Ras, Raf, ARF, Ran and Rho); GTPase activating proteins (GAFs); guanine nucleotide exchange factors (GEFs); proteases (e.g., caspase 3, 8 and 9), ubiquitin ligases (e.g., MDM2, an E3 ubiquitin ligase), acetylation and methylation proteins (e.g., p300/CBP, ahistone acetyl transferase) and tumor suppressors (e.g., p53, which is activated by factors such as oxygen tension, oncogene signaling, DNA damage and metabolite depletion). The protein sometimes is a nucleic acid- associated protein (e.g., histone, transcription factor, activator, repressor, co-regulator, polymerase or origin recognition (ORC) protein), which directly binds to a nucleic acid or binds to another protein bound to a nucleic acid. The protein sometimes is useful as a detectable label, such as a green or blue fluorescent protein. [0044] The protein or peptide sometimes is an antibody. Antibodies sometimes are IgG, IgM, IgA, or IgE, sometimes are polyclonal or monoclonal, and sometimes are chimeric, humanized or bispecific versions of such antibodies. Polyclonal and monoclonal antibodies that bind specific antigens are commercially available, and methods for generating such antibodies are known. In general, polyclonal antibodies are produced by injecting an isolated antigen into a suitable animal (e.g., a goat or rabbit); collecting blood and/or other tissues from the animal containing antibodies specific for the antigen and purifying the antibody. Methods for generating monoclonal antibodies, in general, include injecting an animal with an isolated antigen (e.g., often a mouse or a rat); isolating splenocytes from the animal; fusing the splenocytes with myeloma cells to form hybridomas; isolating the hybridomas and selecting hybridomas that produce monoclonal antibodies which specifically bind the antigen (e.g., Kohler & Milstein, Nature 256:495 497 (1975) and StGroth & Scheidegger, J Immunol Methods 5:1 21 (1980)). Examples of monoclonal antibodies are anti MDM 2 antibodies, anti-p53 antibodies (pAB421, DO 1, and an antibody that binds phosphoryl-serl5), anti-dsDNA antibodies and anti-BrdU antibodies, described hereafter. [0045] Methods for generating chimeric and humanized antibodies also are known (see, e.g., U.S. patent No. 5,530,101 (Queen, et al.), U.S. patent No. 5,707,622 (Fung, et al.) and U.S. PatentNos. 5,994,524 and 6,245,894 (Matsushima, et al.)), which generally involve transplanting an antibody variable region from one species (e.g., mouse) into an antibody constant domain of another species (e.g., human). Antigen-binding regions of antibodies (e.g., Fab regions) include a light chain and a heavy chain, and the variable region is composed of regions from the light chain and the heavy chain. Given that the variable region of an antibody is formed from six complementarity-determining regions (CDRs) in the heavy and light chain variable regions, one or more CDRs from one antibody can be substituted (i.e., grafted) with a CDR of another antibody to generate chimeric antibodies. Also, humanized antibodies are generated by introducing amino acid substitutions that render the resulting antibody less immunogenic when administered to humans. [0046] The protein or peptide sometimes is an antibody fragment, such as a Fab, Fab', F(ab)'2, Dab, Fv or single-chain Fv (ScFv) fragment, and recombinant methods for generating antibody fragments are known (e.g., U.S. PatentNos. 6,099,842 and 5,990,296 and PCT/GB00/04317). In general, single-chain antibody fragments are constructed by joining a heavy chain variable region with a light chain variable region by a polypeptide linker (e.g., the linker is attached at the C-terminus or N- terminus of each chain), and such fragments often exhibit specificities and affinities for an antigen similar to the original monoclonal antibodies. Bifunctional antibodies sometimes are constructed by engineering two different binding specificities into a single antibody chain and sometimes are constructed by joining two Fab' regions together, where each Fab' region is from a different antibody (e.g., U.S. Patent No. 6,342,221). Antibody fragments often comprise engineered regions such as CDR-grafted or humanized fragments. In certam embodiments the binding partner is an intact immunoglobulin, and in other embodiments the binding partner is a Fab monomer or a Fab dimer. [0047] Proteins and peptides sometimes are chemically synthesized using known techniques (e.g., Creighton, 1983 Proteins. New York, N.Y.: W. H. Freeman and Company; and Hunkapiller et al., (1984) Nature July 12 -18;310(5973):105-11). For example, a peptide can be synthesized by a peptide synthesizer. If desired, non-classical amino acids or chemical amino acid analogs can be introduced as a substitution or addition into the fragment sequence. Non-classical amino acids include but are not limited to D-isomers of the common amino acids, 2,4-diaminobutyric acid, a-amino isobutyric acid, 4- aminobutyric acid, Abu, 2-amino butyric acid, g-Abu, e-Ahx, 6-amino hexanoic acid, Aib, 2-amino isobutyric acid, 3 -amino propionic acid, ornithine, norleucine, norvaline, hydroxyproline, sarcosine, cittulline, homocifrulline, cysteic acid, t-butylglycine, t-butylalanine, phenylglycine, cyclohexylalanine, b-alanine, fluoroamino acids, designer amino acids such as b-methyl amino acids, Ca-methyl amino acids, Na-methyl amino acids, and amino acid analogs in general. Each amino acid in the peptide often is L (levorotary) and sometimes is D (dextrorotary). Proteins often are produced by known recombinant methods, or sometimes are purified from natural sources. [0048] Native protein and peptide sequences sometimes are modified. For example, conservative amino acid modifications may be introduced at one or more positions in the amino acid sequences of target polypeptides. A "conservative amino acid substitution" is one in which the amino acid is replaced by another amino acid having a similar structure and/or chemical function. Families of amino acid residues having similar structures and functions are well known. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Also, essential and non- essential amino acids may be replaced. A "non-essential" amino acid is one that can be altered without abolishing or substantially altering the biological function of a target polypeptide, whereas altering an "essential" amino acid abolishes or substantially alters the biological function of a target polypeptide. Amino acids that are conserved among target polypeptides typically are essential amino acids. [0049] Proteins or peptides may exist as chimeric or fusion polypeptides. As used herein, a "chimeric polypeptide" or "fusion polypeptide" includes a protein or peptide linked to a different polypeptide. The different polypeptide can be fused to the N-terminus or C-terminus of the target polypeptide. Fusion polypeptides can include a moiety having high affinity for a ligand. For example, the fusion polypeptide can be a GST-target fusion polypeptide in which the protein or peptide sequences are fused to the C-terminus of the GST sequences, or a polyhistidine-target fusion polypeptide in which the protein or peptide is fused at the N- or C-terminus to a string of histidine residues (e.g., sometimes three to six histidines). Such fusion polypeptides can facilitate purification of recombinant protein or peptide. Expression vectors are commercially available that already encode a fusion moiety, and a nucleotide sequence encoding the peptide or polypeptide can be cloned into an expression vector such that the fusion moiety is linked in-frame to the target polypeptide. Further, the fusion polypeptide can be a protein or peptide containing a heterologous signal sequence at its N- terminus. In certain host cells (e.g., mammalian host cells), expression, secretion, cellular internalization, and cellular localization of a target polypeptide can be increased through use of a heterologous signal sequence. Fusion polypeptides sometimes include all or a part of a serum polypeptide (e.g., an IgG constant region or human serum albumin). [0050] As described above, the protein or peptide sometimes is modified by a process or with a moiety not typically incorporated into a protein during translation. In specific embodiments, the protein or peptide comprises one or more moieties selected from an alkyl moiety (e.g., methyl moiety), an alkanoyl moiefy (e.g., an acefyl group (e.g., an acetylated histone)), an alkanoic acid or alkanoate moiety (e.g., a fatty acid), a glyceryl moiety (e.g., a lipid), a phosphoryl moiety, a glycosyl moiety (e.g., N-linked or O-linked carbohydrate chains) or an ubiquitin moiety. Further, any of numerous chemical modifications may be carried out by known techniques, including but not limited to specific chemical cleavage by cyanogen bromide, trypsin, chymotrypsin, papain, V8 protease, NaBH4; acetylation, formylation, oxidation, reduction; metabolic synthesis in the presence of tunicamycin; and the like. The N-terminal and/or C-terminal ends may be processed (e.g., the N-terminal methionine may not be present due to prokaryotic expression of the protein or peptide) and chemical moieties may be attached to the amino acid backbone. Proteins and peptides sometimes are modified with a detectable label, such as an enzymatic, fluorescent, isotopic or affinity label to allow for detection and isolation of the polypeptide. [0051] When no fransamidase or sortase recognition sequence is present, the protein or peptide often is modified to include an appropriate recognition sequence. When a recognition sequence is present in a native protein or peptide amino acid sequence, the recognition sequence often is removed unless it is located near the N-terminus or C-terminus. A recognition sequence can be removed in a native amino acid sequence by synthesizing it without the recognition sequence or by modifying some and/or all amino acids in the nucleotide sequence encoding the amino acid recognition sequence by known recombinant techniques. When a recognition sequence and non-native sequence useful for purification are introduced to the native protein or peptide amino acid sequence, the recognition sequence often is incorporated closer to the N-terminus than the non-native sequence useful for purification such that the latter is cleaved from the protein or peptide during ligation. In such embodiments, the transamidase also is modified with the non-native sequence and the ligated product is purified away from the reactants when contacted with a solid support that binds the non-native sequence. In embodiments where the protein or peptide is modified with a detectable label or homing sequence for a detectable label, such sequences often are incorporated closer to the N-terminus than the recognition sequence so they are not cleaved from the protein or peptide in the ligation process.
Molecules of Interest [0052] Linkage processes featured herein may proceed by the following reaction scheme:
B-RS-C + NA → B-RS'-NA + RS"-C
where A is a molecule of interest; N is aNH2-CH2- moiefy; B is a protein or peptide; RS is a recognition sequence for the transamidase; C is a portion of the protein or peptide released after the transamidase reaction; RS' is a portion of the recognition sequence retained with the protein or peptide after the transamidase reaction and RS" is a portion of the recognition sequence released with C after the transamidase reaction. C sometimes does not exist in embodiments where the recognition sequence is at the C-terminus of the peptide or protein. The protein or peptide sometimes includes more than one recognition sequence. Recognition sequences are described above. [0053] The transamidase catalyzes formation of an amide linkage between a NH2-CH2- moiety, which is joined to and/or is in the molecule of interest, and a carboxyl moiety in the protein or peptide. Suitable NH2-CH2- moieties are known and can be determined by performing the linkage processes described herein in a routine manner. Where a molecule of interest does not include a suitable NH2- CH2- moiety, one or more NH2-CH2- moieties are joined to the molecule. Methods for joining one or more NH2-CH2- moieties to a molecule of interest are known and can be developed. An example of a method for introducing a NH2-CH2- moiety to a molecule of interest is shown in Figure 1. NH2-CH2- moieties often utilized in the processes described herein are present in one or more glycine amino acids in or derivitized to the molecule of interest. In certain embodiments, between one and six glycines are present in or are incorporated into/onto the molecule of mterest, and in specific embodiments, the molecule of mterest is derivitized with three glycines. [0054] The molecule of interest can be any molecule that leads to a useful molecule/protein or peptide conjugate. In certain embodiments, the molecule of interest is a protein or peptide. The molecule of mterest sometimes is an antibody epitope, an antibody, a recombinant protein, a synthetic peptide or polypeptide, a peptide comprising one or more D-amino acids, a peptide comprising all D- amino acids, a peptide comprising one or more unnatural or non-classical amino acids (e.g., ornithine), a peptide mimetic, or a branched peptide. In particular embodiments, the molecule of mterest is a peptide that confers enhanced cell penetrance to the protein or peptide (e.g., a greater amount of the protein or peptide conjugated to the peptide of mterest is translocated across a cell membrane in a certain time frame as compared to the protein or peptide not conjugated to the peptide of mterest), which is referred to herein as a "protein transduction domain (PTD)" peptide or "transduction peptide." Any PTD can be conjugated to a protein or peptide using the methods described herein. PTD peptides are known, and include amino acid subsequences from HFV-tat (e.g., U.S. Patent No. 6,316,003), sequences from a phage display library (e.g., U.S. 20030104622) and sequences rich in amino acids having positively charged side chains (e.g., guanidino-, amidino- and amino-containing side chains; e.g., U.S. Patent No. 6,593,292). The PTD peptide sometimes is branched as described hereafter. [0055] The molecule of mterest sometimes is a detectable moiety. Any known and convenient detectable moiety can be utilized. In certam embodiments avidin, streptavidin, a fluorescent molecule, or a radioisotope is linked to the protein or peptide. In other embodiments, biotin or another vitamin, such as thiamine or folate, is linked to the protein or peptide. Examples of detectable moieties include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioisotopes. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include sfreptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorofriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioisotopes include I251, 1311, 35S or 3H. The radioisotope sometimes is selected based upon its appropriate use in a nuclear medicinal procedure, such as Be-7, Mg-28, Co-57, Zn-65, Cu-67, Ge-68, Sr-82, Rb-83, Tc-95m, Tc-96, Pd-103, Cd-109, and Xe-127 for example. [0056] Conjugates between a protein or peptide and a detectable label are useful in diagnostic procedures. Diagnostic procedures include, for example, nuclear medicinal procedures for locating diseased locations of a subject, and procedures for detecting specific components or pathogens in a biological sample from a subject. The conjugates also are useful as research tools. For example, the conjugates are useful in flow cytometry techniques and for detecting a cellular location of a specific protein or peptide in a cell. Thus, in certam embodiments, a NH2-CH2-derivitized fluorophore and a transamidase are contacted with cells that express a protein linked to a sortase recognition sequence. The transamidase, often a sortase, joins the fluorophore to the protein and allows detection of the cell, protein in the cell, or protein on the cell surface. In certam embodiments, the transamidase is expressed by a nucleotide sequence that encodes it in the cell (e.g., the nucleotide sequence sometimes is in a plasmid), and in other embodiments useful for detecting a protein on a cell surface or in a fixed cell, exogenous transamidase protem, often isolated protein, is contacted with the cell. [0057] The location of the protein in a cell sometimes is detected by a fluorescence imaging technique in cells fixed to a solid support. Imaging techniques include, for example, using a standard light microscope or a confocal microscope (e.g., U.S. PatentNos. 5,283,433 and 5,296,703 (Tsien)). Appropriate light microscopes are commercially available and are useful for probing cells in two dimensions (i.e., the height of a cell often is not resolved), and confocal microscopy is useful for probing cells in three-dimensions. Many microscopy techniques are useful for determining the location of a protein in a cell (e.g., in the nucleus, cytoplasm, plasma cell membrane, nucleolus, mitochondria, vacuoles, endoplasmic reticulum or Golgi apparatus). Some microscopic techniques are useful for determining the location of molecular antigens in groups of cells, tissue samples, and organs. Cellular locations often are visualized by counter-staining for subcellular organelles. [0058] In other embodiments, cells expressing the protein are subjected to a known flow cytometry procedure, such as flow microfluorimetry (FMF) and fluorescence activated cell sorting (FACS); U.S. PatentNos. 6,090,919 (Cormack, et al.); 6,461,813 (Lorens); and 6,455,263 (Payan)). For use in these procedures, the protein often is expressed on the cell surface. [0059] The molecule of mterest sometimes is a polymer or a small molecule. Polymers sometimes are useful for enhancing protein or peptide solubility, stability and circulating time, and/or for decreasing immunogenicity when the protem or peptide is administered to a subject. The polymer sometimes is a water soluble polymer such as polyethylene glycol, ethylene glycol/propylene glycol copolymers, carboxymethylcellulose, dexfran, polyvinyl alcohol and the like, for example. The protein or peptide may include one, two, three or more attached polymer moieties after ligation. The polymer may be of any molecular weight, and may be branched or unbranched. For polyethylene glycol, the molecular weight often is between about 1 kDa and about 100 kDa (the term "about" indicating that in preparations of polyethylene glycol, some molecules will weigh more, some less, than the stated molecular weight). Other sizes may be used, depending on a desired therapeutic profile (e.g., the duration of sustained release desired, the effects if any on biological activity, the ease in handling, the degree or lack of antigenicity and other known effects of the polyethylene glycol to a therapeutic protein or analog). Attaching a small molecule to a protem or peptide can alter its activity or characteristics. [0060] Any small molecule derivitized with or having a NH2-CH2- moiety can be linked to a protein or peptide having a transamidase recognition site. Figure 1 shows an example of a method useful for derivitizing a small molecule with a triglycine moiety. Small molecules sometimes are referred to herein as "compounds." Compounds can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive (see, e.g., Zuckermann et al., J. Med. Chem.37: 2678-85 (1994)); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; "one-bead one-compound" library methods; and synthetic library methods using affinity chromatography selection. Biological library and peptoid library approaches are typically limited to peptide libraries, while the other approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, Anticancer Drug Des. 12: 145, (1997)). Examples of methods for synthesizing molecular libraries are described, for example, in DeWitt et al., Proc. Natl. Acad. Sci. U.S.A. 90: 6909 (1993); Erb et al., Proc. Natl. Acad. Sci. USA 91 : 11422 (1994); Zuckermann et al., J. Med. Chem. 37: 2678 (1994); Cho et al., Science 261: 1303 (1993); Carrell et al., Angew. Chem. Int. Ed. Engl. 33: 2059 (1994); Carell et al, Angew. Chem. Int. Ed. Engl. 33: 2061 (1994); and in Gallop et al, J. Med. Chem. 37: 1233 (1994). Libraries of compounds may be presented in solution (e.g., Houghten, Biotechniques 13: 412-421 (1992)), or on beads (Lam, Nature 354: 82-84 (1991)), chips (Fodor, Nature 364: 555-556 (1993)), bacteria or spores (Ladner, United States Patent No. 5,223,409), plasmids (Cull et al., Proc. Natl. Acad. Sci. USA 89: 1865-1869 (1992)). Compounds may alter expression or activity of KIAA0861 polypeptides and may be a small molecule. Small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds. In certain embodiments, the small molecule folate, spermine or puromycin is linked to a protein or peptide. In other embodiments, the small molecule is a modification moiety described above, such as a phosphoryl moiety, ubiquitin moiety or a glycosyl moiety, for example. [0061] In some embodiments, the molecule of mterest is a nucleic acid, such as a deoxyribonucleic acid, a ribonucleic acid, nucleic acid derivatives, and a modified nucleic acid. The nucleic acid may comprise or consist of DNA nucleotide sequences (e.g., genomic DNA (gDNA) and complementary DNA (cDNA)) or RNA nucleotide sequences (e.g., mRNA, tRNA, and rRNA). The nucleic acid sometimes is about 8 to about 50 nucleotides in length, about 8 to about 35 nucleotides in length, and sometimes from about 10 to about 25 nucleotides in length. Nucleic acids often are 40 or fewer nucleotides in length, and sometimes are 35 or fewer, 30 or fewer, 25 or fewer, 20 or fewer, and 15 or fewer nucleotides in length. Synthetic oligonucleotides can be synthesized using standard methods and equipment, such as by using an ABI3900 High Throughput DNA Synthesizer, which is available from Applied Biosystems (Foster City, CA). [0062] A nucleic acid sometimes is an analog or derivative nucleic acid, which can include backbone/linkage modifications (e.g., peptide nucleic acid (PNA) or phosphothioate linkages) and/or nucleobase modifications. Examples of such modifications are set forth in U.S. Patent No. 6,455,308 (Freier et al.); in U.S. PatentNos. 4,469,863; 5,536,821; 5,541,306; 5,637,683; 5,637,684; 5,700,922; 5,717,083; 5,719,262; 5,739,308; 5,773,601; 5,886,165; 5,929,226; 5,977,296; 6,140,482; and in WIPO publications WO 00/56746 and WO 01/14398. Methods for synthesizing oligonucleotides comprising such analogs or derivatives are disclosed, for example, in the patent publications cited above, and in U.S. PatentNos. 6,455,308; 5,614,622; 5,739,314; 5,955,599; 5,962,674; 6,117,992; and in WO 00/75372. Nucleic acids may be modified by chemical linkages, moieties, or conjugates that enhance activity, cellular distribution, or cellular uptake of nucleic acid, and examples of modifications in modified nucleic acids are in U.S. Patent Nos. 6,455,308 (Freier), 6,455,307 (McKay et al.), 6,451,602 (Popoff et al.), and 6,451,538 (Cowsert). [0063] The molecule of interest sometimes is a toxin. Any toxin may be selected, and often is selected for high cytotoxic activity. Proteins or peptides ligated to a toxin often are useful as therapeutics. For example, a protein or peptide antibody or receptor that specifically binds to a cancer cell when linked to a toxin such as ricin is useful for treating cancer in subjects. In certain embodiments, the toxin is selected from the group consisting of abrin, ricin A, pseudomonas exotoxin and diphtheria toxin. [0064] The molecule of mterest sometimes is a solid support. The solid support often is derivitized with multiple NH2-CH2- moieties, such as triglycine moieties. The solid support may be any described herein. In certam embodiments, the solid support is a glass slide, a glass bead, a silicon wafer or a resin. In specific embodiments, a resin such as EAH Sepharose is derivitized with triglycine moieties using a FMOC/EDC derivitization procedure. [0065] The molecule of mterest sometimes is a phage that expresses a NH2-CH2- moiefy on the surface. In an embodiment, the phage expresses a protein or peptide comprising one or more N- terminal glycines, sometimes three or five glycines at the N-terminus, and the phage expressing such a protein is contacted with a protem or peptide containing a transamidase recognition motif and a transamidase, thereby producing a phage/protein or phage/peptide conjugate. In another embodiment, a protein or peptide expressed at the phage surface comprises the fransamidase recognition motif, and the phage is contacted with a molecule of interest comprising a NH2-CH2- moiefy and a transamidase, thereby producing a conjugate between the phage and the molecule of interest. Methods for displaying a wide variety of peptides or proteins at the surface of a phage are known. The protein or peptide often is expressed as a fusion with a bacteriophage coat protein (Scott & Smith, Science 249: 386-390 (1990); Devlin, Science 249: 404-406 (1990); Cwirla et al., Proc. Natl. Acad. Sci. 87: 6378-6382 (1990); Felici, J. Mol. Biol. 222: 301-310 (1991)). Methods also are available for linking the test polypeptide to the N-terminus or the C-terminus of the phage coat protein. The original phage display system was disclosed, for example, in U.S. PatentNos. 5,096,815 and 5,198,346. This system used the filamentous phage Ml 3, which required that the cloned protein be generated in E. coli and required franslocation of the cloned protem across the E. coli inner membrane. Lytic bacteriophage vectors, such as lambda, T4 and T7 are more practical since they are independent of E. coli secretion. T7 is commercially available and described in U.S. PatentNos. 5,223,409; 5,403,484; 5,571,698; and 5,766,905. A protem or peptide comprising a NH2-CH2- moiety or a transamidase recognition sequence can be expressed on the surface of any other phage or virus, including but not limited to, murine leukemia virus (MLV), mouse mammary tumor virus (MMTV), Rous sarcoma virus (RSV), Fujinami sarcoma virus (FuSV), Moloney murine leukemia virus (Mo-MLV), FBR murine osteosarcoma virus (FBR MSV), Moloney murine sarcoma virus (Mo-MSV), Abelson murine leukemia virus (A-MLV), Avian myelocytomatosis virus- 29 (MC29), and Avian erythroblastosis virus (AEV), human immunodeficiency virus (HIV), simian immunodeficiency virus (SIV), visna maedi virus (VMV), caprine arthritis-encephalitis virus (CAEV), equine infectious anemia virus (EIAV), feline immunodeficiency virus (FIV), bovine immunodeficiency virus (BIV), a hepatitis virus (e.g., hepatitis B or C), rhinovirus, herpes-zoster virus (VZV), herpes simplex virus (e.g., HSV-1 or HSV-2), cytomegalovirus (CMV), vaccinia virus, influenza virus, encephalitis virus, hantavirus, arbovirus, West Nile virus, human papiloma virus (HPV), Epstein-Barr virus, and respiratory syncytial virus. [0066] The protem or peptide sometimes comprises the molecule of interest, such that the protein or peptide is cyclized in the ligation reaction. For example, the molecule of interest sometimes is an amino acid sequence located at the N-terminus of a protein or peptide, where the amino acid sequence initiates with one or more glycines. Addition of a transamidase, such as a sortase, then cyclizes the linear peptide or protem. Cyclized proteins or peptides often exhibit advantageously enhanced stability as compared to the linear counterpart, and sometimes exhibit enhanced affinity for a target receptor as compared to the linear counterpart.
2. Purification Processes and Products
Fusion Proteins [0067] A fusion protein often comprises a solid phase association region, a target protein or target peptide, a peptidase, and a peptidase recognition sequence, where the peptidase is capable of cleaving the fusion protein at the recognition sequence. The solid phase association region, peptidase, target protein or target peptide, and peptidase recognition sequence elements are in a contiguous amino acid sequence and these elements are arranged in any suitable orientation. For example, the solid phase association region sometimes is located at the N-terminus of a fusion protein, and in some embodiments, it is located at the C-terminus of a fusion protein. The peptidase sequence sometimes is located closer to the N-terminus of the fusion protein than the target protein or target peptide sequence, and sometimes is located closer to the C-terminus of the fusion protein than the target protein or target peptide sequence. The peptidase recognition sequence often is located between the target protein or target peptide sequence and the peptidase sequence. A fusion protein sometimes comprises amino acid sequences in the following orientation (N-terminal to C-terminal orientation): solid phase association region, peptidase, peptidase recognition sequence, and target protein or target peptide. A fusion protein sometimes comprises amino acid sequences in the following orientation (N-terminal to C-terminal orientation): target protein or target peptide, peptidase, peptidase recognition sequence, and solid phase association region. A fusion protein sometimes is in association with a solid support that specifically binds to the solid phase association region. [0068] A fusion protein sometimes comprises sequences other than a solid phase association region, target protein or target peptide sequence, peptidase sequence, and peptidase recognition sequence. For example, a fusion protein sometimes comprises one or more linker sequences that flank elements of the fusion protein (e.g., flanking the solid phase association region, target protein or target peptide sequence, peptidase sequence, and/or peptidase recognition sequence). A linker sequence sometimes is located between the peptidase sequence and the target protein or target peptide sequence. A linker sequence sometimes is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25 or fewer, 30 or fewer, 40 or fewer, 50 or fewer, 60 or fewer, 70 or fewer, 80 or fewer, 90 or fewer or 100 or fewer amino acids in length. Also, a fusion protem sometimes includes an N-terminal sequence and/or C-terminal sequence other than a solid phase association region, target protein or target peptide sequence, peptidase sequence, and peptidase recognition sequence. [0069] A fusion protein sometimes comprises a sequence that increases expression, secretion, cellular internalization, and cellular localization of the fusion protein. Fusion proteins sometimes include all or a part of a serum polypeptide (e.g., an IgG constant region or human serum albumin). In some embodiments, a fusion protem comprises a sequence capable of exporting the fusion protein to an intracellular compartment near a host cell surface (e.g., the fusion protein can be released from host cells by exposing the host cells to osmotic shock conditions). In certain embodiments, a fusion protein comprises a sequence capable of secreting the fusion protein outside of a host cell (e.g., host cells need not be lysed as fusion proteins are secreted from the cell and readily collected). Examples of such sequences are known and sometimes are included in the nucleic acids and fusion proteins described herein (e.g., Izard et al., Mol Microbiol. 1994 Sep;13(5):765-73; Bolhuis et al., Microbiol Mol Biol Rev. 2000 Sep;64(3):515-47; Giga-Hama et al., Biotechnol. Appl. Biochem. 1999 30:235-244). [0070] A fusion protein comprises amino acids of any type. Amino acids include, but are not limited to, D- or L-isomer amino acids, natural amino acids (e.g., any of the 20 naturally occurring L- isomer amino acids), unnatural or non-classical amino acids, and homologs of alpha amino acids such as beta2- and beta3- amino acids and gamma amino acids. Unnatural or non-classical amino acids include, but are not limited to, ornithine, diaminobutyric acid, norleucine, pyrylalanine, thienylalanine, naphthylalanine, phenylglycine, alpha* and alpha-disubstituted* amino acids, N-alkyl amino acids*, lactic acid*, halide derivatives of natural amino acids such as trifluorotyrosine*, p-Cl-phenylalanine*, p-Br-phenylalanine*, p-I-phenylalanine*, L-allyl-glycine*, beta-alanine*, L-alpha-amino butyric acid*, L-gamma-amino butyric acid*, L-alpha-amino isobutyric acid*, L-epsilon-amino caproic acid#, 7- amino heptanoic acid*, L-methionine sulfone*, L-norleucine*, L-norvaline*, p-nifro-L-phenylalanine*, L-hydroxyproline#, L-thioproline*, methyl derivatives of phenylalanine (Phe) such as 4-methyl-Phe*, pentamethyl-Phe*, L-Phe (4-amino)#, L-Tyr (methyl)*, L-Phe (4-isopropyl)*, L-Tic (1,2,3,4- tefrahydroisoquinoline-3-carboxyl acid)*, L-diaminopropionic acid, L-Phe (4-benzyl)*, 2,4- diaminobutyric acid, 4-aminobutyric acid (gamma-Abu), 2-amino butyric acid (alpha-Abu), 6-amino hexanoic acid (epsilon-Ahx), 2-amino isobutyric acid (Aib), 3 -amino propionic acid, ornithine, norleucine, norvaline, hydroxyproline, sarcosine, cittulline, homocitrulline, cysteic acid, t-butylglycine, t-butylalanine, phenylglycine, cyclohexylalanine, fluoroamino acids, designer amino acids such as beta- methyl amino acids, Ca-methyl amino acids, Na-methyl amino acids, naphthyl alanine, and the like. The notation * indicates a derivative having hydrophobic characteristics and # indicates a derivative having hydrophilic characteristics. Methods for introducmg unnatural or non-classical amino acids and amino acid homologs are known, which include, for example, processes utilizing heterologous tRNA/synthetase pair in E.coli, where the tRNA recognizes an amber stop codon and is loaded with an unnatural amino acid (e.g., http address www.iupac.org/news/prize/2003/wang.pdf). A fusion protein also may include suitable spacer groups inserted between any two amino acids, such as alkyl groups (e.g., methyl, ethyl or propyl groups) or amino acid spacers (e.g., glycine or beta-alanine), for example. A fusion protein also may comprise peptoids. The term "peptoids" refers to variant amino acid structures where the alpha-carbon substituent group is linked to the backbone nitrogen atom rather than the alpha-carbon. Processes for preparing peptides in peptoid form are known (e.g., Simon et al., PNAS (1992) 89(20), 9367-9371 and Horwell, Trends Biotechnol. (1995) 13(4), 132-134). [0071] A fusion protem sometimes is modified by a process or with a moiefy not typically incorporated into a protem during translation. In specific embodiments, a fusion protein comprises one or more moieties selected from an alkyl moiety (e.g., methyl moiefy), an alkanoyl moiefy (e.g., an acetyl group (e.g., an acetylated histone)), an alkanoic acid or alkanoate moiety (e.g., a fatty acid), a glyceryl moiefy (e.g., a lipid), a phosphoryl moiety, a glycosyl moiety (e.g., N-linked or O-linked carbohydrate chains) or an ubiquitin moiety. Further, any of numerous chemical modifications may be carried out by known techniques, including but not limited to specific chemical cleavage by cyanogen bromide, trypsin, chymotrypsin, papain, V8 protease, NaBH4; acetylation, formylation, oxidation, reduction; metabolic synthesis in the presence of tunicamycin; and the like. The N-terminal and/or C- terminal ends may be processed (e.g., the N-terminal methionine may not be present due to prokaryotic expression of the protein or peptide) and chemical moieties may be attached to the amino acid backbone. In embodiments where the N-terminal methionine is capable of being removed from the fusion protein in a host cell, the resulting N-terminal amino acid sometimes is substituted or deleted or one or more amino acids are inserted before it when that amino acid is glycine. Fusion proteins sometimes are modified with a detectable label, such as an enzymatic, fluorescent, isotopic or affinity label to allow for detection.
Solid Phase Association Regions [0072] A solid phase association region, which sometimes is referred to herein as a "solid phase association sequence" includes any moiety or amino acid sequence suitable for associating a fusion protein with a solid support. Any suitable solid phase for protein or peptide purification processes can be utilized (e.g., cellulose, plastic, glass, polystyrene), and the solid support often is derivitized with a binding pair member and the solid phase association region comprises, consists essentially of, or consists of the other binding pair member. Any binding pair members can be utilized that allow for association of a fusion protein with a solid phase via the solid phase association region. Examples of binding pair members include, but are not limited to, protein/ligand (e.g., maltose binding protein/maltose, glutathione S-fransferase/glutathione); metal/metal-binding moiety (e.g., metal/polyhistidine amino acid sequence, nickel His6); antibody/epitope (e.g., antibody/FLAG sequence); antibody/antigen; antibody/antibody; antibody/antibody fragment; antibody/antibody receptor; antibody/protein A or protein G; hapten/anti-hapten; biotin/avidin; biotin/streptavidin; folic acid/folate binding protem; vitamin B12/infrinsic factor; nucleic acid/complementary nucleic acid (e.g., DNA, RNA, PNA); and chemical reactive group/complementary chemical reactive group (e.g., sulfhydryl/maleimide, sulfhydryl/haloacetyl derivative, amine/isotriocyanate, amine/succinimidyl ester, and amine/sulfonyl halides). In some embodiments, the solid phase association region comprises, consists essentially of, or consists of a polyhistidine amino acid sequence (e.g., His6) and the solid phase is derivitized with nickel or copper ions. In certain embodiments, the solid phase association region directly associates with the solid phase, and sometimes it is associated indirectly with the solid phase (e.g., the solid phase association region and derivitized solid phase are linked by a bifunctional linker moiety). [0073] A solid phase association region sometimes is included in the fusion protein during fusion protein production. In some embodiments, a solid phase association region or a portion of it sometimes is incorporated in a nucleic acid that encodes the fusion protein. For example, an expressed fusion protein sometimes includes a polyhistidine track that can bind a metal-derivitized solid phase, and a biotin moiety sometimes is included in a fusion protem produced by recombinant expression (e.g., pcDNA™6 BioEase™ Gateway® Biotinylation System (Invitrogen); an avidin- or streptavidin- derivitized solid phase can bind the biotin in the fusion protein). In some embodiments, a solid phase association region sometimes is added to a fusion protein after fusion protein production. Methods for derivitizing a fusion protem with a solid phase associating agent are known (e.g., a biotin-derivitized antibody or antibody fragment that specifically binds the fusion protein sometimes is contacted with fusion protem and the product is contacted with an avidin- or streptavidin-derivitized solid phase).
Peptidase Sequences and Peptidase Recognition Sequences [0074] The amino acid sequence for any peptidase, peptidase fragment or sequence variant thereof that is capable of cleaving the fusion protem can be incorporated in the fusion protem using known processes. The term "peptidase" refers to any amino acid sequence capable of cleaving a backbone amide bond in the fusion protem, and a peptidase, in some embodiments, may be capable of performing additional types of reactions. In some embodiments, the peptidase has hydrolytic activity, fransferase activity, ligase activity, splicing activity, cyclization activity, or a combination of the foregoing. For example, the peptidase sometimes modifies the fusion protein by linking the following to an end of a fusion protem cleavage product: a water molecule (e.g., hydrolytic activity), another molecule (e.g., an exogenous molecule; transferase or ligase activity), an end of another fusion protein cleavage product (e.g., splicing activity), or another end of the same fusion protein cleavage product (e.g., cyclization activity). The peptidase selected sometimes cleaves the fusion protem by an intramolecular reaction (i.e., the peptidase cleaves the fusion protein in which it is located), sometimes cleaves by an intermolecular reaction (i.e., the peptidase cleaves another fusion protein molecule), and sometimes cleaves by intermolecular and intramolecular reactions. In some embodiments, the selected peptidase does not excise all or part of its own sequence from the fusion protein and link ends of the fusion protein cleavage products to one another. A selected peptidase often has endopeptidase activity, where the peptidase cleaves within the fusion protem sequence, and often the peptidase does not cleave a terminal amino acid from the fusion protein. The peptidase can cleave the fusion protein at any catalytic rate, and peptidases having moderate to slow catalytic activity often are selected. Examples of sequences having peptidase activity are known (e.g., transamidases, hydrolases, cysteine-type endopeptidases, aspartic-type endopeptidases, D-alanyl-D-alanine endopeptidases, elastases, metalloendopeptidases, mitochondrial inner membrane peptidases, serine-type endopeptidases and threonine endopeptidases; http address merops.sanger.ac.uk/; http address www.gramene.org/perl/ontology/search_term?id=GO:0003824; and http address http address www.ncbi.nlm.nih.gov). A selected peptidase often has moderate to low hydrolytic activity. In some embodiments, the catalytic activity of the peptidase is expressed in terms of the kinetic parameter kcJK-M, and sometimes this parameter is in a range of about 0.5 MV1 to about 50 MV1, and sometimes is about 5 M'V1. [0075] In some embodiments, the peptidase has hydrolytic activity and ligase activity depending upon substrates available. In latter embodiments, the peptidase sometimes is a sortase (described in further detail above) that hydrolyzes the fusion protem when the system is substantially free of NH2- CH2-containing substrate (e.g., a polyglycine such as triglycine), and/or cleaves and ligates a fusion protein cleavage product to a NH2-CH2-containing substrate present in the system. In some embodiments, the peptidase also comprises transamidase activity, and sometimes the peptidase is a sortase, such as a sortase described above. [0076] In some embodiments, peptidase sequence fragments or variants are selected to favor certain fusion protem parameters. For example, peptidase sequence fragments and variants sometimes are selected for lower catalytic activity and/or enhanced fusion protein production levels as compared to the corresponding full-length, native peptidase sequence. Peptidase sequence fragments and variants having lower catalytic activity sometimes are selected to reduce rates of fusion protein cleavage during protein purification, described in further detail herein. Determining effects of a peptidase sequence fragment or variant in a fusion protein on such parameters sometimes are determined by producing fusion proteins with full-length, native peptidase sequences and fusion proteins with fragment and/or variant peptidase sequences and comparing parameters for the respective fusions. Fragments of any length and sequence variants that have suitable catalytic activity and suitable cleavage specificity for protein purification processes described hereafter may be incorporated into a fusion protem. A fragment sometimes consists of about 80% of the full-length peptidase amino acid sequence, and sometimes about 70%, about 60%, about 50%, about 40% or about 30% of the full-length peptidase amino acid sequence. [0077] A peptidase sequence variant sometimes differs by one or more amino acid substitutions, insertions or deletions, such as 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acid substitutions, insertions or deletions from the native sequence or subsequence, and sometimes is substantially identical to the native peptide sequence or subsequence. The term "substantially identical" as used herein refers to sequences sharing one or more identical amino acid sequences. Included is an amino acid sequence that is 55% or more, 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more (each often within a 1%, 2%, 3% or 4% variability) identical to another amino acid sequence. One test for determining whether two sequences are substantially identical is to determine the percent of identical sequences shared between nucleic acids, proteins or peptides. Sequence identity can be determined as described above. Also, a peptidase sequence can be utilized as a "query sequence" in database searches useful for identifying alternative peptidase sequences for use in fusion proteins that are substantially identical to a query peptidase sequence. [0078] In certain embodiments, the peptidase sequence is a sortase fragment having catalytic activity, and the fragment sometimes is a SrtAc fragment. A sortase fragment sometimes is a catalytic core region that recognizes and cleaves a threonine-glycine bond in a Leu-Pro-Xaa-Thr-Gly sequence. In some embodiments, a catalytic core region from SrtAc is utilized, and sometimes the catalytic core region is from about position 60 to about position 206 of native SrtAc. The peptidase sequence sometimes is a sortase variant or sortase fragment variant, sometimes a SrtAc variant or SrtAc fragment variant, and sometimes is a variant with reduced activity compared to native sortase or native SrtAc. In some embodiments, the SrtAc variant or SrtAc fragment variant includes an amino acid substitution at Trp 194, such as an alanine at that position. [0079] As disclosed above, the peptidase is capable of cleaving the fusion protein at the recognition sequence. The recognition sequence often is selected based upon the peptidase sequence in the fusion protein, and any recognition sequence that the peptidase can specifically recognize can be utilized. Peptidase recognition sequences are known and can be incorporated into a fusion using known recombinant molecular biology processes. For example, in embodiments where Srt A is utilized as a peptidase, the recognition sequence often comprises the amino acid sequence XιPX2X3G, where Xi is leucine, isolucine, valine or methionine; X2 is any amino acid; X3 is threonine, serine or alanine; P is proline and G is glycine. In specific embodiments, Xt is leucine and X3 is threonine. In certain embodiments, X2 is aspartate, glutamate, alanine, glutamine, lysine or methionine. In embodiments where Srt B is utilized as a peptidase, the recognition sequence often comprises the amino acid sequence NPXιTX2, where Xi is glutamine or lysine; X2 is asparagine or glycine; N is asparagine; P is proline and T is threonine. In embodiments where the peptidase is SrtAc, or a fragment or sequence variant thereof, the recognition sequence often is Leu-Pro-Xaa-Thr-Gly, where Xaa is any amino acid. [0080] The term "at the recognition sequence" sometimes refers to the peptidase cleaving the fusion protein within the recognition sequence, such that each cleavage product includes a portion of the recognition sequence. The term sometimes refers to the peptidase cleaving the fusion protem at a site in the fusion protein adjacent to the recognition sequence, sometimes at a position 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids away from an end of the recognition sequence, and sometimes at a position 15 or fewer, 20 or fewer, 50 or fewer or 100 or fewer amino acids away from the an end of the recognition sequence. The recognition sequence often is located between the peptidase sequence and target protein sequence or target peptide sequence in a fusion protein. A linker sequence sometimes is located between the peptidase sequence and target protem sequence or target peptide sequence in a fusion protein and the recognition sequence sometimes is located in the linker sequence. Where the peptidase cleaves within the recognition sequence, the recognition sequence often is located closer to the target protein or target peptide sequence, where an end of the recognition sequence sometimes is 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids away from an end of the target protein or target peptide sequence, and sometimes is 15 or fewer, 20 or fewer, 50 or fewer or 100 or fewer amino acids away from the an end of the target protein or target peptide sequence.
Target Proteins. Target Peptides. Nucleic Acids and Host Cells [0081] Any protein or peptide amino acid sequence may be incorporated into a fusion protem as a target protem or target peptide for the one-step purification processes described herein. Proteins and peptides described above for ligation processes can be incorporated into fusion proteins for one-step purification processes for example. Expressing the protem or peptide in a fusion sometimes increases the solubility of the protein or peptide as compared to when it is expressed alone and not part of a fusion protein. [0082] Provided are nucleic acids that encode a fusion protein described herein. A nucleic acid sometimes comprises a nucleotide sequence that encodes a fusion protein comprising a solid phase association region, a peptidase, and a peptidase recognition sequence, where the peptidase is capable of cleaving the fusion protein at the recognition sequence. The nucleic acid is of any composition useful for generating further copies of the nucleic acid and/or fusion protem expression. The nucleic acid often comprises, consists essentially of or consists of DNA or RNA, often is double-stranded, sometimes is single-stranded, sometimes is linear and sometimes is a plasmid. [0083] A nucleic acid sometimes includes a region for inserting a target protein or target peptide sequence, such as one or more topoisomerase recognition sites, one or more sites adapted for an amplification process (e.g., polymerase chain reaction (PCR) process) and a nucleotide sequence with one or more restriction enzyme sites convenient for cloning the fusion protein-encoding nucleotide sequence into the nucleic acid, for example. A nucleic acid sometimes includes a nucleotide sequence that encodes a target protein or target peptide. [0084] A nucleic acid often includes one or more regulatory sequences operatively linked to the nucleotide sequence that encodes the fusion protein. The term "regulatory sequence" includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals), for example. Regulatory sequences include those that direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. Regulatory sequences are of viral origin in certam embodiments. For example, commonly used viral promoter sequences are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40. A nucleic acid sometimes is capable of directing fusion protein expression in a particular cell type (e.g., tissue-specific regulatory elements are used to express the fusion protem). Examples of tissue-specific promoters include, but are not limited to, an albumin promoter (liver-specific; Pinkert et al, Genes Dev. 1 : 268-277 (1987)), lymphoid-specific promoters (Calame & Eaton, Adv. Immunol. 43: 235-275 (1988)), promoters of T cell receptors (Winoto & Baltimore, EMBO J. 8: 729-733 (1989)) promoters of immunoglobulins (Banerji et al., Cell 33: 729-740 (1983); Queen & Baltimore, Cell 33: 741-748 (1983)), neuron-specific promoters (e.g., the neurofilament promoter; Byrne & Ruddle, Proc. Natl. Acad. Sci. USA 86: 5473- 5477 (1989)), pancreas-specific promoters (Edlund et al., Science 230: 912-916 (1985)), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Patent No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters sometimes are utilized, which include for example, the murine hox promoters (Kessel & Grass, Science 249: 374-379 (1990)) and the alpha-fetopolypeptide promoter (Campes & Tilghman, Genes Dev. 3: 537-546 (1989)). [0085] A nucleic acid sometimes is transiently or stably transfected or transformed into host cells. The terms "transformation" and "fransfection" refer to a variefy of known techniques for introducing an exogenous nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, fransduction/infection, DEAE-dextran-mediated transfection, lipofection, and elecfroporation, for example. For stable fransfection and transformation embodiments, a nucleic acid sometimes includes one or more integration nucleotide sequences for integrating a nucleotide sequence that encodes a fusion protein into a host cell genome, and such integration sequences often flank the fusion protein-encoding nucleotide sequence. [0086] Design of a nucleic acid sometimes depends on such factors as the choice of host cell to be transformed, desired level of fusion protein expression, and the like. The nucleic acid can be designed for fusion protein expression in prokaryotic and/or eukaryotic cells. For example, fusion proteins can be expressed in bacteria (e.g., E. coli), insect cells (e.g., Sf9 cells using baculovirus expression vectors), yeast cells, fungi cells, or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990), for example. In some embodiments, nucleotide sequences encoding the fusion protein in the nucleic acid can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase. [0087] A nucleotide sequence in the nucleic acid that encodes a fusion protem sometimes is codon-optimized depending upon the host organism selected for expression. In some embodiments, cleavage sites for the peptidase in the fusion protein other than the cleavage site that releases the target protein or target peptide from the fusion protein (e.g., one or more cleavage sites not located between the peptidase and the target protein or target peptide sequence) often are removed from or modified in the nucleic acid before the fusion protem is expressed. [0088] Any host cell suitable for producing and expressing a fusion protein can be transfected or transformed with a nucleic acid described herein. Thus, provided are compositions comprising a host cell in combination with a nucleic acid described herein, and compositions comprising a host cell in combination with a fusion protein described herein. Host cells sometimes are a strain of bacteria, a strain of yeast, a strain of fungi, a strain of insect cells or a strain of mammalian cells.
Fusion Protein Production and Target Protein/Peptide Purification [0089] Provided are processes for producing a fusion protem described herein in a system, which comprise contacting a system that comprises a nucleic acid encoding the fusion protein with conditions suitable for expressing the fusion protein. The system sometimes is a cell-free environment for in vitro expression of the fusion protein. The system often is an environment containing cells, such as a cell culture plate or flask containing host organism cells. Any host cell suitable for protein expression is utilized, including but not limited to, a strain of bacteria, a strain of yeast, a strain of fungi, a strain of insect cells and a strain of mammalian cells. As disclosed herein, a host cell sometimes is transiently transfected or transformed and sometimes stably transfected or transformed with a nucleic acid encoding the fusion protein. Any condition suitable for expressing the fusion protein can be utilized, such as conditions in which host cells are multiplying or conditions in which host cells are not multiplying. Parameters such as media type, vessel type (e.g., flask, dish, fermentation reactor), temperature, humidity, agitation levels, inducer concentration, and time(s) of inducer addition sometimes are set according to standard procedures and sometimes are modified to optimize fusion protein expression. An inducer often is selected based upon the type of promoter sequence(s) in the nucleic acid (e.g., the inducer isopropyl beta-D-thiogalactoside often is utilized with nucleic acids having a T7 polymerase promoter sequence). In some embodiments, the host organism is in suspension, and the fusion protein sometimes is produced at a level of 1 mg/L of cell culture or more, 2 mg/L of cell culture or more, 5 mg/L of cell culture or more, 10 mg/L of cell culture or more, 15 mg/L of cell culture or more, 20 mg/L of cell culture or more, 25 mg/L of cell culture or more, 30 mg/L of cell culture or more, 35 mg/L of cell culture or more, 40 mg/L of cell culture or more, 50 mg/L of cell culture or more, 75 mg/L of cell culture or more, 100 mg/L of cell culture or more, 250 mg/L of cell culture or more, 500 mg/L of cell culture or more, 750 mg/L of cell culture or more or 1000 mg/L of cell culture or more. [0090] A fusion protein produced in a system often is purified and isolated. Thus, provided is a method for isolating a target protein or target peptide, which comprises: contacting a fusion protein described herein with a solid phase capable of specifically binding to the solid phase association region of the fusion protein and collecting target protein or target peptide cleaved from the fusion protein associated with the solid support. The terms "isolated" and "purified" often refer to a target protein or target peptide activity having a specific activity of at least ten-fold greater than the corresponding activity present in a crude extract, lysate, or other state from which target protein or target peptide have not been removed. The terms "isolated" and "purified" also refer to the target protein or target peptide being substantially free from proteins and other components in host cell or cell-free systems. The term "substantially free" often refers to a target protein or target peptide having 30% or less, 20% or less, 10% or less, and sometimes 5% or less (by dry weight) of non-target polypeptide (also referred to herein as a "contaminating protein"), or of chemical precursors or non-target chemicals. Isolated and purified target protein or target peptide often is substantially free of culture medium, where culture medium represents less than about 20%, sometimes less than about 10%, and often less than about 5% of the volume of the target protein or target peptide preparation. Isolated or purified target protein or target peptide preparations sometimes are 0.01 milligrams or more or 0.1 milligrams or more, and often 1.0 milligrams or more and 10 milligrams or more in dry weight. Purify of an isolated target peptide or target protem can be determined by a suitable method, such as mass spectrometry, gel electrophoresis and densitometry, for example. [0091] In embodiments where the fusion protein is produced in a host cell, the host cell sometimes is contacted with conditions that release the fusion protein, including but not limited to, cell lysis conditions osmotic sfress conditions. The cells sometimes are contacted with reagents before, during or after the cells are exposed to the releasing conditions, such as contacting cells or cell lysates with one or more protease inhibitors. The fusion protein sometimes is secreted from a host cell, and in such embodiments, medium surrounding the cells often is collected and fusion protein often is purified from the collected medium. Reagents (e.g., one or more protease inhibitors) sometimes are added to culture media into which fusion proteins are secreted. [0092] Any solid phase suitable for protem or peptide purification can be utilized for the purification processes, such as a solid phase described herein. In certain embodiments, the solid phase is derivitized with metal ions, such as nickel ions, and the solid phase association region of the fusion protein includes an amino acid sequence that binds to a metal ion, such as a polyhistidine sequence (e.g., His6). The solid phase is arranged in a configuration suitable for protein purification, such as in a vessel that retains the solid phase and allows liquid reagents to pass through (e.g., chromatography columns, cenfrifugation columns having a membrane, high performance liquid chromatography columns) and vessels for separating a solid phase from liquid phase by cenfrifugation (cenfrifugation vessels with an optional membrane). A fusion protein is contacted with a solid phase under conditions that allow the fusion protein to associate with the solid phase, often by specific binding of the solid phase association region to the solid phase, and often in a buffer solution comprising low salt concentration. After the solid phase is contacted with the fusion protein, components not associated with the solid phase often are separated from the solid phase using standard procedures. The solid phase often is washed under conditions that maintain association of the fusion protein with the solid support. [0093] A fusion protein in association with a solid phase often is contacted with a substance that facilitates or accelerates fusion protein cleavage by the peptidase. Any substance that facilitates or accelerates peptidase activity can be utilized, and in embodiments where the peptidase is a sortase, sortase fragment, or sequence variant thereof, the substance sometimes is a NH2-CH2- containing substance, optionally with calcium ions. Calcium ions can be introduced to the purification system by addition of calcium chloride, for example. The NH2-CH2- containing substance sometimes is a polyglycine and sometimes is triglycine. Where a polyglycine substance is utilized, the cleaved target protein or target peptide often includes an added N-terminal glycine donated by the polyglycine substance for fusion proteins oriented with the peptidase region closer to the N-tenninus of the fusion protein than the target protein or target peptide region. [0094] In some embodiments, the NH2-CH2- containing substance sometimes is characterized by the formula NH2-CH2-Z, where Z is a molecule of interest. Where a NH2-CH2-Z substance is utilized, the cleaved target protein or target peptide often includes an added C-terminal Z moiefy donated by the NH2-CH2-Z substance for fusion proteins oriented with the target protein or target peptide region closer to the N-terminus of the fusion protein than the peptidase region. Z is any molecule of interest that yields a stable target protein-Z product or target peptide-Z product. Any suitable molecule of mterest can be utilized such as molecules of interest described above. [0095] The solid phase, solutions contacted with the solid phase and target protein or target peptide eluted from the solid phase are maintained at any suitable temperature. The temperature sometimes is room temperature (e.g., about 25°C), sometimes is colder than room temperature (e.g., about 4 °C, which can beneficially reduce peptidase cleavage rates and stabilized eluted target protein or target peptide), and temperature can be optimized by performing purification procedures at different temperatures. Purification processes often are performed in a system comprising an aqueous environment. Water with an appropriate buffer and/or salt content often is utilized. An alcohol or organic solvent may be included in certain embodiments. The amount of an organic solvent often does not appreciably esterify the fusion protein, target protem or target peptide (e.g., esterified protein or peptide often increase only by 5% or less upon addition of an alcohol or organic solvent). Alcohol and/or organic solvent contents sometimes are 20% or less, 15% or less, 10% or less or 5% or less, and in embodiments where a greater amount of an alcohol or organic solvent is utilized, 30% or less, 40% or less, 50% or less, 60% or less, 70% or less, or 80% or less alcohol or organic solvent is present. In certain embodiments, the purification system includes only an alcohol or an organic solvent, with only limited amounts of water if it is present. Target protein or target peptide released from the fusion protein sometimes is 90% or more pure, 91% or more pure, 92% or more pure, 93% or more pure, 94% or more pure, 95% or more pure, 96% or more pure or 97% or more pure, and sometimes is 98% or more pure or 99% or more pure.
Kits [0096] Provided herein are kits which comprise one or more containers that include any of the compositions and products described herein, and often include instructions for producing a fusion protein and/or purifying a target protein or target peptide from the fusion protein. In some embodiments, one or more containers in the kit comprise a nucleic acid described herein. A nucleic acid in the kit sometimes comprises a nucleotide sequence that encodes a fusion protein containing a solid phase association region, a peptidase region and a peptidase recognition sequence, and sometimes includes a sequence that facilitates incorporation of a nucleotide sequence encoding a target protem or target peptide into the nucleic acid. A kit sometimes includes instructions for inserting a target protein- encoding or target peptide-encoding nucleotide sequence into a provided nucleic acid. A kit sometimes includes one or more oligonucleotides, polymerases, isomerases (e.g., topoisomerase), restriction enzymes and/or ligases, which sometimes are utilized to insert a target protein-encoding or target peptide-encoding nucleotide sequence into the nucleic acid provided in tbe kit. A nucleic acid in the kit sometimes includes a nucleotide sequence that encodes a target protein or target peptide. In some embodiments, a kit includes a solid support capable of binding a solid phase association region in a fusion protein, and sometimes the solid support is a metal-derivitized for associating a fusion protein containing a polyhistidine solid phase association region. A kit sometimes includes host cells into which a nucleic acid can be transformed or transfected, and often is useful for expressing a fusion protein from the nucleic acid. Such kits can include any fype of suitable host cells, including a sfrain of bacteria, a strain of yeast, a strain of fungi, a strain of insect cells and a strain of mammalian cells. A kit sometimes includes one or more substances that facilitate cleavage of a fusion protein, such as a calcium ion containing substance (e.g., CaCl2), and/or a NH2-CH2-containing substance, such as a polyglycine (e.g., triglycine) and/or a NH2-CH2-Z substance described herein.
Examples [0097] The examples set forth below illustrate and do not limit the invention.
Example 1 Protem Expression and Purification [0098] The following procedures were utilized in the processes described in Example 2 through Example 6.
Sortase [0099] Primers (5'-CATATGGCTAGCCAAGCTAAACCTCAAATTCCG-3' and 5'- CCTAGGCTCGAGTTATTTGACTTCTGTAGCTACAA-3') were used to PCR amplify the srtA sequence from the genomic DNA of Staphylococcus aureus. The DNA fragment was digested with Nhe I and Xho I and subsequently ligated into a modified pET15b (Novagen) vector for expressing a sortase containing an N-terminal 6-His tag and a thrombin cleavage site. The plasmid was transformed into Escherichia coli BL21(DE3) for protein expression. The cell culture was grown in LB medium containing 50 mg/L ampicillin at 37°C. Protein expression was induced with 0.2 mM isopropyl β-D- thiogalactoside (IPTG) at 30°C for 3 hours. The cell paste then was suspended in 10 mM Tris-HCl, pH 7.5, 50 mM NaCl, and 1 mM β-mecaptoethanol and lysed by sonication. Nucleic acids were precipitated by the addition of 0.1% polyethylene imine (PEI). The supernatant of the lysate was applied to a 5-ml Ni-NTA (Qiagen) column. The column was washed with 10 mM Tris-HCl, pH 7.5, 500 mM NaCl, 30 mM imidazole, and ImM β-mecaptoethanol. The protein was eluted with 10 mM Tris-HCl, pH 7.5, 50 mM NaCl, 250 mM imidazole and ImM β-mecaptoethanol (BME). Pooled fractions were buffer exchanged into 50 mM Tri-HCl, pH 7.5 and 150 mM NaCl through a 10DG desalting column (BioRad). The protein was subjected to MALDI-TOF mass spectroscopy (Bruker Daltonic Autoflex) with an average mass of 19,067 ± 20 (predicted 19,048).
GST-LPXTG-6His proteins [0100] Oligos (5'-AATTCCTTCCGGAAACCGGCCTCGAGCACCACCACCAC CACCACTGATAGC-3' and 5'-
GGCCGCTATCAGTGGTGGTGGTGGTGGTGCTCGAGGCCGGTTTCCGGAAGG-3') were used to anneal and insert into pGEX-4T-l vector between EcoRI and Notl sites encoding GST with a C- terminal LPETGHHHHHH sequence. The plasmid was fransformed into Escherichia coli BL21 for protein expression. The cell culture was grown in LB medium containing 50 mg/L ampicillin at 37°C. The protem expression was induced with 0.2 mM IPTG at 30°C for 3 hours. The cell paste was then suspended in 35 ml of IX phosphate buffered saline (PBS) buffer containing 1 mM TCEP [Tris(2- carboxyethyl}phosphine]. The lysate supernatant was applied to a 5-ml GST-binding column (Novagen). The column was washed with 50ml PBS containing 1 mM TCEP. The protein was eluted with 10 mM reduced glutathione. Pooled fractions were buffer exchanged into 150 mM NaCl through a 10DG desalting column. The protein mass was confirmed by MALDI-TOF spectroscopy.
GFP-LPXTG-6His [0101] Primers (5'-ATATACATATGGTGAGCAAGGGCG-3' and 5'- TGGTGCTCGAGACCGGTTTCCGGAAGCTTGTACAGCTCGTCCATGC-3') were used to PCR amplify eGFP from a pET24b-eGFP vector. The PCR product was digested and inserted between the Nde I and Xho I sites of pET23b vector to express eGFP with a C-terminal LPETGLEHHHHH sequence. The plasmid DNA was transformed into Escherichia coli BL21(DE3) for protem expression. The protein expression and purification procedure is similar to that of sortase. The purified protein was desalted into 150 mM NaCl.
Protein Hydrolysis [0102] GST-LPXTG-6His and GFP-LPXTG-6His (10 μM and 35 μM, respectively) were incubated with 10 μM of sortase in buffers containing 150 mM NaCl, 5 mM CaCl2, and 2 mM BME. The buffer pH was controlled by 50 mM sodium acetate (pH 5.5), 50 mM MES (pH 6.5), 50 mM TrisHCl (pH 7.5) or 50 mM TrisHCl (pH 8.5). The reactions were incubated at 37°C for 20 hours. During the course of incubation, aliquots of reaction mixture were taken and hydrolysis products were analyzed on 4-20% Tris-Glycine SDS-PAGE gels. Ligation [0103] For peptide-peptide ligation, unless otherwise noted, all peptides were synthesized on an Applied Biosystems 431 Peptide Synthesizer. Acetyl- E(Edans)LYKTGK(Dabcyl)R {Edans, 5'-[(2'- aminoethyl)amino]naphthalene-l-sulfonic acid) and Dabcyl, 4-{[4-(dimethylamino)phenyl]azo}benzoic acid} (synthesized by the Keck facility at Yale University) was dissolved in H20 and added to a final concentration of 100 μM. The peptide was incubated at 37°C in the absence or presence of 100 μM various peptide {GnRRNRRTSKLMLR (n = 1, 2, 3, or 5); L- or D-Tat (GYGRKKRRQRRR)} and 10 μM SrtA in buffer containing 50 mM Tri-HCl, pH 7.5, 150 mM NaCl, 5mM CaCl2, and 2mM BME. The rate of substrate cleavage was measured with the fluorescence increase of Edans at an emission wavelength of 460 nm and an excitation wavelength of 360 nm on a fluorometer (Applied Biosystems CYTOFLUOR Series 4000). The product formation was monitored by a C-l 8 reverse phase HPLC (Vydac, Cat 218TP54) over the course of 28 hrs, using a gradient of 0.5% to 38% CH3CN in 0.1% trifluoroacetic acid in 40 minutes at a flow rate of 1 ml/min. Elution of peptides was monitored at 214 nm and fractions were collected for mass analysis on a MALDI-TOF mass spectrometer. Conjugation of Acetyl-RE(£ώ«s)LPKTGK(Dαέcy/)R with a L- or D-Tat (GYGRKKRRQRRR) also followed the same procedures. [0104] Fluorescein-Ahx (aminohexanoic acid)-SKLPKTGSE ( -ahx- SKLPKTGSE) was dissolved in ddH20 and added to a final concentration of 1 mM. The peptide was incubated with 10 μM SrtA in buffer containing 50 mM Tri-HCl, pH 8.5, 150 mM NaCl, 5mM CaCl2, and 2mM BME at 37°C. The products were analyzed on a MALDI-TOF mass spectrometer. [0105] For protein-peptide ligation, the protein substrate (GFP-LPXTG-6His or GST-LPXTG- 6His) with concentrations ranging 10 μM to 35 μM was used and a peptide substrate was added in 5 to 10-fold excess. The reactions were incubated at 37°C for 24 to 48 hours in the presence of 10 μM SrtA, 50 mM Tri-HCl, pH 7.5, 150 mM NaCl, 5mM CaCl2, and 2mM BME. The ligation reactions were terminated by passing the reaction mixtures through a 0.5 ml Ni-NTA column equilibrated with 50 mM Tris-HCl pH 7.5 and 150 mM NaCl. The protein ligation product was collected in the column flow through. The column flow through was further purified on 10DG desalting column to remove the unligated peptide. [0106] A protein substrate was also incubated with sortase in the absence or presence of 5mM glycine, 5mM spermine (Sigma), 0.5 mM 3.4 kDa poly(ethylene glycol)-ω-amino- -carboxyl (NH2- PEG-COOH) (Shearwater), or 0.5 mM peptide-1 (GnRRNRRTSKLMLR, n = 1, 3, or 5) in the ligation buffer. After 20 hours at 37°C, the ligation reactions were analyzed on a NOVEX 4-12% Bis-Tris gel with MES running buffer. The molecular weights of the ligation products were also determined by MALDI-TOF mass spectroscopy
Cell Culture [0107] NIH3T3 cells were seeded into twelve-well plates at a density that achieved approximately 80% confluency after an overnight incubation. The cells were incubated with approximately 1 μM of protein in 500 μl of DMEM (Dulbecco's modified Eagle's medium) (Invitrogen) supplemented with 10% fetal bovine serum (FBS) or in 500ul of Opti-MEM (Invitrogen). After incubation for 2 hours at 37°C in a 5% C02 incubator, the cells were washed with PBS and treated with Trypsin (Invitrogen). The detached cells were pelleted and analyzed on a Becton & Dickinson flow cytometry (BD Bioscience).
Example 2 Hydrolysis ofLPXTG-Motif Containing Proteins In Vitro [0108] To determine the hydrolysis efficiency on proteins, sortase was incubated with two different LPXTG containing substrates (GST-LPXTG-6His and GFP-LPXTG-6His) and the cleavage products were analyzed by SDS/PAGE and MALDI-TOF mass spectroscopy. Sortase hydrolyzed both proteins specifically at the LPXTG sequence. More than a third of GST-LPXTG-6His was hydrolyzed within two hours of incubation at 37°C. Over 80% of the full length protem was hydrolyzed after about twenty hours. The hydrolytic product GST-LPXT (calculated mass 27,276; observed mass 27,297 ± 20) was the major product across the pH 6.5 to 8.5, and the hydrolysis was not significantly affected by the solution pH. In contrast, less than half of GFP-LPXTG-6His was hydrolyzed after twenty hours, and solution pH appeared to affect the hydrolytic efficiency. Slightly more hydrolytic product GFP- LPXT (calculated mass 27,382; observed mass 27,378 ± 20) was observed at lower pH (less than 7). This observed difference in hydrolysis between the two substrates may be due to the different accessibilities of the LPXTG-motif on the proteins. In GST-LPXTG-6His the recognition motif is separated away from GST with a 16-amino-acid linker whereas the motif in GFP-LPXTG-6His is directly C-terminal to GFP and thus likely to be less sterically accessible. This may explain why the former substrate is cleaved more efficiently than the latter. In addition to the expected hydrolytic product, there are other high molecular weight (HMW) conjugates as well. Among them, the sortase- thioester intermediates (MW ~ 46 kDa) were observed throughout the incubation. It's likely that formation of sortase-thioester with substrate is almost instantaneous whereas the release of hydrolytic products appears to be a much slower process. Other HMW conjugates may be formed by non-specific nucleophilic attacks from protein lysine side chains on the sortase intermediate. Increasing solution pH facilitates deprotonation of these amino groups, making them more nucleophilic. This may explain why at higher pH such as pH 8.5, significantly more HMW conjugates were observed. [0109] To determine whether lysine side chains are capable of performing nucleophilic attack on a sortase intermediate at high pH, we generated a small peptide (f-ahx- SKLPKTGSE) lacking a free N- terminal amino-group. The only amino groups that are available for nucleophilic attack are the side chains of two lysines. Over a prolonged incubation (48 hours) of the peptide with sortase at pH 8.5, high molecular weight products were observed. Two of them corresponding to ahx-SKLPKT and/- ahx-SKLPKT conjugate (observed mass 2,252.9, calculated mass 2,253.2), and/-ahx- SKLPKT and/- ahx- SKLPKTGSE conjugate (observed mass 2,544.02, calculated mass 2544.22). [0110] The results from the proteins and peptides together show that sortase can readily hydrolyze the peptide bond between the threonine and the glycine of the LPXTG motif. However, the release of the LPXT-containing segment is a slow process. Both H20 and a free amino group on protem can act as nucleophile to slowly release the LPXTG containing product from sortase. To favor the formation of a hydrolysis product and minimize non-specific covalent adducts such as lysines on a protein, the solution pH can be lowered below 7.
Example 3 Ligation with LPXTG-Containing Peptides and Proteins In Vitro [0111] In addition to hydrolysis, sortase catalyzed franspeptidation was effected in vitro in the presence of a tripeptide glycines. The native conjugation partner for LPXTG-containing protein in vivo is a pentaglycine cross bridge on cell walls. However, it was not clear that the pentaglycine cross bridge was required for efficient conjugation, and whether a peptide longer than a tripeptide glycine3 could be specifically linked to an LPXTG sequence. To determine the number of N-terminal glycines required for effective conjugation, we synthesized four peptides with N-terminal glycines ranging from one to five (GnRRNRRTSKLMLR, n = 1 , 2, 3, or 5). Each was incubated with an equal molar amount of an LPXTG containing peptide acefyl-RE(£c/ «-:)LPKTGK(Z) έcy R- The efficiencies of peptide cleavage was assessed by measuring the increase of Edans fluorescence. The cleavage rates were very similar in all four reactions containing amino glycine peptides — about four to five times faster than the sortase mediated hydrolysis in the absence of an amino group donor. Additionally, the cleavage rate in the presence of one N-terminal glycine peptide was slightly slower than those with two or more glycines. This observation indicated that sortase mediated cleavage of the LPXTG sequence favored the presence of one or more N-terminal glycines rather than H20. The formation of the ligation product RE(E_fa«.s)LPKTGnRRNRRTSKLMLR (n = 1, 2, 3, or 5) by RP-HPLC and mass spectrometry analyses next was determined. Within the initial 30 minutes, a substantial amount of conjugates were formed in all four ligation reactions. It appeared that the efficiency of peptide conjugation was not significantly affected by the numbers of glycine presented at the N-terminus. These results suggested that sortase catalyzes ligation more efficiently than hydrolysis. [0112] The sortase-mediated ligation method was applied to protein-peptide conjugation. Protein GFP-LPXTG-6His and a ten-fold excess of the peptide GGGGGRRNRRTSKLMLR were mixed and incubated in the presence of different amount of sortase. Product formation was monitored by SDS/PAGE and MALDI-TOF mass spectrometry. A substantial amount of conjugated product GFP- LPXT- GGGGGRRNRRTSKLMLR (observed mass 29,235 ± 20, calculated mass 29,234) had formed over 24 hours at 37°C. The conjugation efficiency was affected by the amount of sortase present. At an approximately equal molar concentration of protein substrate and sortase, the ligation reaction has reached over 90% completion within 24 hours. The rate of protein-peptide ligation was slower than that of peptide-peptide ligation. Example 4 Conjugating a D-peptide to LPXTG-Containing Substrates [0113] A L-polypeptide was effectively conjugated to the C-terminus of an LPXTG motif in the presence of an N-terminal glycine, and the number of glycines did not significantly affect conjugation. Because glycine is an achiral amino acid, there was a possibility that sortase would not discriminate the chirality of the amino acids C-terminal to the glycine. To compare the ligation efficiencies of peptides with opposite chirality, a L- and a D-Tat peptide with identical sequence (GYGRKKRRQRRR) were synthesized. Each was incubated with an equal molar amount of an LPXTG substrate acetyl- KE(Edansy PKTGK(Dabcyl)R (L-form) in the presence of sortase. The rate of sortase mediated cleavage of the LPXTG substrate was measured by the increase oϊEdans fluorescence, and product formation was determined by RP-HPLC and MALDI-TOF mass spectrometry. The LPXTG cleavage rate in the presence of D-Tat was about half of L-Tat, but higher than in the presence of glycine or H20. D-Tat was able to conjugate to form KE(Edans)LPKΥ-ΥεA (observed mass 2,631.5, calculated mass 2,631.4). The amount of D-Tat ligation product was only slightly less than that of L-Tat. [0114] To demonstrate that a D-peptide could be conjugated to a recombinant protein, the D-Tat peptide was ligated to the GFP-LPXTG-6His protein with a peptide to protein ratio of 5 to 1. Over the course of 48 hours, over 90% of the protem was conjugated to the D-peptide. These results showed that sortase could be used to conjugate a D-peptide to the C-terminus of an LPXTG-motif effectively. The observation that sortase could conjugate both L- and D-peptide substrates to an LPXTG motif may be explained from the sortase structure. Sortase appears to contain an elongated binding groove to recognize the LPXTG motif, but the binding site for the amino donor substrate is much shallower. The chirality of the incoming peptide should not interfere with the conjugation, as long as it contains an N- terminal glycine that can serve as a nucleophile to attack the sortase-thioester intermediate.
Example 5 Conjugating NH2-CHr Containing Compounds to LPXTG Substrates [0115] Sortase activity was tested further with non-peptidyl substrates. Since an N-tenninal glycine rather than amino acids with a branched alpha-carbon facilitates nucleophilic attack, it was possible that sortase might accommodate a substrate with a NH2-CH2- group. Two non-peptidyl compounds, spermine (MW 202.3) and NH2-PEG-COOH (average MW 3,400), were subjected to a sortase ligation reaction with a protein substrate (GFP-LPXTG-6His). The conjugating efficiencies were compared to those of a free glycine and of a peptide GGRRNRRTSKLMLR, which represent two extremes of the sortase-mediated ligation. While a free glycine is slightly better than H20, it is still a very poor substrate for ligation. Besides the ligation product GFP-LPXTG (calculated mass 27,438.8), sortase-mediated ligation also forms many other side products of higher molecular weight in the presence of a free glycine. In contrast, the peptide with an N-terminal glycine is an excellent subsfrate.
Sortase mediates almost exclusive ligation to form GFP-LPXTGRRNRRTSKLMLR (calculated mass
29,007.6). Surprisingly, sortase was able to use both spermine and NH2-PEG-COOH as substrates to form specific GFP conjugates. The ligation efficiencies of both spermine and NH2-PEG-COOH are better than that of a free glycine, but less effective than an N-terminal glycine containing peptide. Sortase-mediated side products were also found both in spermine and NH2-PEG-COOH ligation reactions. These results suggest that the nucleophilic attack on the thioester intermediate prefers an amino glycine of a peptide, but it can also use an NH2-CH2- group on a non-peptidyl compound.
Example 6 Application of Sortase in Improving Protem Transduction [0116] Protein fransduction domains (PTD) are a class of cationic peptides able to facilitate efficient protein transduction in vitro and in vivo. Recombmant protems with PTD fusions typically are expressed as aggregates or inclusion bodies in E. coli. In an effort to obtain soluble active PTD fusion protein, sortase-mediated ligation was utilized to conjugate a protein with PTDs. The synthetic peptide RRQRRTSKLMKR (PTD5) has been shown to possess protem fransduction activity, and was used as a PTD. While a fusion protein containing a single PTD5 sequence may be generated by recombinant expression, a protein containing more complex and efficient PTD moieties cannot be readily generated using either the recombinant expression or chemical synthesis. The sortase-based conjugation method was used to generate a conjugate between the synthetic branched PTD peptide GGY-K-K(Ahx-PTD5)2 and the GFP-LPXTG-6His recombinant protein. A linear peptide also was prepared GGY-PTD5 with GFP-LPXTG-6His. It was determined that the conjugation efficiency with the branched peptide was similar to that of a linear peptide (GGY-PTD5). Over 90% of the protein was conjugated after 48 hours of incubation. The single linear PTD5- and branched PTD5 -conjugated proteins were subsequently purified and incubated with NIH3T3 cells. After incubation, the cells were treated with trypsin to remove the surface bound proteins and subsequently analyzed by flow cytometry to measure the fluorescence intensify of the internalized GFP. Cells incubated with the GFP-branched PTD conjugate contained significantly more GFP fluorescence (13-fold) than those incubated with the GFP-linear PTD conjugate. Together these results demonstrated that sortase could be used to generate biologically useful protein conjugates that are difficult or impossible to make otherwise.
Example 7 Application of a Sortase Variant to Protein Ligation Processes [0117] Nucleic acids encoding sortase B are prepared and isolated according to processes described in Mazmanian etal., Proc. Natl. Acad. Sci. USA 99: 2293-2298 (2002), US 2003/0153020 and documents referenced therein, from which sortase B enzyme is produced and isolated as described therein and above in Example 1. Sortase B is utilized in the processes described in Examples 2-6, with target proteins and peptides having a NPXιTX2 recognition sequence, where Xi is glutamine or lysine; X2 is asparagine or glycine; N is asparagine; P is proline and T is threonine. Example 8 One Step Protein Purification Systems [0118] Described hereafter is a fusion system for one-step purification processes. A component of the fusion, sortase A (SrtA), is a franspeptidase found in the cell envelop of Staphylococcus aureus. In vivo SrtA uses calcium as a cofactor to first cleave the Thr-Gly bond at an LPXTG recognition motif on a surface protem and subsequently to form a peptide bond between the threonine and a pentaglycine on the cell wall peptidoglycan. Several recombinant proteins with various N-terminal deletions in SrtA retain enzymatic activities in vitro, slowly hydrolyze the LPXTG-containing subsfrate in the absence of an amino donor group and catalyze a transpeptidyl reaction in the presence of a triglycine (Gly3). One of the recombinant proteins is the catalytic core of SrtA (SrtAc, amino acid 60 to 206), which has been expressed at very high levels in Escherichia coli (> 75 mg of purified protein per liter culture in shake flask) and is highly soluble. IMAC purification of SrtAc showed that neither an N-terminal nor a C- terminal His6 tag appears to affect its catalytic activities. Furthermore, the cleavage activity is moderate (k/KM ~ 6.08 MV1) and inducible in the presence of calcium. This suggests it is probably suitable to apply a His6 tagged SrtAc in a fusion construct to achieve affinity purification and fusion processing via one-step chromatography. In addition, having the highly soluble and stable SrtAc at the N-terminus may enhance the fusion expression. Presented here is a purification scheme that combines affinity purification, SrtA cleavage, and separation of the fusion partner in a single IMAC chromatography step (e.g., Figure 2), and generates purified recombinant protein with an extra glycine only at its N-terminus.
Plasmid Construction [0119] pGHSL-emGFP — The gene encoding SrtAc-LPETG-emGFP fusion was constructed using overlapping PCR. First, primers 1 and 2 (5'-GATATACATATGCAAGCTAAACCTCAAATTCCG-3' and 5'-GGATCCGGTTTCCGGAAGCTTTTTGACTTCTGTAGCTACAAAG-3', respectively) and template pET23b-SrtAc were used to PCR amplify the DNA sequence that encodes the amino acids 60 to 206 of SrtA (GenBank Accession No. AF 162687). An additional sequence encoding a KLPETGS linker was added at the 3' end of the SrtA gene. Primers 3 and 4 (5'- AAGCTTCCGGAAACCGGATCCATGGTGAGCAAGGGCG-3', and 5'-
ATATACATATGGTGAGCAAGGGCG-3'respectively) and template pET24b-emGFP were used to PCR amplify a second DNA sequence encoding a KLPETGS linker and emerald GFP (emGFP). Subsequently, the first two PCR products were mixed together, and primers 1 and 4 were used to amplify the DNA sequence encoding the protein fusion SrtAc-KLPETGS-emGFP (SL-emGFP), which contains a Bam HI Site between the coding sequences of SrtAc-LPETG and emGFP. The resulting DNA fragment was digested with Nde I and Xho I and then ligated to pET 15b vector (Novagen) for expressing the fusion protein GHSL-emGFP that contains an N-terminal His6 tag, SrtAc, the LPETG recognition site followed by protein emGFP. [0120] pAHSL-emGFP— -Two oligos (5'-GATATACCATGGCCAGCAGCCATCATC-3' and 3'- GATGATGGCTGCTGGCCATGGTATATC-5') were used to make the Gly2 to Ala mutation of the fusion in the plasmid pGHSL-emGFP using a Quick Change Mutagenesis kit (Sfratagene). The resulting plasmid was used for expressing the fusion protem AHSL-emGFP. [0121] pGHS'L-emGFP— Two oligoes (5'- AATGAAAAGACAGGCGTTGCGGAAAAACGTAAAATCTTT-3' and 3'- AAAGATTTTACGTTTTTCCGCAACGCCTGTCTTTTCATT-5') were use to mutate the Tip194 of SrtA to Ala using the pGHSL-emGFP plasmid as a template. The resulting plasmid was used for expressing the fusion protein GHS'L-emGFP. [0122] pAHS'L-emGFP— Similar to pGHS'L-emGFP, the same oligoes were used to mutate the Trp194 of SrtA to Ala using the pAHSL-emGFP plasmid as a template. The resulting plasmid was used for expressing the fusion protein AHS'L-emGFP. [0123] pET15b-emGFP— Primers (5'-GGCAGCCATATGATGGTGAGCAAGGGCGAG-3' and 5'- CGGATCCTCGAGTCACTTGTACAGCTCGTCCATGC-3') were used to PCR amplify DNA sequence encoding emGFP from a pET24b-emGFP vector. The PCR product was then digested with Nde I and Xho I and inserted into pET15b vector to express a fusion protein H-emGFP that contains an N-terminal His6 tag, a thrombin cleavage site followed by emGFP. [0124] Expression constructs for Cre — Plasmid pGHSL-Cre was also constructed by overlapping PCR similarly as pGHSL-emGFP, except that the emGFP coding sequence was replaced with the bacteriophage PI Cre (GenBank Accession No. MYPICRE) using primers 5 and 6 (5'- AAGCTTCCGGAAACCGGATCCATGTCCAATTTACTGACCGTAC-3' and 5'- TCCTTACTCGAGTTAATCGCCATCTTCCAGCAG-3', respectively) and template P24b-Tat-Cre in the first round of PCR, and primers 1 and 6 were used in the overlapping PCR to obtain the DNA sequence encoding the protein fusion SrtAc-LPETGS-Cre (SL-Cre). The resulting DNA fragment was digested with Nde I and Xho I and subsequently ligated to pETl 5b vector. [0125] Flas ids pAHSL-Cre,pGHS'L-Cre, and pAHS 'L-Cre were generated through site-directed mutagenesis using similar approaches as the emGFP constructs. [0126] Expression constructs for p27 — Plasmid pGHSL-p27 was constructed similarly to pGHSL- emGFP, except that primers 7 and 8 (5'-
AAGCTTCCGGAAACCGGATCCATGTCAAACGTGCGAGTGT-3' and 5'- TCCTTACTCGAGTTACGTTTGACGTCTTCTGAGG-3', respectively) and template pTAT-HA-p27 were used to PCR amplify the human cyclin-dependent kinase inhibitor IB (ρ27) (GenBank Accession No. NM 004064), and primers 1 and 8 were used to amplify the DNA sequence encoding the protein fusion SrtAc-LPETGS-p27 (SL-p27). The resulting DNA fragment was digested with Nde I and Xho I and subsequently ligated into the pETl 5b vector. [0127] Plasmids pAHSL-p27, pGHS 'L-p27, and pAHS 'L-p27 were generated through site-directed mutagenesis using similar approaches as the emGFP constructs. Protein expression screening and cleavage analysis [0128] Plasmid pGHSL-emGFP was fransformed into E. coli BL21(DE3) and the cell culture was grown at 37 degrees Celcius in 3 ml LB medium containing 50 mg/L ampicillin and 0.4% glucose. The GHSL-emGFP expression was induced at 30 degrees Celcius with isopropyl beta-D-thiogalactoside (IPTG) in concentrations ranging from 25 micromolar to 1 millimolar for 1 to 5 h. Aliquots of cells were lysed in lxSDS protein loading buffer and protein expression was analyzed on a NOVEX 4-20% Tris-Glycine SDS PAGE gel. Similar procedures also were applied to BL21(DE3) fransformed with plasmid pAHSL-emGFP,pGHS'L-emGFP, or pAHS'L-emGFP to monitor the expression of AHSL- emGFP, GHS'L-emGFP, or AHS'L-emGFP, respectively. [0129] For small scale purification and cleavage analysis, BL21(DE3) transformed wifhpGHSL- emGFP was grown in 200-mL media at 37 degrees Celcius. When OD60o reached approximately 1.0, protein expression was induced with 200 micromolar IPTG at 30 degrees Celcius for 3 hours. The cells were harvested and stored at -80 degrees Celcius. The cell paste was suspended in 5 ml Buffer A (20 mM Tris-HCl, pH 7.5, 50 mM NaCl, and 5 mM beta-mercaptoethanol [BEM] and lysed by sonication. Nucleic acids were precipitated by the addition of 0.1% polyethylene imine (PEI). The supernatant of the lysate was loaded onto a 1 ml Ni-NTA (Qiagen) column and washed with ten times column volumes of Buffer B (20 mM Tris-HCl, pH 7.5, 500 mM NaCl, 30 mM imidazole, and 5 mM BME) followed by five column volumes of Buffer A. After washing, aliquots of the protein-bound resin were equilibrated with different cleavage buffers (+/- 5 mM CaCl2 and 0 to 10 mM Gly3 in Buffer A) and incubated at 4 degrees Celcius or 25 degrees Celcius for 2 to 6 h. All reactions were stopped by mixing the protein samples with equal volume of 2xSDS protein loading buffer and boiled for 5 minutes. The samples were subsequently analyzed on a NOVEX 4-20% Tris-glycine PAGE (Invitrogen). Alternatively the cleavage reactions were incubated at 25 degrees Celcius for 6 h, and the protein released from induced cleavage was collected in the supernatant. The resin was washed 5 times with buffer A, and proteins remained bound to the IMAC column were eluted by 500 mM imidazole solution in buffer A. Samples of the cleavage flow-through and the IMAC elution were analyzed subsequently on a NOVEX 4-20% Tris-glycine SDS PAGE. The intensities of protein bands were quantified on an Alpha Innotech FluoChem 9900 imaging system.
Protein purification [0130] For emGFP, the protein was obtained from fusion cleavage of GHSL-emGFP, AHSL- emGFP, GHS'L-emGFP, or AHS'L-emGFP. Briefly, BL21(DE3) transformed OnpGHSL-emGFP, pAHSL-eιnGFP,pGHS'L-emGFP, or pAHS'L-emGFP was grown in one-liter media. The cell growth, expression induction, and IMAC column purification conditions were scaled up from the procedures described above. After washing, the SrtAc-mediated cleavage was induced by equilibrating the column with Buffer C (20 mM Tris-HCl, pH 7.5, 50 mM NaCl, 5 mM beta-mercaptoethanol, 5 mM CaCl2, and 5 mM Gly3) and incubated at 25 degrees Celcius for 4 to 6 hours. The cleavage flow-through containing emGFP was collected at one-hour intervals. The protein purify was analyzed on a SDS- PAGE gel and the molecular weight of the protein was analyzed on MALDI-TOF mass spectroscopy (Bruker Daltonic Autoflex). Protein yield was quantified by a Bradford assay (BioRad) using bovine serum albumin (BSA) as a standard. [0131] In addition, emGFP was purified from thrombin cleavage of fusion protein H-emGFP. Briefly, plasmid pET15b-emGFP was fransformed into BL21(DE3) to express fusion protem H- emGFP. The cell growth, expression induction, and IMAC column purification conditions were the similar to those of GHSL-emGFP. After column washing with Buffer B, the fusion protein was eluted from the IMAC column using Buffer A containing 200 mM imidazole. Then the pooled fractions were exchanged into Buffer D (50 mM Tri-HCl, pH 7.5, 5 mM CaCl2, and 150 mM NaCl) through a 10 DG desalting column (BioRad) and concentrated to 5 mg/ml. The fusion was incubated with 20 units of restriction grade thrombin (Novagen) at 4 degrees Celcius overnight. The reaction mixture next was subjected to a second round of IMAC in which the His6 tag fragment and the uncleaved fusion were removed while the emGFP was collected in the column flow-through. No further purification was carried out to remove the thrombin from the purified emGFP. The protein purity was analyzed on a SDS-PAGE gel and the molecular weight of the protein was analyzed by MALDI-TOF mass spectroscopy. [0132] For Cre, the recombinant free protein was obtained from expression, purification and fusion cleavage using pGHSL-Cre, pAHSL-Cre, pGHS 'L-Cre, or pAHS'L-Cre. The expression and purification procedure was similar to that of emGFP. [0133] For p27, the recombinant free protein was obtained from expression, purification and fusion cleavage using pGHSL-p27,pAHSL-p27,pGHS'L-p27, or pAHS'L-p27. The expression and purification procedure was similar to that of emGFP.
Construction of self-cleavable SrtAc fusion [0134] A prototype plasmid pGHSL-emGFP was made in HiQpET15b vector to express the fusion protein GHSL-emGFP with an N-terminal His6 tag, SrtAc, and an LPETG linker followed by emGFP at the C-terminus (Figure 3). At least three factors were taken into consideration in designing the self- cleavable fusion protein: fusion stability, affinity purification, and on-column cleavage. First, the fusion contains an unconventional design by linking SrtAc and its recognition sequence LPETG together with a single glycine spacer and by placing the recognition sequence near the SrtAc substrate- binding site. This design carries a potential risk concerning fusion stability. However, because the cleavage activity of SrtAc is inducible and moderate, it was expected that the fusion would be stable during protein expression. Second, the His6 Tag, which is small and does not interfere with SrtAc activity, provides an affinity purification means for the fusion protein. Third, the buffer used for induced cleavage is compatible with an IMAC column. Providing that the LPETG recognition sequence is accessible in the fusion, it is anticipated that emGFP will be released from the immobilized fusion via endogenous SrtAc cleavage (inter- or intra-molecular SrtAc cleavage) upon induction. Therefore, GHSL-emGFP was used as a model system for studying the fusion expression and stability, the IMAC purification, and the fusion cleavage mediated by infra- or inter-molecular SrtAc cleavage at the LPETG recognition junction. Choosing emGFP as the first target provides the benefit of a visual marker to track the target protein during purification. Other protein targets can also be cloned into the fusion construct using the Bam HI sites flanking the emGFP gene in the pGHSL-emGFP plasmid (Figure 3). [0135] In addition to GHSL-emGFP, fusion variants also were prepared to address two sequence components that could potentially affect the stability of the fusion protein. The first deals with the N- terminal MGSS sequence encoded in the pGHSL-emGFP construct, because the endogenous E. coli aminopeptidase is efficient at removing the first methionine during post-translational processing. As a result, the new N-terminal glycine can act as an infra- or inter-molecular nucleophile to cleave emGFP from the fusion and potentially reduce the stability of GHSL-emGFP. Hence pAHSL-emGFP (Figure 3), which encodes the fusion protein AHSL-emGFP with a Gly2 to Ala mutation, was prepared by quick-change mutagenesis mpGHSL-emGFP. The second variant addresses the aforementioned concern that the wild type SrtAc might be too efficient at cleaving the LPETG sequence and neither the GHSL-emGFP nor the AHSL-emGFP fusion could be accumulated in sufficient quantity. Thus, a Tip194* to Ala mutation, which reduces the franspeptidation activity of SrtAc, was incorporated into the constructs of pGHS'L-emGFP and pAHS'L-emGFP to express GHS'L-emGFP and AHS'L-emGFP, respectively (Figure 3).
Expression of the emGFP fusion proteins [0136] To test whether the full length GHSL-emGFP fusion could be expressed in E. coli BL21(DE3) transformed with pGHSL-emGFP, the influence of the IPTG concentration on protein expression was evaluated. The results showed that expression of the full-length protein (approximately 47 kDa) could be induced with as little as 50 micromolar IPTG, and expression levels increased with up to 200 micromolar IPTG, an optimized concentration (data not shown). Further, a majority of GHSL- emGFP appeared in the soluble fraction (data not shown). Using the optimized IPTG concenfration, the stability of GHSL-emGFP during protem expression over the course of 5 hours was investigated next. At 30 degrees Celcius the full-length protein continued to accumulate and was the dominant product throughout the induction. However, two cleavage products (the GHSL fragment and emGFP) and at least one high molecular weight ligation product ((GHSL)2-emGFP) clearly started to appear after about 3 h post-induction. Raising the induction temperature not only accelerated the full-length protein expression but also expedited the cleavage process, whereas decreasing the temperature had an opposite effect. These observations suggest that the SrtAc activity is not suppressed during protein expression, and fusion stability was likely affected by the new N-terminal glycine on GHSL-emGFP derived from post-translational modification. Nevertheless, a majority of the GHSL-emGFP was sufficiently stable during protein expression. [0137] Having demonstrated that the full length GHSL-emGFP could be expressed from E. coli, the effect of the sequence variations on fusion stability was assessed. Whether replacing the N-terminal glycine with an alanine alleviates the cleavage during expression was determined first. Under similar induction conditions, more full-length AHSL-emGFP appeared to accumulate than GHSL-emGFP, and no obvious high molecular weight ligation products were observed. There also was an unexpected observation: one of the cleavage products (the AHSL fragment) migrated around 23 kDa, a position that was different from that of the GHSL fragment (which migrates around 20 kDa). The N-terminal glycine in GHSL likely performed a nucleophilic attack on its own LPETG sequence and formed a cyclized GHSL, which could migrate faster than the linear AHSL in the SDS-PAGE gel. Taken together, these results suggest that the Gly2 to Ala mutation eliminated instability caused by the N- terminal glycine of GHSL-emGFP. There was formation of some cleavage products during AHSL- emGFP expression, reflecting the cleavage activity of wild type SrtAc. Whether the Tip194* to Ala mutation could suppress the SrtAc cleavage activity and enhance protem stability also was determined. The results showed that there were few observable cleavage products for either GHS'L-emGFP or AHS'L-emGFP during expression, even 7.5 h post-induction. These observations confirmed that the Trp194* to Ala mutation stabilized both GHS'L-emGFP and AHS'L-emGFP by significantly attenuating the cleavage activity of SrtAc.
Purification and cleavage of the emGFP fusion protein [0138] Using GHSL-emGFP as a model system, the feasibility of affinity purification and on- column cleavage of the fusion protein was investigated. Based on the color indication from emGFP, it was clear that GHSL-emGFP was highly soluble in the cell lysate and bound well to the IMAC column. The column flow-through and imidazole wash also contained some green color, a direct result of the free emGFP released from SrtAc cleavage during protein expression. The full length GHSL-emGFP was the major product purified from the IMAC column, although the high-molecular weight conjugates (GHSL)n-emGFP (n > 2) and the cleaved N-terminal portion GHSL also were purified. [0139] To determine the efficiency of on-column cleavage, the effects of temperature, presence of calcium, and concentration of Gly3 were systematically investigated. Some conclusions could be drawn from the analyses of the protem bands. First, cleavage efficiency was generally higher at 25°C than 4°C. More specifically, approximately 45% of the full length GHSL-emGFP was cleaved after about 2 h at 25°C in the presence of 5 mM calcium and 1 mM Gly3, and 20% was cleaved at 4°C under the same condition (data not shown). In 6 h more than 90% of the full length GHSL-emGFP was cleaved at 25°C and approximately 70% was cleaved at 4°C. The addition of 5 mM calcium likely promoted cleavage, although SrtAc may have bound calcium from the cell lysate. The addition of Gly3 also facilitated the cleavage, noticeably by reducing the amount of higher molecular weight ligation products (GHSL)n-emGFP (n > 2) and pushing the equilibrium of cyclized GHSL (the band at 20 kDa) to the linearized GHSL-G3 (the band at 23 kDa). In all the reactions, only emGFP was eluted in the flow-through after cleavage, while the un-cleaved fusion and the GHSL segment remained bound to the IMAC resin. [0140] After having obtained emGFP through on-column cleavage of GHSL-emGFP, I next compared the efficiency and feasibility of protein production was compared for the four fusion constructs (i.e., GHSL-, AHSL-, GHS'L-, AHS'L-emGFP). Under the same expression and purification conditions, emGFP (observed mass 27,045 ± 16 Da, calculated mass 27,039 Da) with homogeneity up to approximately 98% was obtained from all four fusions. Purified protein yields were different among these four constructs. The Gly2 to Ala mutation reduced the AHSL-emGFP cleavage during protein expression and purification. The net result was an approximate 50% increase in emGFP yield by switching from GHSL-emGFP to AHSL-emGFP (Table 1). The emGFP yields for the Tip194* to Ala mutants (i.e. GHS'L-emGFP) were lower than the corresponding wild type after about 6 h of induced cleavage at 25°C (Table 1), despite the fact that more full length GHS'L-emGFP and AHS'L- emGFP were captured on IMAC columns. Even after 16 hours, approximately 20 to 30% of the fusions still remained un-cleaved (data not shown).
Table 1. Recombinant proteins expressed and purified by the sortase fusion method Yields from the fusions (mg/L culture)* Target Proteins GHSL- AHSL- GHS'L- AHS'L- emGFP 25.5 36.0 15.1 17.0 Cre 0.8 1.6 n.d. n.d. p27 1.9 2.9 n.d. n.d. *The yields were obtained from 6 hours on-column cleavage, and not determined for Cre and p27 from the GHS'L or the AHS'L fusions.
Comparison of emGFP purifications using one-column versus a conventional multi-column chromatography method [0141] To characterize advantages of one-step purification processes described herein, apET15b- emGFP plasmid was constructed to express H-emGFP (an N-terminal His6 tag and emGFP fusion linked by a thrombin cleavage site) and the untagged emGFP was obtained by a typical thrombin cleavage method. In this approach, the H-emGFP fusion was first purified on an IMAC column and eluted using an imidazole solution. Following a desalting (or alternatively dialysis) step to remove the excess imidazole, the H-emGFP fusion was concentrated and treated with thrombin overnight. The mixture then was passed through a second IMAC column to remove the His6 tag and the un-cleaved fusion. However, separating the unmodified thrombin (33 kDa) away from emGFP (27 kDa) was not routine. One solution was to use biotinylated thrombin, which can be subsequently removed by sfreptavidin agarose. As summarized in Table 2, the purified emGFP was collected within 6 hours from a single IMAC purification using the GHSL-emGFP fusion, whereas three or more chromatography steps and more than 24 hours were required using the H-emGFP fusion. In the end, purities of the purified untagged emGFP were similar. These results showed that time and reagents are saved by the
SrtAc fusion method without jeopardizing protein purity. Table 2. Comparison of two methods used for emGFP purification. GHSL-emGFP H-emGFP Ste l IMAC column IMAC column Step 2 On-column cleavage and collect Imidazole elution the cleavage flow-through Step 3 DG-10 column desalting Step 4 Concentrate the desalted fractions Step 5 Thrombin cleavage overnight Step 6 Removal of His6 tag and un- cleaved fusion on IMAC column Step 7 Removal of thrombin
Application of self-cleavable fusions for protein purification [0142] To illustrate the utility of the self-cleavable SrtAc fusion, two other protein targets were tested using this method. Without any optimization, both Cre and p27 were separately expressed from four fusion constructs (i.e. GHSL-, AHSL-, GHS'L-, and AHS'L-Cre or p27) under similar growth and induction conditions as the emGFP fusions. Results showed that full-length fusions were sufficiently stable for both Cre and p27, and the untagged proteins were successfully purified through one-step IMAC. The yields from the AHSL constructs were the highest for both Cre and p27 (Table 1), and the cleavage rates for the GHS'L and AHS'L constructs were lower than the respective GHSL and AHSL constructs. These results are consistent with previous observation with the emGFP purification. The average yields for Cre and p27 both were lower than for emGFP, likely a direct reflection of the reduced expression efficiencies of the corresponding target proteins. [0143] Demonstrated in this example is a protein purification process for generating free recombinant protein using a self-cleavable SrtAc fusion expressed from E. coli. Besides providing on- column cleavage, SrtAc at the N-terminus promoted expression in E. coli. This phenomenon is particularly exemplified in the case of emGFP. While emGFP without an N-terminal tag expresses poorly in E. coli (~ 1 to 2 % of the total proteins in cell lysate), expression dramatically increased with the N-terminal SrtAc (> 30% of the total proteins). Small peptides (15 to 20 amino acids in length), which expressed poorly by themselves, also expressed at high levels when fused C-terminal to SrtAc (data not shown). Although SrtAc enhanced fusion expression, it did not affect fusion solubility. For three fusions tested, the emGFP and p27 fusions were very soluble and the Cre fusion was less soluble. Fusion solubility therefore likely is determined by the intrinsic property of the target protein. [0144] One question raised for the fusion constructs concerns enzymatic properties of SrtAc. The target protein or peptide released from the fusion is a result of hydrolysis and franspeptidation catalyzed by SrtAc. By nature SrtAc can catalyze a franspeptidation between an LPXTG sequence and an aminoglycine-containing subsfrate. This property was demonstrated during expression of GHSL- emGFP (by cyclization of the GHSL or tandem additions) as well as other GHSL fusions. It is possible that some native E. coli proteins with N-terminal glycine co-purified through this one-step purification method. These impurities, however, have not been identified. If this type of co-purification did occur, the emGFP purified from GHS'L- or AHS'L-emGFP was expected to have fewer contaminants because of the reduced franspeptidation activity. Comparison on an SDS-PAGE gel identified the same quantities of emGFP purified from GHSL-, AHSL-, GHS'L- and AHS'L-emGFP, however, and no clear differences in the protein purity were observed. Further, emGFP, Cre, and p27 purified from GHSL fusions were compared and no apparent protein contaminants of identical molecular weight were observed on the gel. One possibility was the minute quantity of the contaminants could not be identified in the gels. Nonetheless, the purify of the protein obtained from this method was as high as 98% to 99%. [0145] Because the fusion cleavage by SrtAc resembled a peptidase digestion, the purification method could yield multiple cleavage products for fusion proteins having internal SrtAc sites. The specificity of SrtA recognition has been extensively studied, however, and SrtA shows strong selectivity for the Leu, Pro, Thr, and Gly residues found in the LPXTG consensus motif. Since the target proteins used in this study had no internal LPXTG sites, non-specific cleavage was not observed. When a target protein does carry an internal recognition sequence, it is possible to overcome this limitation by simple mutagenesis to eliminate the internal site. [0146] In accordance with the results described herein, a fusion system was constructed for purifying untagged recombinant proteins via a single chromatography step. The purify of the protein obtained from this method was 98% to 99%. The purification method was cost effective, time efficient, and generally applicable to a variefy of protein targets.
[0147] Each patent, patent application and other publication and document referenced herein is hereby incorporated by reference herein in its entirety in jurisdictions allowing such incorporation. Also, each of the priority documents, U.S. patent application no. 60/524,152, filed 20 November 2003, entitled "Protein and peptide ligation methods," naming Mao et al. as inventors, and U.S. patent application no. 60/613,344, filed 27 September 2004, entitled "One-step protein and peptide purification processes and products" and naming Mao as an inventor, and each document cited therein, is incorporated herein by reference in its entirety. Citation of the above publications or documents is not an admission that any of the foregoing is pertinent prior art, nor does it constitute any admission as to the contents or date of these publications or documents. [0148] Modifications may be made to the foregoing without departing from the basic aspects of the invention. Although the invention has been described in substantial detail with reference to one or more specific embodiments, those of ordinary skill in the art will recognize that changes may be made to the embodiments specifically disclosed in this application, and yet these modifications and improvements are within the scope and spirit of the invention. The invention illusfratively described herein suitably may be practiced in the absence of any element(s) not specifically disclosed herein. Thus, for example, in each instance herein any of the terms "comprising", "consisting essentially of, and "consisting of may be replaced with either of the other two terms. Thus, the terms and expressions which have been employed are used as terms of description and not of limitation, equivalents of the features shown and described, or portions thereof, are not excluded, and it is recognized that various modifications are possible within the scope of the invention. Embodiments of the invention are set forth in the following claims.

Claims

What is claimed:
1. A method for linking a molecule A to a protein or peptide B, which comprises contacting in a cell free system the molecule A, the protein or peptide B and a transamidase enzyme, wherein the molecule A comprises a NH2-CH2- moiety and the protein or peptide B comprises a fransamidase recognition sequence, whereby the transamidase enzyme links the molecule A to the protein or peptide B.
2. The method of claim 1, wherein the transamidase enzyme is a sortase.
3. The method of claim 2, where in the sortase is sortase A or sortase B.
4. The method of claim 1, wherein the fransamidase recognition sequence comprises the amino acid sequence XtPX2X3G, wherein Xi is leucine, isolucine, valine or methionine; X2 is any amino acid; X3 is threonine, serine or alanine; P is proline and G is glycine.
5. The method of claim 4, wherein X2 is aspartate, glutamate, alanine, glutamine, lysine or methionine.
6. The method of claim 1 , wherein the fransamidase recognition sequence comprises the amino acid sequence NPXιTX2, wherein Xi is glutamine or lysine; X2 is asparagine or glycine; N is asparagine; P is proline and T is threonine.
7. The method of claim 1, wherein the molecule A comprises one or more glycines.
8. The method of claim 7, wherein the molecule A comprises between one and six glycines.
9. The method of claim 8, wherein the molecule A comprises three glycines.
10. The method of claim 1, wherein the molecule A comprises a protein or peptide.
11. The method of claim 10, wherein the molecule A comprises a protein or peptide selected from the group consisting of an antibody epitope, an antibody, a recombinant protein, a peptide comprising one or more D-amino acids, a peptide comprising all D-amino acids and a branched peptide.
12. The method of claim 1, wherein the molecule A comprises a moiety selected from the group consisting of a vitamin, biotin, thiamine, folate, avidin, sfreptavidin, a fluorescent molecule, a radioisotope, a phosphoryl moiefy, a glycosyl moiefy, an unnatural amino acid and a peptide mimetic molecule.
13. The method of claim 1 , wherem the molecule A comprises a moiety selected from the group consisting of polyethylene glycol, a peptide nucleic acid, a deoxyribonucleic acid, a ribonucleic acid, spermine and puromycin.
14. The method of claim 1 , wherein the molecule A comprises a toxin moiefy selected from the group consisting of abrin, ricin A, pseudomonas exotoxin and diphtheria toxin.
15. The method of claim 1 , wherein the molecule A comprises a transduction peptide having an amino acid sequence selected from the group consisting of a subsequence of HIV-tat, GGY- K-K(Ahx-RRQRRTSKLMKR)2, RRQRRTSKLMKR, and GYGRKKRRQRRR.
16. The method of claim 1 , wherein the molecule A comprises a solid support.
17. The method of claim 16, wherein the solid support is selected from the group consisting of a glass slide, a glass bead, a silicon wafer and a resin.
18. The method of claim 1 , wherein the molecule A is an amino acid sequence in the protein or peptide B, whereby the sortase enzyme cyclizes the protein or peptide.
19. The method of claim 1, wherem the ratio of the enzyme to the protein or peptide B is greater than 1:1000.
20. A method for linking a molecule A to a protein or peptide B, which comprises contacting in a system the molecule A, the protein or peptide B and a fransamidase enzyme, wherein the molecule A comprises a NH2-CH2- moiefy and is not a component of a bacterial cell wall in the system, and wherein the protein or peptide B comprises a fransamidase recognition sequence, whereby the transamidase enzyme links the molecule A to the protein or peptide B.
21. The method of claim 20, wherein the system is cell free.
22. The method of claim 20, wherein the fransamidase recognition sequence comprises the amino acid sequence XιPX2X3G, wherein Xi is leucine, isolucine, valine or methionine; X2 is any amino acid; X3 is threonine, serine or alanine; P is proline and G is glycine.
23. The method of claim 20, wherein the fransamidase recognition sequence comprises the amino acid sequence NPXιTX2, wherein Xi is glutamine or lysine; X2 is asparagine or glycine; N is asparagine; P is proline and T is threonine.
24. The method of claim 20, where the molecule A comprises three glycines.
25. The method of claim 20, wherein the molecule A comprises a protein or peptide selected from the group consisting of an antibody epitope, an antibody, a recombmant protein, a peptide comprising one or more D-amino acids, a peptide comprising all D-amino acids and a branched peptide.
26. The method of claim 20, wherein the molecule A comprises a moiefy selected from the group consisting of a vitamin, biotin, thiamine, folate, avidin, sfreptavidin, a fluorescent molecule, a radioisotope, a phosphoryl moiety, a glycosyl moiety, an unnatural amino acid and a peptide mimetic molecule.
27. The method of claim 20, wherein the molecule A comprises a moiety selected from the group consisting of polyethylene glycol, a peptide nucleic acid, a deoxyribonucleic acid, a ribonucleic acid, speπnine and puromycin.
28. The method of claim 20, wherein the molecule A comprises a toxin moiety selected from the group consisting of abrin, ricin A, pseudomonas exotoxin and diphtheria toxin.
29. The method of claim 20, wherein the molecule A comprises a transduction peptide having an amino acid sequence selected from the group consisting of a subsequence of HIV-tat, GGY- K-K(Ahx-RRQRRTSKLMKR)2, RRQRRTSKLMKR, and GYGRKKRRQRRR.
30. The method of claim 20, wherem the molecule A comprises a solid support.
31. The method of claim 30, wherein the solid support is selected from the group consisting of a glass slide, a glass bead, a silicon wafer and a resin.
32. The method of claim 20, wherein the ratio of the enzyme to the protein or peptide B is greater than 1:1000.
33. A fusion protein which comprises a solid phase association region, a target protein or target peptide, a peptidase, and a peptidase recognition sequence, wherein the peptidase is capable of cleaving the fusion protein at the peptidase recognition sequence.
34. The fusion protein of claim 32, which further comprises a linker sequence between the peptidase and the target protein or target peptide.
35. The fusion protem of any one of claims 33-34, which further comprises a sequence capable of exporting the fusion protein to an intracellular compartment near a host cell surface or capable of secreting the fusion protein outside of a host cell.
36. The fusion protein of any one of claims 33-35, wherein the peptidase is sortase A from Staphylococcus aureus (SrtAc), a fragment of SrtAc, SrtAc sequence variant, or SrtAc fragment sequence variant, which recognize and cleaves a Thr-Gly bond in an Leu-Pro-Xaa-Thr-Gly sequence.
37. The fusion protein of any one of claims 33-36, wherem the SrtAc fragment is a catalytic core region that recognizes and cleaves a threonine-glycine bond in a Leu-Pro-Xaa-Thr-Gly sequence.
38. The fusion protein of any one of claims 33-37, wherein the SrtAc fragment is amino acids 60 to 206 of native SrtAc, or a sequence variant thereof.
39. The fusion protein of any one of claims 33-38, wherein the SrtAc sequence variant or SrtAc fragment sequence variant includes an amino acid substitution at Trp 194.
40. The fusion protein of any one of claims 33-39, wherein the solid phase association region comprises or consists of a polyhistidine sequence.
41. The fusion protein of any one of claims 33-40, wherein the amino acid sequence regions are in the following order, N-terminus to C-terminus: solid phase association region, peptidase, peptidase recognition sequence, and target protein or target peptide.
42. The fusion prote of any one of claims 33-41, which is bound to a solid support that specifically binds to the solid phase association region.
43. A nucleic acid which comprises a nucleotide sequence that encodes a fusion protein of any one of claims 33-41.
44. A nucleic acid which comprises a nucleotide sequence that encodes a fusion protein comprising a solid phase association region, a peptidase, a peptidase recognition sequence, and a region for inserting a target protein or target peptide sequence, wherein the peptidase is capable of cleaving the fusion protein at the peptidase recognition sequence.
45. A host cell which comprises a nucleic acid of any one of claims 43-44.
46. A host cell which comprises a fusion protem of any one of claims 33-41.
47. A kit which comprises a container including a nucleic acid that encodes a fusion protein of any one of claims 33-41, and optionally, instructions for producing and/or purifying the fusion protem encoded by the nucleic acid.
48. The kit of claim 47, which comprises one or more components for inserting a target protein or target peptide sequence into the nucleic acid selected from the group consisting of one or more oligonucleotides, a polymerase, a topoisomerase, one or more restriction enzymes and a ligase.
49. A kit of any one of claims 47-48, which comprise a solid support capable of binding the solid phase association region in the fusion protein expressed from the nucleic acid.
50. The kit of claim 49, wherein the solid support is an IMAC solid support.
51. A kit of any one of claims 47-50, which comprises an organism useful for expressing the fusion protein from the nucleic acid.
52. The kit of claim 51 , wherem the organism is selected from the group consisting of a sfrain of bacteria, a sfrain of yeast, a sfrain of fungi, a strain of insect cells and a strain of mammalian cells.
53. A kit of any one of claims 47-52, which comprises a reagent and/or instructions for inserting the nucleic acid into an organism.
54. A method for producing a fusion protein of any one of claims 33-41 in a host organism, which comprises contacting a host organism that comprises a nucleic acid encoding the fusion protein with conditions suitable for expressing the fusion protein.
55. The method of claim 54, wherein the host organism is selected from the group consisting of a strain of bacteria, a strain of yeast, a strain of fungi, a strain of insect cells and a strain of mammalian cells.
56. The method of any one of claims 54-55, wherein the host organism is in suspension.
57. The method of claim 56, wherein the fusion protein is produced at levels selected from the group consisting of 1 mg/L of cell culture or more, 2 mg/L of cell culture or more, 5 mg/L of cell culture or more, 10 mg/L of cell culture or more, 15 mg/L of cell culture or more, 20 mg/L of cell culture or more, 25 mg/L of cell culture or more, 30 mg/L of cell culture or more or 35 mg/L of cell culture or more.
58. A method for purifying a target protem or target peptide, which comprises: contacting a fusion protein of any one of the preceding claims with a solid support capable of specifically binding to the solid phase association region and collecting target protein or target peptide cleaved from the fusion protein bound to the solid support.
59. The method of claim 58, wherein the solid phase association region comprises a polyhistidine sequence and the solid support is an IMAC solid support.
60. The method of any one of claims 58-59, wherein the target protein or target peptide comprises an N-terminal glycine.
61. The method of any one of claims 58-60, which comprises producing the fusion protein in a host cell containing a nucleotide sequence that encodes the fusion protein.
62. The method of claim 61, which comprises contacting the host cell with a condition that releases the fusion protein from the host cell.
63. The method of claim 62, wherem the condition that releases the fusion protem from the host cell is lysis and/or osmotic stress.
64. The method of any one of claims claim 58-63, wherein the fusion protein is contacted with a substance that enhances protein cleavage by the protein cleavage region of the fusion protein.
65. The method of claim 64, where the fusion protein is contacted with a polyglycine substance and/or calcium ions.
66. The method of claim 65, wherein the polyglycine substance is triglycine.
67. The method of any one of claims 58-66, wherein the target protein released from the fusion protein is 95% or more pure.
68. The method of claim 67, wherem the target protein released from the fusion protein is 98% or more pure.
PCT/US2004/039045 2003-11-20 2004-11-19 Protein and peptide ligation processes and one-step purification processes WO2005051976A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US52415203P 2003-11-20 2003-11-20
US60/524,152 2003-11-20
US61334404P 2004-09-27 2004-09-27
US60/613,344 2004-09-27

Publications (2)

Publication Number Publication Date
WO2005051976A2 true WO2005051976A2 (en) 2005-06-09
WO2005051976A3 WO2005051976A3 (en) 2005-09-29

Family

ID=34636510

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/039045 WO2005051976A2 (en) 2003-11-20 2004-11-19 Protein and peptide ligation processes and one-step purification processes

Country Status (1)

Country Link
WO (1) WO2005051976A2 (en)

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007108013A2 (en) * 2006-03-22 2007-09-27 National Institute Of Immunology Novel bioconjugates as therapeutic agent and synthesis thereof
WO2010087994A2 (en) 2009-01-30 2010-08-05 Whitehead Institute For Biomedical Research Methods for ligation and uses thereof
WO2011056911A2 (en) * 2009-11-04 2011-05-12 Alnylam Pharmaceuticals, Inc. Compositions and methods for enhancing production of a biological product
WO2011133704A3 (en) * 2010-04-20 2012-04-19 Whitehead Institute For Biomedical Researh Modified polypeptides and proteins and uses thereof
WO2012142659A1 (en) * 2011-04-19 2012-10-26 Baker Idi Heart And Diabetes Institute Holdings Limited Site-selective modification of proteins
WO2013122823A1 (en) 2012-02-13 2013-08-22 Bristol-Myers Squibb Company Enediyne compounds, conjugates thereof, and uses and methods therefor
WO2013124473A1 (en) * 2012-02-24 2013-08-29 Novartis Ag Pilus proteins and compositions
WO2013177221A1 (en) * 2012-05-21 2013-11-28 Massachusetts Institute Of Technology Protein retrosplicing enabled by a double ligation reaction
WO2014001324A1 (en) * 2012-06-27 2014-01-03 Hoffmann-La Roche Ag Method for selection and production of tailor-made highly selective and multi-specific targeting entities containing at least two different binding entities and uses thereof
US8663632B1 (en) 2008-03-13 2014-03-04 University Of Kentucky Research Foundation Compositions and methods for selectively targeting cancer cells using a thiaminase compound
WO2014126836A1 (en) 2013-02-14 2014-08-21 Bristol-Myers Squibb Company Tubulysin compounds, methods of making and use
WO2014145441A1 (en) * 2013-03-15 2014-09-18 The Trustees Of The University Of Pennsylvania Sortase-mediated protein purification and ligation
WO2015023879A1 (en) 2013-08-14 2015-02-19 William Marsh Rice University Derivatives of uncialamycin, methods of synthesis and their use as antitumor agents
US20160041157A1 (en) * 2013-03-15 2016-02-11 The Trustees Of The University Of Pennsylvania Method for the site-specific covalent cross-linking of antibodies to surfaces
JP2016511279A (en) * 2013-03-15 2016-04-14 エヌビーイー セラピューティクス アクチェン ゲゼルシャフト Method for producing immunoligand / payload complex
WO2016077260A1 (en) 2014-11-10 2016-05-19 Bristol-Myers Squibb Company Tubulysin analogs and methods of making and use
JP2016519118A (en) * 2013-04-25 2016-06-30 スクリプス コリア アンティボディー インスティテュート Protein purification method including self-cleaving cassette and use thereof
WO2016115201A1 (en) 2015-01-14 2016-07-21 Bristol-Myers Squibb Company Heteroarylene-bridged benzodiazepine dimers, conjugates thereof, and methods of making and using
US9676845B2 (en) 2009-06-16 2017-06-13 Hoffmann-La Roche, Inc. Bispecific antigen binding proteins
US9688758B2 (en) 2012-02-10 2017-06-27 Genentech, Inc. Single-chain antibodies and other heteromultimers
WO2017132395A1 (en) * 2016-01-26 2017-08-03 The Regents Of The University Of California Methods and compositions to increase the rate of ligation reactions catalyzed by a sortase
US9862779B2 (en) 2012-09-14 2018-01-09 Hoffmann-La Roche Inc. Method for the production and selection of molecules comprising at least two different entities and uses thereof
US9879095B2 (en) 2010-08-24 2018-01-30 Hoffman-La Roche Inc. Bispecific antibodies comprising a disulfide stabilized-Fv fragment
US9890204B2 (en) 2009-04-07 2018-02-13 Hoffmann-La Roche Inc. Trivalent, bispecific antibodies
WO2018035391A1 (en) 2016-08-19 2018-02-22 Bristol-Myers Squibb Company Seco-cyclopropapyrroloindole compounds, antibody-drug conjugates thereof, and methods of making and use
WO2018075842A1 (en) 2016-10-20 2018-04-26 Bristol-Myers Squibb Company Condensed benzodiazepine derivatives and conjugates made therefrom
US9994646B2 (en) 2009-09-16 2018-06-12 Genentech, Inc. Coiled coil and/or tether containing protein complexes and uses thereof
US10053683B2 (en) 2014-10-03 2018-08-21 Whitehead Institute For Biomedical Research Intercellular labeling of ligand-receptor interactions
US10081684B2 (en) 2011-06-28 2018-09-25 Whitehead Institute For Biomedical Research Using sortases to install click chemistry handles for protein ligation
US10106600B2 (en) 2010-03-26 2018-10-23 Roche Glycart Ag Bispecific antibodies
WO2019035971A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company 6-amino-7,9-dihydro-8h-purin-8-one derivatives as immunostimulant toll-like receptor 7 (tlr7) agonists
WO2019035968A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company 6-amino-7,9-dihydro-8h-purin-8-one derivatives as toll-like receptor 7 (tlr7) agonists as immunostimulants
WO2019035969A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company Toll-like receptor 7 (tlr7) agonists having a tricyclic moiety, conjugates thereof, and methods and uses therefor
WO2019035970A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company 6-amino-7,9-dihydro-8h-purin-8-one derivatives as immunostimulant toll-like receptor 7 (tlr7) agonists
WO2019036023A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company 6-amino-7,9-dihydro-8h-purin-8-one derivatives as immunostimulant toll-like receptor 7 (tlr7) agonists
US10260038B2 (en) 2013-05-10 2019-04-16 Whitehead Institute For Biomedical Research Protein modification of living cells using sortase
US10323099B2 (en) 2013-10-11 2019-06-18 Hoffmann-La Roche Inc. Multispecific domain exchanged common variable light chain antibodies
CN110196322A (en) * 2018-02-27 2019-09-03 广东志道医药科技有限公司 Organophosphate and carbamate pesticide method for detecting residue and its test strips and preparation method
WO2019209811A1 (en) 2018-04-24 2019-10-31 Bristol-Myers Squibb Company Macrocyclic toll-like receptor 7 (tlr7) agonists
US10471099B2 (en) 2013-05-10 2019-11-12 Whitehead Institute For Biomedical Research In vitro production of red blood cells with proteins comprising sortase recognition motifs
WO2020028610A1 (en) 2018-08-03 2020-02-06 Bristol-Myers Squibb Company 2H-PYRAZOLO[4,3-d]PYRIMIDINE COMPOUNDS AS TOLL-LIKE RECEPTOR 7 (TLR7) AGONISTS AND METHODS AND USES THEREFOR
US10556024B2 (en) 2013-11-13 2020-02-11 Whitehead Institute For Biomedical Research 18F labeling of proteins using sortases
US10611825B2 (en) 2011-02-28 2020-04-07 Hoffmann La-Roche Inc. Monovalent antigen binding proteins
US10633457B2 (en) 2014-12-03 2020-04-28 Hoffmann-La Roche Inc. Multispecific antibodies
US10793621B2 (en) 2011-02-28 2020-10-06 Hoffmann-La Roche Inc. Nucleic acid encoding dual Fc antigen binding proteins
CN114480534A (en) * 2022-02-24 2022-05-13 清华大学 Protein semi-synthesis based on chemical enzymatic methods of transpeptidase
US11421022B2 (en) 2012-06-27 2022-08-23 Hoffmann-La Roche Inc. Method for making antibody Fc-region conjugates comprising at least one binding entity that specifically binds to a target and uses thereof
US11618790B2 (en) 2010-12-23 2023-04-04 Hoffmann-La Roche Inc. Polypeptide-polynucleotide-complex and its use in targeted effector moiety delivery

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030022178A1 (en) * 1999-04-15 2003-01-30 Olaf Schneewind Identification of sortase gene
US20030153020A1 (en) * 1999-04-15 2003-08-14 Olaf Schneewind Identification of sortase gene

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030022178A1 (en) * 1999-04-15 2003-01-30 Olaf Schneewind Identification of sortase gene
US20030153020A1 (en) * 1999-04-15 2003-08-14 Olaf Schneewind Identification of sortase gene
US6773706B2 (en) * 1999-04-15 2004-08-10 The Regents Of The University Of California Identification of sortase gene

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TON-THAT H. ET AL: 'Anchoring of Surface Proteins to the Cell Wall of Staphylococcus Aureus Cysteinebe 184 and Histidine 120 of Sortase Form a Thiolate-Imidazolium Ion Pair For Catalysis' J BIOL CHEM vol. 277, 01 March 2002, pages 7447 - 7452, XP002988217 *
TON-THAT H. ET AL: 'Purification and characterization of sortase, the transpeptidase that cleaves surface proteins of Staphylococcus aureus at the LPXTG motif' PROC NATL ACAD SCI vol. 96, 26 October 1999, pages 12424 - 12429, XP002253866 *

Cited By (77)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007108013A3 (en) * 2006-03-22 2008-03-13 Nat Inst Immunology Novel bioconjugates as therapeutic agent and synthesis thereof
WO2007108013A2 (en) * 2006-03-22 2007-09-27 National Institute Of Immunology Novel bioconjugates as therapeutic agent and synthesis thereof
US8148321B2 (en) 2006-03-22 2012-04-03 National Institute Of Immunology Bioconjugates as therapeutic agent and synthesis thereof
US8663632B1 (en) 2008-03-13 2014-03-04 University Of Kentucky Research Foundation Compositions and methods for selectively targeting cancer cells using a thiaminase compound
US8940501B2 (en) 2009-01-30 2015-01-27 Whitehead Institute For Biomedical Research Methods for ligation and uses thereof
WO2010087994A2 (en) 2009-01-30 2010-08-05 Whitehead Institute For Biomedical Research Methods for ligation and uses thereof
WO2010087994A3 (en) * 2009-01-30 2011-02-17 Whitehead Institute For Biomedical Research Methods for ligation and uses thereof
US20110321183A1 (en) * 2009-01-30 2011-12-29 Whitehead Institute For Biomedical Research Methods for ligation and uses thereof
US9890204B2 (en) 2009-04-07 2018-02-13 Hoffmann-La Roche Inc. Trivalent, bispecific antibodies
US10640555B2 (en) 2009-06-16 2020-05-05 Hoffmann-La Roche Inc. Bispecific antigen binding proteins
US9676845B2 (en) 2009-06-16 2017-06-13 Hoffmann-La Roche, Inc. Bispecific antigen binding proteins
US11673945B2 (en) 2009-06-16 2023-06-13 Hoffmann-La Roche Inc. Bispecific antigen binding proteins
US9994646B2 (en) 2009-09-16 2018-06-12 Genentech, Inc. Coiled coil and/or tether containing protein complexes and uses thereof
WO2011056911A3 (en) * 2009-11-04 2011-10-06 Alnylam Pharmaceuticals, Inc. Compositions and methods for enhancing production of a biological product
WO2011056911A2 (en) * 2009-11-04 2011-05-12 Alnylam Pharmaceuticals, Inc. Compositions and methods for enhancing production of a biological product
US10106600B2 (en) 2010-03-26 2018-10-23 Roche Glycart Ag Bispecific antibodies
WO2011133704A3 (en) * 2010-04-20 2012-04-19 Whitehead Institute For Biomedical Researh Modified polypeptides and proteins and uses thereof
US9879095B2 (en) 2010-08-24 2018-01-30 Hoffman-La Roche Inc. Bispecific antibodies comprising a disulfide stabilized-Fv fragment
US11618790B2 (en) 2010-12-23 2023-04-04 Hoffmann-La Roche Inc. Polypeptide-polynucleotide-complex and its use in targeted effector moiety delivery
US10611825B2 (en) 2011-02-28 2020-04-07 Hoffmann La-Roche Inc. Monovalent antigen binding proteins
US10793621B2 (en) 2011-02-28 2020-10-06 Hoffmann-La Roche Inc. Nucleic acid encoding dual Fc antigen binding proteins
WO2012142659A1 (en) * 2011-04-19 2012-10-26 Baker Idi Heart And Diabetes Institute Holdings Limited Site-selective modification of proteins
US11028185B2 (en) 2011-06-28 2021-06-08 Whitehead Institute For Biomedical Research Using sortases to install click chemistry handles for protein ligation
US10081684B2 (en) 2011-06-28 2018-09-25 Whitehead Institute For Biomedical Research Using sortases to install click chemistry handles for protein ligation
US9688758B2 (en) 2012-02-10 2017-06-27 Genentech, Inc. Single-chain antibodies and other heteromultimers
WO2013122823A1 (en) 2012-02-13 2013-08-22 Bristol-Myers Squibb Company Enediyne compounds, conjugates thereof, and uses and methods therefor
WO2013124473A1 (en) * 2012-02-24 2013-08-29 Novartis Ag Pilus proteins and compositions
CN104321335A (en) * 2012-02-24 2015-01-28 诺华股份有限公司 Pilus proteins and compositions
WO2013177221A1 (en) * 2012-05-21 2013-11-28 Massachusetts Institute Of Technology Protein retrosplicing enabled by a double ligation reaction
US9731029B2 (en) 2012-05-21 2017-08-15 Massachusetts Institute Of Technology Protein retrosplicing enabled by a double ligation reaction
US10106612B2 (en) 2012-06-27 2018-10-23 Hoffmann-La Roche Inc. Method for selection and production of tailor-made highly selective and multi-specific targeting entities containing at least two different binding entities and uses thereof
US11407836B2 (en) 2012-06-27 2022-08-09 Hoffmann-La Roche Inc. Method for selection and production of tailor-made highly selective and multi-specific targeting entities containing at least two different binding entities and uses thereof
WO2014001324A1 (en) * 2012-06-27 2014-01-03 Hoffmann-La Roche Ag Method for selection and production of tailor-made highly selective and multi-specific targeting entities containing at least two different binding entities and uses thereof
US11421022B2 (en) 2012-06-27 2022-08-23 Hoffmann-La Roche Inc. Method for making antibody Fc-region conjugates comprising at least one binding entity that specifically binds to a target and uses thereof
US9862779B2 (en) 2012-09-14 2018-01-09 Hoffmann-La Roche Inc. Method for the production and selection of molecules comprising at least two different entities and uses thereof
US8980824B2 (en) 2013-02-14 2015-03-17 Bristol-Myers Squibb Company Tubulysin compounds, methods of making and use
US9382289B2 (en) 2013-02-14 2016-07-05 Bristol-Myers Squibb Company Tubulysin compounds, methods of making and use
WO2014126836A1 (en) 2013-02-14 2014-08-21 Bristol-Myers Squibb Company Tubulysin compounds, methods of making and use
US9109008B2 (en) 2013-02-14 2015-08-18 Bristol-Myers Squibb Company Tubulysin compounds, methods of making and use
US9688721B2 (en) 2013-02-14 2017-06-27 Bristol-Myers Squibb Company Tubulysin compounds, methods of making and use
US20160032346A1 (en) * 2013-03-15 2016-02-04 The Trustees Of The University Of Pennsylvania Sortase-mediated protein purification and ligation
US11156608B2 (en) * 2013-03-15 2021-10-26 The Trustees Of The University Of Pennsylvania Method for the site-specific covalent cross-linking of antibodies to surfaces
US9631218B2 (en) * 2013-03-15 2017-04-25 The Trustees Of The University Of Pennsylvania Sortase-mediated protein purification and ligation
JP2016511279A (en) * 2013-03-15 2016-04-14 エヌビーイー セラピューティクス アクチェン ゲゼルシャフト Method for producing immunoligand / payload complex
US20160041157A1 (en) * 2013-03-15 2016-02-11 The Trustees Of The University Of Pennsylvania Method for the site-specific covalent cross-linking of antibodies to surfaces
WO2014145441A1 (en) * 2013-03-15 2014-09-18 The Trustees Of The University Of Pennsylvania Sortase-mediated protein purification and ligation
US10077299B2 (en) 2013-04-25 2018-09-18 Abtlas Co., Ltd. Method for refining protein including self-cutting cassette and use thereof
JP2016519118A (en) * 2013-04-25 2016-06-30 スクリプス コリア アンティボディー インスティテュート Protein purification method including self-cleaving cassette and use thereof
RU2639527C2 (en) * 2013-04-25 2017-12-21 ЭбТЛАС КО., Лтд. Method of cleaning protein included in self-immolative tape and its application
KR101808223B1 (en) * 2013-04-25 2017-12-13 주식회사 앱틀라스 Method for refining protein including self-cutting cassette and use thereof
US11266695B2 (en) 2013-05-10 2022-03-08 Whitehead Institute For Biomedical Research In vitro production of red blood cells with sortaggable proteins
US11492590B2 (en) 2013-05-10 2022-11-08 Whitehead Institute For Biomedical Research Protein modification of living cells using sortase
US10471099B2 (en) 2013-05-10 2019-11-12 Whitehead Institute For Biomedical Research In vitro production of red blood cells with proteins comprising sortase recognition motifs
US10260038B2 (en) 2013-05-10 2019-04-16 Whitehead Institute For Biomedical Research Protein modification of living cells using sortase
WO2015023879A1 (en) 2013-08-14 2015-02-19 William Marsh Rice University Derivatives of uncialamycin, methods of synthesis and their use as antitumor agents
US10323099B2 (en) 2013-10-11 2019-06-18 Hoffmann-La Roche Inc. Multispecific domain exchanged common variable light chain antibodies
US11850216B2 (en) 2013-11-13 2023-12-26 Whitehead Institute For Biomedical Research 18F labeling of proteins using sortases
US10556024B2 (en) 2013-11-13 2020-02-11 Whitehead Institute For Biomedical Research 18F labeling of proteins using sortases
US10053683B2 (en) 2014-10-03 2018-08-21 Whitehead Institute For Biomedical Research Intercellular labeling of ligand-receptor interactions
WO2016077260A1 (en) 2014-11-10 2016-05-19 Bristol-Myers Squibb Company Tubulysin analogs and methods of making and use
US10633457B2 (en) 2014-12-03 2020-04-28 Hoffmann-La Roche Inc. Multispecific antibodies
WO2016115201A1 (en) 2015-01-14 2016-07-21 Bristol-Myers Squibb Company Heteroarylene-bridged benzodiazepine dimers, conjugates thereof, and methods of making and using
US10766923B2 (en) 2016-01-26 2020-09-08 The Regents Of The University Of California Methods and compositions to increase the rate of ligation reactions catalyzed by a sortase
WO2017132395A1 (en) * 2016-01-26 2017-08-03 The Regents Of The University Of California Methods and compositions to increase the rate of ligation reactions catalyzed by a sortase
WO2018035391A1 (en) 2016-08-19 2018-02-22 Bristol-Myers Squibb Company Seco-cyclopropapyrroloindole compounds, antibody-drug conjugates thereof, and methods of making and use
WO2018075842A1 (en) 2016-10-20 2018-04-26 Bristol-Myers Squibb Company Condensed benzodiazepine derivatives and conjugates made therefrom
WO2019036023A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company 6-amino-7,9-dihydro-8h-purin-8-one derivatives as immunostimulant toll-like receptor 7 (tlr7) agonists
WO2019035970A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company 6-amino-7,9-dihydro-8h-purin-8-one derivatives as immunostimulant toll-like receptor 7 (tlr7) agonists
WO2019035969A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company Toll-like receptor 7 (tlr7) agonists having a tricyclic moiety, conjugates thereof, and methods and uses therefor
WO2019035968A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company 6-amino-7,9-dihydro-8h-purin-8-one derivatives as toll-like receptor 7 (tlr7) agonists as immunostimulants
WO2019035971A1 (en) 2017-08-16 2019-02-21 Bristol-Myers Squibb Company 6-amino-7,9-dihydro-8h-purin-8-one derivatives as immunostimulant toll-like receptor 7 (tlr7) agonists
CN110196322A (en) * 2018-02-27 2019-09-03 广东志道医药科技有限公司 Organophosphate and carbamate pesticide method for detecting residue and its test strips and preparation method
WO2019209811A1 (en) 2018-04-24 2019-10-31 Bristol-Myers Squibb Company Macrocyclic toll-like receptor 7 (tlr7) agonists
WO2020028610A1 (en) 2018-08-03 2020-02-06 Bristol-Myers Squibb Company 2H-PYRAZOLO[4,3-d]PYRIMIDINE COMPOUNDS AS TOLL-LIKE RECEPTOR 7 (TLR7) AGONISTS AND METHODS AND USES THEREFOR
WO2020028608A1 (en) 2018-08-03 2020-02-06 Bristol-Myers Squibb Company 1H-PYRAZOLO[4,3-d]PYRIMIDINE COMPOUNDS AS TOLL-LIKE RECEPTOR 7 (TLR7) AGONISTS AND METHODS AND USES THEREFOR
CN114480534A (en) * 2022-02-24 2022-05-13 清华大学 Protein semi-synthesis based on chemical enzymatic methods of transpeptidase
CN114480534B (en) * 2022-02-24 2024-02-13 清华大学 Protein semisynthesis by chemoenzymatic methods based on transpeptidase

Also Published As

Publication number Publication date
WO2005051976A3 (en) 2005-09-29

Similar Documents

Publication Publication Date Title
WO2005051976A2 (en) Protein and peptide ligation processes and one-step purification processes
Antos et al. Site‐specific protein labeling via sortase‐mediated transpeptidation
Cesaratto et al. Tobacco Etch Virus protease: A shortcut across biotechnologies
Proft Sortase-mediated protein ligation: an emerging biotechnology tool for protein modification and immobilisation
David et al. Expressed protein ligation: Method and applications
Tsukiji et al. Sortase‐mediated ligation: a gift from gram‐positive bacteria to protein engineering
Popp et al. Site‐specific protein labeling via sortase‐mediated transpeptidation
JP7469285B2 (en) Use of nucleosome-interacting protein domains to enhance targeted genome modification
US20200140835A1 (en) Engineered CRISPR-Cas9 Nucleases
Evans et al. Mechanistic and kinetic considerations of protein splicing
Evans Jr et al. Intein‐mediated protein ligation: Harnessing nature's escape artists
Perler Protein splicing mechanisms and applications
JP7109547B2 (en) An engineered Cas9 system for eukaryotic genome modification
Mao A self-cleavable sortase fusion for one-step purification of free recombinant proteins
EP3274459A1 (en) Platform for non-natural amino acid incorporation into proteins
WO2016148044A1 (en) MODIFIED AMINOACYL-tRNA SYNTHETASE AND USE THEREOF
Steinhagen et al. Large scale modification of biomolecules using immobilized sortase A from Staphylococcus aureus
Girish et al. Site-specific immobilization of proteins in a microarray using intein-mediated protein splicing
WO2021185360A1 (en) Novel truncated sortase variants
KR20200017479A (en) Synthetic Induced RNA for CRISPR / CAS Activator Systems
Sancheti et al. “Splicing up” drug discovery.: Cell-based expression and screening of genetically-encoded libraries of backbone-cyclized polypeptides
JP5497295B2 (en) Microginin-producing protein, nucleic acid encoding microginin gene cluster, and method for producing microginin
US10407672B2 (en) Compositions and methods comprising the use of cell surface displayed homing endonucleases
FR2973032A1 (en) PEPTIDES CAPABLE OF FORMING A COVALENT COMPLEX AND THEIR USES
US20210163899A1 (en) Fusion proteins for the detection of apoptosis

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase