DETECTION OF SUBSTRATE RECOGNITION OF PROTEIN KINASES
AND PHOSPHATASES Field of the Invention
This invention relates to a method for the detection and identification of substrate recognition sites for protein kinases and protein phosphatases. Background of the Invention
Phosphatases and kinases play important metabolic roles. The ability to modulate their activity is a major focus of interest for the chemical industry. The ability to identify the substrate specificity of phosphatases and kinases is of importance inter alia for the design of inhibitors of these enzymes.
Protein kinases phosphorylate either tyrosine, serine or threonine residues of a protein or short peptide, whereas protein phosphatases dephosphorylate phosphotyrosine, phosphoserine or phosphothreonine residues of a protein or short peptide.
Methods used so far to screen substrates for phosphatases include analysing dephosphorylation by detecting a UV change, use of radioactive phosphate, colouri etric assay and ELISA.
Methods used so far to screen substrates for protein kinases include the use of radioactive phosphate. For example, the screening of a bead-bound peptide library, by incubation with a protein kinase and radioactive ATP, has been disclosed. This results in the transfer of radioactive phosphate to those peptides which are substrates for the kinase. The radioactive beads are then isolated and the peptides sequenced.
It is known that phosphopeptide libraries have been used in free solution to yield information about the sequence recognition of SH2 domains by affinity selection (Songyang et al , Cell 71:767-778 (1993)). Affinity selection has been used also to screen phosphopeptide libraries in solution against a catalytically-inactive mutant protein tyrosine phosphatase. A limitation in
solution assays has been the difficulty in obtaining sequence information about single phosphopeptide sequences that are positive. Sιιmm_*rγ of the Invention A novel assay system has now been developed that can be utilised to screen a resin-bound or solution-phase phosphotyrosine peptide library for substrate turnover by a selected protein tyrosine phosphatase (PTP) or kinase. The present invention is capable of identifying individual hits from a catalysis-based screen. It has the ability to identify individual sequences rather than simply a positional preference for particular residues. Furthermore, it has an advantage in not requiring radioactive reagents. It has been known for some time that the proteolytic enzyme α-chymotrypsin (CT) is selective for amide bond cleavage on the C-terminal side of Phe, Tyr or Trp (and to a lesser extent Leu or lie) in peptides. The basis of this invention is the discovery that whilst a tyrosine- containing peptide is a good substrate for CT, the analogous phosphotyrosyl peptide is not.
Preferably, the phosphopeptide substrate is synthesised on a solid phase by standard methods. A non- cleavable (alkylamine) linker may typically be used. Description of the Invention
In a principal embodiment of the invention, described (by way of example only) for phosphotyrosine phosphatases, a library of phosphotyrosyl peptides is screened with a phosphatase. Those phosphotyrosyl peptides which are substrates for the phosphatase will be dephosphorylated, converting the phosphotyrosine to tyrosine. Those phosphopeptides which are not substrates will not be dephosphorylated. This library, which now includes both phosphopeptides and dephosphorylated peptides, may then be treated with a second enzyme, chymotrypsin. α-Chymotrypsin is a proteolytic enzyme that preferentially cleaves peptides adjacent to aromatic residues e.g. tyrosine
groups. The α-chymotrypsin cleaves adjacent to phosphotyrosine groups at a very much reduced rate with respect to non-phosphorylated tyrosine. The discrimination is so high that for the purposes of the screen it can be considered not to cleave.
The action of α-chymotrypsin leaves the phosphopeptides unchanged, but cleaves those peptides which, as phosphopeptides, were substrates for the phosphatase. The cleavage of the peptide can be detected in several ways, e.g.
(i) by modification of the exposed terminal amine at the cleavage site, e.g. using a fluorescent or chromogenic reagent specific for amines, e.g. fluorescein, 5(&6)- carboxyfluorescein succinimidyl ester or 4- dimethylaminobenzene-4' -carboxylic acid (Dabsyl) . It is important to ensure that there are no free amines in the peptide prior to treatment with α-chymotrypsin. This may be achieved by using a blocking agent such as Fmoc to block the amino terminus prior to treatment with the enzymes. (ii) using a FRET (Fluorescence Resonance Energy Transfer) assay.
This methodology can be used in connection with a FRET assay system in several ways. For example, a fluorescence donor is placed at one side of the phosphorylated site and an acceptor is placed at the other side. If the peptide is dephosphorylated on tyrosine, it becomes a substrate for α- chymotrypsin. Treatment with chymotrypsin then results in a fluorescence signal as the peptide is cleaved and the donor and acceptor separate. This methodology is applicable whether the target peptide is attached to a solid support (e.g. a resin bead) or is in solution as either an individual compound or part of a mixture.
Preferred solid supports are Kieselguhr and PEGA (bis(2-acrylamidoprop-l-yl) polyethylene glycol cross- linked dimethyl acrylamide and acryloyl sarcosine methyl ester) resins which give excellent substrate turnover efficiencies for CT, PTP and proteins as large as 140 kDa.
The cleaved peptide is selected and its sequence determined. This can be achieved in several ways, e.g.
(i) Sequencing (by conventional peptide sequencing methods e.g. Edman degradation or mass spectrometry ; see US Patent Specification No. 5,470,753) of a sequencing strand, e.g. a sequence which has been made in parallel to the original sequence but in which the phosphotyrosine residue is replaced by a dummy residue e.g. glycine. This renders the sequencing strand non-cleavable by α-chymotrypsin. The cleaved target strand is not sequenced because the first cycle of sequencing is blocked by the fluorescent tag. This method is most appropriate when dealing with phosphopeptide libraries on solid support or single peptides in solution. (ii) Sequencing of remaining, uncleaved peptide, if the cleavage reaction with the proteolytic enzyme is only run to partial cleavage. This identification method is most appropriate when dealing with single peptides in solution or on a solid support. (ϋi) Isolating the cleaved peptide from mixtures in solution phase libraries. One method is to make orthogonal small libraries. Target peptides show up as hits in both libraries. The libraries containing the cleaved peptide can be identified by FRET, and the identity of the peptide determined by, e.g. tandem liquid chromatography mass spectrometry of the mixture.
In a further embodiment of the invention, this system may be extended for screening kinases by running them in the reverse direction (i.e. as phosphatases) . This can be achieved by treating the phosphotyrosine peptide library with a protein tyrosine kinase and ADP or water, under conditions where the kinase reaction operates in the reverse direction, to dephosphorylate the phosphopeptide and also form ATP or phosphate. After treatment with CT, any cleaved dephosphorylated peptides would be detected as described above for the protein tyrosine phosphatase assay.
In principle, the use of a double enzyme system can be extended to other kinds of assay, e.g. assays for liposylation, ADP-ribosylation or glycosylation.
Certain limitations must be considered if α- chymotrypsin is used as the proteolytic enzyme. The natural amino-acids Phe, Tyr and Trp must be excluded from the peptides, since they may promote CT-mediated cleavage and thus potentially give false positives. Other hydrophobic side-chains such as Leu and lie could pose a similar problem. However, the kinetics of CT cleavage will be significantly slower than at Tyr, so optimisation of the screen should minimise false positives.
The following Example illustrates the invention, and shows that α-chymotrypsin cleaves preferentially on non- phosphorylated peptides.
The following abbreviations are used: DCM - Dichloromethane DIPEA - Diisopropylethyla ine DMF - Dimethylformamide DTT - Dithiothreitol
HOBt - N-hydroxybenzotriazole pNPP - p-Nitrophenol Phosphate
PyBOP - Benzotr iazol-l-yloxy-tris-pyrrolidino- phosphonium hexafluorophosphate TFA - Trifluoroacetic acid TIS - Triisopropylsilane Example
The Example is carried out using the substrate specificity of leukocyte antigen receptor (LAR) phosphatase. The undecapeptide corresponding to the autophosphorylation site of the epidermal growth factor receptor (EGFR988.998, SEQ ID NO.l) has been shown to be a good substrate for Yersinia and mammalian PTPases, and was used as a prototype sequence for the combinatorial library. EGFR98β.99β contains 3 acidic residues on the N-terminal side of the phosphorylated tyrosine (Tyr) . It is noted that several PTPs exhibit a general requirement for >4 residues
on the N-terminal side of the phosphotyrosine, of which at least one should bear an acidic side-chain.
To investigate possible PTP substrate sites, a peptide library was constructed, with each peptide having the general formula
Fmoc-X1-A-X2-X3-Y-L-I-P-Q-Q-G-resin wherein X,, X2 and X3 represent alternative amino-acids in the peptide library, and Y is a phosphorylated tyrosine. Batch Synthesis of phosphopeptide library All batch peptide syntheses were carried out on a VacMaster Processing Station (Jones Chromatography) . This essentially involved an upper polyfluorinated polyethylene (PFTE) block with 20 filtration columns connected through PFTE tubes {via Luer taps) to a lower vacuum chamber. The phosphopeptide library was synthesised on Kieselguhr resin using a "split and mix approach", inserting one of 8 amino-acids (D, E, G, V, S, M, Q, P) in each of the positions designated X X2, X3, to generate a 512 member library. All reactions were carried out at room temperature (20°C) unless otherwise stated.
First, the Kieselguhr resin (400 mg) was swollen in DMF for 30 minutes, washed with two volumes of DMF and coupled to the first amino-acid (5-fold equivalent (5 eq) ) using PyBOP (5 eq) , HOBt (5 eq) and DIPEA (10 eq) in DMF (2 ml) . The reaction was gently agitated from time to time. After 90 minutes the resin was washed (3 x DMF, 1 x DCM, 1 x DMF) . A Kaiser test was taken before the last wash. If the test was positive, then the coupling step was repeated. Otherwise the resin was treated with 20% piperidine in DMF (2 x 5 minutes) , washed as before and another Kaiser test taken. If the test proved positive, then the next residue in the sequence was coupled using the same conditions as before.
Incorporation of the "dummy" residue was achieved by coupling F oc-glycine (1 eq) for 60 minutes. The loading was checked by Fmoc analysis. This was consistently found to be 30 %. Fmoc-phosphotyrosine was then double-coupled using a 7.5-fold equivalent each time. All residues after
this position were also double-coupled, but using a 5-fold equivalent each time.
The pool of variable amino-acids was chosen to include acidic, hydrophobic, hydrophilic, large and small side-chains, in order to explore the requirement for carboxyl side-chains in these positions for LAR PTP.
After the incorporation of phosphotyrosine and the removal of the terminal Fmoc protecting group, the resin was lightly dried (final wash was with DCM) and poured onto a sheet of aluminium foil. It was distributed evenly between 8 reaction vessels (by weight) and the appropriate amino-acid coupling solution added to each reaction vessel as before. At the end of each coupling following a negative Kaiser result, the resins were combined for Fmoc deprotection and redistribution.
The terminal Fmoc was left on the sequences. The final library was air-dried and then treated with a cocktail of TFA/H20/TIS (95:2.5:2.5) for 3 hours, washed thoroughly, dried and stored desiccated at 4°C. Dephosphorylation
The library (40 mg) was swollen in buffer (30 mM Hepes, 150 mM NaCl, 6.25 mM DTT, pH 7.4, 500 μl) at 37°C for 30 minutes. LAR solution (200 units, of which 1 unit will hydrolyse 1 nmole of pNPP per minute in Hepes buffer (pH 7.4) at 37°C, in 500 μl Hepes buffer) was added and the reaction (total volume 1 ml) incubated at 37 °C for 60 minutes with occasional agitation. The resin was then washed (water, methanol, DCM, DMF). Capping false hits The DMF-equilibrated library was treated with a solution of Fmoc-glycine (5.3 mg) , PyBOP (9.4 mg) , HOBt (2.8 mg) and DIPEA (6.3 ml) in DMF (1 ml) for 60 minutes to cap any free amino termini, and washed (DMF, DCM, methanol, water) . CT cleavage
The library was allowed to equilibrate in buffer (20 mM Tris/HCl, 160 mM NaCl, pH 8.0) for 30 minutes, drained and treated with the CT solution (100 units, based on
suppliers value of 52 units per mg of crystallised protein in 1 ml Tris buffer) for 45 minutes with occasional agitation. After 30 minutes, the library was washed (water, methanol, DCM, DMF). Fluorescence labelling
The library was allowed to equilibrate in DMF, treated with the coupling solution 5 (&6) -carboxyfluorescein (2.5 mg) , PyBOP (3.5 mg) , HOBt (1 mg) and DIPEA (2.3 μl) freshly dissolved in DMF (500 μl) , and allowed to react in the dark for 3 hours, before washing (DMF, DCM, DMF) . Fmoc removal
The DMF-equilibrated library was treated with a solution of 20% piperidine/DMF (2 x 5 minutes) , washed (DMF, DCM) and air-dried. Selection
The library was poured out onto a Petri dish and swollen in Tris buffers. The intensities of various beads were observed by eye. There was generally a background of pale yellow beads. Deep orange beads were deemed as "hits" and removed for peptide sequencing.
Analysis of the beads showed that an estimated 4% of the library was lightly labelled, with <1% of the beads heavily labelled. Six of the heavily stained beads were sequenced to give the results shown in SEQ ID NOS:2-7. The peptides having SEQ ID N0S:2, 3 and 4 were resynthesised and assayed individually (resin-bound) with LAR PTP for release of Pi against time (Lanzetta et al , Anal. Biochem. (1979) 100, 95-97) . The release profile for all three sequences was comparable with that of EGFR988.998. Furthermore, the initial rates of dephosphorylation were an estimated 30-fold greater than for sequence Fmoc- GAPGYLIPQQG-resin (SEQ ID NO:8), which has no acidic residues in positions X1 , X2 or X3, and did not label strongly under the screening conditions.
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT:
(A) NAME: Cambridge University Technical Services Ltd.
(B) STREET: The Old Schools, Trinity Lane
(C) CITY: Cambridge
(D) STATE: N/A
(E) COUNTRY: United Kingdom
(F) POSTAL CODE (ZIP): CB2 ITS
(ii) TITLE OF INVENTION: DETECTION OF SUBSTRATE RECOGNITION OF PROTEIN KINASES AND PHOSPHATASES
(iii) NUMBER OF SEQUENCES: 8
(iv) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO)
(V) CURRENT APPLICATION DATA: APPLICATION NUMBER:
(2) INFORMATION FOR SEQ ID NO: 1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO • (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:
Asp Ala Asp Glu Tyr Leu He Pro Gin Gin Gly 1 5 10
(2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:
Glu Ala Glu Gin Tyr Leu He Pro Gin Gin Gly 1 5 10
(2) INFORMATION FOR SEQ ID NO: 3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: N-terminal
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:
Glu Ala Asp Gin Tyr Leu He Pro Gin Gin Gly 1 5 10
(2) INFORMATION FOR SEQ ID NO: 4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:
Asp Ala Glu Pro Tyr Leu He Pro Gin Gin Gly 1 5 10
(2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
Glu Ala Gin Pro Tyr Leu He Pro Gin Gin Gly 1 5 10
(2) INFORMATION FOR SEQ ID NO: 6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
Glu Ala Asp Glu Tyr Leu He Pro Gin Gin Gly 1 5 10
(2) INFORMATION FOR SEQ ID NO: 7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids (3= ) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:
Glu Ala Gin Asp Tyr Leu He Pro Gin Gin Gly 1 5 10
(2) INFORMATION FOR SEQ ID NO: 8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(iii) HYPOTHETICAL: NO (iv) ANTI -SENSE: NO (v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:
Gly Ala Pro Gly Tyr Leu He Pro Gin Gin Gly 1 5 10