WO1999037816A1 - Methods for identifying therapeutic targets - Google Patents

Methods for identifying therapeutic targets Download PDF

Info

Publication number
WO1999037816A1
WO1999037816A1 PCT/US1999/001463 US9901463W WO9937816A1 WO 1999037816 A1 WO1999037816 A1 WO 1999037816A1 US 9901463 W US9901463 W US 9901463W WO 9937816 A1 WO9937816 A1 WO 9937816A1
Authority
WO
WIPO (PCT)
Prior art keywords
cell
cells
sample
neoplastic
gene
Prior art date
Application number
PCT/US1999/001463
Other languages
French (fr)
Inventor
Bruce L. Roberts
Srinivas Shankara
Original Assignee
Genzyme Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Genzyme Corporation filed Critical Genzyme Corporation
Priority to AU23391/99A priority Critical patent/AU756357B2/en
Priority to JP2000528722A priority patent/JP2002500896A/en
Priority to EP99903346A priority patent/EP1053349A4/en
Priority to CA002319148A priority patent/CA2319148A1/en
Publication of WO1999037816A1 publication Critical patent/WO1999037816A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1072Differential gene expression library synthesis, e.g. subtracted libraries, differential screening
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1079Screening libraries by altering the phenotype or phenotypic trait of the host
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection

Definitions

  • This invention is in the fields of molecular biology, cell biology and immunology. More particularly, the invention uses techniques of functional genomics to correlate the phenotype of a cell with its pattern of gene expression and to identify new therapeutic targets.
  • This invention provides methods for identifying therapeutically-relevant genes which are expressed differentially in one cell with respect to another.
  • the present invention broadly provides a method for correlating the phenotype of a cell with its "functional genotype,” that is, the constellation of expressed sequences in that cell.
  • the invention provides a means for identifying therapeutically-relevant genes and gene products.
  • This invention also provides computer-related systems and methods. More specifically, the invention provides a system and method for automatically generating a data base of gene tags from cell samples and using the data base for filtering the tag counts from the samples into meaningful candidates for further testing and analysis.
  • the present invention provides a method for identifying a gene associated with a selected phenotype. Knowledge of the sequence of such a gene will also provide the skilled artisan with knowledge of the sequence and structure of the protein product(s) of the gene. In a preferred embodiment, the action of the gene and/or its product will be causative or involved in some way with respect to the selected phenotype.
  • PCR 2 A PRACTICAL APPROACH (M.J. MacPherson, B.D. Hames and G.R. Taylor eds. (1995)) and ANIMAL CELL CULTURE (RJ. Freshney, ed. (1987)).
  • a cell includes a plurality of cells, including mixtures thereof.
  • polynucleotide and “nucleic acid molecule” are used interchangeably to refer to polymeric forms of nucleotides of any length.
  • the polynucleotides may contain deoxyribonucleotides, ribonucleotides, and/or their analogs.
  • Nucleotides may have any three-dimensional structure, and may perform any function, known or unknown.
  • polynucleotide includes, for example, single-, double-stranded and triple helical molecules, a gene or gene fragment, exons, introns, mRNA, tRNA, rRNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers.
  • a nucleic acid molecule may also comprise modified nucleic acid molecules.
  • the term “differentially expressed” refers to nucleotide sequences in a cell or tissue which are either more or less expressed than a control cell, or expressed where silent in a control cell or not expressed where expressed in a control cell.
  • Oligonucleotide refers to polynucleotides of between about 5 and about 100 nucleotides of single- or double-stranded DNA. Oligonucleotides are also known as oligomers or oligos and may be isolated from genes, or chemically synthesized by methods known in the art.
  • a "gene” is a hereditary unit that, in the classical sense, occupies a specific position (locus) within the genome or chromosome; a unit that has one or more specific effects upon the phenotype of the organism; a unit that can mutate to various allelic forms; a unit that recombines with other such units.
  • Three classes of polynucleotides s are now recognized: (1) structural genes that are transcribed into mRNAs, which are then translated into polypeptide chains, (2) structural polynucleotides that are transcribed into rRNA or tRNA molecules which are used directly, and (3) regulatory sequences that are not transcribed, but serve as recognition sites for enzymes and other proteins involved in DNA replication and transcription.
  • a “primer” refers to an oligonucleotide, usually single-stranded, that provides a 3 '-hydroxyl end for the initiation of enzyme-mediated nucleic acid synthesis.
  • the primer sequence need not reflect the exact sequence of the template.
  • PCR primers refer to primers used in "polymerase chain reaction” or "PCR,” a method for amplifying a DNA base sequence using a heat-stable polymerase such as Taq polymerase, and two oligonucleotide primers, one complementary to the (+)-strand at one end of the sequence to be amplified and the other complementary to the (-)-strand at the other end.
  • PCR also can be used to detect the existence of the defined sequence in a DNA sample.
  • a “sequence tag” or “SAGE tag” is a short sequence, generally under about 20 nucleotides, that occurs in a certain position in messenger RNA. The tag can be used to identify the corresponding transcript and gene from which it was transcribed.
  • a “ditag” is a dimer of two sequence tags.
  • the term “cDNAs” refers to complementary DNA, that is mRNA molecules present in a cell or organism made in to cDNA with an enzyme such as reverse transcriptase.
  • a “cDNA library” is a collection of all of the mRNA molecules present in a cell or organism, all turned into cDNA molecules with the enzyme reverse transcriptase, then inserted into “vectors” (other DNA molecules which can continue to replicate after addition of foreign DNA).
  • Exemplary vectors for libraries include bacteriophage (also known as "phage"), viruses that infect bacteria, for example, lambda phage.
  • the library can then be probed for the specific cDNA (and thus mRNA) of interest.
  • immune effector cells refers to cells capable of binding an antigen and which mediate an immune response. These cells include, but not limited to, T cells, B cells, monocytes, macrophages, NK cells and cytotoxic T lymphocytes (CTLs), for example CTL lines, CTL clones, and CTLs from tumor, inflammatory, or other infiltrates. Certain diseased tissue expresses specific antigens and CTLs specific for these antigens have been identified. For example, approximately 80% of melanomas express the antigen known as GP-100.
  • T-lymphocytes denotes lymphocytes that are phenotypically CD3+, typically detected using an anti-CD3 monoclonal antibody in combination with a suitable labeling technique.
  • the T-lymphocytes of this invention are also generally positive for CD4, CD8, or both.
  • restriction endonucleases and “restriction enzymes” refer to bacterial enzymes which bind to a specific double-stranded DNA sequence termed a recognition site or recognition nucleotide sequence, and cut double-stranded DNA at or near the specific recognition site.
  • Type IIS restriction endonucleases are those which cleave at a defined distance (up to 20 bases away) from their recognition sites. Endonucleases will be known to those of skill in the art (see for example, Current Protocols in Molecular Biology, Vol. 2, 1995, Ed. Ausubel et al, Greene Publish. Assoc. & Wiley Interscience, Unit 3.1.15; New England Biolabs Catalog, 1995).
  • a “na ⁇ ve” cell is a cell that has never been exposed to an antigen.
  • the term “culturing” refers to the in vitro propagation of cells or organisms on or in media of various kinds. It is understood that the descendants of a cell grown in culture may not be completely identical (morphologically, genetically, or phenotypically) to the parent cell. By “expanded” is meant any proliferation or division of cells.
  • a "subject” is a vertebrate, preferably a mammal, more preferably a human. Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets.
  • “Host cell” or “recipient cell” is intended to include any individual cell or cell culture which can be or have been recipients for vectors or the incorporation of exogenous nucleic acid molecules, polynucleotides and/or proteins. It also is intended to include progeny of a single cell, and the progeny may not necessarily be completely identical (in morphology or in genomic or total DNA complement) to the original parent cell due to natural, accidental, or deliberate mutation.
  • the cells may be procaryotic or eucaryotic, and include but are not limited to bacterial cells, yeast cells, animal cells, and mammalian cells, e.g., murine, rat, simian or human.
  • An “antibody” is an immunoglobulin molecule capable of binding an antigen.
  • the term encompasses not only intact immunoglobulin molecules, but also anti-idiotypic antibodies, mutants, fragments, fusion proteins, humanized proteins and modifications of the immunoglobulin molecule that comprise an antigen recognition site of the required specificity.
  • antibody complex is the combination of antibody (as defined above) and its binding partner or ligand.
  • a native antigen is a polypeptide, protein or a fragment containing an epitope, which induces an immune response in the subject.
  • isolated means separated from constituents, cellular and otherwise, in which the polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, are normally associated with in nature. As is apparent to those of skill in the art, a non-naturally occurring polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, does not require “isolation” to distinguish it from its naturally occurring counterpart.
  • a "concentrated”, “separated” or “diluted” polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof is distinguishable from its naturally occurring counterpart in that the concentration or number of molecules per volume is greater than “concentrated” or less than “separated” than that of its naturally occurring counterpart.
  • a non-naturally occurring polynucleotide is provided as a separate embodiment from the isolated naturally occurring polynucleotide.
  • a protein produced in a bacterial cell is provided as a separate embodiment from the naturally occurring protein isolated from a eucaryotic cell in which it is produced in nature.
  • an “isolated” or “enriched” population of cells is “substantially free” of cells and materials with which it is associated in nature.
  • substantially free or “substantially pure” means at least 50% of the population are the desired cell type, preferably at least 70%), more preferably at least 80%, and even more preferably at least 90%.
  • composition is intended to mean a combination of active agent and another compound or composition, inert (for example, a detectable agent , solid support or label) or active, such as an adjuvant.
  • a “pharmaceutical composition” is intended to include the combination of an active agent with a carrier, inert or active, making the composition suitable for diagnostic or therapeutic use in vitro, in vivo or ex vivo.
  • the term "pharmaceutically acceptable carrier” encompasses any of the standard pharmaceutical carriers, such as a phosphate buffered saline solution, water, and emulsions, such as an oil/water or water/oil emulsion, and various types of wetting agents.
  • the compositions also can include stabilizers and preservatives.
  • stabilizers and adjuvants see Martin, REMINGTON'S PHARM. SCI., 15th Ed. (Mack Publ. Co., Easton (1975)).
  • An "effective amount” is an amount sufficient to effect beneficial or desired results. An effective amount can be administered in one or more administrations, applications or dosages.
  • the present method identifies a polynucleotide fragment of a gene that confers or is involved in conferring a selected phenotype to a sample cell, cells, or tissue or presenting a potential therapeutic target.
  • the method requires identifying a unique polynucleotide, the unique polynucleotide representing a gene that is differentially expressed in a sample cell compared to a control cell.
  • the gene corresponding to the unique polynucleotideis is identified and cloned, thereby providing the sequence and identy of the gene conferring the selected phenotype to the sample cell or is associated with a selected phenotype but not necessarily causative of the selected phenotype.
  • the unique polynucleotide can represent or correspond to or be a fragment of a gene that is differentially, overexpressed or underexpressed in the sample cell compared to the control cell. More than one sample cell type can be compared to a single control cell, or alternatively, more than one control cell type can be compared to a single sample cell. Therapeutic targets can be identified using the methods disclosed herein.
  • the polypeptides and proteins encoded by these polynucleotides and genes can further produced, isolated and characterized. In one embodiment, the method is useful for identifying one or more secreted biological factors and/or the gene(s) encoding the factor(s) or fragments thereof.
  • the method involves the steps of: providing one or more sample cells that secrete the factor and one or more control cells that do not secrete the factor; obtaining a set of polynucleotides representing gene expression in the sample cells; obtaining a set of polynucleotides representing gene expression in the control cells; and identifying one or more unique polynucleotides, the unique polynucleotides being common to the sample cells and the unique polynucleotides being absent or expressed at lower levels in the control cells. Finally, by determining the genes corresponding to the unique polynucleotides, one or more secreted biological factors are identified.
  • the practice of the invention can be applied to the identification of gene(s) that are relevant to any property that differs between one cell (a sample cell) and another (a control cell). Such properties may include, but are not limited to, disease state, infection, drug resistance, cytokine secretion, secretory protein expression, state of differentiation, growth regulation, consequences of exposure to external environmental stimuli, etc.
  • the practice of the invention can be applied to any cell type including, but not limited to, plants, animals and microorganisms.
  • sample cells include, but are not limited to, neoplastic cells; drug-resistant neoplastic cells; neoplastic cells which promote angiogenesis; de-differentiated cells; differentiated cells; apoptotic cells; hyperproliferative cells; cells infected with a pathogen or drug-resistant cells infected with a pathogen.
  • Cancers from which cells can be obtained for use in the methods of the present invention include carcinomas, sarcomas, leukemias, and cancers derived from cells of the nervous system. These include, but are not limited to: brain tumors, such as astrocytoma, oligodendroglioma, ependymoma, medulloblastomas, and Primitive Neural Ectodermal Tumor (PNET); pancreatic tumors, such as pancreatic ductal adenocarcinomas; lung tumors, such as small and large cell adenocarcinomas, squamous cell carcinoma and bronchoalveolarcarcinoma; colon tumors, such as epithelial adenocarcinoma and liver metastases of these tumors; liver tumors, such as hepatoma and cholangiocarcinoma; breast tumors, such as ductal and lobular adenocarcinoma; gynecologic tumors, such as
  • Tumor cells are typically obtained from a cancer patient by resection, biopsy, or endoscopic sampling; the cells may be used directly, stored frozen, or maintained or expanded in culture. Samples of both the tumor and the patient's blood or blood fraction should be thoroughly tested to ensure sterility before co- culturing of the cells. Standard sterility tests are known to those of skill in the art and are not described in detail herein.
  • the tumor cells can be cultured in vitro to generate a cell line. Conditions for reliably establishing short-term cultures and obtaining at least 10 8 cells from a variety of tumor types is described in Dillmar et al. (1993) J. Immunother. 14:65-69. Alternatively, tumor cells can be dispersed from, for example, a biopsy sample, by standard mechanical means before use.
  • Tumor cells can be obtained by any method known in the art. The following is an example of one method employed by skilled artisans. Using sterile technique, solid tumors (10-30 g) excised from a patient are dissected into 5 mm pieces which are immersed in RPMI 1640 medium containing 0.01%> hyaluronidase type V, 0.002% DNAse type I, 0.1 % collagenase type IV, 50 IU/ml penicillin, 50 ⁇ g/ml streptomycin and 50 ⁇ g/ml gentamycin. This mixture is stirred for 6 to 24 hours at room temperature, after which it is filtered through a coarse wire grid to exclude undigested tissue fragments.
  • the resultant tumor cell suspension is then centrifuged at 400 x g for 10 minutes.
  • the pellet is washed twice with Hanks balanced salt solution (HBSS) without Ca or Mg or phenol red, then resuspended in HBSS and passed through Ficoll-Hypaque gradients.
  • HBSS Hanks balanced salt solution
  • the gradient interfaces, containing viable tumor cells, lymphocytes, and monocytes, are harvested and washed twice more with HBSS.
  • 10 may be frozen for storage in a type-compatible human serum containing 10% (v/v) DMSO.
  • neoplastic cell refers to cells that have undergone a malignant transformation that makes them pathological to the host organism.
  • Primary cancer cells that is, cells obtained from near the site of malignant transformation
  • the definition of a cancer cell includes not only a primary cancer cell, but any cell derived from a cancer cell ancestor. This includes metastasized cancer cells, and in vitro cultures and cell lines derived from cancer cells.
  • a "clinically detectable" tumor is one that is detectable on the basis of tumor mass; e.g., by such procedures as CAT scan, magnetic resonance imaging (MRI), X-ray, ultrasound or palpation. Biochemical or immunologic findings alone may be insufficient to meet this definition.
  • MDR multi-drug resistance
  • a drug-resistant cancer cell for the purposes of the present invention, include a cell which is resistant to a single antitumor chemotherapeutic agent, as well as a cell
  • Cytotoxic drugs as antitumor chemotherapeutic agents can be subdivided into several broad categories, including: 1) alkylating agents, such as mechlorethamine, cyclophosphamide, melphalan, uracil mustard, chlorambucil and carmustine; 2) antimetabolites such as methotrexate, fluorouracil, azarabine, mercaptopurine, thioguanine and adenine arabinoside; 3) natural product derivatives such as vinblastine, vincristine, doxorubicin, bleomicine, toposide, teniposide and mitomycin-c; and 4) miscellaneous agents, such as hydroxyurea, procarbezine and mititane.
  • Sample cells further include neoplastic cells which promote angiogenesis.
  • Angiogenic factors include the CXC family of chemokines (Arenberg et al. (1997) J. Leukocyte Biol. 62:554-562),
  • Sample cells also include those expressing an antigen, or those which specifically recognize an antigen and which induce an immune response such as a T-cell.
  • Sample cells also include antigen expressing cells such as "antigen presenting cells” or “APCs” which includes both intact whole cells as well as other molecules which are capable of inducing the presentation of one or more antigens, preferably in association with class I MHC molecules.
  • antigen presenting cells such as "antigen presenting cells” or “APCs” which includes both intact whole cells as well as other molecules which are capable of inducing the presentation of one or more antigens, preferably in association with class I MHC molecules.
  • suitable APCs include, but are not limited to, whole cells such as macrophages, dendritic cells, B cells; purified MHC class I molecules complexed to ⁇ 2- microglobulin; and foster antigen presenting cells.
  • Faster antigen presenting cells refers to any modified or naturally occurring cell (wild-type or mutant) with antigen presenting capability that is utilized in lieu of antigen presenting cells (“APC”) that normally contact the immune effector cells they are to react with. In other words, it is any functional APC that T cells would not normally encounter in vivo.
  • Foster antigen presenting cells can be derived as follows.
  • the human cell line 174xCEM.T2, referred to as T2 contains a mutation in its antigen processing
  • T2 cells are what will be referred to as "foster" APCS.
  • Sample cells include those transduced with a polynucleotide. The term
  • polynucleotide refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. Thus, this term includes double- and single-stranded DNA and RNA.
  • DNA includes not only bases A, T, C, and G, but also includes any of their analogs or modified forms of these bases, such as methylated nucleotides, internucleotide modifications such as uncharged linkages and thioates, use of sugar analogs, and modified and/or alternative backbone structures, such as polyamides.
  • polynucleotides which encode one or more proteins, or which can be transcribed to generate antisense RNA or a ribozyme.
  • Suitable methods for manipulation of polynucleotides include those described in a variety of references, including, but not limited to, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd Ed., Vol. 1-3, eds. Sambrook et al. Cold Spring Harbor Laboratory Press (1989); and CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, eds. Ausubel et al., Greene Publishing and Wiley- Interscience: New York (1987) and periodic updates.
  • any method in the art can be used for the transformation, or insertion, of an exogenous polynucleotide into a host cell, for example, lipofection, transduction, infection or electroporation, using either purified DNA, viral vectors, or DNA or RNA viruses.
  • the exogenous polynucleotide may be maintained as a non-integrated vector, for example, a plasmid, or alternatively, may be integrated into the host cell genome.
  • Sample cells include those infected with a pathogen.
  • Pathogen includes any microorganism which is potentially harmful to a cell, including prokaryotes, viruses and single-celled eukaryotes.
  • pathogens include, but are not limited to viruses such as human immunodeficiency virus, Epstein-Barr virus; fungi; bacteria capable of infecting mammalian cells, such as Chlamydia spp., Legionella pneumophila, Mycobacterium spp. (Sinai and Joiner (1997) Ann. Rev. Microbiol. 51:415-462), Salmonella typhosa, Brucella abortus; protozoan parasites such as Toxoplasma gondii, Leishmania donovani, Trypanosoma cruzi, malarial plasmodia.
  • the practice of the invention involves comparison of polynucleotides corresponding to expressed genes between a sample cell and a control cell.
  • the selection of the appropriate cell or cell type is dependent on the sample cell initially selected and the phenotype of the sample cell which is under investigation.
  • the sample cell is a neoplastic cell and one or more counterparts is another neoplastic cell or non-neoplastic precursors of the sample cell can be used as control cells.
  • Counterparts would include, for example, cell lines established from the same or related cells to those found in the sample cell population.
  • the control cell can be any of a counterpart normal cell type, a counterpart benign cell type, a counterpart non-metastatic cell type and a non-neoplastic precursor of the neoplastic cell.
  • a sample cell can be selected based on the expression of a gene coding for peptide which participates in recognition of the sample cell by an immune effector cell, e.g., an antigen presenting cell, a suitable control cell is one which has a compatible MHC complex but does not express the antigen.
  • a suitable control cell is one which has a compatible MHC complex but does not express the antigen.
  • control cells are compatible for lysis by a cytotoxic T-lymphocyte for example, but are not lysed by the cytotoxic T-lymphocyte.
  • a control cell is one that does not secrete the factor.
  • polynucleotide includes SAGE tags (defined above) as well as any other nucleic acid obtained from methods that yield quantitative/comparative gene expression data. Such methods include, but are not limited to cDNA subtraction, differential display and expressed sequence tag methods.
  • a futher method utilizes differential display coupled with real time PCT and representational difference analysis (described in Lisitisyn and Wigler (1995) Meth. Enzymol. 254:291-304).
  • Another approach is the technology known as Serial Analysis of Gene Expression (SAGE, described in U.S. Patent No. 5,695,937). Using SAGE, sequence tags (tags being used synonymously with polynucleotides) corresponding to expressed genes can be analyzed.
  • sequence tags or polynucleotides corresponding to the expressed genes are prepared essentially as follows. First, a sample containing the genes of interest is provided. Suitable sources of samples include cells, tissue, cellular extracts or the like. Preferably, the sample is taken from an individual having a particular disease state of interest or at a particular stage in its development.
  • cDNA Complementary DNA
  • cDNA is then isolated from the sample, for example using methods known to those skilled in the art.
  • the cDNA is synthesized from mRNA using a biotinylated oligo(dT) primer.
  • Smaller fragments of cDNA are then be created using a restriction endonuclease, preferably one that would be expected to cleave most transcripts at least once.
  • a restriction endonuclease preferably one that would be expected to cleave most transcripts at least once.
  • a 4-base pair recognition site enzyme is used.
  • More than one restriction endonuclease can also be used, sequentially or in tandem.
  • the cleaved cDNA can then be isolated by binding to a capture medium for label attached to the primer described above.
  • streptavidin beads are used to isolate the defined 3' nucleotide sequence polynucleotide when the oligo dT primer for cDNA synthesis is biotinylated.
  • Other capture systems e.g., biotin/streptavidin, digoxigenin anti-digoxigenin
  • biotin/streptavidin digoxigenin anti-digoxigenin
  • the isolated defined nucleotide sequence polynucleotides are separated into two pools of cDNA. Each pool is ligated using the appropriate linkers.
  • the linkers can be the same or different, although when the linkers have the same sequence, it is not necessary to separate the polynucleotides into pools.
  • the first oligonucleotide linker comprises a first sequence for hybridization of a PCR primer and the second oligonucleotide linker comprises a second sequence for hybridization of a PCR primer.
  • the linkers further comprise a second restriction endonuclease site.
  • the linkers are designed so that cleavage of the ligation products with the second restriction enzyme results in release of the linker having a defined nucleotide sequence polynucleotidefe.g., 3' of the restriction endonuclease cleavage site).
  • the defined nucleotide sequence polynucleotide may be from about 6 to 30 base pairs.
  • the polynucleotide is about 9 to 11 base pairs.
  • a ditag i.e. the dimer of two sequence tags
  • the second restriction endonuclease cleaves at a site distant from or outside of the recognition site.
  • the second restriction endonuclease can be a type IIS restriction enzyme.
  • Type IIS restriction endonucleases cleave at a defined distance up to 20 bp away from their
  • the ditag (ligated tag pair) having a first restriction endonuclease site upstream (5') and a first restriction endonuclease site downstream (3') of the ditag; a second restriction endonuclease cleavage site upstream and downstream of the ditag, and a linker oligonucleotide containing both a second restriction enzyme recognition site and an amplification primer hybridization site upstream and downstream of the ditag.
  • the ditag is flanked by the first restriction endonuclease site, the second restriction endonuclease cleavage site and the linkers, respectively.
  • the ditag can be amplified by utilizing primers which specifically hybridize to one strand of each linker.
  • the amplification is performed after the ditags have been ligated together using standard polymerase chain reaction (PCR) methods as described for example in U.S. Patent No. 4,683,195.
  • PCR polymerase chain reaction
  • the ditags can be amplified by cloning in prokaryotic-compatible vectors or by other amplification methods known to those of skill in the art. Those of skill in the art can prepare similar primers for amplification based on the nucleotide sequence of the linkers without undue experimentation. Cleavage of the amplified PCR product with the first restriction endonuclease allows isolation of ditags which can then be concatenated by ligation. After ligation, it may be desirable to clone the concatemers, although it is not required. Analysis of the ditags or concatemers, whether or not amplification was performed, can be performed by standard sequencing methods. Concatemers generally consist of about 2 to 200 ditags and preferably from about
  • 17 number of ditags which can be concatenated will depend on the length of the individual tags and can be readily determined by those of skill in the art without undue experimentation.
  • multiple tags can be cloned into a vector for sequence analysis, or alternatively, ditags or concatemers can be directly sequenced without cloning by methods known to those of skill in the art, either manually or using automated methods.
  • the standard procedures for cloning the defined nucleotide sequence tags of the invention is insertion of the tags into vectors such as plasmids or phage.
  • the ditag or concatemers of ditags produced by the method described herein are cloned into recombinant vectors for further analysis, e.g. , sequence analysis, plaque/plasmid hybridization using the tags as probes, by methods known to those of skill in the art.
  • Vectors in which the ditags are cloned can be transferred into a suitable host cell.
  • "Host cells” are cells in which a vector can be propagated and its DNA expressed. The term also includes any progeny of the subject host cell.
  • progeny may not be identical to the parental cell since there may be mutations that occur during replication. However, such progeny are included when the term "host cell" is used.
  • Methods of stable transfer meaning that the foreign DNA is continuously maintained in the host, are known in the art. Transformation of a host cell with a vector containing ditag(s) may be carried out by conventional techniques as are well known to those skilled in the art. Where the host is prokaryotic, such as E. coli, competent cells which are capable of DNA uptake can be prepared from cells harvested after exponential growth phase and subsequently treated by the CaCl method using procedures well known in the art. Alternatively, MgCl 2 or RbCl can be used. Transformation can also be performed by electroporation or other commonly used methods in the art.
  • the individual tags or ditags can be hybridized with oligonucleotides immobilized on a solid support (e.g., nitrocellulose filter, glass slide, silicon chip).
  • a solid support e.g., nitrocellulose filter, glass slide, silicon chip.
  • either the ditags or oligonucleotide probes are labeled with a detectable label, for example, with a radioisotope, a fluorescent compound, a
  • bioluminescent compound a chemi-luminescent compound, a metal chelator, or an enzyme.
  • a chemi-luminescent compound a chemi-luminescent compound
  • metal chelator a metal chelator
  • an enzyme e.g., a carboxylate, a carboxylate, or a carboxylate.
  • PCR can be performed with labeled (e.g. , fiuorescein tagged) primers.
  • the ditags are separated into single-stranded molecules which are preferably serially diluted and added to a solid support (e.g., a silicon chip as described by Fodor et al. Science 251:767, 1991) containing oligonucleotides representing, for example, every possible permutation of a 10-mer (e.g., in each grid of a chip).
  • a solid support e.g., a silicon chip as described by Fodor et al. Science 251:767, 1991
  • the solid support is then used to determine differential expression of the tags contained within that support (e.g., on a grid on a chip) by hybridization of the oligonucleotides on the solid support with tags produced from cells under different conditions (e.g., different stage of development growth of cells in the absence and presence of a growth factor, normal versus transformed cells, comparison of different tissue expression, etc.).
  • fluoresceinated end labeled ditags analysis of fluorescence is indicative of hybridization to a particular 10-mer.
  • the immobilized oligonucleotide is fluoresceinated, for example, a loss of fluorescence due to quenching (by the proximity of the hybridized ditag to the labeled oligo) is observed and is analyzed for the pattern of gene expression.
  • polynucleotide information After the polynucleotide information is obtained, it is analyzed to identify polynucleotides that correspond to genes that are differentially expressed between the two or more cell types. It is within the scope of this invention to perform the method described above using previously identified and stored sequence information that define and identify expressed genes. This information can be obtained from private, publically available and commercially available sequence databases.
  • a cell or tissue is selected for having a phenotype which is dependent on the presence of one gene product within a sample cell samples, e.g., cells that secrete a biological factor whose activity can be measured in an in vitro assay, cells that stain with an antibody that recognizes a specific antigen or cells that are lysed by cytotoxic T cells that recognize a specific antigen, the cells are further selected to identify sample cells that exhibit extremes of the chosen phenotype and ideally are matched in all other respects or phenotypic characteristics.
  • test cells that are matched, e.g., from the same individual, would minimize having to deal with histocompatability differences Ideally one selects two examples of sample cells (say “A” and “B”) that exhibit the chosen phenotype prominently and two examples of samples cells (say “C” and “D”) that do not have the phenotype at all.
  • polynucleotides present in a library form from each cell sample are isolated and their relative expression noted.
  • the individual libraries are sequenced and the information regarding sequence and in some embodiments, relative expression, is stored in any functionally relevant program, e.g., in Compare Report using the SAGE software (available through Dr. Ken Kinzler at Johns Hopkins University).
  • the Compare Report provides a tabulation of the polynucleotide sequences and their abundance for the samples (say A, B, C and D above) normalized to a defined number of polynucleotides per library (say
  • GroupNormal Normal 1 + Normal2
  • GroupTumor Primary Tumor 1 + TumorCellLine. Additional characteristic values are also calculated for each tag in the group (e.g., average count, minimum count, maximum count).
  • the researcher may calculate individual tag count ratios between groups, for example the ratio of the average GroupNormal count to the average GroupTumor count for each polynucleotide.
  • the researcher may calculate a statistical measure of the significance of observed differences in tag counts between groups.
  • a query to sort polynucleotide tags based on their abundance in the sample cells is run.
  • the output from the Query report lists specific polynucleotides (by sequence) that fit the sorting criteria and their abundance in the various sample cells
  • the sorting is based on the principle that the gene product of interest (and hence the corresponding polynucleotide) is more abundant in the samples that prominently exhibit the chosen phenotype than in samples that do not exhibit the phenotype.
  • a frequency of 1/5000 (5 copies of a SAGE tag normalized to a library size of 25,000) correlates with sufficient expression of a tumor antigen within the sample cell to render it sensitive to lysis by an antigen specific T cell while a frequency of 1/25,000 correlates with the cell being weakly sensitive to lysis.
  • Query Report and test them individually in an appropriate biological assay to determine if they confer the phenotype.
  • candidates that correspond to known genes it is a relatively easy task to obtain complementary DNAs for these candidates and test them individually to determine if they confer the specific phenotype in question when transferred into cells that do not exhibit the phenotype. If none of the known genes confer the phenotype, retrieve the cDNAs corresponding to the No Match sequences of the Query Report by PCR cloning and test the novel cDNAs individually for their ability to confer the phenotype.
  • the polynucleotide or gene sequence can also be compared to a sequence database, for example, using a computer method to match a sample sequence with known sequences.
  • Sequence identity can be determined by a sequence comparison using, i.e., sequence alignment programs that are known in the art, such as those described in CURRENT PROTOCOLS IN MOLECULAR
  • BIOLOGY F.M. Ausubel et al., eds., 1987) Supplement 30, section 7.7.18, Table 7.7.1.
  • the BLAST program is available at the following Internet address: http://www.ncbi.nlm.nih.gov.
  • hybridization under conditions of high, moderate and low stringency can also indicate degree of sequence identity.
  • genes and gene products associated with cancer and neoplastic cells are determined. Additionally, the methods of the present invention can be used to establish correlations between the phenotype and the SAGE tag genotype of a variety of other types of cell. For example, in other aspects, the methods of the invention can be used in the identification of gene products associated with genetic disease, inherited disease and/or acquired diseases. Gene products associated with drug resistance and drug metabolism can also be identified. Identification of genes associated with drug metabolism will have important applications in the field of pharmacogenomics, wherein an individual's response to a particular therapeutic is determined, so as to maximize therapeutic value and minimize side effects. In additional aspects, the methods of the invention are used in the identification of gene products that confer some measurable biological activity on a mature or differentiated population of cells, wherein the activity is not exhibited by immature or undifferentiated precursors.
  • cytotoxic T-lymphocytes are able to recognize and lyse a target cell, whereas other types of T-lymphocyte are capable of recognition but incapable of lysis.
  • genes that are responsible for this difference i.e., genes whose expression specifically enable lysis of a target cell by a cytotoxic T- lymphocyte.
  • a phenotype such as metastatic potential, which is likely to depend upon multiple factors, may be more difficult to establish than a phenotype whose magnitude is dependent on the relative abundance of a single specific transcript.
  • Hybridizing tags or preferably amplified ditags, against oligonucleotide sequences fixed to a solid matrix such as nitrocellulose filters, glass slides or silicon chips ("parallel sequence analysis", or PSA); or
  • Ditags are prepared, amplified and cleaved with the anchoring enzyme as defined by SAGE technology:
  • oligonucleotide sequences contain a CATG sequence at the 5' end:
  • the matrices are constructed of any material known in the art and the oligonucleotide-bearing chips are generated by any procedure known in the art, e.g. silicon chips containing oligonucleotides prepared by the VLSIP procedure. See, for example, U.S. Patent No. 5,424,186.
  • the oligonucleotide-bearing matrices are evaluated for the presence or absence of a fluorescent ditag at each position in the grid.
  • oligonucleotides on the grid of the general sequence CATGOOOOOOOOOO, such that every possible 10-base sequence is represented 3' to the CATG. Since there are estimated to be no more than 100,000 to 200,000 different expressed genes in the human genome, there are enough oligonucleotide sequences to identify all of the possible sequences adjacent to the 3' -most anchoring enzyme site observed in the cDNAs from the expressed genes in the human genome.
  • Library B that is expressed at low abundance in Library A.
  • 4D reflects a differentially-expressed, high abundance transcript restricted to Library A;
  • 5 A reflects a transcript that is expressed at high abundance in Library A but only at low abundance in Library B;
  • 5E reflects a differentially-expressed (in Library B), low abundance transcript.
  • step 3 above does not involve the use of a fluorescent or other identifier; instead, at the last round of amplification of the ditags, fluoresceinated dNTPs are used so that half of the molecules are probed on the chips.
  • a particular portion of the transcript is used, e.g., the sequence between the 3' terminus of the transcript and the first anchoring enzyme site. In that particular case, a double-stranded cDNA reverse transcript is generated as described in WO 97/10363.
  • the transcripts are cut with the anchoring enzyme, a linker is added containing a PCR primer and amplification is initiated (using the primer at one end and the A tail at the other) while the transcripts are still on the strepavidin bead.
  • fluoresceinated dNTPs are used so that half of the molecules can be probed on the chip.
  • the linker-primer is optionally removed with the anchoring enzyme at this point in order to reduce the size of the fragments.
  • the soluble fragments are then melted and captured on solid matrices containing
  • Ditags or concatemers are diluted and added to wells or other receptacles so that on average the wells contain, statistically, less than one DNA molecule per well (as is done in limited dilution for cell cloning).
  • Each well then receives reagents for PCR or another amplification process and the DNA in each receptacle is sequenced, e.g., by mass spectoscopy.
  • the results are either be a single sequence (there having been a single sequence in that receptacle), a "null" sequence (no DNA present) or a double sequence (more than one DNA molecule), which is discarded. Thereafter, assessment of differential expression is the same as defined by SAGE technique.
  • tumor antigens which are self proteins over-expressed by tumor cells
  • viral antigens such as HPV16E6 and E7
  • cancer/testes family of antigens typified by MAGE
  • mutated proteins such as ras or p53.
  • differentiation antigens the vast majority are melanoma associated antigens and attempts to identify self antigens over-expressed by lung, prostate, breast or colon carcinomas that might be good candidates as targets for cytotoxic T cells have largely been unsuccessful.
  • the present invention calls for the use of genes differentially expressed in target cells in the design of a vaccine to generate an immune response against the
  • the inventors have applied a SAGE analysis (described in U.S. Patent No. 5,695,937), to identify a variety of transcripts that are differentially expressed in cancer cells, that have not previously been associated with tumor cells.
  • SAGE analysis described in U.S. Patent No. 5,695,937
  • CTL gplOO specific cytotoxic T lymphocyte
  • the HLA-A2 negative cell lines were subjected to SAGE analysis and SAGE polynucleotides were sorted to identify polynucleotides common to lines that are susceptible to lysis that are less abundant in lines that are less susceptible to lysis (see Table 3). Of the two polynucleotides that matched the sorting criteria, one was the gplOO tag CCTGGTCAAG. Thus, by conducting the SAGE analysis of 6 different melanoma cell lines that are differentially susceptible to lysis by an HLA restricted CTL, one is able to focus on just 2 transcripts that were candidates for the cognate antigen, one of which was the desired target.
  • Example 2 Melanoma and breast cancer cell lines, exhibiting differential immunoreactivity to an anti-HER-2 antibody as judged by FACS analysis were subjected to SAGE analysis to determine which SAGE polynucleotides were shared amongst the cell lines that showed a high mean fluorescence signal that were less abundant in cell lines that showed a lower mean fluorescence signal.
  • SAGE polynucleotides matched the sorting criteria and were found to be represented at a higher level in cell lines 21PT and 21MT (that show a strong fluorescence signal) than in cell lines MDA-468, SK28, BA1, NM455 and 1300mel (that show a weaker fluorescence signal) (Table 4).
  • HER-2 has previously been identified as a target for patient derived T cells, it has not been reported that integrin alpha-3 can also be a target for patient derived immune effector cells or antibodies.
  • the gene encoding integrin alpha-3 or the corresponding gene product or peptide fragments thereof can be used to provoke an immune response to target cells that differentially express integrin alpha-3.
  • any differentially expressed gene or genes (identified by SAGE) and their corresponding proteins or peptide fragments could be used to provoke an anti-target cell immune response.

Abstract

The present invention broadly provides a method for correlating the phenotype of a cell with its 'functional genotype', that is, the constellation of expressed sequences in that cell. In addition, the invention provides a means for identifying therapeutically-relevant genes and gene products.

Description

METHODS FOR IDENTIFYING THERAPEUTIC TARGETS
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority under 35 U.S.C. § 119(e) of U.S.
Provisional Application Numbers 60/100,436; 60/077,853; and 60/103,230, filed January 26, 1998; March 13, 1998; and October 5, 1998, respectively, the contents of which are hereby incorporated by reference into the present disclosure.
TECHNICAL FIELD
This invention is in the fields of molecular biology, cell biology and immunology. More particularly, the invention uses techniques of functional genomics to correlate the phenotype of a cell with its pattern of gene expression and to identify new therapeutic targets.
BACKGROUND OF THE INVENTION
The imminent acquisition of the sequence of the entire human genome will provide a wealth of information on gene and genome structure and organization. In order to use this vast wealth of genetic information in the prediction and treatment of human disease, the next step is to develop methods for the analysis of the data. In particular, methods are required which will allow one to distinguish global patterns of differential gene expression between different cells, or between different pathological stages of the same cell. Methods of this type are often denoted functional genomics. It is well known that many, but not all genes present in a cell are expressed at any given time. Fundamental questions of biology require knowledge of which genes are transcribed and the relative abundance of transcripts in different cells. Typically, when and to what degree a given gene is expressed has been analyzed one gene at a time. Thus, information regarding the identity of all expressed genes in a cell and the level of expression of these genes would facilitate the study of many cellular processes such as activation, differentiation, aging, viral transformation, morphogenesis, and mitosis. A comparison of the expressed genes of a particular cell or the same cell from various individuals or species, under the same or different environmental stimuli, provides valuable insight into the molecular biology of the cell.
Accordingly, a method that provides a comparison of the expressed genes of a particular cell as compared to another cell would be of great value. This invention provides methods for identifying therapeutically-relevant genes which are expressed differentially in one cell with respect to another.
SUMMARY OF THE INVENTION
The present invention broadly provides a method for correlating the phenotype of a cell with its "functional genotype," that is, the constellation of expressed sequences in that cell. In addition, the invention provides a means for identifying therapeutically-relevant genes and gene products.
This invention also provides computer-related systems and methods. More specifically, the invention provides a system and method for automatically generating a data base of gene tags from cell samples and using the data base for filtering the tag counts from the samples into meaningful candidates for further testing and analysis.
MODES FOR CARRYING OUT THE INVENTION
Various publications, patents and published patent specifications are referenced by an identifying citation. The disclosures of these publications, patents and published patent specifications are hereby incorporated by reference into the present disclosure to more fully describe the state of the art to which this invention pertains. The present invention provides a method for identifying a gene associated with a selected phenotype. Knowledge of the sequence of such a gene will also provide the skilled artisan with knowledge of the sequence and structure of the protein product(s) of the gene. In a preferred embodiment, the action of the gene and/or its product will be causative or involved in some way with respect to the selected phenotype. The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology, microbiology, cell biology and recombinant DNA, which are within the skill of the art. See, e.g., Sambrook, Fritsch and Maniatis, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd edition (1989); CURRENT PROTOCOLS IN MOLECULAR BIOLOGY (F. M. Ausubel et al. eds., (1987)); the series METHODS IN ENZYMOLOGY (Academic
Press, Inc.): PCR 2: A PRACTICAL APPROACH (M.J. MacPherson, B.D. Hames and G.R. Taylor eds. (1995)) and ANIMAL CELL CULTURE (RJ. Freshney, ed. (1987)).
Definitions As used in the specification and claims, the singular form "a", "an" and
"the" include plural references unless the context clearly dictates otherwise. For example, the term "a cell" includes a plurality of cells, including mixtures thereof.
The terms "polynucleotide" and "nucleic acid molecule" are used interchangeably to refer to polymeric forms of nucleotides of any length. The polynucleotides may contain deoxyribonucleotides, ribonucleotides, and/or their analogs. Nucleotides may have any three-dimensional structure, and may perform any function, known or unknown. The term "polynucleotide" includes, for example, single-, double-stranded and triple helical molecules, a gene or gene fragment, exons, introns, mRNA, tRNA, rRNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A nucleic acid molecule may also comprise modified nucleic acid molecules.
The term "differentially expressed" refers to nucleotide sequences in a cell or tissue which are either more or less expressed than a control cell, or expressed where silent in a control cell or not expressed where expressed in a control cell. "Oligonucleotide" refers to polynucleotides of between about 5 and about 100 nucleotides of single- or double-stranded DNA. Oligonucleotides are also known as oligomers or oligos and may be isolated from genes, or chemically synthesized by methods known in the art. A "gene" is a hereditary unit that, in the classical sense, occupies a specific position (locus) within the genome or chromosome; a unit that has one or more specific effects upon the phenotype of the organism; a unit that can mutate to various allelic forms; a unit that recombines with other such units. Three classes of polynucleotides s are now recognized: (1) structural genes that are transcribed into mRNAs, which are then translated into polypeptide chains, (2) structural polynucleotides that are transcribed into rRNA or tRNA molecules which are used directly, and (3) regulatory sequences that are not transcribed, but serve as recognition sites for enzymes and other proteins involved in DNA replication and transcription. A "primer" refers to an oligonucleotide, usually single-stranded, that provides a 3 '-hydroxyl end for the initiation of enzyme-mediated nucleic acid synthesis. The primer sequence need not reflect the exact sequence of the template. "PCR primers" refer to primers used in "polymerase chain reaction" or "PCR," a method for amplifying a DNA base sequence using a heat-stable polymerase such as Taq polymerase, and two oligonucleotide primers, one complementary to the (+)-strand at one end of the sequence to be amplified and the other complementary to the (-)-strand at the other end. Because the newly synthesized DNA strands can subsequently serve as additional templates for the same primer sequences, successive rounds of primer annealing, strand elongation, and dissociation produce exponential and highly specific amplification of the desired sequence. PCR also can be used to detect the existence of the defined sequence in a DNA sample.
A "sequence tag" or "SAGE tag" is a short sequence, generally under about 20 nucleotides, that occurs in a certain position in messenger RNA. The tag can be used to identify the corresponding transcript and gene from which it was transcribed. A "ditag" is a dimer of two sequence tags. The term "cDNAs" refers to complementary DNA, that is mRNA molecules present in a cell or organism made in to cDNA with an enzyme such as reverse transcriptase. A "cDNA library" is a collection of all of the mRNA molecules present in a cell or organism, all turned into cDNA molecules with the enzyme reverse transcriptase, then inserted into "vectors" (other DNA molecules which can continue to replicate after addition of foreign DNA). Exemplary vectors for libraries include bacteriophage (also known as "phage"), viruses that infect bacteria, for example, lambda phage. The library can then be probed for the specific cDNA (and thus mRNA) of interest. The term "immune effector cells" refers to cells capable of binding an antigen and which mediate an immune response. These cells include, but not limited to, T cells, B cells, monocytes, macrophages, NK cells and cytotoxic T lymphocytes (CTLs), for example CTL lines, CTL clones, and CTLs from tumor, inflammatory, or other infiltrates. Certain diseased tissue expresses specific antigens and CTLs specific for these antigens have been identified. For example, approximately 80% of melanomas express the antigen known as GP-100.
The term "T-lymphocytes" as used herein denotes lymphocytes that are phenotypically CD3+, typically detected using an anti-CD3 monoclonal antibody in combination with a suitable labeling technique. The T-lymphocytes of this invention are also generally positive for CD4, CD8, or both.
As used herein, the terms "restriction endonucleases" and "restriction enzymes" refer to bacterial enzymes which bind to a specific double-stranded DNA sequence termed a recognition site or recognition nucleotide sequence, and cut double-stranded DNA at or near the specific recognition site. "Type IIS" restriction endonucleases are those which cleave at a defined distance (up to 20 bases away) from their recognition sites. Endonucleases will be known to those of skill in the art (see for example, Current Protocols in Molecular Biology, Vol. 2, 1995, Ed. Ausubel et al, Greene Publish. Assoc. & Wiley Interscience, Unit 3.1.15; New England Biolabs Catalog, 1995). A "naϊve" cell is a cell that has never been exposed to an antigen. The term "culturing" refers to the in vitro propagation of cells or organisms on or in media of various kinds. It is understood that the descendants of a cell grown in culture may not be completely identical (morphologically, genetically, or phenotypically) to the parent cell. By "expanded" is meant any proliferation or division of cells.
A "subject" is a vertebrate, preferably a mammal, more preferably a human. Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets.
"Host cell" or "recipient cell" is intended to include any individual cell or cell culture which can be or have been recipients for vectors or the incorporation of exogenous nucleic acid molecules, polynucleotides and/or proteins. It also is intended to include progeny of a single cell, and the progeny may not necessarily be completely identical (in morphology or in genomic or total DNA complement) to the original parent cell due to natural, accidental, or deliberate mutation. The cells may be procaryotic or eucaryotic, and include but are not limited to bacterial cells, yeast cells, animal cells, and mammalian cells, e.g., murine, rat, simian or human. An "antibody" is an immunoglobulin molecule capable of binding an antigen. As used herein, the term encompasses not only intact immunoglobulin molecules, but also anti-idiotypic antibodies, mutants, fragments, fusion proteins, humanized proteins and modifications of the immunoglobulin molecule that comprise an antigen recognition site of the required specificity.
An "antibody complex" is the combination of antibody (as defined above) and its binding partner or ligand.
A native antigen is a polypeptide, protein or a fragment containing an epitope, which induces an immune response in the subject.
The term "isolated" means separated from constituents, cellular and otherwise, in which the polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, are normally associated with in nature. As is apparent to those of skill in the art, a non-naturally occurring polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, does not require "isolation" to distinguish it from its naturally occurring counterpart. In addition, a "concentrated", "separated" or "diluted" polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, is distinguishable from its naturally occurring counterpart in that the concentration or number of molecules per volume is greater than "concentrated" or less than "separated" than that of its naturally occurring counterpart. A polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, which differs from the naturally occurring counterpart in its primary sequence or for example, by its glycosylation pattern, need not be present in its isolated form since it is distinguishable from its naturally occurring counterpart by its primary sequence, or alternatively, by another characteristic such as glycosylation pattern. Although not explicitly stated for each of the inventions disclosed herein, it is to be understood that all of the above embodiments for each of the compositions disclosed below and under the appropriate conditions, are provided by this invention. Thus, a non-naturally occurring polynucleotide is provided as a separate embodiment from the isolated naturally occurring polynucleotide. A protein produced in a bacterial cell is provided as a separate embodiment from the naturally occurring protein isolated from a eucaryotic cell in which it is produced in nature.
An "isolated" or "enriched" population of cells is "substantially free" of cells and materials with which it is associated in nature. By "substantially free" or "substantially pure" means at least 50% of the population are the desired cell type, preferably at least 70%), more preferably at least 80%, and even more preferably at least 90%.
A "composition" is intended to mean a combination of active agent and another compound or composition, inert (for example, a detectable agent , solid support or label) or active, such as an adjuvant.
A "pharmaceutical composition" is intended to include the combination of an active agent with a carrier, inert or active, making the composition suitable for diagnostic or therapeutic use in vitro, in vivo or ex vivo.
As used herein, the term "pharmaceutically acceptable carrier" encompasses any of the standard pharmaceutical carriers, such as a phosphate buffered saline solution, water, and emulsions, such as an oil/water or water/oil emulsion, and various types of wetting agents. The compositions also can include stabilizers and preservatives. For examples of carriers, stabilizers and adjuvants, see Martin, REMINGTON'S PHARM. SCI., 15th Ed. (Mack Publ. Co., Easton (1975)). An "effective amount" is an amount sufficient to effect beneficial or desired results. An effective amount can be administered in one or more administrations, applications or dosages.
As described in more detail below, the present method identifies a polynucleotide fragment of a gene that confers or is involved in conferring a selected phenotype to a sample cell, cells, or tissue or presenting a potential therapeutic target. The method requires identifying a unique polynucleotide, the unique polynucleotide representing a gene that is differentially expressed in a sample cell compared to a control cell. In one embodiment, the gene corresponding to the unique polynucleotideis is identified and cloned, thereby providing the sequence and identy of the gene conferring the selected phenotype to the sample cell or is associated with a selected phenotype but not necessarily causative of the selected phenotype. The unique polynucleotide can represent or correspond to or be a fragment of a gene that is differentially, overexpressed or underexpressed in the sample cell compared to the control cell. More than one sample cell type can be compared to a single control cell, or alternatively, more than one control cell type can be compared to a single sample cell. Therapeutic targets can be identified using the methods disclosed herein. The polypeptides and proteins encoded by these polynucleotides and genes can further produced, isolated and characterized. In one embodiment, the method is useful for identifying one or more secreted biological factors and/or the gene(s) encoding the factor(s) or fragments thereof. The method involves the steps of: providing one or more sample cells that secrete the factor and one or more control cells that do not secrete the factor; obtaining a set of polynucleotides representing gene expression in the sample cells; obtaining a set of polynucleotides representing gene expression in the control cells; and identifying one or more unique polynucleotides, the unique polynucleotides being common to the sample cells and the unique polynucleotides being absent or expressed at lower levels in the control cells. Finally, by determining the genes corresponding to the unique polynucleotides, one or more secreted biological factors are identified. The practice of the invention can be applied to the identification of gene(s) that are relevant to any property that differs between one cell (a sample cell) and another (a control cell). Such properties may include, but are not limited to, disease state, infection, drug resistance, cytokine secretion, secretory protein expression, state of differentiation, growth regulation, consequences of exposure to external environmental stimuli, etc. In addition, the practice of the invention can be applied to any cell type including, but not limited to, plants, animals and microorganisms.
Materials and Methods Sample cells
The invention provides methods for identifying and obtaining polynucleotides, genes and fragments thereof, associated with a selected phenotype in a sample cell. "Sample cells" include, but are not limited to, neoplastic cells; drug-resistant neoplastic cells; neoplastic cells which promote angiogenesis; de-differentiated cells; differentiated cells; apoptotic cells; hyperproliferative cells; cells infected with a pathogen or drug-resistant cells infected with a pathogen.
Cancers from which cells can be obtained for use in the methods of the present invention include carcinomas, sarcomas, leukemias, and cancers derived from cells of the nervous system. These include, but are not limited to: brain tumors, such as astrocytoma, oligodendroglioma, ependymoma, medulloblastomas, and Primitive Neural Ectodermal Tumor (PNET); pancreatic tumors, such as pancreatic ductal adenocarcinomas; lung tumors, such as small and large cell adenocarcinomas, squamous cell carcinoma and bronchoalveolarcarcinoma; colon tumors, such as epithelial adenocarcinoma and liver metastases of these tumors; liver tumors, such as hepatoma and cholangiocarcinoma; breast tumors, such as ductal and lobular adenocarcinoma; gynecologic tumors, such as squamous and adenocarcinoma of the uterine cervix, and uterine and ovarian epithelial adenocarcinoma; prostate tumors, such as prostatic adenocarcinoma; bladder tumors, such as transitional, squamous cell carcinoma; tumors of the reticuloendothelial system (RES), such as B and T cell lymphoma (nodular and diffuse), plasmacytoma and acute and chronic leukemia; skin tumors, such as melanoma; and soft tissue tumors, such as soft tissue sarcoma and leiomyosarcoma.
Tumor cells are typically obtained from a cancer patient by resection, biopsy, or endoscopic sampling; the cells may be used directly, stored frozen, or maintained or expanded in culture. Samples of both the tumor and the patient's blood or blood fraction should be thoroughly tested to ensure sterility before co- culturing of the cells. Standard sterility tests are known to those of skill in the art and are not described in detail herein. The tumor cells can be cultured in vitro to generate a cell line. Conditions for reliably establishing short-term cultures and obtaining at least 108 cells from a variety of tumor types is described in Dillmar et al. (1993) J. Immunother. 14:65-69. Alternatively, tumor cells can be dispersed from, for example, a biopsy sample, by standard mechanical means before use. Tumor cells can be obtained by any method known in the art. The following is an example of one method employed by skilled artisans. Using sterile technique, solid tumors (10-30 g) excised from a patient are dissected into 5 mm pieces which are immersed in RPMI 1640 medium containing 0.01%> hyaluronidase type V, 0.002% DNAse type I, 0.1 % collagenase type IV, 50 IU/ml penicillin, 50 μg/ml streptomycin and 50 μg/ml gentamycin. This mixture is stirred for 6 to 24 hours at room temperature, after which it is filtered through a coarse wire grid to exclude undigested tissue fragments. The resultant tumor cell suspension is then centrifuged at 400 x g for 10 minutes. The pellet is washed twice with Hanks balanced salt solution (HBSS) without Ca or Mg or phenol red, then resuspended in HBSS and passed through Ficoll-Hypaque gradients. The gradient interfaces, containing viable tumor cells, lymphocytes, and monocytes, are harvested and washed twice more with HBSS. The harvested cells
10 may be frozen for storage in a type-compatible human serum containing 10% (v/v) DMSO.
The terms "neoplastic cell", "tumor cell", or "cancer cell", used either in the singular or plural form, refer to cells that have undergone a malignant transformation that makes them pathological to the host organism. Primary cancer cells (that is, cells obtained from near the site of malignant transformation) can be readily distinguished from non-cancerous cells by well-established techniques, particularly histological examination. The definition of a cancer cell, as used herein, includes not only a primary cancer cell, but any cell derived from a cancer cell ancestor. This includes metastasized cancer cells, and in vitro cultures and cell lines derived from cancer cells. When referring to a type of cancer that normally manifests as a solid tumor, a "clinically detectable" tumor is one that is detectable on the basis of tumor mass; e.g., by such procedures as CAT scan, magnetic resonance imaging (MRI), X-ray, ultrasound or palpation. Biochemical or immunologic findings alone may be insufficient to meet this definition.
The emergence of tumor cell resistance to chemotherapeutic agents poses a major problem in the treatment of malignancies of the blood and solid tumors. This resistance causes cancer patients to fail to respond to any antitumor agent, since the transformed tumor cells tend to exhibit clinical resistance to many drugs, a phenomenon known as multi-drug resistance (MDR). Several mechanisms can account for MDR at a molecular and cellular level, including, decreased drug uptake or increased drug efflux, altered redox potential, enhanced DNA repair, and increased drug sequestration mechanisms or amplification of the drug-target protein. Drugs of proven antitumor chemotherapeutic value to which MDR has been observed include vinblastine, vincristine, etoposide, teniposide, doxorubicin
(adriamycin), daunorubicin, pliamycin, and actinomycin D. Jones et al. (1993) Cancer (Suppl.) 72:3484-3488. Many tumors are intrinsically multi-drug resistant (e.g., adenocarcinomas of the colon and kidney) while other tumors acquire MDR during the course of therapy (e.g., neuroblastomas and childhood leukemias). "A drug-resistant cancer cell", for the purposes of the present invention, include a cell which is resistant to a single antitumor chemotherapeutic agent, as well as a cell
11 resistant to two or more antitumor chemotherapeutic agents. Cytotoxic drugs as antitumor chemotherapeutic agents can be subdivided into several broad categories, including: 1) alkylating agents, such as mechlorethamine, cyclophosphamide, melphalan, uracil mustard, chlorambucil and carmustine; 2) antimetabolites such as methotrexate, fluorouracil, azarabine, mercaptopurine, thioguanine and adenine arabinoside; 3) natural product derivatives such as vinblastine, vincristine, doxorubicin, bleomicine, toposide, teniposide and mitomycin-c; and 4) miscellaneous agents, such as hydroxyurea, procarbezine and mititane. Sample cells further include neoplastic cells which promote angiogenesis.
Tumors promote angiogenesis (or neovascularization) through a combination of overexpression of angiogenic factors and local inhibition of angiostatic factors. This strategy leads to an angiogenic environment that promotes tumor growth and metastases. Angiogenic factors include the CXC family of chemokines (Arenberg et al. (1997) J. Leukocyte Biol. 62:554-562),
Sample cells also include those expressing an antigen, or those which specifically recognize an antigen and which induce an immune response such as a T-cell. Sample cells also include antigen expressing cells such as "antigen presenting cells" or "APCs" which includes both intact whole cells as well as other molecules which are capable of inducing the presentation of one or more antigens, preferably in association with class I MHC molecules. Examples of suitable APCs include, but are not limited to, whole cells such as macrophages, dendritic cells, B cells; purified MHC class I molecules complexed to β2- microglobulin; and foster antigen presenting cells. The term "foster antigen presenting cells" refers to any modified or naturally occurring cell (wild-type or mutant) with antigen presenting capability that is utilized in lieu of antigen presenting cells ("APC") that normally contact the immune effector cells they are to react with. In other words, it is any functional APC that T cells would not normally encounter in vivo. Foster antigen presenting cells can be derived as follows. The human cell line 174xCEM.T2, referred to as T2, contains a mutation in its antigen processing
12 pathway that restricts the association of endogenous peptides with cell surface MHC class I molecules (Zweerink et al. (1993) J. Immunol. 150:1763-1771). This is due to a large homozygous deletion in the MHC class II region encompassing the genes TAP1, TAP2, LMP1, and LMP2 which are required for antigen presentation to MHC class I-restricted CD8+ CTLs. In effect, only
"empty" MHC class I molecules are presented on the surface of these cells. Exogenous peptide added to the culture medium binds to these MHC molecules provided that the peptide contains the allele-specific binding motif. These T2 cells are what will be referred to as "foster" APCS. Sample cells include those transduced with a polynucleotide. The term
"polynucleotide" as used herein refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. Thus, this term includes double- and single-stranded DNA and RNA. As used herein, "DNA" includes not only bases A, T, C, and G, but also includes any of their analogs or modified forms of these bases, such as methylated nucleotides, internucleotide modifications such as uncharged linkages and thioates, use of sugar analogs, and modified and/or alternative backbone structures, such as polyamides. Included are polynucleotides which encode one or more proteins, or which can be transcribed to generate antisense RNA or a ribozyme. Suitable methods for manipulation of polynucleotides include those described in a variety of references, including, but not limited to, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd Ed., Vol. 1-3, eds. Sambrook et al. Cold Spring Harbor Laboratory Press (1989); and CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, eds. Ausubel et al., Greene Publishing and Wiley- Interscience: New York (1987) and periodic updates.
Any method in the art can be used for the transformation, or insertion, of an exogenous polynucleotide into a host cell, for example, lipofection, transduction, infection or electroporation, using either purified DNA, viral vectors, or DNA or RNA viruses. The exogenous polynucleotide may be maintained as a non-integrated vector, for example, a plasmid, or alternatively, may be integrated into the host cell genome.
13 Sample cells include those infected with a pathogen. "Pathogen" includes any microorganism which is potentially harmful to a cell, including prokaryotes, viruses and single-celled eukaryotes. Such pathogens include, but are not limited to viruses such as human immunodeficiency virus, Epstein-Barr virus; fungi; bacteria capable of infecting mammalian cells, such as Chlamydia spp., Legionella pneumophila, Mycobacterium spp. (Sinai and Joiner (1997) Ann. Rev. Microbiol. 51:415-462), Salmonella typhosa, Brucella abortus; protozoan parasites such as Toxoplasma gondii, Leishmania donovani, Trypanosoma cruzi, malarial plasmodia.
Control cells
The practice of the invention involves comparison of polynucleotides corresponding to expressed genes between a sample cell and a control cell. The selection of the appropriate cell or cell type is dependent on the sample cell initially selected and the phenotype of the sample cell which is under investigation.
In one aspect of the invention, the sample cell is a neoplastic cell and one or more counterparts is another neoplastic cell or non-neoplastic precursors of the sample cell can be used as control cells. Counterparts would include, for example, cell lines established from the same or related cells to those found in the sample cell population. For example, the control cell can be any of a counterpart normal cell type, a counterpart benign cell type, a counterpart non-metastatic cell type and a non-neoplastic precursor of the neoplastic cell.
Alternatively, a sample cell can be selected based on the expression of a gene coding for peptide which participates in recognition of the sample cell by an immune effector cell, e.g., an antigen presenting cell, a suitable control cell is one which has a compatible MHC complex but does not express the antigen. Such control cells are compatible for lysis by a cytotoxic T-lymphocyte for example, but are not lysed by the cytotoxic T-lymphocyte. When the sample cell is a cell which secretes a biological factor, a control cell is one that does not secrete the factor.
14 Polynucleotide Fragments or Expression Tags
Practice of the method of this invention involves analysis of polynucleotide fragments of or corresponding to expressed genes. The polynucleotides are obtained from sample and control cells using methods well known in the art. Many methods are known in the art to identify differentially expressed polynucleotides and each can be used to provide these polynucleotides. As used herein, the term "polynucleotide" includes SAGE tags (defined above) as well as any other nucleic acid obtained from methods that yield quantitative/comparative gene expression data. Such methods include, but are not limited to cDNA subtraction, differential display and expressed sequence tag methods. Techniques based on cDNA subtraction or differential display can be quite useful for comparing gene expression differences between two cell types (described in Hedrick et al. (1984) Nαtwre 308:149 and Lian and Pardee (1992) Science 257:967). The expressed sequence tag (EST) approach is another valuable tool for gene discovery (desribed in Adams et al. (1991) Science 252:1651), like Northern blotting, RNase protection, and reverse transcriptase- polymerase chain reaction (RT-PCR) analysis (described in Sambrook et al. (1989) supra; Alwine et al. (1977) PNAS 74:5350; Zinn et al. (1983) Cell 34:865; and Veres et al. (1987) Science 237:415). A futher method utilizes differential display coupled with real time PCT and representational difference analysis (described in Lisitisyn and Wigler (1995) Meth. Enzymol. 254:291-304). Another approach is the technology known as Serial Analysis of Gene Expression (SAGE, described in U.S. Patent No. 5,695,937). Using SAGE, sequence tags (tags being used synonymously with polynucleotides) corresponding to expressed genes can be analyzed.
The sequence tags or polynucleotides corresponding to the expressed genes are prepared essentially as follows. First, a sample containing the genes of interest is provided. Suitable sources of samples include cells, tissue, cellular extracts or the like. Preferably, the sample is taken from an individual having a particular disease state of interest or at a particular stage in its development.
15 Complementary DNA (cDNA) is then isolated from the sample, for example using methods known to those skilled in the art. In one embodiment, the cDNA is synthesized from mRNA using a biotinylated oligo(dT) primer.
Smaller fragments of cDNA are then be created using a restriction endonuclease, preferably one that would be expected to cleave most transcripts at least once. Preferably, a 4-base pair recognition site enzyme is used. More than one restriction endonuclease can also be used, sequentially or in tandem. The cleaved cDNA can then be isolated by binding to a capture medium for label attached to the primer described above. For example, streptavidin beads are used to isolate the defined 3' nucleotide sequence polynucleotide when the oligo dT primer for cDNA synthesis is biotinylated. Other capture systems (e.g., biotin/streptavidin, digoxigenin anti-digoxigenin) can also be employed.
In one aspect, the isolated defined nucleotide sequence polynucleotides are separated into two pools of cDNA. Each pool is ligated using the appropriate linkers. The linkers can be the same or different, although when the linkers have the same sequence, it is not necessary to separate the polynucleotides into pools. The first oligonucleotide linker comprises a first sequence for hybridization of a PCR primer and the second oligonucleotide linker comprises a second sequence for hybridization of a PCR primer. In addition, the linkers further comprise a second restriction endonuclease site. The linkers are designed so that cleavage of the ligation products with the second restriction enzyme results in release of the linker having a defined nucleotide sequence polynucleotidefe.g., 3' of the restriction endonuclease cleavage site). The defined nucleotide sequence polynucleotidemay be from about 6 to 30 base pairs. Preferably, the polynucleotideis about 9 to 11 base pairs. Therefore, a ditag (i.e. the dimer of two sequence tags) is from about 12 to 60 base pairs, and preferably from 18 to 22 base pairs.
Typically, the second restriction endonuclease cleaves at a site distant from or outside of the recognition site. For example, the second restriction endonuclease can be a type IIS restriction enzyme. Type IIS restriction endonucleases cleave at a defined distance up to 20 bp away from their
16 asymmetric recognition sites (Szybalski W. (1985) Gene 40:169). Examples of type IIS restriction endonucleases include BsmFI and Fokl. Other similar enzymes will be known to those of skill in the art (see, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, supra). The pool of defined tags ligated to linkers having the same sequence, or the two pools of defined nucleotide sequence tags ligated to linkers having different nucleotide sequences, are randomly ligated to each other "tail to tail". The portion of the cDNA polynucleotidefurthest from the linker is referred to as the "tail". This creates the ditag (ligated tag pair) having a first restriction endonuclease site upstream (5') and a first restriction endonuclease site downstream (3') of the ditag; a second restriction endonuclease cleavage site upstream and downstream of the ditag, and a linker oligonucleotide containing both a second restriction enzyme recognition site and an amplification primer hybridization site upstream and downstream of the ditag. In other words, the ditag is flanked by the first restriction endonuclease site, the second restriction endonuclease cleavage site and the linkers, respectively.
The ditag can be amplified by utilizing primers which specifically hybridize to one strand of each linker. Preferably, the amplification is performed after the ditags have been ligated together using standard polymerase chain reaction (PCR) methods as described for example in U.S. Patent No. 4,683,195.
Alternatively, the ditags can be amplified by cloning in prokaryotic-compatible vectors or by other amplification methods known to those of skill in the art. Those of skill in the art can prepare similar primers for amplification based on the nucleotide sequence of the linkers without undue experimentation. Cleavage of the amplified PCR product with the first restriction endonuclease allows isolation of ditags which can then be concatenated by ligation. After ligation, it may be desirable to clone the concatemers, although it is not required. Analysis of the ditags or concatemers, whether or not amplification was performed, can be performed by standard sequencing methods. Concatemers generally consist of about 2 to 200 ditags and preferably from about
8 to 20 ditags. While these are preferred concatemers, it will be apparent that the
17 number of ditags which can be concatenated will depend on the length of the individual tags and can be readily determined by those of skill in the art without undue experimentation. After formation of concatemers, multiple tags can be cloned into a vector for sequence analysis, or alternatively, ditags or concatemers can be directly sequenced without cloning by methods known to those of skill in the art, either manually or using automated methods.
Among the standard procedures for cloning the defined nucleotide sequence tags of the invention is insertion of the tags into vectors such as plasmids or phage. The ditag or concatemers of ditags produced by the method described herein are cloned into recombinant vectors for further analysis, e.g. , sequence analysis, plaque/plasmid hybridization using the tags as probes, by methods known to those of skill in the art. Vectors in which the ditags are cloned can be transferred into a suitable host cell. "Host cells" are cells in which a vector can be propagated and its DNA expressed. The term also includes any progeny of the subject host cell. It is understood that all progeny may not be identical to the parental cell since there may be mutations that occur during replication. However, such progeny are included when the term "host cell" is used. Methods of stable transfer, meaning that the foreign DNA is continuously maintained in the host, are known in the art. Transformation of a host cell with a vector containing ditag(s) may be carried out by conventional techniques as are well known to those skilled in the art. Where the host is prokaryotic, such as E. coli, competent cells which are capable of DNA uptake can be prepared from cells harvested after exponential growth phase and subsequently treated by the CaCl method using procedures well known in the art. Alternatively, MgCl2 or RbCl can be used. Transformation can also be performed by electroporation or other commonly used methods in the art.
The individual tags or ditags, can be hybridized with oligonucleotides immobilized on a solid support (e.g., nitrocellulose filter, glass slide, silicon chip). In addition, either the ditags or oligonucleotide probes are labeled with a detectable label, for example, with a radioisotope, a fluorescent compound, a
18 bioluminescent compound, a chemi-luminescent compound, a metal chelator, or an enzyme. Those of ordinary skill in the art will know of other suitable labels for binding to the ditag, or will be able to ascertain such using routine experimentation. For example, PCR can be performed with labeled (e.g. , fiuorescein tagged) primers.
The ditags are separated into single-stranded molecules which are preferably serially diluted and added to a solid support (e.g., a silicon chip as described by Fodor et al. Science 251:767, 1991) containing oligonucleotides representing, for example, every possible permutation of a 10-mer (e.g., in each grid of a chip). The solid support is then used to determine differential expression of the tags contained within that support (e.g., on a grid on a chip) by hybridization of the oligonucleotides on the solid support with tags produced from cells under different conditions (e.g., different stage of development growth of cells in the absence and presence of a growth factor, normal versus transformed cells, comparison of different tissue expression, etc.). In the case of fluoresceinated end labeled ditags, analysis of fluorescence is indicative of hybridization to a particular 10-mer. When the immobilized oligonucleotide is fluoresceinated, for example, a loss of fluorescence due to quenching (by the proximity of the hybridized ditag to the labeled oligo) is observed and is analyzed for the pattern of gene expression.
Computational Analysis
After the polynucleotide information is obtained, it is analyzed to identify polynucleotides that correspond to genes that are differentially expressed between the two or more cell types. It is within the scope of this invention to perform the method described above using previously identified and stored sequence information that define and identify expressed genes. This information can be obtained from private, publically available and commercially available sequence databases.
19 For example, after a cell or tissue is selected for having a phenotype which is dependent on the presence of one gene product within a sample cell samples, e.g., cells that secrete a biological factor whose activity can be measured in an in vitro assay, cells that stain with an antibody that recognizes a specific antigen or cells that are lysed by cytotoxic T cells that recognize a specific antigen, the cells are further selected to identify sample cells that exhibit extremes of the chosen phenotype and ideally are matched in all other respects or phenotypic characteristics. For example, cells that are matched, e.g., from the same individual, would minimize having to deal with histocompatability differences Ideally one selects two examples of sample cells (say "A" and "B") that exhibit the chosen phenotype prominently and two examples of samples cells (say "C" and "D") that do not have the phenotype at all. Using the method of this invention, polynucleotides present in a library form from each cell sample are isolated and their relative expression noted. The individual libraries are sequenced and the information regarding sequence and in some embodiments, relative expression, is stored in any functionally relevant program, e.g., in Compare Report using the SAGE software (available through Dr. Ken Kinzler at Johns Hopkins University). The Compare Report provides a tabulation of the polynucleotide sequences and their abundance for the samples (say A, B, C and D above) normalized to a defined number of polynucleotides per library (say
25,000). This is then imported into MS-ACCESS either directly or via copying the data into an Excel spreadsheet first and then from there into MS-ACCESS for additional manipulations. Other programs such as SYBASE or Oracle that permit the comparison of polynucleotide numbers could be used as alternatives to MS- ACCESS. Enhancements to the software can be designed to incorporate these additional functions. These functions consist in standard Boolean, algebraic, and text search operations, applied in various combinations to reduce a large input set of polynucleotides to a manageable subset of polynucleotides of specifically defined interest. The researcher may create groups containing one or more project(s) by combining the counts of specific polynucleotides within a group (e.g.,
20 GroupNormal = Normal 1 + Normal2, GroupTumor = Primary Tumor 1 + TumorCellLine). Additional characteristic values are also calculated for each tag in the group (e.g., average count, minimum count, maximum count). The researcher may calculate individual tag count ratios between groups, for example the ratio of the average GroupNormal count to the average GroupTumor count for each polynucleotide. The researcher may calculate a statistical measure of the significance of observed differences in tag counts between groups.
To identify the polynucleotides within MS-ACCESS, a query to sort polynucleotide tags based on their abundance in the sample cells is run. The output from the Query report lists specific polynucleotides (by sequence) that fit the sorting criteria and their abundance in the various sample cells
The sorting is based on the principle that the gene product of interest (and hence the corresponding polynucleotide) is more abundant in the samples that prominently exhibit the chosen phenotype than in samples that do not exhibit the phenotype.
For example, one may query to identify polynucleotides that are present at a level of 10 or more in samples A and B and less than 1 in samples C and D, the results of the search might reveal that 5 different polynucleotides fit the sorting criteria hence there are 5 candidates genes to be tested to determine whether they confer the phenotype when transferred into samples like C and D that do not have the phenotype.
The more stringent the sorting criteria, the more efficient the sorting should be. Thus if one asked for polynucleotides that are at 5 copies or more in samples A and B and less than 5 copies in samples C and D, a large number of candidates would be generated. However, if one can increase the differential because the samples manifest extremes of the phenotype (say >10 in samples A and B and <1 in samples C and D) this restricts the number of candidates that will be identified.
Prior knowledge of what amount of gene product (hence abundance of polynucleotides) is required to confer the phenotype is not essential as one can arbitrarily select a set of sorting parameters, run the data analysis, and identify
21 and test candidates. If the desired candidate is not found the stringency of the sorting criteria can be reduced (i.e. reduce the differential) and the new candidates that are found can be tested. Iterative cycles of sorting and testing candidates should eventually culminate in the successful recovery of the desired candidate
Table 1
Cycle Sorting Number of Number of Criteria Candidates Candidates to Evaluate
1 —10 in 10 10 samples A and B
...1 in samples C and D (minimum differential 1 Ox)
2 —5 in 30 20* samples A and B
...2 in samples C and D (minimum differential=2.5x)
3 —5 in 80 50# samples A and B
...5 in samples C and D (minimum
Figure imgf000024_0001
differential lx)
*Of the 30 candidates, 10 will have already been evaluated in cycle 1 so only 20 new candidates need to be evaluated
#Of the 80 candidates, 30 will have already been evaluated (10 in cycle 1, 20 in cycle 2) so only 50 need to be evaluated
Knowledge of what amount of gene product (hence abundance of polynucleotide) is required to confer the phenotype will permit the rationale use of stringent sorting criteria and greatly accelerate the search process as the desired gene may be captured within a handful of candidates
22 Establishing what amount of gene product is required to confer a specific phenotype will be dependent on the specific phenotype in question and the sensitivity of assays that measure that phenotype
For instance, the inventor has found that a frequency of 1/5000 (5 copies of a SAGE tag normalized to a library size of 25,000) correlates with sufficient expression of a tumor antigen within the sample cell to render it sensitive to lysis by an antigen specific T cell while a frequency of 1/25,000 correlates with the cell being weakly sensitive to lysis.
Thus, one could use a sorting criteria of >5 in samples cells that are susceptible to lysis and <1 in samples that are not susceptible to lysis to home in on a candidate tumor antigen.
Accordingly, one enters the individual polynucleotide sequences from the Query report into the program to determine if there is a match with any known genes or whether they are potentially novel (no match=NM). One then retrieves cDNAs corresponding to specific sequences from the
Query Report and test them individually in an appropriate biological assay to determine if they confer the phenotype. Of the candidates that correspond to known genes, it is a relatively easy task to obtain complementary DNAs for these candidates and test them individually to determine if they confer the specific phenotype in question when transferred into cells that do not exhibit the phenotype. If none of the known genes confer the phenotype, retrieve the cDNAs corresponding to the No Match sequences of the Query Report by PCR cloning and test the novel cDNAs individually for their ability to confer the phenotype. If the assumptions made up to this point are sound (i.e., a single gene product can confer the phenotype; the sorting criteria are not too stringent so as to exclude the desired candidate) then a cDNA corresponding to one of the candidates of the Query Report will be found to confer the phenotype and the search is over. If however none of the candidates are found to confer the phenotype then one may need to reduce the stringency of the sorting parameters to "cast a wider net" and capture more candidates to be tested as above.
23 In one embodiment, the polynucleotide or gene sequence can also be compared to a sequence database, for example, using a computer method to match a sample sequence with known sequences. Sequence identity can be determined by a sequence comparison using, i.e., sequence alignment programs that are known in the art, such as those described in CURRENT PROTOCOLS IN MOLECULAR
BIOLOGY (F.M. Ausubel et al., eds., 1987) Supplement 30, section 7.7.18, Table 7.7.1. A preferred alignment program is ALIGN Plus (Scientific and Educational Software, Pennsylvania), preferably using default parameters, which are as follows: mismatch = 2; open gap = 0; and extend gap = 2. Another preferred program is the BLAST program for alignment of two nucleotide sequences, using default parameters as follows: open gap = 50; extension gap - 2 penalties; gap x dropoff = 0; expect = 10; word size = 11. The BLAST program is available at the following Internet address: http://www.ncbi.nlm.nih.gov. Alternatively, hybridization under conditions of high, moderate and low stringency can also indicate degree of sequence identity.
Phenotypes amenable to study using the methods of the invention
In one aspect of the invention, genes and gene products associated with cancer and neoplastic cells are determined. Additionally, the methods of the present invention can be used to establish correlations between the phenotype and the SAGE tag genotype of a variety of other types of cell. For example, in other aspects, the methods of the invention can be used in the identification of gene products associated with genetic disease, inherited disease and/or acquired diseases. Gene products associated with drug resistance and drug metabolism can also be identified. Identification of genes associated with drug metabolism will have important applications in the field of pharmacogenomics, wherein an individual's response to a particular therapeutic is determined, so as to maximize therapeutic value and minimize side effects. In additional aspects, the methods of the invention are used in the identification of gene products that confer some measurable biological activity on a mature or differentiated population of cells, wherein the activity is not exhibited by immature or undifferentiated precursors.
24 For example, a class of T-lymphocytes known as cytotoxic T-lymphocytes are able to recognize and lyse a target cell, whereas other types of T-lymphocyte are capable of recognition but incapable of lysis. Using the methods of the invention, it is possible to identify genes that are responsible for this difference, i.e., genes whose expression specifically enable lysis of a target cell by a cytotoxic T- lymphocyte.
It will be clear to the skilled artisan that the ability to determine the phenotype of a cell, and hence to establish a correlation between its phenotype and its SAGE genotype, will be dependent upon the sensitivity, specificity and/or complexity of the assays used to establish the phenotype of the cell. For example, a phenotype, such as metastatic potential, which is likely to depend upon multiple factors, may be more difficult to establish than a phenotype whose magnitude is dependent on the relative abundance of a single specific transcript.
The following example is intended to illustrate, but not limit, the invention as defined herein.
There are several alternatives to the use of conventional sequencers to generate sequence information on the polynucleotides:
1) Hybridizing tags, or preferably amplified ditags, against oligonucleotide sequences fixed to a solid matrix such as nitrocellulose filters, glass slides or silicon chips ("parallel sequence analysis", or PSA); or
2) Performing limiting dilutions on the ditag (or concatenate) preparations and then sequencing individual DNAs, either with or without prior amplification, by techniques that include, for example, mass spectroscopy (clonal sequencing, or "CS").
25 PSA:
In a preferred embodiment of PSA, the following steps are carried out with ditags:
1) Ditags are prepared, amplified and cleaved with the anchoring enzyme as defined by SAGE technology:
OOOOOOOOOOXXXXXXXXXXCATG GTACOOOOOOOOOOXXXXXXXXXX
3) 4-base oligomers containing an identifier (e.g., a fluorescent moiety, FL) are prepared that are complementary to the overhangs: FL-CATG
4) The FL-CATG oligomers (in excess) are ligated to the ditags: FI-CATGOOOOOOOOOOXXXXXXXXXXCATG
GTACOOOOOOOOOOXXXXXXXXXXGTAC-FL
5) The ditags are purified and melted to yield single-stranded DNAs: FI-CATGOOOOOOOOOOXXXXXXXXXXCATG
GTACOOOOOOOOOOXXXXXXXXXXGTAC-FL
6) The mixture of single-stranded DNAs is serially diluted.
7) Each serial dilution is hybridized out under appropriate stringency conditions with solid matrices containing gridded single-stranded oligonucleotides; all of the oligonucleotides contain a half-site of the anchoring enzyme cleavage sequence. In the example used herein, the oligonucleotide sequences contain a CATG sequence at the 5' end:
CATGOOOOOOOOOO, CATGXXXXXXXXXX, etc. (or alternatively a GTAC sequence at the 3' end: OOOOOOOOOGTAC) The matrices are constructed of any material known in the art and the oligonucleotide-bearing chips are generated by any procedure known in the art, e.g. silicon chips containing oligonucleotides prepared by the VLSIP procedure. See, for example, U.S. Patent No. 5,424,186.
8) The oligonucleotide-bearing matrices are evaluated for the presence or absence of a fluorescent ditag at each position in the grid.
26 In a preferred embodiment, there are 410 or 1,048,576, oligonucleotides on the grid of the general sequence CATGOOOOOOOOOO, such that every possible 10-base sequence is represented 3' to the CATG. Since there are estimated to be no more than 100,000 to 200,000 different expressed genes in the human genome, there are enough oligonucleotide sequences to identify all of the possible sequences adjacent to the 3' -most anchoring enzyme site observed in the cDNAs from the expressed genes in the human genome.
A determination is made of differential expression by comparing the fluorescence profile on the grids at different dilutions among different libraries. For example:
Library A, Ditags Diluted 1:10
A B C D E
1 FL
2 FL
3 FL FL
4 FL
5 FL
Library B, Ditags Diluted 1:10
A B C D E
1 FL
2 FL FL
3 FL FL
4
5 FL FL
Library A, Ditags Diluted 1 :50
A B C D E
1 FL
2
3 FL
4 FL
Figure imgf000029_0001
5 FL
27 Library B, Ditags Diluted 1:50
A B C D E
1 FL
FL
FL FL
5
Library A, Ditags Diluted 1:100
A B C D E
1 FL
2
3 FL
4 FL
5 FL
Library B, Ditags Diluted 1:100
A B C D E
1 FL
2 FL
3 FL
4
Figure imgf000030_0001
5
The individual oligonucleotides thus hybridize to ditags with the following characteristics:
Dilution 1:10 1:50 1:100
Lib A Lib B Lib A Lib B Lib A Lib B
1A + + + + + +
2C + + +
2E + +
3B + + + + + +
3C + + +
4D + + +
5A + + + +
Figure imgf000030_0002
5E +
From the summary table, it is concluded that tags hybridizing to 1 A and
3B reflect highly abundant mRNAs that are not differentially expressed (since the
28 tags hybridize to both libraries at all dilutions); that 2C is a highly abundant mRNA, but only in Library B, and that 4D is highly abundant, but only in Library A. 2E reflects a low abundance transcript (since it is only detected at the lowest dilution) that is not found to be differentially expressed; 3C reflects a moderately abundant transcript (since it is expressed only at the lower two dilutions) in
Library B that is expressed at low abundance in Library A. 4D reflects a differentially-expressed, high abundance transcript restricted to Library A; 5 A reflects a transcript that is expressed at high abundance in Library A but only at low abundance in Library B; and 5E reflects a differentially-expressed (in Library B), low abundance transcript.
In another PSA embodiment, step 3 above does not involve the use of a fluorescent or other identifier; instead, at the last round of amplification of the ditags, fluoresceinated dNTPs are used so that half of the molecules are probed on the chips. In yet another PSA embodiment, instead of ditags, a particular portion of the transcript is used, e.g., the sequence between the 3' terminus of the transcript and the first anchoring enzyme site. In that particular case, a double-stranded cDNA reverse transcript is generated as described in WO 97/10363. The transcripts are cut with the anchoring enzyme, a linker is added containing a PCR primer and amplification is initiated (using the primer at one end and the A tail at the other) while the transcripts are still on the strepavidin bead. At the last round of amplification, fluoresceinated dNTPs are used so that half of the molecules can be probed on the chip. The linker-primer is optionally removed with the anchoring enzyme at this point in order to reduce the size of the fragments. The soluble fragments are then melted and captured on solid matrices containing
CATGOOOOOOOOOO, as in the previous example. Analysis and scoring (only of the half of the fragments which contain fluoresceinated bases) are as described above.
29 CS:
Ditags or concatemers are diluted and added to wells or other receptacles so that on average the wells contain, statistically, less than one DNA molecule per well (as is done in limited dilution for cell cloning). Each well then receives reagents for PCR or another amplification process and the DNA in each receptacle is sequenced, e.g., by mass spectoscopy. The results are either be a single sequence (there having been a single sequence in that receptacle), a "null" sequence (no DNA present) or a double sequence (more than one DNA molecule), which is discarded. Thereafter, assessment of differential expression is the same as defined by SAGE technique.
The preceding discussion and examples are intended merely to illustrate the art. As is apparent to one of skill in the art, various modifications can be made to the above without departing from the spirit and scope of this invention.
Example 1
Investigators have sought to elicit antigen specific T cell responses in the hopes of creating an anti-tumor cell immune response that might lead to the eradication of tumor cells. To date, 4 classes of tumor antigens have been identified: differentiation antigens which are self proteins over-expressed by tumor cells; viral antigens such as HPV16E6 and E7; the cancer/testes family of antigens typified by MAGE; and mutated proteins such as ras or p53. Of the differentiation antigens, the vast majority are melanoma associated antigens and attempts to identify self antigens over-expressed by lung, prostate, breast or colon carcinomas that might be good candidates as targets for cytotoxic T cells have largely been unsuccessful. Thus the vast majority of cancer immunotherapy trials conducted to date have been for the treatment of melanoma and little by way of immunotherapy is available to offer patients suffering with other malignant diseases. The present invention calls for the use of genes differentially expressed in target cells in the design of a vaccine to generate an immune response against the
30 target cells. The inventors have applied a SAGE analysis (described in U.S. Patent No. 5,695,937), to identify a variety of transcripts that are differentially expressed in cancer cells, that have not previously been associated with tumor cells. Melanoma cell lines, differentially susceptible to lysis by a gplOO specific cytotoxic T lymphocyte (CTL) were subjected to SAGE analysis to determine which SAGE tags were shared amongst the cell lines that were susceptible to lysis against those polynucleotides that were absent or less abundant in cell lines that were not susceptible to lysis. Ten SAGE polynucleotides matched the sorting criteria and were found to be represented at a higher level in cell lines identified as 624mel and 1300mel (that are susceptible to lysis) than in cell lines identified as BA1 and A375 (that are not susceptible to lysis). Two different polynucleotides corresponding to the differentially spliced forms of the gplOO mRNA were identified within the set of differentially expressed genes indicating it is possible to rapidly narrow down the candidates, but in addition, 8 other tag sequences were found including a tag corresponding to cdc2 -related protein kinase (Table 2). At the same time other differentially expressed genes were identified. Thus, by virtue of the fact that the identified genes were overexpressed, some may be candidates for use in immunotherapy.
31 Table 2
COMPARISON OF MELANOMA CELL LINE
SAGE DATA
<5 <5 >10 >10
BA1 A375 624 1300 GENE
0 0 206 92 gplOO melanocyte lineage-specific antigen
0 0 65 18 gplOO melanocyte lineage-specific antigen
0 0 60 16 calpain-skeletal muscle protein
1 4 18 25 Mitochondrial
1 4 18 11 Biliary glycoprotein
3 3 47 34 microsomal epoxide hydrolase gene
3 4 26 14 NM
3 4 18 13 NM
4 4 72 27 cdc2 -related protein kinase mRNA
4 4 20 11 ATP synthase subunit c
I
Figure imgf000034_0001
M = no match It was reasoned that if additional melanoma cell lines were included in the analysis, one might further be able to restrict the number of SAGE polynucleotides corresponding to expressed transcripts that could encode the cognate antigen. Two non-HLA-A2 melanoma cell lines (e.g., NM455 and SK28) were chosen and phenotypes established (susceptibility to lysis by a gplOO specific CTL) of the cells following transduction with an adeno viral vector encoding HLA-A2. While SK28 was a good tafet for the gplOO specific CTL, NM455 was not. The HLA-A2 negative cell lines were subjected to SAGE analysis and SAGE polynucleotides were sorted to identify polynucleotides common to lines that are susceptible to lysis that are less abundant in lines that are less susceptible to lysis (see Table 3). Of the two polynucleotides that matched the sorting criteria, one was the gplOO tag CCTGGTCAAG. Thus, by conducting the SAGE analysis of 6 different melanoma cell lines that are differentially susceptible to lysis by an HLA restricted CTL, one is able to focus on just 2 transcripts that were candidates for the cognate antigen, one of which was the desired target.
32 Table 3
Comparison of Melanoma Cell Line SAGE Data
>2 >2 >2 >5 >5 >5 GENE A375 BA1 NM455 SK28 624 1300
0 0 1 6 200 89 gplOO antigen
Figure imgf000035_0001
0 1 0 8 6 7 tag 9
Example 2 Melanoma and breast cancer cell lines, exhibiting differential immunoreactivity to an anti-HER-2 antibody as judged by FACS analysis were subjected to SAGE analysis to determine which SAGE polynucleotides were shared amongst the cell lines that showed a high mean fluorescence signal that were less abundant in cell lines that showed a lower mean fluorescence signal. Four SAGE polynucleotides matched the sorting criteria and were found to be represented at a higher level in cell lines 21PT and 21MT (that show a strong fluorescence signal) than in cell lines MDA-468, SK28, BA1, NM455 and 1300mel (that show a weaker fluorescence signal) (Table 4). One tag corresponding to HER-2 was identified but in addition, 3 other tag sequences were found including a tag corresponding to integrin alpha-3. While HER-2 has previously been identified as a target for patient derived T cells, it has not been reported that integrin alpha-3 can also be a target for patient derived immune effector cells or antibodies. Thus, the gene encoding integrin alpha-3 or the corresponding gene product or peptide fragments thereof can be used to provoke an immune response to target cells that differentially express integrin alpha-3.
While integrin alpha-3 was used for this example, any differentially expressed gene or genes (identified by SAGE) and their corresponding proteins or peptide fragments could be used to provoke an anti-target cell immune response.
33 Table 4
Identification of the Antigen Recognized by an Antibody
Cell Line Mean Fluorescence
21PT 35.2
21MT 33.4
MDA-468 3.1
SK28 7.4
BA1 8.9
NM455 11.1
Figure imgf000036_0001
1300 14.7
>10 <5
A B C D E F G Gene
66 11 2 0 0 0 1 NM
21 21 1 1 0 1 3 AL0096
11 25 0 0 1 2 2 HER2
11 15 0 0 4 3 0 integrin alpha-3
Figure imgf000036_0002
o match
34

Claims

1. A method for identifying a polynucleotide fragment of a gene conferring a selected phenotype to a sample cell, wherein the method comprises the following steps:
(a) obtaining a set of polynucleotides representing gene expression in two or more sample cells;
(b) obtaining a set of polynucleotides representing gene expression in one or more control cells; and (c) identifying a unique polynucleotide, the unique polynucleotide representing a gene that is common to the two or more sample cells and differentially expressed in the sample cells compared to the control cell.
2. The method of claim 1, further comprising identifying the gene corresponding to the unique polynucleotide identified in (c), thereby identifying the gene.
3. The method according to claim 1 , wherein the unique polynucleotide represents a gene that is overexpressed or underexpressed in at least one of the sample cells compared to the control cell.
4. The method according to claim 1, wherein more than one control cell type is used.
5. The method according to claim 1 , wherein at least one of the sample cells is a neoplastic cell.
6. The method according to claim 1 , wherein at least one of the sample cells secretes a molecule, protein or factor.
35
7. The method according to claim 5, wherein the neoplastic cell is selected from the group consisting of a breast cancer cell, a colon cancer cell, a lung cancer cell, a pancreatic cancer cell, a prostate cancer cell, and a melanoma.
8. The method according to claim 5, wherein the neoplastic cell is selected from the group consisting of a leukemia cell, a lymphoma cell and a myeloma cell.
9. The method according to claim 5, wherein the control cell is selected from the group consisting of a counterpart normal cell type, a counterpart benign cell type, a counterpart non-metastatic cell type and a non-neoplastic precursor of the neoplastic cell.
10. The method according to claim 5, wherein the neoplastic cell is obtained from a tumor.
11. The method according to claim 5, wherein the neoplastic cell is selected from the group consisting of a breast cancer cell, a colon cancer cell, a lung cancer cell, a pancreatic cancer cell, a prostate cancer cell, and a melanoma.
12. The method according to claim 5, wherein the neoplastic cell is selected from the group consisting of a leukemia cell, a lymphoma cell and a myeloma cell.
13. The method according to claim 5, wherein at least one of the control cells is selected from the group consisting of a counterpart normal cell type, a counterpart benign cell type, a counterpart non-metastatic cell type and a non-neoplastic precursor of the neoplastic cell.
36
14. The method according to claim 1, wherein the gene encodes a peptide which participates in recognition of at least one of the sample cells by an immune effector cell.
15. The method according to claim 14, wherein the immune effector cell is a T-lymphocyte.
16. The method according to claim 14, wherein the immune effector cell is a B-lymphocyte.
17. The method according to claim 14, wherein the immune effector cell is a NK-cell.
18. The method according to claim 1 , wherein one or more of the sample cells express a surface marker that is recognized by an immune effector cell.
19. The method according to claim 4, wherein the gene encodes a peptide which participates in recognition of at least one of the sample cells by an immune effector cell.
20. The method according to claim 19, wherein the immune effector cell is a T-lymphocyte.
21. The method according to claim 19, wherein the immune effector cell is a B-lymphocyte.
22. The method according to claim 19, wherein the immune effector cell is a NK cell.
37
23. The method according to claim 19, wherein at least one of the sample cells is a neoplastic cell that is lysed by a cytotoxic T-lymphocyte.
24. The method according to claim 23, wherein the control cell is a cell that is compatible for lysis by the cytotoxic T-lymphocyte but not lysed by the cytotoxic T-lymphocyte.
25. The method according to claim 24, wherein at least one of the control cells is selected from the group consisting of a counterpart normal cell type, a counterpart benign cell type, a counterpart non-metastatic cell type and a non-neoplastic precursor of the neoplastic cell.
26. The method according to claim 25, wherein at least one of the control cells is selected from the group consisting of a counterpart normal cell type, a counterpart benign cell type, a counterpart non-metastatic cell type and a non-neoplastic precursor of the neoplastic cell.
27. The method according to claim 1 , wherein at least one of the sample cells is drug-resistant.
28. The method according to claim 5, wherein at least one of the sample cells is drug resistant.
29. The method according to claim 1 , wherein at least one of the sample cells has the ability to stimulate angiogenesis.
30. The method according to claim 1 , wherein at least one of the sample cells is infected with a pathogen.
31. The method according to claim 31 , wherein the control cell comprises an uninfected cell.
38
32. The method according to claim 31 , wherein the pathogen is resistant to a drug or antibiotic.
33. The method according to claim 31 , wherein the pathogen confers resistance to a drug or antibiotic.
34. The method according to claim 1, wherein at least one of the sample cells is an apoptotic cell.
35. The method according to claim 1, wherein at least one of the sample cells is a hyperproliferative cell.
36. The method according to claim 1, wherein the selected phenotype is associated with a genetic disease.
37. The method according to claim 1 , wherein the selected phenotype is associated with altered metabolic activity.
38. The method according to claim 1 , wherein the selected phenotype is associated with senescence.
39. The method according to claim 1 , wherein the selected phenotype is associated with apoptosis.
40. The method according to claim 1, wherein the selected phenotype is associated with drug metabolism.
41. The method according to claim 1 , wherein the selected phenotype is associated with an allergic reaction.
39
42. The method according to claim 1, wherein the sample cell is an animal cell.
43. The method according to claim 42, wherein the sample cell is a mammalian cell.
44. The method according to claim 1, wherein the sample cell is a plant cell.
45. The method according to claim 1, wherein the sample cell is a microorganism.
46. The method according to claim 1, wherein at least one of the sample cells is a differentiated cell.
47. The method according to claim 46, wherein the control cell is a cell in an earlier state of differentiation than that of the differentiated sample cell.
48. The method according to claim 1, wherein the gene encodes a secreted biological factor.
49. A method for identifying one or more polynucleotides corresponding to one or more secreted biological factors, wherein the method comprises the following steps: (a) obtaining a set ofpolynucleotidees representing gene expression in one or more sample cells that secrete the factor;
(b) obtaining a set of polynucleotides representing gene expression in one or more control cells that do not secrete the factor;
(c) identifying one or more unique polynucleotides, wherein the unique polynucleotides are common to the sample cells, the unique polynucleotides being absent or expressed at lower levels in the control cells.
40
50. The method of claim 49, further comprising determining the genes corresponding to the polynucleotides identified in (c), thereby identifying one or more secreted biological factors.
51. A method for identifying a therapeutic target, wherein the method comprises the following steps:
(a) obtaining a set of polynucleotides representing gene expression in two or more sample cells; (b) obtaining a set of polynucleotides representing gene expression in one or more control cells; and
(c) identifying a unique polynucleotide, the unique polynucleotide representing a gene that is common to the two or more sample cells and differentially expressed in the sample cells compared to the control cell.
52. The method of claim 51 , further comprising determining the gene corresponding to the unique polynucleotide identified in (c), thereby identifying the gene.
53. The method according to claim 51 , wherein the unique polynucleotide represents a gene that is overexpressed in at least one of the sample cells compared to the control cells.
54. The method according to claim 51 , wherein more than one control cell type is used.
54. The method according to claim 51 , wherein at least one of the sample cells is a neoplastic cell.
55. The method according to claim 53, wherein at least one of the control cells is a neoplastic cell.
41
56. The method according to claim 55, wherein the neoplastic cell is obtained from a tumor.
57. The method according to claim 54, wherein the neoplastic cell is selected from the group consisting of a breast cancer cell, a colon cancer cell, a lung cancer cell, a pancreatic cancer cell, a prostate cancer cell, and a melanoma.
58. The method according to claim 54, wherein the neoplastic cell is selected from the group consisting of a leukemia cell, a lymphoma cell and a myeloma cell.
59. The method according to claim 55, wherein the control cell is selected from the group consisting of a counterpart normal cell type, a counterpart benign cell type, a counterpart non-metastatic cell type and a non-neoplastic precursor of the neoplastic cell.
60. The method according to claim 51, wherein at least one of the sample cells and at least one of the control cells is a neoplastic cell of the same or different tumor type.
61. The method according to claim 60, wherein the neoplastic cell is selected from the group consisting of a breast cancer cell, a colon cancer cell, a lung cancer cell, a pancreatic cancer cell, a prostate cancer cell, and a melanoma.
62. The method according to claim 60, wherein the neoplastic cell is selected from the group consisting of a leukemia cell, a lymphoma cell and a myeloma cell.
63. The method according to claim 53, wherein at least one of the control cells is selected from the group consisting of a counterpart normal cell
42 type, a counterpart benign cell type, a counterpart non-metastatic cell type and a non-neoplastic precursor of the neoplastic cell.
64. The method according to claim 60, wherein the gene encodes a peptide which participates in recognition of at least one of the sample cells by an immune effector cell.
65. The method according to claim 64, wherein the immune effector cell is a T-lymphocyte.
66. The method according to claim 64, wherein the immune effector cell is a B-lymphocyte.
67. The method according to claim 64, wherein the immune effector cell is a NK-cell.
68. The method according to claim 51 , wherein one or more of the sample cells express a surface marker that is recognized by an immune effector cell.
69. The method according to claim 51 , wherein the gene encodes a peptide which participates in recognition of at least one of the sample cells by an immune effector cell.
70. The method according to claim 68, wherein the immune effector cell is a T-lymphocyte.
71. The method according to claim 68, wherein the immune effector cell is a B-lymphocyte.
43
72. The method according to claim 68, wherein the immune effector cell is a NK cell.
73. The method according to claim 51 , wherein at least one of the sample cells is a neoplastic cell that is lysed by a cytotoxic T-lymphocyte.
74. The method according to claim 73, wherein the control cell is a cell that is compatible for lysis by the cytotoxic T-lymphocyte but not lysed by the cytotoxic T-lymphocyte.
75. A method for inducing an immune response against a polypeptide not previously associated with a neoplastic phenotype and that is overexpressed in a sample cell, the method comprising contacting the sample cell with an antibody raised against a polypeptide expressed by the gene identified by the method of claim 51.
76. A method for inducing an immune response against a polypeptide not previously associated with a neoplastic phenotype, the method comprising contacting the sample cell with an effective amount of an antibody raised against a protein expressed by the gene identified by the method of claim 51.
77. A method for inducing an immune response against a cell not previously associated with a neoplastic phenotype, the method comprising contacting the sample cell with an effective amount of an immune effector cell generated by exposure to an antigen presenting cell which presents the protein expressed by the gene identified by the method of claim 20 on the surface of the antigen presenting cell in the context of an MHC molecule.
78. A method for inducing an immune response against a polypeptide not previously associated with a neoplastic phenotype, the method comprising contacting the sample cell with an effective amount of a population educated
44 immune effector cells cultured in the presence and at the expense the antigen presenting cell of claim 77.
79. The methods of any of claims 77 or 78, further comprising contacting the cell with an effective amount of a cytokine or co-stimulatory molecule.
80. A method for inducing an immune response against a polypeptide not previously associated with a neoplastic phenotype, the method comprising administering to a suitable subject an effective amount of an antigen presenting cell which presents the protein expressed by the gene identified by the method of claim 51 on the surface of the antigen presenting cell in the context of an MHC molecule.
81. A method for inducing an immune response against a polypeptide not previously associated with a neoplastic phenotype, the method comprising administering to a suitable subject an effective amount of a population educated immune effector cell cultured in the presence and at the expense the antigen presenting cell of claim 77.
82. The methods of any of claims 80 or 81 , further comprising contacting the cell with an effective amount of a cytokine or co-stimulatory molecule.
83. A method for inducing an immune response against a polypeptide not previously associated with a neoplastic phenotype in a suitable subject, the method comprising administering to the subject an effective amount of an antibody raised against a protein expressed by the gene identified by the method of claim 51.
45
84. A method of creating a database of polynucleotide data resulting from processing a plurality of cell samples comprising: a) transferring a plurality of sequence records that correspond to polynucleotides obtained from a sample of a plurality of cells electronically to a computer processor and creating a data raw file containing observed polynucleotide abundances related to the samples; and b) creating a compare data file by combining the data raw file with other data raw files, the other data raw files having been created from other samples; whereby the compare data file contains records combined from the data raw files, the data having been normalized to indicate percentage of sample for a number of occurrences of a polynucleotide in each of samples from the plurality of cells.
85. The method of claim 84, further comprising loading the compare data file into a relational database management (RDBMS).
86. The method of claim 85, further comprising applying queries based upon a desired selection criteria to the compare data file in the RDBMS to produce reports of polynucleotides which match the desired selection critieria.
87. A system for identifying selected polynucleotide records, the system comprising: a digital computer; a database coupled to the computer; a database coupled to the database server having data stored therein, the data comprising records of data combined from polynucleotide raw files, the data having been normalized to indicate percentage of sample for a number of occurrences of a same tag in each sample of a plurality of samples; and
46 a code mechanism for applying queries based upon a desired selection criteria to the data file in the database to produce reports of polynucleotide records which match the desired selection criteria.
88. A method for identifying selected polynucleotide records from a database, using a computer having a processor, memory, display, input/output devices, the method of comprising the steps of: a) providing a database coupled to the computer having data stored therein, the data comprising representations of data combined from polynucleotide raw files, the data having been normalized to indicate percentage of sample for a number of occurrences of a same polynucleotide in each of a plurality of samples; and b) using a code mechanism for applying queries based upon a desired selection criteria to the data file in the database to produce reports of polynucleotide records which match the desired selection criteria.
47
PCT/US1999/001463 1998-01-26 1999-01-25 Methods for identifying therapeutic targets WO1999037816A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AU23391/99A AU756357B2 (en) 1998-01-26 1999-01-25 Methods for identifying therapeutic targets
JP2000528722A JP2002500896A (en) 1998-01-26 1999-01-25 Identification of therapeutic targets
EP99903346A EP1053349A4 (en) 1998-01-26 1999-01-25 Methods for identifying therapeutic targets
CA002319148A CA2319148A1 (en) 1998-01-26 1999-01-25 Methods for identifying therapeutic targets

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US10043698P 1998-01-26 1998-01-26
US7785398P 1998-03-13 1998-03-13
US10323098P 1998-10-05 1998-10-05
US60/100,436 1998-10-05
US60/077,853 1998-10-05
US60/103,230 1998-10-05

Publications (1)

Publication Number Publication Date
WO1999037816A1 true WO1999037816A1 (en) 1999-07-29

Family

ID=27373177

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/001463 WO1999037816A1 (en) 1998-01-26 1999-01-25 Methods for identifying therapeutic targets

Country Status (5)

Country Link
EP (1) EP1053349A4 (en)
JP (1) JP2002500896A (en)
AU (1) AU756357B2 (en)
CA (1) CA2319148A1 (en)
WO (1) WO1999037816A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1113382A1 (en) * 1999-12-27 2001-07-04 Applied Research Systems ARS Holding N.V. A method for the identification of gene transcripts with improved efficiency in the treatment of errors
EP1364066A2 (en) * 2001-02-02 2003-11-26 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Method for identifying functional nucleic acids

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004287619A (en) * 2003-03-19 2004-10-14 Ntt Data Corp Epidemiological information management device, epidemiological information management method, and program

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992010588A1 (en) * 1990-12-06 1992-06-25 Affymax Technologies N.V. Sequencing by hybridization of a target nucleic acid to a matrix of defined oligonucleotides
US5260191A (en) * 1992-01-30 1993-11-09 Agracetus, Inc. Method for diagnosing tumors
US5721098A (en) * 1986-01-16 1998-02-24 The Regents Of The University Of California Comparative genomic hybridization
US5776683A (en) * 1996-07-11 1998-07-07 California Pacific Medical Center Methods for identifying genes amplified in cancer cells
US5800992A (en) * 1989-06-07 1998-09-01 Fodor; Stephen P.A. Method of detecting nucleic acids
US5807522A (en) * 1994-06-17 1998-09-15 The Board Of Trustees Of The Leland Stanford Junior University Methods for fabricating microarrays of biological samples
US5830645A (en) * 1994-12-09 1998-11-03 The Regents Of The University Of California Comparative fluorescence hybridization to nucleic acid arrays

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5695937A (en) * 1995-09-12 1997-12-09 The Johns Hopkins University School Of Medicine Method for serial analysis of gene expression

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5721098A (en) * 1986-01-16 1998-02-24 The Regents Of The University Of California Comparative genomic hybridization
US5800992A (en) * 1989-06-07 1998-09-01 Fodor; Stephen P.A. Method of detecting nucleic acids
WO1992010588A1 (en) * 1990-12-06 1992-06-25 Affymax Technologies N.V. Sequencing by hybridization of a target nucleic acid to a matrix of defined oligonucleotides
US5260191A (en) * 1992-01-30 1993-11-09 Agracetus, Inc. Method for diagnosing tumors
US5807522A (en) * 1994-06-17 1998-09-15 The Board Of Trustees Of The Leland Stanford Junior University Methods for fabricating microarrays of biological samples
US5830645A (en) * 1994-12-09 1998-11-03 The Regents Of The University Of California Comparative fluorescence hybridization to nucleic acid arrays
US5776683A (en) * 1996-07-11 1998-07-07 California Pacific Medical Center Methods for identifying genes amplified in cancer cells

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HOELTKE H. J., ET AL.: "MULTIPLE NUCLEIC ACID LABELING AND RAINBOW DETECTION.", ANALYTICAL BIOCHEMISTRY., ACADEMIC PRESS INC., NEW YORK., vol. 207., no. 01., 15 November 1992 (1992-11-15), NEW YORK., pages 24 - 31., XP000323752, ISSN: 0003-2697, DOI: 10.1016/0003-2697(92)90494-R *
See also references of EP1053349A4 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1113382A1 (en) * 1999-12-27 2001-07-04 Applied Research Systems ARS Holding N.V. A method for the identification of gene transcripts with improved efficiency in the treatment of errors
WO2001048670A2 (en) * 1999-12-27 2001-07-05 Applied Research Systems Ars Holding N.V. A method for the identification of gene transcripts with improved efficiency in the treatment of errors
WO2001048670A3 (en) * 1999-12-27 2002-05-10 Applied Research Systems A method for the identification of gene transcripts with improved efficiency in the treatment of errors
AU771954B2 (en) * 1999-12-27 2004-04-08 Laboratoires Serono Sa A method for the identification of gene transcripts with improved efficiency in the treatment of errors
AU771954C (en) * 1999-12-27 2005-06-30 Laboratoires Serono Sa A method for the identification of gene transcripts with improved efficiency in the treatment of errors
US7101665B2 (en) 1999-12-27 2006-09-05 Applied Research Systems Ars Holdings N.V. Method for the identification of gene transcripts with improved efficiency in the treatment of errors
EP1364066A2 (en) * 2001-02-02 2003-11-26 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Method for identifying functional nucleic acids

Also Published As

Publication number Publication date
CA2319148A1 (en) 1999-07-29
AU756357B2 (en) 2003-01-09
AU2339199A (en) 1999-08-09
EP1053349A1 (en) 2000-11-22
EP1053349A4 (en) 2004-12-15
JP2002500896A (en) 2002-01-15

Similar Documents

Publication Publication Date Title
US10787706B2 (en) System and methods for massively parallel analysis of nucleic acids in single cells
US20070020618A1 (en) Process to study changes in gene expression in T lymphocytes
US20030049599A1 (en) Methods for negative selections under solid supports
CN105339503A (en) Transposition into native chromatin for personal epigenomics
CN110234772B (en) Enhanced immune cell receptor sequencing method
JP2018525034A (en) Methods for providing tumor-specific T cells
EP3807636B1 (en) A system for identification of antigens recognized by t cell receptors expressed on tumor infiltrating lymphocytes
AU756357B2 (en) Methods for identifying therapeutic targets
Marincola et al. The role of quantitative PCR for the immune monitoring of cancer patients
WO1995006750A1 (en) Methods for quantifying the number of cells containing a selected nucleic acid sequence in a heterogenous population of cells
Frazer et al. RDA of lymphocyte subsets
EP3548511B1 (en) Methods and materials for cloning functional t cell receptors from single t cells
Albertini et al. Clonal expansions of 6‐thioguanine resistant T lymphocytes in the blood and tumor of melanoma patients
WO2023240218A1 (en) Methods for detecting genomic abnormalities in cells
JP2022544578A (en) Targeted hybrid capture method for determining T cell repertoire
KR20210030929A (en) Amplification method and primer for use therein

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 23391/99

Country of ref document: AU

ENP Entry into the national phase

Ref document number: 2319148

Country of ref document: CA

Ref country code: CA

Ref document number: 2319148

Kind code of ref document: A

Format of ref document f/p: F

ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 528722

Kind code of ref document: A

Format of ref document f/p: F

NENP Non-entry into the national phase

Ref country code: KR

WWE Wipo information: entry into national phase

Ref document number: 1999903346

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 09733026

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 1999903346

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWG Wipo information: grant in national office

Ref document number: 23391/99

Country of ref document: AU

WWW Wipo information: withdrawn in national office

Ref document number: 1999903346

Country of ref document: EP