US 20050125161 A1
The invention relates to a method for staging embryos of plants. In particular, this invention relates to a method for creating a relational database by determining transcript levels of sets of genes expressed at predetermined stages in embryo development. This approach creates a method by which embryos of unknown stage development can be determined by comparisons between expression levels of those embryos to the expression levels found in the database. This approach further allows rapid identification of transcripts in an embryo to be staged by the utilization of probes corresponding to cDNAs comprising the database. Additionally, this invention relates to a method for selecting advantageous plant clones for future propagation. Specifically, this method relates to an approach to link the biochemical condition of an embryo to current culture conditions and thus provides a method for enhancing conditions to produce embryos with a desired biochemical state.
1. A relational database comprising the data of Table I.
2. A method of staging embryos comprising:
a) providing at least one embryo;
b) detecting the expression in the embryo of at least one RNA transcript of Table I; and
c) correlating the expression of said transcript to one or more embryonic stages.
3. The method of
4. The method of
5. The method of
6. The method of
7. The method of
8. The method of
9. The method of
10. A database comprising a multiplicity of nucleotide sequences shown in any one of Table I, including variants thereof, wherein said variants hybridize under conditions of high stringency to either strand of a denatured, double-stranded DNA comprising any of SEQ ID NOS: 1-327.
11. The database of
12. A DNA array comprising a multiplicity of nucleotide sequences shown in Table I, including variants thereof, wherein said variants hybridize under conditions of high stringency to either strand of a denatured, double-stranded DNA comprising any of SEQ ID NOS: 1-327.
13. The DNA array of
14. A method for staging plant embryos comprising:
a) selecting total RNA from a multiplicity of embryos of known developmental age;
b) correlating the embryonic expression pattern to the developmental age to develop a relational database;
c) determining levels of expression from embryos of unknown developmental age by hybridization to a DNA array comprising a multiplicity of the nucleotide sequences shown in any one of SEQ ID NOS: 1-327;
d) correlating the expression pattern from step 3 to the relational database to determine developmental stage for the unknown embryo.
15. The method of
16. The method of
17. An isolated nucleic acid variant of the nucleotide sequence shown in any one of SEQ ID NOS: 1-334, wherein said variant hybridizes under conditions of moderate stringency to either strand of a denatured, double-stranded DNA comprising any of SEQ ID NOS: 1-334.
18. An isolated polypeptide encoded by a nucleic acid molecule of
19. An isolated nucleic acid encoding the polypeptide of
20. Antibodies that specifically bind to the peptide of
21. The antibodies of
22. A recombinant vector that directs the expression of a nucleic acid of
23. A host cell transformed with the vector of
24. The host cell of
25. A method for staging plant embryos comprising:
a) selecting total RNA from at least one embryo of known developmental age;
b) determining the level of expression of a multiplicity of genes which hybridize to one or more of SEQ ID NOS: 1-327;
c) correlating the known developmental ages of the embryos from step 1) with the profile of expression measured in step 2);
d) applying the correlation of step 3) to a sample of embryo RNA from embryos to be staged; and
e) determining the embryo stage.
26. The method of
27. The method of
28. The method of
29. The method of
30. A method for selecting advantageous plant clones comprising:
a) selecting one or more samples of embryonic RNA from multiple clones of plants;
b) determining that at least one sampled clone has an advantageous characteristic;
c) comparing the embryonic levels of expression of genes which hybridize to one or more of SEQ ID NOS: 1-327 in samples from the advantageous clone with expression levels in at least one clone that does not show the advantageous characteristic; and
d) selecting additional clones which show an embryonic gene expression pattern more similar to that of the advantageous clone than to the pattern of at least one clone that does not show the advantageous characteristic.
31. Method of
32. Method of
33. The method of
34. The method of
35. The method of
36. The method of
37. A method of determining embryo fitness comprising:
a) creating a relational database with RNA expression values for genes listed in Table I for embryos of known developmental stages;
b) isolating total RNA from embryos of unknown stage development;
c) measuring expression levels of genes identified in Table I from the solated total RNA; and
d) correlating the database of step 1) with the pattern of expression determined in steps 2) and 3) to assess proper embryo development.
38. The method of
39. The method of
40. The method of
41. The method of
42. A method for selecting advantageous growth conditions for embryo development comprising:
a) determining RNA expression profiles for staged embryos under control culture conditions;
b) altering culture conditions;
c) determining RNA expression profiles for staged embryos under altered culture conditions; and
d) correlating culture change to developmental effect in embryo.
43. The method of
44. The method of
45. The method of
46. The method of
47. A recombinant nucleic acid molecule encoding a product during embryo development comprising:
a) a first nucleic acid sequence which is the LP2-3 promoter; and
b) a second nucleic acid sequence encoding a product,
wherein the first nucleic acid is operatively linked to the second nucleic acid molecule whereby its expression is directed by the promoter sequence.
48. The recombinant nucleic acid molecule of
49. The recombinant nucleic acid molecule of
50. The recombinant nucleic acid molecule of
51. The recombinant nucleic acid molecule of
52. The recombinant nucleic acid molecule of
53. A plant cell comprising the recombinant nucleic acid molecule of
54. A method for producing a protein product during embryo development comprising:
a) operatively linking one more stage-specific embryo promoter(s) to one or more nucleic acid molecules that encode a protein product,
b) delivering construct to developing embryos.
55. The method of
56. The method of
57. The method of
58. A method for staging embryos comprising:
a) providing one or more stage-specific embryo promoter(s) operatively linked to one or more nucleic acid molecules that encode a protein product to developing embryos,
b) monitoring expression of the protein product as the embryo matures through stage in which promoter functions.
59. The method of
60. The method of
61. The method of
This patent application claims benefit of priority of provisional application U.S. Ser. No. 60/239,250, filed Oct. 11, 2000, and claims benefit of priority of provisional application U.S. Ser. No. 60/260,882, filed Jan. 12, 2001.
The present invention relates to a relational database of cDNA molecules, including those corresponding to Loblolly Pine Major Intrinsic Protein (MIP), which are differentially expressed during plant embryogenesis. The present invention further relates to the use of DNA arrays for evaluating gene expression in somatic and zygotic embryos. The invention encompasses related nucleic acids, proteins, antigens, and antibodies derived from these cDNAs as well as the use of such molecules for the staging, characterization, and manipulation of plant embryogenesis, in particular conifer embryogenesis. The cDNAs and related nucleic acids, proteins, antigens, and antibodies derived from these cDNAs are useful in the design, selection, and cultivation of improved crops, specifically including coniferous trees, which provide raw materials for paper and wood products.
The world demand for paper is expected to increase nearly 50% by the year 2010 (McNutt and Rennel, Pulp Paper Intern 39: 48 (1997)). The United States' forest products industry faces a great challenge in order to keep pace with the growing demand for paper. This challenge is made greater by the decreasing availability of a forest land-base, resulting from environmental restrictions and urban growth, from which to harvest timber resources. Additionally, valuable wood resources are lost to the environmental stresses and biotic diseases. Consequently, the push to secure a renewable and sustainable source of raw material for paper and other wood related products has become an important priority for the forest products industry.
Current forestry related research and development is focused on creating sustainable fiber farms or tree plantations. Farming trees with elite germplasms will increase growth rates and yields of wood per acre. However, creating improved tree stock requires the ability to identify and generate genetically superior trees and a way to propagate such superior trees without diluting their genetic quotient.
A. Breeding and Selection
Addressing the need to propagate genetically superior trees without genetic diminution demands full research attention. Traditional methods of tree propagation relied on selected breeding programs to achieve genetic gain, i.e., the development of a strain, sub-strain, or line having any heritable and economically valuable characteristic or combination of characteristics not found in the parents. Based on the results of progeny tests, superior maternal trees are selected and used in “seed orchards” for mass production of genetically improved seed. The genetic gain in such an open-pollinated sexual propagation strategy is, however, limited by the breeder's inability to control the paternal parent. Additional gains can also be achieved by control-pollination of the maternal tree with pollen from individual trees whose progeny have demonstrated superior growth characteristics. Nevertheless, even under controlled conditions where both parents of each seed are the same, sexual propagation results in a “family” of seeds, i.e., siblings, comprised of many different genetic combinations. As not all genotype combinations are favorable, the genetic gain in any particular progeny is frequently offset and obscured by the genetic variation among sibling seeds and those seedlings retaining undesirable or previously masked pre-cross traits.
In addition to inherent genetic limitations of a traditional breeding programs, large-scale production of control pollinated seeds is also expensive. Consequently, economic and biological limitations of large-scale seed production has lead the industry to turn towards methods of asexual reproduction, such as grafting, vegetative propagation and micropropagation, as more viable alternatives.
B. Asexual (Clonal) Propagation
Asexual propagation permits the application of very high selection intensity, resulting in the propagation of only those progeny showing a high genetic gain potential. These highly desirable progeny can have unique genetic combinations that result in superior growth and performance characteristics. Thus, with asexual propagation it is possible to genetically select individuals while avoiding a concomitant reduction of genetic gain due to intra-familial variation.
Asexual propagation of trees can be accomplished currently by grafting, vegetative propagation, and micropropagation. Grafting, widely used to propagate select individuals in limited quantities for seed orchard establishment, is not applicable to large-scale production for reforestation. Vegetative propagation, achieved by the rooting of cuttings, and micropropagation by somatic embryogenesis, currently hold the most potential for reforestation of conifers. Although vegetative propagation by rooted cuttings can be achieved in many coniferous species, large-scale production via this method is extremely costly due to difficulties in automating and mechanizing the process, not to mention the need for tremendous quantities of stock tissue. This propagation method is still further limited by the fact that the rooting potential of stock plants decrease with time, making it difficult to serially propagate from select genotypes over extended periods of time.
Micropropagation is the most promising method of asexual propagation for mass plantings. This process involves the production of somatic embryos in vitro from minute pieces of plant tissue or individual cells. The embryos are referred to as somatic because they are derived from the somatic (vegetative) tissue, rather than from the sexual process. Both vegetative propagation and micropropagation have the potential to capture all genetic gain of highly desirable genotypes. However, unlike conventional vegetative propagation methods, somatic embryogenesis is amenable to automation and mechanization, making it highly desirable for large-scale production of planting stock for reforestation. Moreover, somatic embryogenesis is particularly amenable to high intensity selection of a large number of clones. These advantages are compounded by the ability to safely preserve somatic embryogenic cultures in liquid nitrogen for long-term storage. Consequently, long-term cryogenic preservation offers immense advantages over other vegetative propagation systems that attempt to maintain the juvenility of stock plants. Techniques for somatic embryogenesis in a wide variety of plant species are well known in the art; exemplary methods for performing somatic embryogenesis in conifers are taught in U.S. Pat. Nos.: 5,036,007; 5,236,841; 5,294,549; 5,413,930; 5,491,090; 5,506,136; 5,563,061; 5,677,185; 5,731,203; 5,731,204; and 5, 856,191, herein incorporated by reference in their entirety.
Thus, somatic embryogenesis has great potential for clonal production of conifer embryos to meet the increased demands of the pulp and paper industry. Assessment of embryo quality, however, needs improvement. The process of creating better tree stock begins with understanding the process of tree development from embryogenesis through full maturation.
In general, plant tissue culture is the broad science of growing plant tissues on or in a nutrient medium containing minerals, sugars, vitamins and plant hormones. By adjusting the composition of the media, cultured tissues can be induced to grow or differentiate into specific cell types or organs. “Somatic embryogenesis” is a type of plant tissue culture where a piece of a donor plant is excised, cultured and induced to form multiple embryos. An embryo is a discrete mass of cells with a well-defined structure that is capable of growing into a whole plant.
The methods generally in use for somatic embryogenesis today involve several steps. Prior to the tissue culture process, a suitable “explant” is harvested. A typical explant in conifer somatic embryogenesis is the “megagametophyte”, a haploid nutritive tissue of the conifer seed, which is extracted from the ovule of a pollinated female cone. This ovule contains single or multiple zygotic seed embryos. In the seeds of many coniferous species, one or more genetically unique embryos naturally undergo a process called cleavage polyembryony, where a zygotic embryo grows and divides to form a small clones of embryos.
The first step in somatic embryogenesis is the initiation step. The explant is placed on a suitable media. When the explant is an ovule, a process called extrusion occurs. Extrusion involves the emergence or expulsion of a zygotic embryo or multiple embryos and embryogenic tissue out of the megagametophyte. If culture conditions are suitable, initiation proceeds and the extruded embryo or embryos undergo the process of cleavage polyembryony. This results in the formation of early stage somatic embryos in a glossy, mucilaginous mass.
After embryogenic cultures are initiated, the somatic embryos are transferred to a second medium with an appropriate composition of plant hormones and other factors to induce the somatic embryos to multiply. In the multiplication stage, cultures can double up to 2-6 times in one week. Once large numbers of embryos are obtained in the multiplication stage, the embryos are moved to a development and maturation medium. Here, the correct balance of plant hormones and other factors will induce the early-stage embryos to mature into late stage embryos. Following the maturation and development stage, embryos are germinated to form small seedlings. These seedlings are then acclimated for survival outside of the culture vessel. After acclimation, the seedlings are ready for planting.
The relative ability to propagate plants by somatic embryogenesis can vary greatly between species. Among conifers, for example, spruce (Picea) species and Douglas fir are easily propagated, while Pinus species are much more difficult. Many Pinus species, including Loblolly pine (Pinus taeda), do not readily initiate embryonic cultures. Typical initiation frequencies between 1% and 12% are reported for various Pinus species (Becwar et al., For. Sci. p1-18 (1988), Jain et al., Plant Sci. 65:233-241 (1989), Becwar et al., Can. J. For. Res. 20:810 (1990), Li and Huang, J. Tissue Cult. Assoc. 32:129 (1996)). Laine and David, (Plant Sci. 69:215 (1990)), however, were able to obtain high frequencies of initiation (up to 59%) in Pinus caibaea, suggesting that not all Pinus species are recalcitrant. Also, one earlier report described initiation frequencies of 54% in White pine (Pinus strobus). Finer et al., Plant Cell Rep. 8:203 (1989). However, other workers were not able to duplicate this success. Michler et al., Plant Sci. 77:111 (1991). The results in the literature demonstrate the recalcitrance of Pinus species, especially Loblolly pine, in regeneration by somatic embryogenesis.
Nevertheless, once this process is understood from the standpoint of developmental genetics, breeders will then have the appropriate tools to monitor, intervene, and improve both the regeneration frequency and the overall quality of tree stock through genetic engineering. For example, both environmental requirements and responsiveness of a developing embryo change as the embryo passes various developmental milestones. Consequently, accurate and timely knowledge of the developmental stage of an embryonic culture would allow the skilled practitioner to beneficially adjust the growth media components and other environmental factors to achieve optimal embryo survival, growth, and maturation. In addition, an understanding of developmentally regulated genes would allow for early selection of advantageous clones and provide tools for developmentally regulated transgenic expression systems.
Currently, a reasonable determination of the precise developmental stage of an embryo requires a practiced, physical familiarity with the morphological appearance of embryos at different stages, which is further complicated by the presence of morphological variations between species. Consequently, visual determination is performed best by experts in the field. Thus, there is a need in the art for a staging method which can be reliably practiced by the ordinary practitioner. The current invention will allow one to stage embryos based on a relational database system profiling gene expression patterns instead of physical morphological differences, thereby permitting one less skilled in the art of visual staging to biologically determine the stages of embryogenesis.
The traditional morphological staging method provides only a crude indication of the underlying biochemical condition or state of an embryo. This level of information is insufficient for refining culture conditions, including media formulations, or for selecting potentially advantageous embryo clones for further development. Thus, there is a need in the art for a more sensitive staging method that precisely defines the physiological age, health, growth requirements, and potential fitness of a particular embryo. The current invention will allow definitive staging significantly beyond that currently practiced in the art, and provides a detailed analysis of the biochemical state and potential fitness of an embryo by comparison to developed relational database profiles.
Visual staging methods depend on morphological markers to assign a numerical stage of 1-9 to an embryo. Nevertheless, it is well accepted that visually undetectable developmental changes occur in an embryo after it reaches stage 9. The current invention is particularly useful in providing means for monitoring and evaluating the developmental state of these older embryos, as genetic responses occur and are detectable up to and through an adult tree's life.
There further exists in the art a need for information regarding the proteins, genes, and gene expression patterns in plant embryo development, as well as a more thorough understanding of how this information relates to the physiology, developmental potential, and genetic quotient of a plant embryo. The relational database system provides a platform for which to monitor individual gene expression levels during embryo development while directly correlating expression with, for example, environmental conditions, age, and embryo fitness, as well as the protein identification achieved by BLAST searches of publicly available databases (i.e., GenBank) for desirable genes. Accordingly, the present invention therefore provides the additional ability to correlate the direct, global gene expression response within the embryo system to a typically non-expressing gene driven by a stage-specific promoter.
The present invention addresses these needs by providing in a relational database format nucleic acid and protein sequences that are differentially expressed during various stages of plant embryogenesis. The invention encompasses a set of isolated nucleic acid molecules comprising the DNA sequence of any one of SEQ ID NOS: 1-334 and nucleic acid molecules related or complementary to any one of SEQ ID NOS: 1-334. (See Table I) As such, the invention includes both single-stranded and double-stranded RNA and DNA nucleic acids, including variants thereof. The nucleic acids of the invention can be used as an expression template in the form of DNA arrays, including for example, gene arrays, DNA chips, and dot array Southerns, for which to compare and evaluate expression in test samples. (See Table II) The nucleic acids of the invention can be further used as probes to detect the presence or level of both single-stranded and double-stranded RNA and DNA encoding variants of polypeptides or fragments of polypeptides encompassed by the invention. The nucleic acids of the invention can be further used as promoters for the expression of sense and antisense molecules at specific stages of embryo development. Data acquired through the use of the present invention can in turn be provided to the relational database for further development.
Isolated nucleic acid molecules that hybridize to a denatured, double-stranded DNA comprising the DNA sequence of any one of SEQ ID NOS: 1-334 under conditions of moderate or high stringency are also encompassed by the invention. The invention further encompasses synthetic and naturally-occurring variants of the nucleic acids described in SEQ ID NOS: 1-334, for example, isolated nucleic acid molecules derived by in vitro mutagenesis from SEQ ID NOS: 1-334. In vitro mutagenesis would include numerous techniques known in the art including, but not limited to, site-directed mutagenesis, random mutagenesis, and in vitro nucleic acid synthesis.
The invention also encompasses related molecules (variants) including isolated nucleic acid molecules degenerate from SEQ ID NOS: 1-334 as a result of the genetic code, for example, naturally-occurring or synthetic allelic variants of the genes encoding SEQ ID NOS: 1-334. Such related molecules also encompass both smaller and larger nucleic acids that contain sufficient sequence to support hybridization to any of SEQ ID NOS: 1-334 under conditions of moderate or high stringency. Consequently, recombinant vectors, including those that direct the expression of these nucleic acid molecules and host cells transformed or transfected with these vectors are herein defined as variants and are encompassed by the invention.
Another embodiment of this invention is the production of transgenic vectors and transgenic plants comprising vectors or other nucleic acids comprising any one of SEQ ID NOS: 1-334 and related molecules. Particularly preferred are those capable of expressing polypeptides or peptides encoded by any of SEQ ID NOS: 1-327. In a preferred embodiment, the transgene comprises SEQ ID NO: 327, or a variant thereof.
SEQ ID NO: 327 encodes a protein which corresponds to a novel Loblolly pine homolog of the plant Major Intrinsic Protein (MIP) family. MIPs comprise a large family of related proteins that function as membrane channels for the transport of water and possibly ions across cellular membranes. Henceforth, the encoded protein of SEQ ID NO: 327 may be referred to as Loblolly MIP. Variants, including naturally-occurring and artifactually-programmed allelic variants, vectors, and other nucleic acids which hybridize to SEQ ID NO: 327 under conditions of moderate or high stringency are encompassed by the invention. Also encompassed are plant cells, seeds, embryos and trees, transgenic for loblolly pine MIP, and variants thereof.
The invention also encompasses isolated polypeptides, or fragments thereof, encoded by any one of the nucleic acid molecules of SEQ ID NOS: 1-327, including variants thereof. The invention further encompasses the use of these peptide sequences as markers for staging, monitoring, and selecting embryos and embryo cultures. The invention also encompasses methods for the production of these polypeptides or fragments thereof including culturing a host cell under conditions promoting expression and recovering the polypeptide or peptide from the culture medium. In particular, the expression of polypeptides or peptides encoded by SEQ ID NOS: 1-327 in viral vectors, bacteria, yeast, plant, and animal cells is encompassed by the invention. Isolated polyclonal or monoclonal antibodies that bind to peptides encoded by SEQ ID NOS: 1-327 are also encompassed by the invention.
Further encompassed by this invention are methods for using the nucleic acid molecules of any one of SEQ ID NOS: 1-327 to obtain full length cDNA and genomic sequences of the corresponding genes, including cognate, homologous, or otherwise related genetic sequences, which hybridize to any of SEQ ID NOS: 1-327 under conditions of moderate or high stringency. Also provided by this invention are oligonucleotides derived from any one of SEQ ID NOS: 1-334 that can be used as probes and/or as primers in PCR, RT-PCR, and other assays to detect the presence or level of the nucleic acids of SEQ ID NOS: 1-334 and related molecules.
The primers and other probes of the invention may be used to monitor and characterize the development of plant embryos, in particular, pine tree embryos. Characterization of embryonic gene expression provides means for correlating gene expression with current and potential plant phenotypes. Consequently, the present invention encompasses means for monitoring and adjusting growth conditions (see
The relational database of the present invention allows expression information pertaining to embryo stages to be viewed as sequence data generated in accordance with the present invention. The invention includes a database for storing a plurality of sequence records for which to correlate embryo stages to sequence records. The method further involves providing an interface which allows a user to select one or more expression categories contained within the database.
The relational database is designed to include separate parts or cells for information storage. One cell or part may contain a gene expression database which contains nucleic acid molecules of SEQ ID NOS: 1-327. Other cells or parts may contain descriptive information pertaining to each nucleic acid molecules of SEQ ID NOS: 1-327, additional sequence data related to the gene expression database, protein encoded by nucleic acids disclosed herein, similarity values to known proteins of other systems, and to conditions under which expression data was obtained.
The database system described in the present invention will allow identification or selection of particular genes of interest for further use with DNA arrays. Identification or selection of particular genes may include, for example, those related to patterns of expression, those identified with homology to known genes from other studies, and those sequences considered novel.
The three hundred and twenty-seven differentially expressed cDNAs isolated from plant specimens of known developmental ages are disclosed in SEQ ID NOS: 1-327. The seven stage-specific promoters isolated from plant specimens are disclosed in SEQ ID NOS: 328-334. The discovery of these cDNAs and promoters enables the design, isolation, and construction of related nucleic acids, proteins, antigens, antibodies other heterologous genes. Both the cDNAs and promoters facilitate the staging, characterization, and manipulation of plant embyrogenesis, in particular, conifer embryogenesis. These molecules, and related nucleic acids, peptides, proteins, antigens, and antibodies are particularly useful when compiled into a relational database for the monitoring, design, selection, and cultivation of improved crop plants.
The cDNAs of SEQ ID NOS: 1-327, in addition to the promoters of SEQ ID NOS: 328-334, were originally derived from Pinus taeda embryos, commonly known as the Loblolly Pine. Nevertheless, it is understood that the invention is applicable to and encompasses all plants, including all dicotyledonous plants, including all conifers, including all species of Pinus, Picea, and Pseudotsuga. Exemplary conifers may include Picea abies, and Psedotsuga menziesii, and Pinus taeda.
Nucleic Acid Molecules
In a particular embodiment, the invention relates to certain isolated nucleotide sequences including those that are substantially free from contaminating endogenous material. The terms “nucleic acid” or “nucleic acid molecule” refer to a deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, and unless otherwise limited, would encompass known analogs of natural nucleotides that can function in a similar manner as naturally occurring nucleotides. A “nucleotide sequence” also refers to a polynucleotide molecule or oligonucleotide molecule in the form of a separate fragment or as a component of a larger nucleic acid. The nucleotide sequence or molecule may also be referred to as a “nucleotide probe.” The nucleic acid molecules of the invention are derived from DNA or RNA isolated at least once in substantially pure form and in a quantity or concentration enabling identification, manipulation, and recovery of its component nucleotide sequence by standard biochemical methods. Examples of such methods, including methods for PCR protocols that may be used herein, are disclosed in Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1989), Current Protocols in Molecular Biology edited by F. A. Ausubel et al., John Wiley and Sons, inc. (1987), and Innis, M. et al., eds., PCR Protocols: A Guide to Methods and Applications, Academic Press (1990), each of which are herein incorporated by reference in their entirety.
As used herein a “nucleotide probe” is defined as an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, through complementary base pairing, or through hydrogen bond formation. As described above, the oligonucleotide probe may include natural (ie. A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, bases in a oligonucleotide probe may be joined by a linkage other than a phosphodiester bond, so long as it does not prevent hybridization. Thus, oligonucleotide probes may have constituent bases joined by peptide bonds rather than phosphodiester linkages.
A “target nucleic acid” herein refers to a nucleic acid to which the nucleotide probe or molecule can specifically hybridize. The probe is designed to determine the presence or absence of the target nucleic acid, and the amount of target nucleic acid. The target nucleic acid has a sequence that is complementary to the nucleic acid sequence of the corresponding probe directed to the target. As recognized by one of skill in the art, the probe may also contain additional nucleic acids or other moieties, such as labels, which may not specifically hybridize to the target. The term target nucleic acid may refer to the specific nucleotide sequence of a larger nucleic acid to which the probe is directed or to the overall sequence (e.g., gene or mRNA) whose expression level it is desired to detect. One skilled in the art will recognize the full utility under various conditions.
As described herein, the nucleic acid molecules of the invention include DNA in both single-stranded and double-stranded form, as well as the RNA complement thereof. DNA includes, for example, cDNA, genomic DNA, chemically synthesized DNA, DNA amplified by PCR, and combinations thereof. Genomic DNA, including translated, non-translated and control regions, may be isolated by conventional techniques, e.g., using any one of the cDNAs of SEQ ID NO: 1 through SEQ ID NO: 327, or suitable fragments thereof, as a probe, to identify a piece of genomic DNA which can then be cloned using methods commonly known in the art. In general, nucleic acid molecules within the scope of the invention include sequences that hybridize to sequences of SEQ ID NOS: 1-334 under hybridization and wash conditions of 5°, 10°, 15°, 20°, 25°, or 30° below the melting temperature of the DNA duplex of sequences of SEQ ID NOS: 1-334, including any range of conditions subsumed within these ranges.
In a further embodiment, DNA arrays are used to identify hybridizing sequences from test samples. The term “DNA array” refers to “gene arrays,” “DNA chips,” “dot array Southerns,” etc. One of skill in the art will appreciate that an enormous number of array designs are suitable for the practice of this invention. The DNA array will typically include one or a multiplicity of nucleic acid molecules derived from SEQ ID NO: 1 through SEQ ID NO: 327 that specifically hybridize to the nucleic acid expression of which is to be detected. In addition, the array may include one or more control probes to monitor the expression system. Control probes refer to known expression products present at each stage of expression, e.g., ribosomal gene products or the transcripts of other housekeeping genes. The organization of the DNA array will be known to facilitate interpretation of results. Examples in the art describing the uses and composition of DNA arrays can be found in U.S. Pat. Nos.: 5,700,637, 5,837,832, 5,843,655, 5,874,219, 6,040,138, 6,045,996, and are incorporated by reference.
Molecules That Hybridize to Identified Sequences
Thus, in a particular embodiment, this invention provides an isolated nucleic acid molecule selected from the group consisting of:
As used herein, stringency conditions in nucleic acid hybridizations can be readily determined by those having ordinary skill in the art based on, for example, the length and composition of the nucleic acid. In one embodiment, moderate stringency is herein defined as a nucleic acid having 10, 11, 12, 13, 14, 15, 16, or 17, contiguous nucleotides identical to any of the sequences of SEQ ID NOS: 1-334, or a complement thereof. Similarly, high stringency is hereby defined as a nucleic acid having 18, 19, 20, 21, 22, or more contiguous identical nucleotides, or a longer nucleic acid having at least 80, 85, 90, 95, or 99 percent identity with any of the sequences of SEQ ID NOS: 1-334; for sequences of at least 50, 100, 150, 200, or 250 nucleotides, high stringency may comprise an overall identity of at least 60, 65, 70 or 75 percent.
Generally, nucleic acid hybridization simply involves providing a denatured nucleotide molecule or probe and target nucleic acid under conditions where the probe and its complementary target can form stable hybrid duplexes through complementary base pairing. The nucleic acids that do not substantially form hybrid duplexes are then washed away leaving the hybridized nucleic acids to be detected, typically through detection of an attached detectable label. It is further generally recognized that nucleic acids are denatured by increasing the temperature or decreasing the salt concentration of the buffer containing the nucleic acids. Under lower stringency conditions (e.g., low temperature and/or high salt) hybrid duplexes (e.g., DNA:DNA, RNA:RNA, or RNA:DNA) will form even where the annealed sequences are not perfectly complementary. Thus specificity of hybridization is reduced at lower stringency. Conversely, at higher stringency (e.g., higher temperature or lower salt) successful hybridization requires fewer mismatches. One of skill in the art will appreciate that hybridization conditions may be selected to provide any degree of stringency.
As used herein, the percent identity between an amino acid sequence encoded by any of SEQ ID NOS: 1-334 and a potential hybridizing variant can be determined, for example, by comparing sequence information using the GAP computer program, version 6.0 described by Devereux et al. (Nucl. Acids Res. 12:387, 1984) and available from the University of Wisconsin Genetics Computer Group (UWGCG). The GAP program utilizes the alignment method of Needleman and Wunsch (J. Mol. Biol, 48:443, 1970), as revised by Smith and Waterman (Adv. Appl. Math 2:482, 1981). The preferred default parameters for the GAP program include: (1) a unary comparison matrix (containing a value of 1 for identities and 0 for non-identities) for nucleotides, and the weighted comparison matrix of Gribskov and Burgess (Nuci. Acids Res. 14:6745, 1986), as described by Schwartz and Dayhoff (eds., Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, pp. 353-358, 1979); (2) a penalty of 3.0 for each gap and an additional 0.10 penalty for each symbol in each gap; and (3) no penalty for end gaps.
Alternatively, basic protocols for empirically determining hybridization stringency are set forth in section 2.10 of Current Protocols in Molecular Biology edited by F. A. Ausubel et al., John Wiley and Sons, Inc. (1987). Stringency conditions can be determined readily by the skilled artisan. An example of moderate stringency hybridization conditions would be hybridization in 5×SSC, 5× Denhardt's Solution, 50% (w/v) formamide, and 1% SDS at 42° C. with washing conditions of 0.2×SSC and 0.1% SDS at 42° C. An example of high stringency conditions can be defined as hybridization conditions as above, and with washing at approximately 68° C., in 0.1×SSC, and 0.1% SDS. The skilled artisan will recognize that the temperature and wash solution salt concentration can be adjusted as necessary according to factors such as the length of the probe.
Due to the degeneracy of the genetic code wherein more than one codon can encode the same amino acid, multiple DNA sequences can code for the same polypeptide. Such variant DNA sequences can result from genetic drift or artificial manipulation (e.g., occurring during PCR amplification or as the product of deliberate mutagenesis of a native sequence). The present invention thus encompasses any nucleic acid capable of encoding a protein derived from SEQ ID NOS: 1-327, or variants thereof.
Deliberate mutagenesis of a native sequence can be carried out using numerous techniques well known in the art. For example, oligonucleotide-directed site-specific mutagenesis procedures can be employed, particularly where it is desired to mutate a gene such that predetermined restriction nucleotides or codons are altered by substitution, deletion or insertion. Exemplary methods of making such alterations are disclosed by Walder et al. (Gene 42:133, 1986); Bauer et al. (Gene 37:73, 1985); Craik (BioTechniques, Jan. 12-19, 1985); Smith et al. (Genetic Engineering: Principles and Methods, Plenum Press, 1981); Kunkel (Proc. Natl. Acad. Sci. USA 82:488, 1985); Kunkel et al. (Methods in Enzymol. 154:367, 1987); and U.S. Pat. Nos. 4,518,584 and 4,737,462, all of which are incorporated by reference.
Thus, the invention further provides an isolated nucleic acid molecule selected from the group comprising of (1), (2), and (3) above and further consisting of:
The cDNAs isolated and cloned through the differential display procedure will often only represent a partial sequence (generally the 3′ end) of the mRNA from which it was derived due to the nature of the arbitrary primer used in the differential display PCR reaction. Consequently, the cDNA sequences of SEQ ID NOS: 1-327 provide powerful tools for obtaining the sequences of full-length cDNAs. This can be accomplished by using a partial cDNA as a probe to identify and isolate the full length cDNA from a population of full length cDNAs or from a full length cDNA library. As is well known in the art, similar procedures can be used to identify corresponding genomic DNA sequences.
Alternatively, one can obtain the 5′ sequence of a partial cDNA by performing Rapid Amplification of cDNA Ends (RACE) procedures such as those disclosed in Frohman, Methods in Enzymology, 218:340-356 (1993) and Bertling et al., PCR Methods and Applications 3:95-99 (1993) which are hereby incorporated by reference. For example, Clonetech Laboratories, Inc. (Palo Alto, Calif.) offers a SMAR™ cDNA product line that allows one to generate high quality full length cDNAs and cDNA libraries. SMART™ technology can also be used to perform RACE. One skilled in the art will readily recognize that there are other equivalent products and procedures for obtaining full length cDNAs. Full length cDNAs may be sequenced and their sequences compared to sequences in public databases to assess their identities and/or homologies to other known sequences.
Cloned full length cDNAs can be used in the construction of expression vectors for the production and purification of pine tree polypeptides which contain the pine tree peptides encoded by the cDNAs of any one of SEQ ID NOS: 1-327.
Oligonucleotide Primers for PCR Assays
In another embodiment, the present invention encompasses oligonucleotide fragments derived from any one of SEQ ID NO: 1 through SEQ ID NO: 327 or from the reverse complement sequence of any one of SEQ ID NO: 1 through SEQ ID NO: 327. Such oligonucleotides would be useful as primers in the performance of RT-PCR assays to detect, or even quantify, pine embryo stage-specific transcripts. Such oligonucleotide primers will generally comprise from 10 to 25 nucleotides substantially complementary to the ends of the target sequence and may contain additional non-complementary nucleotides, for example, nucleotides that generate a restriction endonuclease site or cloning junction. Programs useful in selecting PCR primers may be used to design the oligonucleotides of this invention, but use of such programs is not necessary. By way of example, the Wisconsin Package™ software available from the Genetic Computer Group (Madison, Wis.) includes a program called Prime that can aid in selecting primers from a given template sequence. Protocols for the design and optimization of PCR reactions are commonly known in the art and are described in Saiki et al., Science 239:487 (1988); Recombinant DNA Methodology, Wu et al., eds., Academic Press, Inc., San Diego (1989), pp. 189-196; and PCR Protocols: A Guide to Methods and Applications, Innis et al., eds., Academic Press, Inc. (1990).
Antisense Nucleic Acid Molecules
Other useful fragments of the nucleic acids include antisense or sense oligonucleotides comprising a single-stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA (antisense) sequences. Antisense or sense oligonucleotides, according to the present invention, comprise a fragment of DNA from any one of SEQ ID NO: 1 through SEQ ID NO: 327. Such a fragment generally comprises at least about 14 nucleotides, preferably from about 14 To about 30 nucleotides. The ability to derive an antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein is described in, for example, Stein and Cohen (Cancer Res. 48:2659, 1988) and van der Krol et al. (Bio/Techniques 6:958, 1988).
Binding of antisense or sense oligonucleotides to target nucleic acid sequences results in the formation of duplexes or other nucleic acid complexes inimical to efficient production of gene products. The antisense oligonucleotides thus may be used to block expression of proteins or the function of RNA. Antisense or sense oligonucleotides further comprise oligonucleotides having modified sugar-phosphodiester backbones (or other sugar linkages, such as those described in WO91/06629) and wherein such sugar linkages are resistant to endogenous nucleases. Such oligonucleotides with resistant sugar linkages are stable in vivo (i.e., capable of resisting enzymatic degradation) but retain sufficient sequence specificity to be able to bind to target nucleotide sequences.
Other examples of sense or antisense oligonucleotides include those oligonucleotides which are covalently linked to organic moieties, such as those described in WO 90/10448, and other moieties that increases affinity of the oligonucleotide for a target nucleic acid sequence, such as poly-(L-lysine). Further still, intercalating agents, such as ellipticine, and alkylating agents or metal complexes may be attached to sense or antisense oligonucleotides. Such modifications may modify binding specificities of the antisense or sense oligonucleotide for the target nucleotide sequence.
Antisense or sense oligonucleotides may be introduced into a cell containing the target nucleic acid sequence by any gene transfer method, including, for example, lipofection, CaPO4-mediated DNA transfection, electroporation, or by using gene transfer vectors such as Epstein-Barr virus or adenovirus.
Sense or antisense oligonucleotides also may be introduced into a cell containing the target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that bind to cell surface receptors. In one embodiment, conjugation of the ligand binding molecule does not substantially interfere with the ability of the ligand binding molecule to bind to its corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide or its conjugated version into the cell.
Alternatively, a sense or an antisense oligonucleotide may be introduced into a cell containing the target nucleic acid sequence by formation of an oligonucleotide-lipid complex, as described in WO 90/10448. The sense or antisense oligonucleotide-lipid complex is preferably dissociated within the cell by an endogenous lipase.
Polypeptides Encoded by Differentially-Expressed cDNAs
The cDNAs of SEQ ID NOS: 1-327 can be translated into amino acid sequences potentially corresponding to portions of developmentally-regulated plant proteins. These amino acid sequences can be identified from sequences listed in Table I, below. The cDNAs encoding these predicted polypeptides are grouped into early, middle, and late transcripts according to the staged embryo population from which they were derived.
(See Table I)
Although the term “peptide” is generally understood to reference synthetic sequences, or fragments of larger proteins, and includes short amino acid sequences of between 2 and 10 amino acids, whereas “polypeptide” refers to larger sequences and full-length proteins, the terms are used interchangeably herein to indicate that the invention applies to peptides and polypeptides of any length and variants thereof. Moreover, the discovery of presumptive open reading frames in SEQ ID NOS: 1-327, and the ability to isolate additional cDNA sequence, enables the construction of expression vectors comprising nucleic acid sequences encoding those polypeptides. The cDNAs of the invention also enable cells transfected or transformed with expression vectors driving the expression of the encoded polypeptides and antibodies reactive with the polypeptides.
In one embodiment, the invention provides for isolated polypeptides, preferably, pine tree polypeptides. As used herein, the term “polypeptides” refers to a genus of polypeptide or peptide fragments that encompass the amino acid sequences identified from Table I, as well as smaller fragments. Consequently, the invention encompasses any polypeptide fragment comprising at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 contiguous amino acids encoded by the cDNAs of any of SEQ ID NOS: 1-327, or comprising at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 contiguous amino acids of any of amino acid sequence derived from Table I.
Alternatively, a polypeptide may be defined in terms of its antigenic relatedness to any peptide encoded by SEQ ID NOS:-1-327. Thus, in one embodiment a polypeptide within the scope of the invention is defined as an amino acid sequence comprising a linear or 3-dimensional epitope shared with any peptide encoded by the cDNAs of SEQ ID NOS: 1-327. Alternatively, a polypeptide within the scope of the invention is recognized by an antibody that specifically recognizes any peptide encoded by SEQ ID NOS: 1-327. Antibodies are defined to be specifically binding if they bind pine tree polypeptides with a Ka of greater than or equal to about 107 M−, and preferably greater than or equal to 108 M−1.
A polypeptide “variant” as referred to herein means a polypeptide substantially homologous to a native polypeptide, but which has an amino acid sequence different from that encoded by any of SEQ ID NOS: 1-327 because of one or more deletions, insertions or substitutions. The variant amino acid sequence preferably is at least 80% identical to a native polypeptide amino acid sequence, preferably at least 90%, more preferably, at least 95% identical over at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21-25, or 26-30 contiguous amino acids. The percent identity between an amino acid sequence encoded by any of SEQ ID NOS: 1-327 and a potential variant can be determined manually, or, for example, by comparing sequence information using the GAP computer program, version 6.0 described by Devereux et al. (Nucl. Acids Res. 12:387, 1984) and available from the University of Wisconsin Genetics Computer Group (UWGCG). The GAP program, described above, utilizes the alignment method of Needleman and Wunsch (J. Mol. Biol. 48:443,1970), as revised by Smith and Waterman (Adv. Appl. Math 2:482, 1981).
Variants can comprise conservatively substituted sequences, meaning that a given amino acid residue is replaced by a residue having similar physiochemical characteristics. Examples of conservative substitutions include substitution of one aliphatic residue for another, such as lie, Val, Leu, or Ala for one another, or substitutions of one polar residue for another, such as between Lys and Arg; Glu and Asp; or Gin and Asn. See Zubay, Biochemistry, Addison-Wesley Pub. Co., (1983) incorporated by reference in its entirety. The effects of such substitutions can be calculated using substitution score matrices such a PAM-120, PAM-200, and PAM-250 as discussed in Altschul, (J. Mol. Biol. 219:555-65, 1991). Other such conservative substitutions, for example, substitutions of entire regions having similar hydrophobicity characteristics, are well known.
Naturally-occurring peptide variants are also encompassed by the invention. Examples of such variants are proteins that result from alternate mRNA splicing events or from proteolytic cleavage of the polypeptides of Table I. Variations attributable to proteolysis include, for example, differences in the N- or C-termini upon expression in different types of host cells, due to proteolytic removal of one or more terminal amino acids from the polypeptides encoded by the sequences of Table I (generally from 1-5 terminal amino acids).
As stated above, the invention provides recombinant and non-recombinant, isolated and purified polypeptides, preferably pine tree polypeptides. Variants and derivatives of native polypeptides can be obtained by isolating naturally-occurring variants, or the nucleotide sequence of variants, of other plant lines or species, or by artificially programming mutations of nucleotide sequences coding for native pine tree polypeptides. Alterations of the native amino acid sequence can be accomplished by any of a number of conventional methods. Mutations can be introduced at particular loci by synthesizing oligonucleotides containing a mutant sequence, flanked by restriction sites enabling ligation to fragments of the native sequence. Following ligation, the resulting reconstructed sequence encodes an analog having the desired amino acid insertion, substitution, or deletion. Alternatively, oligonucleotide-directed site-specific mutagenesis procedures can be employed to provide an altered gene wherein predetermined codons can be altered by substitution, deletion or insertion. Exemplary methods of making such alterations are discussed supra.
The following sections are examples of the various expression vectors, host cells, and protein purification methods that are known in the art. These examples are provided merely as illustrative and should not be construed as the only means to express and purify the polypeptides and polypeptide variants of the invention.
Expression Vectors and Purified proteins
Recombinant expression vectors containing a nucleic acid sequence encoding the polypeptides of the invention can be prepared using well known methods. In one embodiment, the expression vectors include a cDNA sequence encoding the polypeptide operably linked to suitable transcriptional or translational regulatory nucleotide sequences, such as those derived from a mammalian, microbial, viral, or insect gene. Examples of regulatory sequences include transcriptional promoters, operators, or enhancers, mRNA ribosomal binding sites, and appropriate sequences which control transcription and translation initiation and termination. Nucleotide sequences are “operably linked” when the regulatory sequence functionally relates to the cDNA sequence of the invention. Thus, a promoter nucleotide sequence is operably linked to a cDNA sequence if the promoter nucleotide sequence controls the transcription of the cDNA sequence. The ability to replicate in the desired host cells, usually conferred by an origin of replication, and a selection gene by which transformants are identified can additionally be incorporated into the expression vector.
In addition, sequences encoding appropriate signal peptides that are not naturally associated with the polypeptides of the invention can be incorporated into expression vectors. For example, a DNA sequence for a signal peptide (secretory leader) can be fused in-frame to the pine tree nucleotide sequence so that the polypeptides of the invention is initially translated as a fusion protein comprising the signal peptide. A signal peptide that is functional in the intended host cells enhances extracellular secretion of the expressed polypeptide. The signal peptide can be cleaved from the polypeptide upon secretion from the cell.
Fusions of additional peptide sequences at the amino and carboxyl terminal ends of the polypeptides of the invention can be used to enhance expression of the polypeptides or aid in the purification of the protein. Such peptides include, for example, poly-His or the antigenic identification peptides described in U.S. Pat. No. 5,011,912 and in Hopp et al., (Bio/Technology6:1204, 1988).
Suitable host cells for expression of polypeptides of the invention include prokaryotes, yeast or higher eukaryotic cells. Appropriate cloning and expression vectors for use with bacterial, fungal, yeast, and mammalian cellular hosts are described, for example, in Pouwels et al., Cloning Vectors: A Laboratory Manual, Elsevier, N.Y., (1985). Cell-free translation systems could also be employed to the disclosed polypeptides using RNAs derived from DNA constructs disclosed herein.
Prokaryotic Expression Systems
Prokaryotes include gram negative or gram positive organisms, for example, E. coli or Bacilli. Suitable prokaryotic host cells for transformation include, for example, E. coli, Bacillus subtilis, Salmonella typhimurium, and various other species within the genera Pseudomonas, Streptomyces, and Staphylococcus. In a prokaryotic host cell, such as E. coli, the disclosed polypeptides can include an N-terminal methionine residue to facilitate expression of the recombinant polypeptide in the prokaryotic host cell. The N-terminal methionine can be cleaved from the expressed recombinant polypeptide.
Expression vectors for use in prokaryotic host cells generally comprise one or more phenotypic selectable marker genes. A phenotypic selectable marker gene is, for example, a gene encoding a protein that confers antibiotic resistance or that supplies an autotrophic requirement. Examples of useful expression vectors for prokaryotic host cells include those derived from commercially available plasmids such as the cloning vector pBR322 (ATCC 37017). pBR322 contains genes for ampicillin and tetracycline resistance and thus provides simple means for identifying transformed cells. To construct an expression vector using pBR322, an appropriate promoter and a DNA sequence encoding one or more of the polypeptides of the invention are inserted into the pBR322 vector. Other commercially available vectors include, for example, pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and pGEM-1 (Promega Biotec, Madison, Wis., USA). Other commercially available vectors include those that are specifically designed for the expression of proteins; these would include pMAL-p2 and pMAL-c2 vectors that are used for the expression of proteins fused to maltose binding protein (New England Biolabs, Beverly, Mass., USA).
Promoter sequences commonly used for recombinant prokaryotic host cell expression vectors include P-lactamase (penicillinase), lactose promoter system (Chang et al., Nature 275:615, 1978; and Goeddel et al., Nature 281:544, 1979), tryptophan (trp) promoter system (Goeddel et al., Nucl. Acids Res. 8:4057, 1980; and EP-A-36776), and tac promoter (Maniatis, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, p. 412, 1982). A particularly useful prokaryotic host cell expression system employs a phage λ PL promoter and a c1857ts thermolabile repressor sequence. Plasmid vectors available from the American Type Culture Collection (“ATCC”), which incorporate derivatives of the PL promoter, include plasmid pHUB2 (resident in E. Coli strain JMB9 (ATCC 37092)) and pPLc28 (resident in E. coli RR1 (ATCC 53082)).
DNA encoding one or more of the polypeptides of the invention may be cloned in-frame into the multiple cloning site of an ordinary bacterial expression vector. Ideally the vector would contain an inducible promoter upstream of the cloning site, such that addition of an inducer leads to high-level production of the recombinant protein at a time of the investigator's choosing. For some proteins, expression levels may be boosted by incorporation of codons encoding a fusion partner (such as hexahistidine) between the promoter and the gene of interest. The resulting “expression plasmid” may be propagated in a variety of strains of E. coli.
For expression of the recombinant protein, the bacterial cells are propagated in growth medium until reaching a pre-determined optical density. Expression of the recombinant protein is then induced, e.g., by addition of IPTG (isopropyl-b-D-thiogalactopyranoside), which activates expression of proteins from plasmids containing a lac operator/promoter. After induction (typically for 1-4 hours), the cells are harvested by pelleting in a centrifuge, e.g., at 5,000×G for 20 minutes at 4° C.
For recovery of the expressed protein, the pelleted cells may be resuspended in ten volumes of 50 mM Tris-HCl (pH 8)/1 M NaCl and then passed two or three times through a French press. Most highly expressed recombinant proteins forms insoluble aggregates known as inclusion bodies. Inclusion bodies can be purified away from the soluble proteins by pelleting in a centrifuge at 5,000×G for 20 minutes, 4° C. The inclusion body pellet is washed with 50 mM Tris-HCl (pH 8)/1% Triton X-100 and then dissolved in 50 mM Tris-HCl (pH 8)/8 M urea/0.1 M DTT. Any material that cannot be dissolved in 50 mM Tris-HCl (pH 8)/8 M urea/0.1 M DTT may be removed by centrifugation (10,000×G for 20 minutes, 20° C.). The protein of interest will, in most cases, be the most abundant protein in the resulting clarified supernatant. This protein may be “refolded” into the active conformation by dialysis against 50 mM Tris-HCl (pH 8)/5 mM CaCl2/5 mM Zn(OAc)2/1 mM GSSG/0.1 mM GSH. After refolding, purification can be carried out by a variety of chromatographic methods such as ion exchange or gel filtration. In some protocols, initial purification may be carried out before refolding. As an example, hexahistidine-tagged fusion proteins may be partially purified on immobilized Nickel.
While the preceding purification and refolding procedure assumes that the protein is best recovered from inclusion bodies, those skilled in the art of protein purification will appreciate that many recombinant proteins are best purified out of the soluble fraction of cell lysates. In these cases, refolding is often not required, and purification by standard chromatographic methods can be carried out directly.
Yeast Expression Systems
Polypeptides of the invention can also be expressed in yeast host cells, preferably from the Saccharomyces genus (e.g., S. cerevisiae). Other genera of yeast, such as Pichia or Kluyveromyces (e.g. K. lactis), can also be employed. Yeast vectors will often contain an origin of replication sequence from a 2μ yeast plasmid, an autonomously replicating sequence (ARS), a promoter region, sequences for polyadenylation, sequences for transcription termination, and a selectable marker gene. Suitable promoter sequences for yeast vectors include, among others, promoters for metallothionine, 3-phosphoglycerate kinase (Hitzeman et al., J. Biol. Chem. 255:2073, 1980), or other glycolytic enzymes (Hess et al., J. Adv. Enzyme Reg. 7:149, 1968; and Holland et al., Biochem. 17:4900, 1978), such as enolase, glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase. Other suitable vectors and promoters for use in yeast expression are further described in Hitzeman, EPA-73,657 or in Fleer et. al., Gene, 107:285-195 (1991); and van den Berg et. al., Bio/Technology, 8:135-139 (1990). Another alternative is the glucose-repressible ADH2 promoter described by Russell et al. (J. Biol. Chem. 258:2674, 1982) and Beier et al. (Nature 300:724, 1982). Shuttle vectors replicable in both yeast and E. coli can be constructed by inserting DNA sequences from pBR322 for selection and replication in E. coli (Amp gene and origin of replication) into the above-described yeast vectors.
The yeast α-factor leader sequence can be employed to direct secretion of one or more of the disclosed polypeptides. The α-factor leader sequence is often inserted between the promoter sequence and the structural gene sequence. See, e.g., Kurjan et al., Cell 30:933, 1982; Bitter et al., Proc. Natl. Acad. Sci. USA 81:5330, 1984; U.S. Pat. No. 4,546,082; and EP 324,274. Other leader sequences suitable for facilitating secretion of recombinant polypeptides from yeast hosts are known to those of skill in the art. A leader sequence can be modified near its 3′ end to contain one or more restriction sites. This will facilitate fusion of the leader sequence to the structural gene.
Yeast transformation protocols are known to those of skill in the art. One such protocol is described by Hinnen et al., Proc. Natl. Acad. Sci. USA 75:1929, 1978. The Hinnen et al. protocol selects for Trp+transformants in a selective medium, wherein the selective medium consists of 0.67% yeast nitrogen base, 0.5% casamino acids, 2% glucose, 10 μg/ml adenine, and 20 μg/ml uracil.
Yeast host cells transformed by vectors containing ADH2 promoter sequence can be grown for inducing expression in a “rich” medium. An example of a rich medium is one consisting of 1% yeast extract, 2% peptone, and 1% glucose supplemented with 80 μg/ml adenine and 80 μg/ml uracil. Derepression of the ADH2 promoter occurs when glucose is exhausted from the medium.
Mammalian Expression Systems
Mammalian or insect host cell culture systems could also be employed to express recombinant polypeptides of the invention. Baculovirus systems for production of heterologous proteins in insect cells are reviewed by Luckow and Summers, Bio/Technology 6:47 (1988). Established cell lines of mammalian origin also can be employed. Examples of suitable mammalian host cell lines include the COS-7 line of monkey kidney cells (ATCC CRL 1651) (Gluzman et al., Cell 23:175, 1981), L cells, C127 cells, 3T3 cells (ATCC CCL 163), Chinese hamster ovary (CHO) cells, HeLa cells, and BHK (ATCC CRL 10) cell lines, and the CV-1/EBNA-1 cell line (ATCC CRL 10478) derived from the African green monkey kidney cell line CVI (ATCC CCL 70) as described by McMahan et al. (EMBO J. 10: 2821, 1991).
Established methods for introducing DNA into mammalian cells have been described (Kaufman, R. J., Large Scale Mammalian Cell Culture, 1990, pp. 15-69). Additional protocols using commercially available reagents, such as Lipofectamine (Gibco/BRL) or Lipofectamine-Plus, can be used to transfect cells (Feigner et al., Proc. Natl. Acad. Sci. USA 84:7413-7417, 1987). In addition, electroporation can be used to transfect mammalian cells using conventional procedures, such as those in Sambrook et al. Molecular Cloning: A Laboratory Manual, 2 ed. Vol. 1-3, Cold Spring Harbor Laboratory Press, 1989). Selection of stable transformants can be performed using resistance to cytotoxic drugs as a selection method. Kaufman et al., Meth. in Enzymology 185:487-511, 1990, describes several selection schemes, such as dihydrofolate reductase (DHFR) resistance. A suitable host strain for DHFR selection can be CHO strain DX-B11, which is deficient in DHFR (Urlaub and Chasin, Proc. Natl. Acad. Sci. USA 77:4216-4220, 1980). A plasmid expressing the DHFR cDNA can be introduced into strain DX-B11, and only cells that contain the plasmid can grow in the appropriate selective media. Other examples of selectable markers that can be incorporated into an expression vector include cDNAs conferring resistance to antibiotcs, such as G418 and hygromycin B. Cells harboring the vector can be selected on the basis of resistance to these compounds.
Transcriptional and translational control sequences for mammalian host cell expression vectors can be excised from viral genomes. Commonly used promoter sequences and enhancer sequences are derived from polyoma virus, adenovirus 2, simian virus 40 (SV40), and human cytomegalovirus. DNA sequences derived from the SV40 viral genome, for example, SV40 origin, early and later promoter, enhancer, splice, and polyadenylation sites can be used to provide other genetic elements for expression of a structural gene sequence in a mammalian host cell. Viral early and late promoters are particularly useful because both are easily obtained from a viral genome as a fragment, which can also contain a viral origin of replication (Fiers et al., Nature 273:113, 1978; Kaufman, Meth. in Enzymology, 1990). Smaller or larger SV40 fragments can also be used, provided the approximately 250 bp sequence extending from the Hind III site toward the Bgl I site located in the SV40 viral origin of replication site is included.
Additional control sequences shown to improve expression of heterologous genes from mammalian expression vectors include such elements as the expression augmenting sequence element (EASE) derived from CHO cells (Morris et al., Animal Cell Technology, 1997, pp. 529-534) and the tripartite leader (TPL) and VA gene RNAs from Adenovirus 2 (Gingeras et al., J. Biol. Chem. 257:13475-13491,1982). The internal ribosome entry site (IRES) sequences of viral origin allows dicistronic mRNAs to be translated efficiently (Oh and Sarnow, Current Opinion in Genetics and Development 3:295-300, 1993; Ramesh et al., Nucleic Acids Research 24:2697-2700, 1996). Expression of a heterologous cDNA as part of a dicistronic mRNA followed by the gene for a selectable marker (eg. DHFR) has been shown to improve transfectability of the host and expression of the heterologous cDNA (Kaufman, Meth. in Enzymology, 1990). Exemplary expression vectors that employ dicistronic mRNAs are pTR-DC/GFP described by Mosser et al., Biotechniques 22:150-161, 1997, and p2A51 described by Morris et al., Animal Cell Technology, 1997, pp. 529-534.
A useful high expression vector, pCAVNOT, has been described by Mosley et al., Cell 59:335-348,1989. Other expression vectors for use in mammalian host cells can be constructed as disclosed by Okayama and Berg (Mol. Cell. Biol. 3:280, 1983). A useful system for stable high level expression of mammalian cDNAs in C127 murine mammary epithelial cells can be constructed substantially as described by Cosman et al. (Mol. Immunol. 23:935, 1986). A useful high expression vector, PMLSV N1/N4, described by Cosman et al., Nature 312:768, 1984, has been deposited as ATCC 39890. Additional useful mammalian expression vectors are described in EP-A-0367566, and in U.S. patent application Ser. No. 07/701,415, filed May 16, 1991, incorporated by reference herein. The vectors can be derived from retroviruses. In place of the native signal sequence, a heterologous signal sequence can be added, such as the signal sequence for IL-7 described in U.S. Pat. No. 4,965,195; the signal sequence for IL-2 receptor described in Cosman et al., Nature 312:768 (1984); the IL4 signal peptide described in EP 367,566; the type I IL-1 receptor signal peptide described in U.S. Pat. No. 4,968,607; and the type H IL-1 receptor signal peptide described in EP 460,846.
The polypeptides of the invention and the nucleic acid molecules encoding them can also be used as reagents to identify (a) proteins that the disclosed polypeptides or their constituent proteins regulate, and (b) other proteins with which it might interact. The disclosed polypeptides can be coupled to a recombinant protein, to an affinity matrix, or by using them as a bait in the yeast two-hybrid system. The use of the yeast two-hybrid system developed by Stanley Fields and coworkers is well known in the art and described in Golemis, E., et al Section 20.1 in: Current Protocols in Molecular Biology, ed. Ausubel, F. M., et al., John Wiley & Sons, NY, 1997 and in The Yeast Two-Hybrid System., ed. P. L. Bartel and S. Fields, Oxford University Press, 1997.
Antibodies and Peptide Binding Proteins
Purified polypeptides of the invention can be used to generate antibodies that bind to one or more epitopes of the disclosed polypeptide. Such anti-polypeptide antibodies includes polyclonal antibodies, monoclonal antibodies, fragments thereof such as F(ab′)2, and Fab fragments, as well as any recombinantly produced binding partners. Antibodies are defined to be specifically binding if they bind pine tree polypeptides with a Ka of greater than or equal to about 107 M−1. Affinities of binding partners or antibodies can be readily determined using conventional techniques, for example, those described by Scatchard et al., Ann. N.Y. Acad. Sci., 51:660 (1949).
Polyclonal antibodies can be readily generated from a variety of sources, for example, horses, cows, goats, sheep, dogs, chickens, rabbits, mice, hamsters, guinea pigs, or rats, using procedures that are well-known in the art, for example, as described for example, U.S. Pat. No. 5,585,100, incorporated by reference herein. In general, a composition comprising at least one of the polypeptides of the invention is administered to the host animal, typically through intra-peritoneal or subcutaneous injection. In the case where a peptide is used as the immunogen, it is preferable to conjugated it to a suitable carrier molecule, such as a T-dependent antigen (Bovine Serum Albumin, cholera toxin, and the like). The immunogenicity of the disclosed polypeptides can also be enhanced through the use of an adjuvant, for example, Freund's complete or incomplete adjuvant or alum. Following booster immunizations, small samples of serum are collected and tested for reactivity to the disclosed polypeptides or their constituent epitopes. Examples of various assays useful for such determination include those described in: Antibodies: A Laboratory Manual, Harlow and Lane (eds.), Cold Spring Harbor Laboratory Press, 1988; as well as procedures such as countercurrent immuno-electrophoresis (CIEP), radioimmunoassay, radio-immunoprecipitation, enzyme-linked immuno-sorbent assays (ELISA), dot blot assays, and sandwich assays, see U.S. Pat. Nos. 4,376,110 and 4,486,530, each of which is incorporated by reference in their entirety.
Monoclonal antibodies (or fragments thereof), directed against epitopes of the disclosed polypeptides can also be readily prepared using well-known procedures, such as, for example, the procedures described in U.S. Patent No. RE 32,011, U.S. Pat. Nos. 4,902,614, 4,543,439, and 4,411,993; Monoclonal Antibodies, Hybrddomas: A New Dimension in Biological Analyses, Plenum Press, Kennett, McKearn, and Bechtol (eds.), 1980, each of which is incorporated by reference. Briefly, the host animals, such as mice, are injected intraperitoneally at least once, and preferably at least twice at about 3 week intervals with isolated and purified polypeptides optionally in the presence of adjuvant. Again, if peptide fragments are used they may need to be conjugated to a suitable carrier protein. Mouse sera are then assayed by conventional dot blot technique or antibody capture (ABC) to determine which animal is best to fuse. Approximately two to three weeks later, the mice are given an intravenous boost of pine tree polypeptides. Mice are later sacrificed and spleen cells fused with commercially available myeloma cells, such as Ag8.653 (ATCC), following established protocols. Briefly, the myeloma cells are washed several times in media and fused to mouse spleen cells at a ratio of about three spleen cells to one myeloma cell. The fusing agent can be any suitable agent used in the art, for example, polyethylene glycol (PEG). Fusion is plated out into plates containing media that allows for the selective growth of the fused cells. The fused cells can then be allowed to grow for approximately eight days. Supernatants from resultant hybridomas are collected and added to a plate that is first coated with goat anti-mouse Ig. Following washes, a label, such as, 125I-pine tree polypeptides is added to each well followed by incubation. Positive wells can be subsequently detected by autoradiography. Positive clones can be grown in bulk culture and supernatants are subsequently purified over a Protein A column (Pharmacia).
Monoclonal antibodies and specific-binding fragments of the invention can be produced using alternative techniques, such as those described by Alting-Mees et al., “Monoclonal Antibody Expression Libraries: A Rapid Alternative to Hybridomas”, Strategies in Molecular Biology 3:1-9 (1990), which is incorporated herein by reference. Similarly, binding partners can be constructed using recombinant DNA techniques to incorporate the variable regions of a gene that encodes a specific binding antibody Such a technique is described in Larrick et al., Biotechnology, 7:394 (1989).
It is understood of course that many techniques could be used to generate antibodies against the polypeptides of the invention and that the above embodiments in no way limits the scope of the invention.
Nucleotides. Proteins, Antibodies, and Binding Proteins As Probes and Reagents
The disclosed nucleic acids, polypeptides, and antibodies directed against the disclosed polypeptides can be used in a variety of research protocols, such as in DNA arrays or as reagents. A sample of such research protocols are given in Sambrook et al. Molecular Cloning: A Laboratory Manual, 2 ed. Vol. 1-3, Cold Spring Harbor Laboratory Press, (1989), incorporated by reference. For example, the compiled sequences, polypeptides, etc., can serve as markers for cell specific or tissue specific expression of RNA or proteins. Similarly, this system can be used to investigate constitutive and transient expression of the genes encoding the cDNAs of SEQ ID NOS: 1-327 and the proteins encoded by these genes.
Further, the disclosed cDNA sequences can be used to determine the chromosomal location of the genomic DNA and to map genes in relation to this chromosomal location. The disclosed nucleotide sequence can be further used to identify additional genes related to the nucleotides of SEQ ID NOS: 1-334 and to establish evolutionary relatedness among species based on the comparison of sequences. The disclosed nucleotide and polypeptide sequences can be used to select for those genes or proteins that are homologous to the disclosed cDNAs or polypeptides, using well-established positive screening procedures such as Southern blotting and immunoblotting and negative screening procedures such as subtractive hybridization.
Method for Using Nucleic Acid Probes or Antibodies to Stage Embryos
Accurate staging of tree embryos is critical. It is known that different stages of tree embryos have different capacities as subjects for genetic transformation and genetic engineering. In addition, environmental requirements exhibited by embryos vary due to increasing physiologic age. Currently, the staging of tree embryogenesis is most accurately performed by an expert in the field who is very familiar with the morphological appearance of embryos at different stages. The cDNAs and related molecules of this invention can be used as markers for different stages of tree embryogenesis, thereby eliminating the need for a subjective eye to assess maturity and potentially allowing for more accurate staging of tree embryos. Moreover, by monitoring the expression of the underlying genes, it is possible to determine when an embryo has reached a certain level of development even if that level does not correspond to a visible difference in embryo morphology. The relational database of this invention aids the ability to monitor expression levels and tailor research approaches, such as the use of DNA arrays, to the specific needs of the objective, i.e., staging.
The information provided in this invention can be used in whole or in part to stage embryos. For example, one or a multiplicity of nucleic acid molecules from SEQ ID NOS: 1-327 having an expression profile consistent with a particular embryo stage can be used in this invention. A researcher may find it beneficial to use oligonucleotide probes or antibodies, for example, that specifically recognize proteins derived from genes expressed during middle embryonic stages, or that specifically monitor expression levels for embryos that have reached maturity associated with late developmental stages. A researcher can quickly determine that an embryo subset has progressed to or through an embryonic stage with the use of this invention and make appropriate changes in conditions if necessary, e.g. alter growth media or other environmental conditions.
Method for Monitoring, Enhancing, or Determining Expression of Stage-Specific Genes
Expression patterns of SEQ ID NOS: 1-327 indicate that gene activation can be classified as stage-specific, such as in the case of SEQ ID NO: 327, otherwise known as “LP2-3.” The promoter that drives such a gene can perform valuable functions. For example, a promoter from LP2-3 operatively linked to a reporter gene presented within an embryo system is expected to produce the reporter product under the conditions for expression of gene LP2-3. Thus, the system allows a rapid determination of stage specific embryos by a simple phenotypic reporter screen, perhaps by visualization of green fluorescent protein (GFP) or by loss of fluorescent protein product. Similarly, a set of promoters from known, differently staged genes operatively linked to reporter genes will be effective for monitoring developmental changes within the system as the embryos mature. The LP2-3 promoter is identified as SEQ ID NOS: 328-334 in Table I. The promoter expression pattern is that of the natively linked gene, LP2-3.
Virtually any indicator or reporter gene can be used for this approach or for other methods associated with this invention provided they are compatible with the system studied. Generally, reporter genes are genes typically not present in the recipient organism or tissue and which encode for proteins resulting in some phenotypic change or enzymatic property. Examples of such genes and assays are provided by Schenborn, E. and Groskreutz, D., Mol. Biotechnol., 13:29, 1999; Helfand, S. L. and Rogina, B., Results Probl. Cell Differ., 29:67, 2000; Kricka, L. J., Methods Enzymol., 305:333, 2000; Himes, S. R. and Shannon, M. F., Methods Mol. Biol., 130:165, 2000; and Leffel, S. M. et al., Biotechniques, 23:912, 1997, which are incorporated in their entirety by reference. In one embodiment of this invention, the reporter used is GFP, or any ariant of the fluorescent protein.
Additionally, one skilled in the art would recognize that a promoter, like that from LP2-3, has potential to stimulate production of products not ordinarily observed at a particular stage. A promoter derived from a gene that expresses during a known stage, for example an early stage, can be operatively linked to a gene that does not normally express during that stage, yielding controlled expression of any targeted gene. It may be shown that earlier or later expression, or prolonged expression of a particular gene may give a desirable genotype or phenotype in a mature plant, may result in increased vigor in culture, or may be sufficient to alter the normal maturation process of the embryo. Prolonged expression of any desired gene also may be achieved from linking a constitutively expressed promoter to the targeted gene. Further, the ability to manipulate gene expression during embryogenesis allows for a detailed study of the effects of an individual gene or multiple genes on embryogenesis, leading to a better understanding of the developmental processes involved in embryogenesis.
Method of Correlatinq Gene Expression with Improved Tree Stock or Culture Conditions
Importantly, the cDNAs and related molecules of the invention can also be used as markers to examine genetic heterogeneity and heredity through the use of techniques such as genetic fingerprinting. These markers can also be correlated with improved agronomic traits including good initiation frequency, embryonic maturation, high frequency of germination, rapid growth rates, herbicide tolerance, insect resistance, pathogen resistance, climate and environmental adaptability wood quality, and wood fiber quality and content, to name a few. Additionally, the expression of these developmentally regulated genes can be compared among genetically identical clones grown under different culture conditions to determine the best protocols and media for somatic embryogenesis.
Cryogenic storage of pine tree embryos is effective for maintaining stocks of embryos determined by this invention to have the desired fitness traits or exist at the appropriate developmental stage. With such storage, one can specifically target desirable embryos for expansion many years after they are frozen. For example, a culture of somatic embryos can be divided into at least three portions, one of which is cryogenically stored, one which is used to study gene embryonic gene (and protein) expression, and one that is used to produce seedlings for field testing. Clones producing valuable mature plants could be selected and expanded from frozen stocks. Additional clones exhibiting similar expression patterns could be selected for future expansion and cultivation.
As will be evident to the ordinary practitioner, there are numerous ways in which the nucleic acids, polypeptides and antibodies of this invention might be used to characterize the gene expression of embryos. Ideally the stage-specific gene expression of embryos of several different genotypes and at several different stages of embryogenesis are characterized. For example, sets of oligonucleotide primers designed using any one of SEQ ID NOS: 1-327 may be used in RT-PCR assays to characterize expression of a gene product. In situ hybridization assays or antibody staining protocols may also be used to characterize RNA and/or protein expression and localization.
Embryos of the same genotype in which gene expression has been characterized may also used be to generate plantlets that are used in field testing. Once the embryos have developed into mature trees, the various genotype trees can be evaluated for important traits such as growth rates, herbicide tolerance, insect resistance, pathogen resistance, climate and environmental adaptability, wood quality, and wood fiber quality and content, among others. Finally the phenotypic data collected from the field testing can be correlated with gene expression during early embryogenesis to further enhance the database of the present invention. This will allow further identification of gene products which whose expression is correlated, either positively or negatively, with commercially valuable tree characteristics.
It will be clear to those skilled in the art that identification of such gene products can have several uses. Determining the correlation between a desirable phenotype and a genotype would allow for the “pre-selection” of tree embryos for field testing. It would also be useful in evaluating experimental tissue culture conditions for somatic embryogenesis; in other words, the expression level of a gene known to correlate with the development of trees with desirable characteristics could serve as the criterion on which culture media is evaluated, as opposed to assessing the phenotype of fully matured trees. The ability to evaluate culture conditions without having to develop fully mature trees and do field testing would save a great deal of research time and expense. And of course, the knowledge of the correlation between gene expression and desirable tree phenotypes would serve to identify target genes for genetic engineering.
Genetically Engineering Trees and Other Plants
There are several methods known in the art for the creation of transgenic plants. These include, but are not limited to: electroporation of plant protoplasts, liposome-mediated transformation, polyethylene-glycol-mediated transformation, microinjection of plant cells, and transformation using viruses. Because the invention is especially concerned with the transformation of woody species, the two prevalent methods for transforming forest trees, namely Agrobacteriurm-mediated transfer and direct gene transfer by particle bombardment, will be discussed in more detail, though it is understood that the present invention encompasses generation of transgenic plants via standard methods commonly known in the art.
Agrobacterium Mediated Transfer
A. tumefaciens and A. rhizogenes are two soil microorganisms that naturally infect a wide variety of plants including dicotyledonous plants, gymnosperms and some monocotyledonous plants. Infection by these organisms results in the growth of crown gall tumors or in hairy root disease, respectively. Each of these organisms carries a large plasmid, the tumor inducing (Ti) plasmid, in the case of A. tumefaciens and the root-inducing (Ri) plasmid in the case of A. rhizogenes. These plasmids have two critical features, a set of virulence genes and a segment of DNA called T-DNA that is delimited by conserved regions of approximately 25 base pairs known as the left and right borders. During infection, the T-DNA is transferred to the plant cell where it is able to stably integrate in single copy in the plant genome. Transfer of T-DNA requires the function of the virulence genes.
In its natural state, T-DNA contains genes that mediate progression of disease such as growth hormones or genes controlling root morphogenesis. Using recombinant DNA technology, however, T-DNA may be modified to contain an expression cassette encoding a foreign gene of interest. There are several T-DNA vector systems commonly in use for the transformation of plants. Several of these vector systems are reviewed in Hansen et al., Current Topics in Microbiology and Immunology 240: 21-57 (1999) which is hereby incorporated by reference. T-DNA vectors must include the left and right borders. In addition they must either be capable of replication in Agrobacterium or be designed so as to recombine with a plasmid that does so. The latter type of vector is known as a co-integrate vector. For transformation to proceed, there must also be a source of virulence (vir) genes. The vir genes may be on the same plasmid with the T-DNA or more likely supplied by a helper plasmid. For example, binary T-DNA vector systems are comprised of two plasmids, one containing the vir genes and the other containing T-DNA. Some plants known to be recalcitrant to Agrobacterium-mediated transformation may be transformed if additional copies of some or all virulence genes are provided. Extra copies of VirG and VirE can be particularly useful.
Additionally, it is convenient to include in the T-DNA a selectable marker that will allow identification and selection of transformed plant cells. The selectable marker should be one that works in both Agrobacterium and the target plant. For example, the genes encoding chloramphenicol acetyltransferase and neomycin phosphotransferase are suitable marker genes that confer resistance to chloramphenicol and kanamycin, respectively. Additionally, a selectable marker may be provided on a separate T-DNA from the T-DNA encoding the gene of interest. Co-transformed T-DNAs can integrate at separate sites in the plant genome. This can be useful because it will later allow segregation of the marker gene in progeny enabling the generation of transgenic trees expressing the gene of interest but not the marker gene.
The gene of interest and the selectable marker genes must also be under the control of promoters that function in the transformed plant cell. Examples of suitable promoters include, but are not limited to: the abscisic acid (ABA)-inducible promoter from the early methionine (Em) gene from wheat (Marcotte et al., Plant Cell 1:976-979 (1989); the cauliflower mosaic virus (CaMV) 35S promoter (Odell et al., Nature 313:810-812 (1985); and the nopaline synthase (nos) promoter (Sanders et al., Nucl. Acids Res. 15(4):1543-58 (1987). Tissue-specific plant promoters or plant promoters responsive to chemical, hormone, heat or light treatments may be used. Additionally, the gene of interest may be expressed under the control of its endogenous promoter to ensure proper regulation.
The process of transformation requires plant cells that are competent and that are either embryogenic or organogenic. The plant cells to be transformed are then co-cultivated with Agrobacterium containing an engineered T-DNA vector system for 1-5 days. Following the co-cultivation period, the cells are incubated with the antibiotic against which the selectable marker confers resistance, and transformed lines are selected for further cultivation. The use of Agrobacterium mediated transfer in woody trees is described in Loopstra et al., Plant Molecular Biology 15:1-9 (1990), Gallardo et al., Planta 210:19-26 (1999) and Wenck et al., Plant Molecular Biology 39:407-419 (1999), each of which is hereby incorporated by reference.
Direct Gene Transfer by Particle Bombardment
Direct gene transfer by particle bombardment provides another method for transforming plant tissue. This method can be especially useful when plant species are recalcitrant to transformation by other means. In this technique a particle, or microprojectile, coated with DNA is shot through the physical barriers of the cell. Particle bombardment can be used to introduce DNA into any target tissue that is penetrable by DNA coated particles, but for stable transformation, it is imperative that regenerable cells be used. Typically, the particles are made of gold or tungsten. The particles are coated with DNA using either CaCl2 or ethanol precipitation methods which are commonly known in the art.
DNA coated particles are shot out of a particle gun. A suitable particle gun can be purchased from Bio-Rad Laboratories (Hercules, Calif.). Particle penetration is controlled by varying parameters such as the intensity of the explosive burst, the size of the particles, or the distance particles must travel to reach the target tissue.
The DNA used for coating the particles should comprise an expression cassette suitable for driving the expression of the gene of interest. Minimally this will comprise a promoter operably linked to the gene of interest. As with Agrobacterium mediated transformation. Suitable promoters include, but are not limted to, the the abscisic acid (ABA)-inducible Em promoter from wheat (Marcotte et al., Plant Cell 1:976-979 (1989), the CaMV35S promoter (Odell, et al., Nature 313:810-812 (1985), and the NOS:promoter (Sanders et., Nucl. Acids Res. 15(4):1543-58 (1987).
Methods for performing direct gene transfer by particle bombardment are disclosed in U.S. Pat. No. 5,990,387 to Tomes et al. Additionally, Ellis et al. describe the successful use of direct gene transfer to white spruce and larch trees in Bio/Technology 11, 84-89 (1993).
Researchers skilled in the area of DNA or gene transformation will recognize that additional procedures, or combination of procedures, may be useful for the successful tranformation of genetic stock.
The cDNAs of the invention may be expressed in such a way as to produce either sense or antisense RNA. Antisense RNA is RNA that has a sequence which is the reverse complement of the mRNA (sense RNA) encoded by a gene. A vector that will drive the expression of antisense RNA is one in which the cDNA is placed in “reverse orientation” with respect to the promoter such that the non-coding strand (rather than the coding strand) is transcribed. The expression of antisense RNA can be used to down-modulate the expression of the protein encoded by the mRNA to which the antisense RNA is complementary. This phenomenon is also known as “antisense suppression.” It is believed that down-regulation of protein expression following antisense RNA is caused by the binding of the antisense RNA to the endogenous mRNA molecule to which it is complementary, thereby, inhibiting or preventing translation of the endogenous mRNA.
The antisense RNA expressed need not be the full-length cDNA and need not be exactly homologous to the target mRNA. Generally, however, where the introduced sequence is of shorter length, a higher degree of homology to the endogenous mRNA will be needed for effective antisense suppression. Preferably, the introduced antisense sequence in the vector will be at least 30 nucleotides in length, and improved antisense suppression will typically be observed as the length of the antisense sequence increases. The length of the antisense sequence in the vector may be greater than 100 nucleotides. Vectors producing antisense RNA's could be used to make transgenic plants, as described above, in situations when desirable tree characteristics are produced when the expression of a particular gene is reduced or inhibited.
The cDNAs of the current invention can be derived from any sets of plant tissue. The cDNAs of SEQ ID NOS: 1-334, for example, were originally derived from embryonic tissues of pine tree embryos staged 1-9.9 as classified in Pullman and Webb TAPPI R&D Division 1994 Biological Sciences Symposium, pages 31-34, which is hereby incorporated by reference. LPS and LPZ clones are derived from somatic and zygotic embryos, respectively. As noted, embryos may be of either somatic or zygotic derivation, and the embryos may be grown in either semi-solid or liquid tissue culture systems. Applicable methods for growing embryos in semi-solid or liquid tissue culture systems are disclosed in U.S. Pat. Nos.: 5,036,007; 5,236,841; 5,294,549; 5,413,930; 5,491,090; 5,506,136; 5,563,061; 5,677,185; 5,731,203; 5,731,204; and U.S. Patent Application 60/212,651 filed Jun. 19, 2000, which are hereby incorporated by reference.
In one embodiment, RNA isolated from staged cell populations provides the starting material for reverse transcription, differential display, and cloning of amplified cDNA. Methods and kits for isolating total RNA from cellular populations, or for generating poly(A)+ RNA, are commonly known in the art. For example, several procedures for isolating RNA are disclosed in Chapter 4 of Current Protocols in Molecular Biology edited by F. A. Ausubel et al., John Wiley and Sons, Inc. (1987) (incorporated herein by reference). As an example, the TRI Reagent7 available from Molecular Research Center, Inc. (Cincinnati, Ohio) is a suitable reagent (used according to the manufacturer's instructions) for isolation of RNA from plant tissues.
Differential display provides a method to identify individual messenger RNAs that are differentially expressed among two or more cell populations. In the practice of the present invention, these cell populations may be provided by pine tree or other plant embryos of different developmental stages. The differential display procedure is taught in Liang et al., Science, 257:967-71 (1992) and in U.S. Pat. No. 5,262,311, which are hereby incorporated by reference. Briefly, mRNA sequences are PCR-amplified using two types of oligonucleotide primers known as “anchor” and “arbitrary” primers. Anchor primers are designed to recognize the polyadenylate tail of messenger RNAs. Arbitrary primers are short and arbitrary in sequence ard anneal to complementary sequences in various mRNAs. Products amplified with these primers will vary in size and can be differentiated on an agarose or sequencing gel based on their size. If different cell populations are amplified with the same anchor and arbitrary primers, one can compare the amplification products to identify differentially expressed RNA sequences.
PCR-amplified bands representing differentially expressed RNA samples are excised from the gel, transferred to tubes and reamplified using the same primer pairs and PCR conditions as used in the differential display procedure. Methods for the cloning of PCR products are commonly known in the art and there are several commercially available reagents and kits for cloning PCR products. For instance, the pCR-Scipt™ Cloning kit from Stratagene, La Jolla, Calif.) is suitable for this purpose. Using this kit, E. coli transformants containing plasmids with PCR fragment inserts can rapidly be identified using blue/white color selection followed by plasmid purification and restriction digests. The pCR-Script vector contains T3 and T7 polymerase recognition sites allowing for in vitro transcription of the inserted fragment.
Methods for sequencing DNA, including cloned PCR products, are commonly known in the art. The selection of cloning vectors having M13, T7 or T3 primer annealing sites flanking the PCR-amplified insert can be used in sequencing reactions directly. Most sequencing procedures in use today are modifications of Sanger's dideoxy chain termination sequencing reaction as disclosed in and Sanger et al., Proceedings of the National Academy of Sciences, 74:5463-5467 (1977); which is hereby incorporated by reference.
Homology Searching and Identification of Protein Coding Sequences
As understood by one of ordinary skill in the art, the sequence of a cloned cDNA insert obtained, may be compared against public databases such as Genbank to discern any identity or homology to known sequences. Programs, such BLAST, for performing such a search are available on the National Center for Biotechnology Information's web page located at hftp://www.ncbi.nim.nih.qov. The results from Genbank search may reveal the potential function of a polypeptide or RNA molecule encoded by the cDNA. In addition to searching gene sequence database, the use of commercially available analysis software is well known in the art. For example, software packages such as the Wisconsin Package™ (Genetic Computer Group, Madison, Wis.) include programs such as FRAMES and CodonPreference that help to identify protein coding sequences in a query nucleotide sequence. FRAMES displays open reading frames for the six DNA translation frames, allowing one to quickly assess the presence or absence of stretches of open-reading frames that are likely to be protein encoding regions. CodonPreference is a more sophisticated program that identifies and displays possible protein coding regions based on similarity of the codon usage in the sequence to a codon frequency table (Gribskov et al., 1984).
cDNA libraries were prepared from staged pine tree embryos, as described above. The differential display technique was used to identify 327 novel cDNAs that were preferentially-expressed during early, middle, or late stages of pine tree embryogenesis, as set forth below. Clone nomenclature is divided into subsets based on tissue type; a clone is designated LPS to indicate somatic origins and LPZ for zygotic origins.
Somatic embryos were collected at different stages of development. Cultures of somatic embryos of were initiated from Loblolly pine immature zygotic embryos as described by Becwar et al., Forestry Science 44:287-301 (1994) (incorporated by reference) or with minor modifications in media mineral composition. Somatic embryos were grown in cell suspension culture medium 16 (Pullman and Webb, Tappi R&D Division 1994 Biological Sciences Symposium) and a maturation medium similar to that of a standard maturation media. Resulting somatic embryos were selected and classified as stages 1-9 according to morphological development following the teachings of Pullman and Webb, Tappi R&D Division 1994 Biological Sciences Symposium pp.31-34. Somatic embryos were sorted into tubes containing the same stages and stored at −70° C.
Total RNA was isolated from all stages of somatic embryos of loblolly pine and grouped into early, middle, and late phases of development. The early phase is represented by a liquid suspension culture containing embryos of stages 1 through stage 3. Middle phase contains embryos of stages 4 through stage 6, while stages 7 through 9 formed the late phase. 60-100 mg aliquots of staged frozen embryos were ground in 1.0 ml of TRI Reagent® Isolation Reagent (Molecular Research Center, Inc.), a commercial product that includes phenol and guideline thiocyanate in a monophase solution and extracted according to the manufacturer's instructions.
Reverse Transcription of mRNA (RT-PCR)
The total RNA was used as a template to synthesize single stranded DNA mediated by MMLV reverse transcriptase (100 U/μl). The method involves the reverse transcription by PCR of the mRNA with an oligo-dT primer (H-T11G: 5′ B AAGCTTTTTTTTTTTG 3′) anchored to the beginning of the poly(A) tail, followed by a PCR reaction in the presence of a second short (13-mer) primer which is arbitrary in sequence [AP1 (5′ B AAGCTTGATTGCC-3′) or AP2 (5′ B AAGCTTCGACTGT-3′)]. Reverse transcription and Differential Display were conducted using the GenHunter RNAimage Kit 1.
A 19 μl reverse transcription reaction (10 μl sterile water, 2.0 μl 5×RT buffer, 1.6 μl dNTP (250 μM), 2.0 μl anchored primer (2.0 μM), 2.0 μl RNA template at 100 ng/μl) was prepared for each embryo phase sample. The reaction mixture was heated to 65° C. for 5 minutes in a thermocycler, cooled to 37° C. and paused after 10 minutes while 1.0 μl MMLV was added. The program was allowed to resume at 37° C. for 50 minutes. The reaction was then heated to 75° C. for 5 minutes, cooled to 4° C. and stored at −20° C.
Incorporation of Radiolabeled Nucleotides by PCR
Differential Display PCR was performed in a 20 μl reaction containing 2 μl of the reverse-transcribed cDNA template; 10 μl sterile water 2.0 μl 10×PCR buffer, 1.6 μl dNTP (25 μM), 2.0 μl anchored primer H-T 11G, (2.0 μM), 2.0 μl 13 mer arbitrary primer (AP1 or AP2 (2.0 μM), 0.2 μl Taq DNA polymerase, and 0.2 μl α32P-dATP (2000 Ci/mmole). The cDNA was amplified by PCR: 94° C. for 3 minutes, 40 cycles of 94° C. for 30 seconds, 40° C. for 2 minutes, and 72° C. for 30 seconds, followed by 72° C. for 5 minutes. The reaction was cooled to 4° C. and stored at −20° C.
The PCR products were separated on a Stratagene (La Jolla, Calif.) pre-cast 6% polyacrylamide sequencing gel at 30 watts constant power for approximately 2.5 to 3 hours. 3.5 μl of sample was mixed with 2.0 μl, of loading dye and incubated at 80° C. for 2 minutes immediately before loading onto the gel. The gel was rinsed in water and dried. Dilute 35P-dATP with loading dye was spotted at the corners as alignment markers and the gels were exposed to Kodak BioMaX™ autoradiography film. An exemplary gel is shown in
Bands that appeared to be possible markers for phase specific gene expression were marked on the film and aligned over the gel. The bands were excised by cutting through the film. The gel pieces were scraped from the gel and transferred to tubes and re-amplified using the same primer pairs and PCR conditions as used for incorporation of radiolabeled nucleotides.
Cloning of DNA Fragments from Differential Display
The PCR products from the gel fragments were purified, polished, ligated and cloned into XL 10-Gold Kan ultracompetent cells by heat shock with the Stratagene pCR-Script Amp SK(+) Supercompetent Cell Cloning Kit according to manufacturer's instructions. The transformed cells were spread on LB agar plates containing ampicillin, IPTG, and X-Gal each at 50 μg/ml. The plates were incubated overnight at 37° C. Plasmids containing PCR inserts were identified using blue-white colony screening. The presence of inserts was confirmed by digesting the clones with restriction endonucleases, Msc I and Nla ll, followed by standard DNA gel electrophoresis. Transformants representing early, middle, and late phase embryos were sequenced using standard dideoxy protocols known in the art with the T3 primer.
All sequences were analyzed using a program-database pair search of the NCBI BLAST 2.0 server, blastn-nr, blastn-others ests, and blastx-nr. In each case, the query sequence was filtered for low complexity regions by default and entered in FASTA format. Other formatting options were set by default; alignment view-pairwise, descriptions-100, and alignments-50. Using these parameter settings, significant similarity to known DNA, RNA, or protein sequences was found for several of the nucleic acid molecules of SEQ ID NOS: 1-334, for example, those described herein. (Alignment data not shown).
SEQ ID NO: 327, designated LP2-3, was first identified through differential display with T12MG and AP1 primers (GeneHunter). The differential display band appeared to be present only in liquid suspension cultures of Loblolly Pine somatic embryos. The conditions for mRNA isolation, reverse-transcription, differential display-PCR, and gel separation/visualization for producing this band were all as described in Example 1. Likewise, the band containing the original LP2-3 fragment was excised from the differential display gel, amplified, and cloned into pCR-Script AMP SK(+) according to standard protocols known in the art.
Northern Hybridizations Demonstrating Early-Specific Expression
Northern analysis demonstrated that the LP2-3 differential display clone hybridized to an approximately 1.2 Kb mRNA from liquid suspension culture embryos but was undetectable in late (6-9) stage embryo RNA. (
LP2-3 Differential Display and ‘Full-Length’ cDNA Sequences
A ‘full-length’ cDNA was captured from SMART™ cDNA made from somatic embryo liquid suspension by using a biotinylated LP2-3 differential display fragment as a capture probe. The “full-length” cDNA was cloned and sequenced according to standard protocols known in the art. This sequence was designated at LP2-3+.
GenBank blastx searches conducted with the above sequence translated in all 6 reading frames indicated that LP2-3+likely encodes a member of the major intrinsic protein family. This family of proteins encodes membrane channels for the transport of water and/or ions across cell membranes. They may play a significant role in osmoregulation and may play a role in the cellular responses to water and salt stresses. As is known in the art, the MIPs are induced by dessication, flooding, and high levels of the plant hormone ABA. In contrast, the LP2-3 sequence was not detected in desiccated late-stage embryos which have high levels of ABA and, thus, appears to be regulated by some embryo-specific signal.
Currently the improvement of tissue culture practices arises via hypothesis, evaluation and adoption. Hypotheses arise from observation of size, shape, weight, etc. and physiological measurement of ion or sugar content (
To this end, mRNA levels of two cDNAs (LPZ-202 and LPZ-216), similar to “Late Embryogenesis Abundant” (LEA) proteins, identified in other plants, were monitored. These genes are induced by the plant hormone ABA. Two peaks of mRNA were observed in these clones rather than the typical single peak in most plants. (See
Zygotic and Somatic Loblolly Pine Embryos
Loblolly pine cones were collected weekly from a breeding orchard near Lake Charles, La., and shipped on ice for experimentation. Embryos were excised and evaluated for developmental stage (Pullman et al. 1994). Stage 9 embryos were separated by the week they were collected-9.1 (week 1), 9.2 (week 2), etc. Staged zygotic embryos were sorted into vials partially immersed in liquid nitrogen and stored at −70° C. Somatic embryos for loblolly pine were initiated as described by Becwar et al. (1995) or with minor modifications. Somatic embryos were grown, selected, and staged as described by Pullman et al. (1994) and stored at −70° C.
cDNA Probe Preparation and Hybridization
30 ng of purified Lea protein cDNA fragments was labeled with 32P dCTP using the Ready-To-Go cDNA Random Labeling kit (Pharmacia). The labeled cDNAs were purified using NICK Column (Pharmacia) and heat denatured for hybridization. The RNA slot blot was pre-hybridized in hybridization buffer (0.5 M sodium-phosphate, pH 7.2, 5% SDS, and 10 mM EDTA) at 65° C. for 2 hours in a hybridization oven (Model 400, Robbins Scientific, Sunnyvale, Calif.) and the hybridized in the same conditions with the cDNA probes. After hybridization, the membranes were washed at 65° C. in 0.2×SSC and 0.1% SDS. Each wash was 15 min. The membranes were then exposed to Image Plate.
The probes can be stripped from the RNA slot blot by pouring boiling 0.5% SDS onto the membrane twice and incubating without heating for 30 min. The stripped blot was then exposed to Image Plate for overnight to check the completeness of the de-probing before next round of hybridization.
To ensure the equal loading of the each RNA sample, the same membranes were stripped and hybridized with a 32P-dCTP labeled 26S ribosomal rDNA fragment. These results were used as controls to normalize the Lea protein gene expression levels.
As a means of evaluating the usefulness of these arrays, we followed the expression of three cDNAs that have strong sequence similarity to late embryo-abundant proteins, (Lea) proteins from cotton (Baker et al 1988). Lea proteins and mRNAs appear in embryos at a stage when ABA is high and the genes can be induced in vegetative tissue by application of ABA. The transcript level of Lea genes LPZ-202 and LPZ-216 showed two peaks, rising from stage 5 and returning to a base line about stage 9.2 then rising again around stage 9.5. (See
To confirm the fluctuation in lea transcript levels by Northern analysis. RNA was extracted from zygotic embryos at different stages of development A in ‘dehydrin’ cDNA from the North Carolina State University cDNA collection (hftp://www.cbc.med.umn.edu/ResearchProiects/Pine/DOE.pine/index.html) was used as probe for some experiments. Dehydrins are a class of lea protein, originally identified as water deficit inducible proteins. Since the expression of this class of protein is well characterized, in contrast to our lea genes, the dehydrin expression profile could act as a reference point. After probing with dehydrin, blots were stripped and probed with a 26S rDNA probe from Arabidopsis to check the loading of the original gel. The normalized expression pattern of dehydrin in the zygotic embryogenesis is illustrated in the top panel of
This pattern reveals two significant peaks at the early development of the embryos and high expression levels for the stage 9.6 and beyond. The expression pattern of these two lea genes in loblolly pine embryos is consistent with the changes in ABA concentration observed in pine during embryogenesis. (See
Embryos as Compared to Zygotic Embryos for Fitness Determination
The model and goal for somatic embryogenesis is to produce an embryo that in vigor, germinatability, etc., resembles a zygotic embryo. Standard measurements reveal relatively little about the embryos; thus the metabolic state of somatic and zygotic embryos is unknown. The metabolic state of zygotic (natural) embryos can be evaluated by DNA arrays containing the cDNA clones described in this application. A database of mRNA levels for the genes represented on the DNA arrays can then be established. Embryos growing under a new tissue culture protocol (
To illustrate this process, elevation of plant hormone ABA in maturation medium was evaluated as a protocol modification, as described below. This modification proved beneficial, elevating the number and quality of the embryos produced. The mRNA abundance for cDNAs was assessed by DNA array using RNA isolated from control and elevated ABA conditions; several differences were observed in the mRNA levels of specific genes. Further, abundance of mRNA in the elevated ABA condition, more closely resembled the mRNA abundance observed for the these same genes in zygotic embryos. Thus a protocol which produces higher quality embryos produces, in these embryos, a mRNA profile that more closely resembles that observed in natural embryos.
Zygotic and Somatic Loblolly Pine Embryos
Loblolly pine cones were collected weekly from a breeding orchard near Lake Charles, La., and shipped on ice for experimentation. Embryos were excised and evaluated for developmental stage (Pullman et al. 1994). Stage 9 embryos were separated by the week they were collected-9.1 (week 1), 9.2 (week 2), etc. Staged zygotic embryos were sorted into vials partially immersed in liquid nitrogen and stored at −70° C. Somatic embryos for loblolly pine were initiated as described by Becwar et al. (1995) or with minor modifications. Somatic embryos were grown, selected, and staged as described by Pullman et al. (1994) and stored at −70° C.
Mass Isolation of Genes Differentially Expressed in Loblolly Pine Zygotic Embryos
The following RNA differential display method is sensitive enough to produce banding patterns from one mid- to late-stage embryo or 10-20 early stage embryos. This technique, which extracts mRNA directly from tissue using oligo(dt) beads, avoids losses inherent in conventional RNA extraction methods, is fast, reliable, and inexpensive. Differences in gene expression during development, as well as between somatic and zygotic embryos, can be easily detected.
To achieve these results, 50-100 μl lysis buffer containing 100 mM Tris-HCl, pH 8.0, 500 mM LiCl, 10 mM EDTA, 1% SDS and 5 mM DTT was added to 10-100 mg of staged embryos in a 1.5 ml tube. The mixture was ground thoroughly with an electric drill containing a plastic pestle bit (VWR, Cat# KT95050-99) that had been sterilized by autoclaving. An additional 50-100 μl lysis buffer was added and ground briefly. The grinder and vortex was washed with 100 μl lysis buffer. If multiple samples were processed, each is stored on ice until ready for the next step. The grinding tip was washed with sterile water and dried for the next sample.
After all the samples were ground, they were spun at 4° C. for 15 minutes in a bench top centrifuge at 14,000 rpm. 8 μl oligo(dT) coated Dynal beads (mRNA DIRECT Kit, Dynal, N.Y.) was placed in a 1.5 ml tube. The Dynal beads were washed twice with a 100 μl of the above mentioned lysis buffer and suspended in an equal volume of the lysis buffer used in tissue grinding. If more than one sample is handled, the beads for all the samples can be washed together and dispensed in several 1.5-ml tubes. The cleared embryo lysate (after centrifugation) was added to the beads and mixed well.
The mixture was then incubated on ice for 5 min., placed on a magnetic stand (Promega) for 5 min., and partially dried by careful removal of the liquid. To this, 100 μl of washing buffer with LiDS containing 100 mM Tris-HCl, pH 8.0, 0.15 mM LiCl, 1.0 mM EDTA, and 0.1% SDS was added, (mRNA DIRECT kit.) The mix was transferred to a 200 μl PCR tube. The beads were washed once with 100 μl washing buffer with LIDS and once with 50 μl washing buffer containing 100 mM Tris-HCl, pH 8.0, 0.15 mM LiCl, and 1.0 mM EDTA. (mRNA DIRECT kit.) The beads were then washed quickly with 20 μl 1×RT Buffer (25 mM Tris-HCl, pH 8.3, 37.6 mM KCl, 2.5 mM MgCl2, and 5 mM DTT) and 20 μl RT Mix containing 1×RT Buffer and 20 μM dNTP was added. The tube was heated at 65° C. for 5 min. and cooled to 37° C. 1 μl MMLV reverse transcriptase (Promega) was added and the mixture was incubated at 37° C. for 1 h. with occasional shaking. Next, 20 μl of water was added to the RT reaction, mixed and a 1.0 μl to 20 μl aliquot of the PCR mix containing 1×Perkin-Elmer PCR buffer, 2.0 μM dNTP, 1.0 μM T12VN, 0.2 μM arbitrary 10-mer, 1 unit AmpliTaq (Perkin-Elmer), 50 μCi α35S-dATP (Amersham) was taken. PCR using temperature settings of 94° C. 30″, 40° C. 1′, 72° C. 2′, 40 cycles, and 72° C. 10′ extension was performed with the Perkin Elmer 9600 Thermal Cycler. All PCR product was run on appropriate gels for band visualization.
cDNA cloning of Differential Display Bands
All dried gels were marked with radioactive ink prior to film exposure for proper alignment between the X-ray film and the dried gel plate. Appropriate bands were marked by puncturing. A scalpel blade was used to score the gel around each band to be excised. The excised gel pieces were placed into a PCR tube containing 2 μl water. PCR was performed using a 50 μl PCR mix (same as for differential display with the following modifications: the primer concentration was 1 μM, and the dNTP concentration was 200 μM; no α35S-dATP is added.) The cycle settings were the same as above.
A portion of the PCR products was run on a gel to determine amount and size of PCR products; DNA that did not correspond to the size of the original differential display band was discarded. The remaining PCR fractions were purified using CHROMA SPIN-100 columns (Clontech, Palo Alto, Calif.) according to the manufacturer's instructions. The purified PCR fragments were cloned into the pCR2.1 TA cloning vector (Invitrogen) according to Invitrogen cloning protocols supplied with the vector. The only variation from the standard protocol was an increase in the molar concentration of PCR product to vector (over 100-fold); multiple insertions were not found to be a problem. All ligations were performed at 16° C. overnight, transformed into E. coli strain DH5α, and plated onto LB with X-gal/IPTG.
Five colonies were chosen for PCR verification; PCR products of expected size were selected. About 10 μl of the 30 μl PCR reaction was simultaneously digested with Nla III and Mse I overnight at 37° C. (a 5 h digestion was used as well.) cDNA clones were selected according to the colony PCR and the restriction enzyme digestion pattern.
The differential display protocol for finely staged zygotic embryos of loblolly pine as described above, has produced more than 600 differential display patterns and more than 60,000 bands. Within that set of bands, we have identified bands that increased and/or decreased during embryo development. From those bands cDNA clones of this invention were isolated and sequenced.
Detection of Gene Expression by Micro-Array Assay
In order to verify expression patterns of the cloned DNA in loblolly pine embryos a micro-array assay was developed. The cloned cDNAs were amplified by PCR and adjusted to equal concentrations (0.1 μg/μl). The cDNAs were then dispensed in the wells of a 384-well plate, denatured in 0.3 M NaOH at 65° C. for 30 min. and neutralized with 2 volumes of 20×SSPE mixed with 0.00125% bromophenol blue and 0.0125% xylene cyanol FF (5% gel loading dye). The denatured DNAs were then blotted on to Hybond N+membranes (Amersham) as arrays using a VP 386 pin blotter (V&P Scientific, Inc., San Diego, Calif.). Each DNA was dot-blotted four times as a quartet on the membrane. An example of quartet spotting is seen in
The cDNA array membranes were pre-hybridized in hybridization buffer (0.5 M Na-phosphate, pH 7.2, 5% SDS, and 10 mM EDTA) at 65° C. for 30′ in a hybridization oven (Model 400, Robbins Scientific, Sunnyvale, Calif.) and then hybridized under the same conditions with total cDNA probes made from mRNA. The membranes were washed twice at room temperature in 2×SSPE and 0.1% SDS, twice in 0.5×SSPE and 0.1% SDS, and twice in 0.1× hybridization buffer. Each wash was roughly 20 min. Each membrane was then exposed to Kodak Biomax MR films.
The total cDNA probes referred to above were made by initially creating the first strand cDNA. This was accomplished by mixing loblolly pine embryos (0.05-0.1 gm fresh weight) with 100 μl lysis buffer (containing 100 mM Tris-HCl, pH 8.0, 500 mM LiCl, 10 mM EDTA, 1% SDS and 5 mM DTT) in a 1.5 ml Eppendorf tube. The mix was then ground with an electric drill as described above. Another 100 μl lysis buffer was added and the lysate was ground again briefly. The drill pestle was washed with 100 μl lysis buffer that was pooled with the lysate. After centrifugation at 14K at 4° C. for 15 min. in a Beckman bench top centrifuge, the clear embryo lysate was mixed with 10 μl Dynal beads washed twice with lysis buffer. The suspension was incubated on ice for 5 min., with occasional mixing to allow binding of Poly (A) RNA to the oligo (dT) on the beads, and then left on a magnetic stand at room temperature for another 5 min. The liquid was removed and the beads were moved to a 0.2 ml PCR tube by suspending in 100 μl lysis buffer.
The beads were washed twice with 100 μl of washing buffer with LiDS and once with 50 μl of washing buffer. The mRNA was eluted from the beads in 6 μl water at 65° C. for 2′. One μl T21VN primer (10 μM) and 1 μl SCSP oligo (cap switch primer, 5′-ctcttaattaagtacgcggg-3′, 10 μM) were added to the mRNA eluate. The mixture was incubated at 70° C. for 2′ and cooled on ice. Three μl 5×First Strand Buffer, 1.5 μl DTT (20 mM), 1.5 μl dNTP (10 mM each) and 1 μl MMLV Superscript II (Gibco BRL) were added to the mRNA-primer mixture followed by incubation at 42° C. for 1 h to synthesis first strand cDNAs. The cDNA was heated to 72° C. for 1 min. to degrade RNA and then diluted to 100 μl with water. The lysis buffer, washing buffer and Dynal beads are components of the mRNA DIRECT kit (Dynal, N.Y.). The first strand buffer (5×), 20 mM DTT and 10 mM dNTP are components of the SMART PCR cDNA synthesis kit (Clontech, Palo Alto, Calif.).
The first strand cDNAs synthesized as described above contains a T21VN sequence at their 5′ ends and the SCSP sequence (see “SMARTTM cDNA, Clontech, Palo Alto, Calif.) at their 3′ terminals. Total cDNA probes were made by PCR amplifying the first strand cDNAs using SMART cDNA PCR (Clontech, Palo Alto, Calif.) in the presence of labeling agent. Five 5 μl first strand cDNA solution was mixed with 5 μl 10×KlenTaq PCR buffer (Clonetech), 5 μl dATP+dGTP+dUTP (5 μM each), 1 μl T21VN primer, 1 μl SCSP oligo, 1 μl KlenTaq Mix, 5 μl 32P-dCTP (10 mCi/ml, Amersham) and 27 μl water. The PCR was performed using the setting of 94° C. 2′, 15 cycles of 95° C. 15″, 52° C. 30″, 68° C. 6′. The PCR products were purified using NICK column (Pharmacia) according to the manufacture's instructions.
Currently, high-density array Southerns for both somatic and zygotic embryos at all the developmental stages have been performed. The dot array Southern data indicate that gene expression of late stage somatic embryos resembles middle stage zygotic embryos; many transcripts present during late zygotic embryogenesis (ZE) are absent in somatic embryos and late stage somatic embryo gene expression patterns resemble the patterns of middle stage zygotic embryos.
Cairney et al. (In Vitro Cell. & Devel. Biol.-Plant. 36:155-162 (2000); Appl. Biochem. Biotech. 77-79:5-17 (1999)) have discussed how this gene expression information may be used to improve the process of somatic embryogenesis; the rare incorporated in their entirety. As shown in
The evaluation of tissue culture modifications for pine somatic embryogenesis, depicted in
Table 4 describes several publicly available clones. Lec. Fie, and Pkl, used to provide a representative model for this example. Any clone within Table 1, SEQ ID NOS: 1-327, can be substituted for those in Table 4 to assay increased performance in tissue culture. Any promoter within Table 1, SEQ ID NOS: 328-334, can be incorporated with those in Table 4 or SEQ ID NOS: 1-327 to assay increased performance in tissue culture. In this scenario, Table 5, a representation of the information contained in
Immature zygotic seeds were collected from loblolly pine genotype 260 (mother tree BC-3, Boise Cascade). Somatic embryos were initiated as described by Becwar et al. (1990) or with modifications in media mineral composition. The early stage somatic embryos were grown in cell suspension culture medium 16 and sub-cultured every week (Pullman and Webb, 1994). The embryos collected from the suspension, which include stage 1 and stage 2 somatic embryos, are referred to as stage S embryos. At the end of the subculture week, the somatic embryos in the suspension were settled in a cylinder and transferred to maturation medium 240 (Pullman and Webb, 1994). Resulting somatic embryos were selected, staged, sorted into vials containing the same stage, and stored at −70° C. until analyses were performed.
For the following example analysis RNA was isolated from embryos at different stages in development, early stage somatic embryos and late-stage somatic embryos. The cDNA probes used in this example are not contained in the SEQ ID NOS: 1-327, but rather, are generic, publicly available pine sequences obtained from the Pine Gene Discovery project located at (http://www.cbc.med.umn.edu/ResearchProiects/Pine/DOE.pine/index.html). These clones are homologs to the well-studied Arabidopsis genes that have been shown to have significant influence on embryo development in this plant. The pine clone names (first column) and corresponding references for the Arabidopsis homologs are shown in Table 4. The three clones listed, Lec, Lie, and Pkl, are for representative purposes within this example and it will be clear to one skilled in the art that any of the SEQ ID NOS: 1-327 could be substituted for those here as all will help identify conditions for improved performance in culture.
Probes were made by preparation of DNA using Wizard Minipreps (Promega, Madison, Wis.) and cDNA inserts isolated by restriction enzyme digestion. For the cDNA probes, 50 ng of the isolated cDNA insert DNA was used to make 32P-labeled probes with Ready-To-Go DNA labeling beads (Amersham Pharmacia Biotech) according to manufacturer's instructions. Blots were prehybridized (7% SDS, 1% BSA, 0.25 M NaPO4 (pH 7.2), 1.0 mM EDTA) for 3 hours at 65° C. and hybridized in fresh buffer at 65° C. for 12 to 18 hours (4). Each blot was washed 6 times with the following conditions: 1) RT, 2×SSC, 0.1% SDS, 15 min; 2) RT, 2×SSC, 0.1% SDS, 30 min; 3) 42° C., 0.2×SSC, 0.1% SDS, 15 min; 4) 42° C., 0.2×SSC, 0.1% SDS, 30 min; 5) 60° C., 0.2×SSC, 0.1% SDS, 30 min; 6) 60° C., 0.2×SSC, 0.1% SDS, 30 min. Blots were exposed to a phosphorimaging plate for 10 minutes. Screens were read with a BAS1800 (software v1.0) and images were manipulated with ImageGauge (v2.54) (Fuji Photo Film Co., Ltd., Kanagawa, Japan).
The hypothesis tested within this example is that genotypes that produce large numbers of embryos have high Lec expression and low Pkl expression, poor genotypes have the opposite pattern, and that Lec and Pkl expression act as indicators of embryogenic potential.
The results described in the previous section of Example 5 reveal ways in which gene expression analyses can be used to improve somatic embryogenesis based on several genes. However, this principle applies as well when the assay is expanded to determine the expression of hundreds or thousands of genes simultaneously (e.g. by DNA arrays). We can create hypotheses which state that expression of a single specific gene can be used to determine the potential of a culture, or hypotheses that state that the expression of a group of genes (e.g., hypothetical genes A, B, C, D, E, F) acts as an indicator of high embryogenic potential. For example, all these genes may be expressed at a high level in cell lines that produce large numbers of embryos, thus we would select cell lines which exhibited this characteristic. Alternatively specific levels of expression for genes A, B, C, D, E and F may be required and a combination of high and low expression of particular genes will identify desirable cultures. Alternatively, experience will determine that certain exceptions can be tolerated.
While the previous paragraphs discuss numbers of embryos produced, the principle applies to ANY desired characteristic: by establishing a correlation of gene expression with e.g., germination potential, embryo size, growth of plantlets in their first year, disease resistance of mature plants, environmental hardiness or wood quality. Any trait where could be evaluated by these gene expression assays and correlations with gene expression established, resulting in a molecular tool which could be used to predict desirable characteristics. Explicitly, we could use these gene expression tools to select cell lines which will produce high quality plantlets months before they grow into plantlets, or cell lines or juvenile plantlets which will produce hardy trees with desirable wood quality, years before these traits are expressed.