US20030074363A1 - Method and system for providing a polymorphism database - Google Patents

Method and system for providing a polymorphism database Download PDF

Info

Publication number
US20030074363A1
US20030074363A1 US10/219,021 US21902102A US2003074363A1 US 20030074363 A1 US20030074363 A1 US 20030074363A1 US 21902102 A US21902102 A US 21902102A US 2003074363 A1 US2003074363 A1 US 2003074363A1
Authority
US
United States
Prior art keywords
records
item
computer
sequence
polymorphism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/219,021
Inventor
David Balaban
Jyoti Baid
Anthony Berno
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Affymetrix Inc
Original Assignee
Affymetrix Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Affymetrix Inc filed Critical Affymetrix Inc
Priority to US10/219,021 priority Critical patent/US20030074363A1/en
Publication of US20030074363A1 publication Critical patent/US20030074363A1/en
Priority to US11/038,624 priority patent/US20050164270A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/006Call diverting means
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/30Microarray design
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2207/00Type of exchange or network, i.e. telephonic medium, in which the telephonic communication takes place
    • H04M2207/20Type of exchange or network, i.e. telephonic medium, in which the telephonic communication takes place hybrid systems
    • H04M2207/206Type of exchange or network, i.e. telephonic medium, in which the telephonic communication takes place hybrid systems composed of PSTN and wireless network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/46Arrangements for calling a number of substations in a predetermined sequence until an answer is obtained
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/54Arrangements for diverting calls for one subscriber to another predetermined subscriber
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S128/00Surgery
    • Y10S128/92Computer assisted medical diagnostics
    • Y10S128/922Computer assisted medical diagnostics including image analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/912Applications of a database
    • Y10S707/923Intellectual property
    • Y10S707/924Patent procedure
    • Y10S707/925Drafting an application
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/953Organization of data
    • Y10S707/955Object-oriented
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99934Query formulation, input preparation, or translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99944Object-oriented database structure
    • Y10S707/99945Object-oriented database structure processing

Definitions

  • the present invention relates to the collection and storage of information pertaining to chips for processing biological samples and thereby identifying polymorphisms.
  • the variant form may confer an evolutionary advantage or disadvantage relative to a progenitor form or may be neutral.
  • a variant form confers a lethal disadvantage and is not transmitted to subsequent generations of the organism.
  • a variant form confers an evolutionary advantage to the species and is eventually incorporated into the DNA of many or most members of the species and effectively becomes the progenitor form.
  • both progenitor and variant form(s) survive and co-exist in a species population. The coexistence of multiple forms of a sequence gives rise to polymorphisms.
  • an array of nucleic acid probes is fabricated at known locations on a chip or substrate.
  • a fluorescently labeled nucleic acid is then brought into contact with the chip and a scanner generates an image file indicating the locations where the labeled nucleic acids bound to the chip. Based upon the identities of the probes at these locations, it becomes possible to extract information such as the identity of polymorphic forms in of DNA or RNA.
  • Such systems have been used to form, for example, arrays of DNA that may be used to study and detect mutations relevant to cystic fibrosis, the P53 gene (relevant to certain cancers), HIV, and other genetic characteristics.
  • the present invention provides systems and methods for organizing information relating to study of polymorphisms.
  • a database model is provided which interrelates information about one or more of, e.g, subjects from whom samples are extracted, primers used in extracting the DNA from the subjects, about the samples themselves, about experiments done on samples, about particular oligonucleotide probe arrays used to perform experiments, about analysis procedures performed on the samples, and about analysis results.
  • the model is readily translatable into database languages such as SQL.
  • the database model scales to permit storage of information about large numbers of subjects, samples, experiments, chips, etc.
  • Applications include linkage studies to determine resistance to drugs, susceptibility to diseases, and study of every characteristic of humans and other organisms that is related genetic variability.
  • Another application of a database constructed according to this model is quality control of the various steps of performing a polymorphism study. By preserving information about every step of a polymorphism study, one can assess the reliability of the results or use the preserved information as feedback to improve procedures.
  • FIG. 1 illustrates an overall system and process for forming and analyzing arrays of biological materials such as DNA or RNA.
  • FIG. 2A illustrates a computer system suitable for use in conjunction with the overall system of FIG. 1.
  • FIG. 2B illustrates a computer network suitable for use in conjunction with the overall system of FIG. 1.
  • FIG. 3 illustrates a key for interpreting a database model.
  • FIGS. 4 A- 4 H illustrate a database model for maintaining information for the system and process of FIG. 1 according to one embodiment of the present invention.
  • Polymorphisms are detected in a target nucleic acid from an individual being analyzed.
  • genomic DNA virtually any biological sample (other than pure red blood cells) is suitable.
  • tissue samples include whole blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair.
  • tissue sample must be obtained from an organ in which the target nucleic acid is expressed.
  • the target nucleic acid is a cytochrome P450
  • the liver is a suitable source.
  • LCR ligase chain reaction
  • NASBA nucleic acid based sequence amplification
  • the latter two amplification methods involve isothermal reactions based on isothermal transcription, which produce both single stranded RNA (ssRNA) and double stranded DNA (dsDNA) as the amplification products in a ratio of about 30 or 100 to 1, respectively.
  • ssRNA single stranded RNA
  • dsDNA double stranded DNA
  • the first type of analysis is sometimes referred to as de novo characterization. This analysis compares target sequences in different individuals to identify points of variation, i.e., polymorphic sites. By analyzing groups of individuals representing the greatest ethnic diversity among humans and greatest breed and species variety in plants and animals, patterns characteristic of the most common alleles/haplotypes of the locus can be identified, and the frequencies of such populations in the population determined. Additional allelic frequencies can be determined for subpopulations characterized by criteria such as geography, race, or gender.
  • the second type of analysis is determining which form(s) of a characterized polymorphism are present in individuals under test. There are a variety of suitable procedures, which are discussed in turn.
  • Allele-specific probes for analyzing polymorphisms is described by e.g., Saiki et al., Nature 324, 163-166 (1986); Dattagupta, EP 235,726, Saiki, WO 89/11548. Allele-specific probes can be designed that hybridize to a segment of target DNA from one individual but do not hybridize to the corresponding segment from another individual due to the presence of different polymorphic forms in the respective segments from the two individuals. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles.
  • Some probes are designed to hybridize to a segment of target DNA such that the polymorphic site aligns with a central position (e.g., in a 15 mer at the 7 position; in a 16 mer, at either the 8 or 9 position) of the probe. This design of probe achieves good discrimination in hybridization between different allelic forms.
  • Allele-specific probes are often used in pairs, one member of a pair showing a perfect match to a reference form of a target sequence and the other member showing a perfect match to a variant form. Several pairs of probes can then be immobilized on the same support for simultaneous analysis of multiple polymorphisms within the same target sequence.
  • the polymorphisms can also be identified by hybridization to nucleic acid arrays, some example of which are described by WO 95/11995 (incorporated by reference in its entirety for all purposes).
  • WO 95/11995 also describes subarrays that are optimized for detection of a variant forms of a precharacterized polymorphism.
  • Such a subarray contains probes designed to be complementary to a second reference sequence, which is an allelic variant of the first reference sequence.
  • the second group of probes is designed by the same principles as described in the Examples except that the probes exhibit complementarily to the second reference sequence.
  • a second group (or further groups) can be particular useful for analyzing short subsequences of the primary reference sequence in which multiple mutations are expected to occur within a short distance commensurate with the length of the probes (i.e., two or more mutations within 9 to 21 bases).
  • An allele-specific primer hybridizes to a site on target DNA overlapping a polymorphism and only primes amplification of an allelic form to which the primer exhibits perfect complementarily. See Gibbs, Nucleic Acid Res. 17, 2427-2448 (1989). This primer is used in conjunction with a second primer which hybridizes at a distal site. Amplification proceeds from the two primers leading to a detectable product signifying the particular allelic form is present. A control is usually performed with a second pair of primers, one of which shows a single base mismatch at the polymorphic site and the other of which exhibits perfect complementarily to a distal site. The single-base mismatch prevents amplification and no detectable product is formed.
  • the method works best when the mismatch is included in the 3′-most position of the oligonucleotide aligned with the polymorphism because this position is most destabilizing to elongation from the primer. See, e.g., WO 93/22456.
  • Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution. Erlich, ed., PCR Technology, Principles and Applications for DNA Amplification, (W. H. Freeman and Co, New York, 1992), Chapter 7.
  • Alleles of target sequences can be differentiated using single-strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described in Orita et al., Proc. Nat. Acad. Sci. 86, 2766-2770 (1989).
  • Amplified PCR products can be generated as described above, and heated or otherwise denatured, to form single stranded amplification products.
  • Single-stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence.
  • the different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence difference between alleles of target sequences.
  • One embodiment of the present invention operates in the context of a system for analyzing biological or other materials using arrays that themselves include probes that may be made of biological materials such as RNA or DNA.
  • the VLSIPSTM and GeneChipTM technologies provide methods of making and using very large arrays of polymers, such as nucleic acids, on chips. See U.S. Pat. No. 5,143,854 and PCT Patent Publication Nos. WO 90/15070 and 92/10092, each of which is hereby incorporated by reference for all purposes.
  • Nucleic acid probes on the chip are used to detect complementary nucleic acid sequences in a sample nucleic acid of interest (the “target” nucleic acid).
  • FIG. 1 illustrates an overall system 100 for forming and analyzing arrays of biological materials such as RNA or DNA.
  • a part of system 100 is a polymorphism database 102 .
  • Polymorphism database 102 includes information about, e.g., biological sources, preparation of samples, design of arrays, raw data obtained from applying experiments to chips, analysis procedures applied, and analysis results, etc.
  • Polymorphism database 102 facilitates large scale study of polymorphisms.
  • a chip design system 104 is used to design arrays of polymers such as biological polymers such as RNA or DNA.
  • Chip design system 104 may be, for example, an appropriately programmed Sun Workstation or personal computer or workstation, such as an IBM PC equivalent, including appropriate memory and a CPU.
  • Chip design system 104 obtains inputs from a user regarding chip design objectives including polymorphisms of interest, and other inputs regarding the desired features of the array.
  • chip design system 104 from external databases such as GenBank.
  • the output of chip design system 104 is a set of chip design computer files in the form of, for example, a switch matrix, as described in PCT application WO 92/10092, and other associated computer files.
  • the chip design computer files form a part of polymorphism database 102 .
  • Systems for designing chips for study of polymorphisms are disclosed in U.S. Pat. No. 5,571,639 and in PCT application WO 95/11995, the contents of which are herein incorporated by reference.
  • the chip design files are input to a mask design system (not shown) that designs the lithographic masks used in the fabrication of arrays of molecules such as DNA.
  • the mask design system designs the lithographic masks used in the fabrication of probe arrays.
  • the mask design system generates mask design files that are then used by a mask construction system (not shown) to construct masks or other synthesis patterns such as chrome-on-glass masks for use in the fabrication of polymer arrays.
  • the masks are used in a synthesis system (not shown).
  • the synthesis system includes the necessary hardware and software used to fabricate arrays of polymers on a substrate or chip.
  • the synthesis system includes a light source and a chemical flow cell on which the substrate or chip is placed.
  • a mask is placed between the light source and the substrate/chip, and the two are translated relative to each other at appropriate times for deprotection of selected regions of the chip.
  • Selected chemical reagents are directed through the flow cell for coupling to deprotected regions, as well as for washing and other operations.
  • the substrates fabricated by the synthesis system are optionally diced into smaller chips.
  • the output of the synthesis system is a chip ready for application of a target sample.
  • a biological source 112 is, for example, tissue from a plant or animal.
  • Various processing steps are applied to material from biological source 112 by a sample preparation system 114 . Operation of sample preparation system 114 in the context of a polymorphism study is discussed below in further detail.
  • the prepared samples include nucleic acid sequences such as DNA.
  • the nucleic acids may or may not bond to the probes.
  • the nucleic acids can be tagged with fluoroscein labels to determine which probes have bonded to nucleotide sequences from the sample.
  • the prepared samples will be placed in a scanning system 118 .
  • Scanning system 118 includes a detection device such as a confocal microscope or CCD (charge-coupled device) that is used to detect the location where labeled receptors have bound to the substrate.
  • the output of scanning system 118 is an image file(s) indicating, in the case of fluorescein labeled receptor, the fluorescence intensity (photon counts or other related measurements, such as voltage) as a function of position on the substrate.
  • image files may also form a part of polymorphism database 102 . Since higher photon counts will be observed where the labeled nucleic acid(s) has bound more strongly to the array of probes, and since the monomer sequence of the probes on the substrate is known as a function of position, it becomes possible to analyze the sequence(s) of the nucleic acid(s) that are complementary to the probes.
  • the image files and the design of the chips are input to an analysis system 120 that, e.g., calls bases.
  • analysis techniques are described in EPO Pub. No. 0717113A, the contents of which are herein incorporated by reference.
  • Chip design system 104 analysis system 120 and control portions of exposure system 116 , sample preparation system 114 , and scanning system 118 may be appropriately programmed computers such as a Sun workstation or IBM-compatible PC.
  • An independent computer for each system may perform the computer-implemented functions of these systems or one computer may combine the computerized functions of two or more systems.
  • One or more computers may maintain chip design database 102 independent of the computers operating the systems of FIG. 1 or chip design database 102 may be fully or partially maintained by these computers.
  • FIG. 2A depicts a block diagram of a host computer system 10 suitable for implementing the present invention.
  • Host computer system 210 includes a bus 212 which interconnects major subsystems such as a central processor 214 , a system memory 216 (typically RAM), an input/output (I/O) adapter 218 , an external device such as a display screen 224 via a display adapter 226 , a keyboard 232 and a mouse 234 via an I/O adapter 218 , a SCSI host adapter 236 , and a floppy disk drive 238 operative to receive a floppy disk 240 .
  • a central processor 214 includes a central processing unit 214 , a central processing unit 216 (typically RAM), an input/output (I/O) adapter 218 , an external device such as a display screen 224 via a display adapter 226 , a keyboard 232 and a mouse 234 via an I/O adapter 218 , a
  • SCSI host adapter 236 may act as a storage interface to a fixed disk drive 242 or a CD-ROM player 244 operative to receive a CD-ROM 246 .
  • Fixed disk 244 may be a part of host computer system 210 or may be separate and accessed through other interface systems.
  • a network interface 248 may provide a direct connection to a remote server via a telephone link or to the Internet.
  • Network interface 248 may also connect to a local area network (LAN) or other network interconnecting many computer systems. Many other devices or subsystems (not shown) may be connected in a similar manner.
  • LAN local area network
  • FIG. 2A it is not necessary for all of the devices shown in FIG. 2A to be present to practice the present invention, as discussed below.
  • the devices and subsystems may be interconnected in different ways from that shown in FIG. 2A.
  • the operation of a computer system such as that shown in FIG. 2A is readily known in the art and is not discussed in detail in this application.
  • Code to implement the present invention may be operably disposed or stored in computer-readable storage media such as system memory 216 , fixed disk 242 , CD-ROM 246 , or floppy disk 240 .
  • FIG. 2B depicts a network 260 interconnecting multiple computer systems 210 .
  • Network 260 may be a local area network (LAN), wide area network (WAN), etc.
  • Bioinformatics database 102 and the computer-related operations of the other elements of FIG. 2B may be divided amongst computer systems 210 in any way with network 260 being used to communicate information among the various computers.
  • Portable storage media such as floppy disks may be used to carry information between computers instead of network 260 .
  • Polymorphism database 102 is preferably a relational database with a complex internal structure.
  • the structure and contents of chip design database 102 will be described with reference to a logical model depicted in FIGS. 4 A- 4 H that describes the contents of tables of the database as well as interrelationships among the tables.
  • a visual depiction of this model will be an Entity Relationship Diagram (ERD) which includes entities, relationships, and attributes.
  • ERP Entity Relationship Diagram
  • a detailed discussion of ERDs is found in “ERwin version 3.0 Methods Guide” available from Logic Works, Inc. of Princeton, N.J., the contents of which are herein incorporated by reference.
  • Those of skill in the art will appreciate that automated tools such as Developer 2000 available from Oracle will convert the ERD from FIGS. 4 A- 4 H directly into executable code such as SQL code for creating and operating the database.
  • FIG. 3 is a key to the ERD that will be used to describe the contents of chip design database 102 .
  • a representative table 302 includes one or more key attributes 304 and one or more non-key attributes 306 .
  • Representative table 302 includes one or more records where each record includes fields corresponding to the listed attributes. The contents of the key fields taken together identify an individual record.
  • each table is represented by a rectangle divided by a horizontal line. The fields or attributes above the line are key while the fields or attributes below the line are non-key.
  • An identifying relationship 308 signifies that the key attribute of a parent table 310 is also a key attribute of a child table 312 .
  • a non-identifying relationship 314 signifies that the key attribute of a parent table 316 is also a non-key attribute of a child table 318 . Where (FK) appears in parenthesis, it indicates that an attribute of one table is a key attribute of another table. Both the depicted non-identifying and identifying relationship are one to one-or-more relationships where one record in the parent table corresponds to one or more records in the child table.
  • An alternative non-identifying relationship 324 is a one to zero-or-more relationship where one record in a parent table 320 corresponds to zero or more records in a child table 322 .
  • FIGS. 4 A- 4 H are entity relationship diagrams (ERDs) showing elements of polymorphism database 102 according to one embodiment of the present invention. Each rectangle in the diagram corresponds to a table in database 102 . First, the relationships and general contents of the various tables will be described.
  • ERPs entity relationship diagrams
  • FIG. 4A illustrates core elements of database 102 according to one embodiment of the present invention.
  • a subject table 402 lists organisms from which samples have been extracted for polymorphism analysis or other tissue sources. Samples may also be obtained from tissue collections not associated with any one identified organism. Information stored within subject table 402 includes the name, gender, family, position with family, (e.g., father, mother, etc.), and ethnic group. For human subjects, the name and family will preferably be represented in coded form to assure privacy. Associated with each subject is a species as listed in a species table 404 . Also, a relationship may be defined among subjects a subject relationship table 406 which includes records corresponding to related subjects. These relationships may be father-mother, sibling, twins, etc.
  • Subjects may be part of a group that is being studied, e.g., a group with a congenital disease, or a toxic reaction to a particular drug.
  • the groups are listed in a subject group table 408 .
  • Participation of subjects in groups is defined by a subject participation table 410 which lists all group memberships.
  • Samples and their attributes are listed in a sample table 412 . Each sample has an associated sample type. The sample types are listed in a sample type table 414 . Possible sample types include blood, urine, etc. Companies or institutions that provide samples are listed in a sample source table 416 .
  • Database 102 provides an item table 418 that includes records for items. There are various types of items that correspond to different stages of the sample preparation process. An “item derivation” transforms an item of one type into an item of another type. The following table lists various item types and item derivation types for a representative embodiment.
  • Item Type Derived from by Item Derivation Type Sample other samples pooling Sample other sample splitting Extracted DNA Sample DNA Extraction Target (Sequences of Extracted DNA PCR interest amplified) Fluorescently Labeled Target Labeling Target Hybridized Chip Labeled Target Hybridization (application of target to chip) Stained Hybridized Chip Hybridized Chip Staining
  • Item Type Derived from by Item Derivation Type Fluorescently Labeled Target Labeling Target Hybridized Chip Labeled Target Hybridization (application of target to chip) Stained Hybridized Chip Hybridized Chip Staining
  • Item derivations are listed in an item derivation table 420 . It should be noted that derivations need not produce a change between item types. Each item derivation occurs in accordance with a protocol that characterizes the step or steps in the derivation. Protocols are listed in a protocol table 428 . Each item derivation is performed by an employee listed in employee table 432 .
  • Unused chips are listed in a chip table 422 .
  • Hybridized chips i.e., chips that have had target applied
  • hybridized chip table 424 Hybridized chip table 424 .
  • a hybridized sample map table 426 lists the relationships between hybridized chips and the samples that have been applied to them.
  • Stained hybridized chips are scanned in a process referred to here as a scan experiment.
  • Scan experiments are listed in a scan experiment table 430 .
  • the scan experiment occurs in accordance with a protocol listed in protocol table 428 .
  • the scan experiment is performed by an employee listed in employee table 432 .
  • FIG. 4B depicts further details of the data model for items and item derivations.
  • the various item types are listed in an item type table 434 and the various item derivation types are listed in an item derivation type table 436 .
  • the relationships between successive item types, e.g., sample and target are defined in an item type derivation table 438 .
  • An item has associated attributes. For example, for a target, database 102 may store the concentration, volume, location and/or remaining amount. All item attributes are stored in an item attribute table 440 . Item attributes may be shared among multiple items. For example, a series of targets may all share a preparation date.
  • An item attribute item map table 442 implements a many-to-many relationship between item attributes and items.
  • item attributes such as preparer, preparation date, etc. are listed in an item attribute type table 444 .
  • Each item type has corresponding attribute types.
  • Some attribute types are, however, shared among various item types. Accordingly, there is a many-to-many relationship among item attribute types and item types that is implemented by an item type map table 446 .
  • the tables of FIG. 4B represent a powerfully general model of the sample preparation process. Changes in process steps that require changes in the type of information that should be stored may be implemented by changing and adding table contents rather than providing new tables or changing relationships among tables.
  • FIG. 4C depicts a detailed data model for storing information about protocols according to the present invention.
  • Protocols as stored in protocol table 428 represent information about particular processes that have been performed including item derivations, analyses, and scan experiments.
  • Each protocol has an associated protocol template.
  • Protocol templates identify protocol types. For example, one protocol template may be a PCR template. All protocols associated with the PCR template identify parameters for performing a PCR procedure. Protocol templates are listed in a protocol template table 448 .
  • a parameter table 450 lists all the parameters and their values for all the protocols listed in protocol table 428 .
  • a parameter template table 452 lists the various parameter types along with default values. An examples of a parameter template would be a PCR reaction temperature. The parameter template would include a default value for this parameter.
  • Parameter table 450 might then list many different PCR reaction temperature values that would be used by many different protocols. If a parameter value has not been modified by the user, it inherits the standard value of the associated parameter template.
  • a parameter template set is a set of parameter templates that are used for a particular purpose, e.g., in association with protocols according to one or more protocol templates. Parameter template sets are listed in a parameter template set table 454 . There are different types of parameter template set and these are listed in a parameter template set table 456 .
  • a mapping between parameter template sets and protocol templates is defined by a protocol template set map table 458 .
  • Protocol templates may have associated lengthy verbal information about how to perform protocol steps.
  • a protocol template document table 460 stores references to documents that include instructions for performing protocols.
  • the data model for protocols defined by FIG. 4C is highly general and allows significant changes in the way item derivations, analyses, and experiments are performed without changing the underlying data model.
  • a fragment table 462 lists all the sequence fragments investigated in conjunction with database 102 . Associated with each fragment are one or more primer pairs used to amplify the fragment in a PCR process.
  • a primer pair table 464 lists all the primer pairs including information about whether the primer pair actually worked to amplify the fragment.
  • PCR table 466 that lists records identifying the outcome of multiple PCR operations. The individual PCR operations are identified by reference to item derivation table 420 .
  • a single PCR operation may be used to amplify many different fragments and thus employ many different primer pairs.
  • a single primer pair may be used in multiple PCR operations. There is therefore a many-to-many relationship between PCR operations and primer pairs that is recorded by a primer pair PCR map table 468 .
  • primer table 470 Information about individual primers is stored in a primer table 470 . Also, each primer has an associated protocol in protocol table 428 that characterizes the primer preparation process. Information about primer orders is listed in a primer order table 472 . Each primer order is to a vendor and the vendors are listed in a vendor table 474 . Each primer order is made by an employee listed in employee table 432 .
  • a primer order design map table 476 implements a many-to-many relationship between primer orders and primers.
  • the data model described here thus preserves information about primers used in PCR reactions.
  • the information preserved here thus permits experimenters to make optimal use of expensive and time consuming PCR procedures.
  • a wafer table 478 lists wafers. When chips are produced, many chips are produced at the same time as part of a single wafer. Chip table 422 stores references to wafer table 478 for each chip and the location of each chip on its wafer at production time. Sometimes there is analytic significance associated with the location of a chip on the wafer. Each wafer is produced as part of a lot and the identify of the lot for each wafer is recorded by wafer table 478 as a reference to a lot table 480 that lists each lot.
  • FIG. 4D depicts further details of tables pertaining to chip design that are preferably maintained within polymorphism database 102 according to one embodiment of the present invention.
  • a tiling design table 482 lists tiling designs. Each tiling design represents the application of a particular tiling format to a sequence to be investigated. Tiling formats indicate probe orientation, probe length, and the position within a probe of a single nucleotide polymorphism being investigated. In a preferred embodiment, there may be very few tiling formats and they are listed in a tiling format table 484 .
  • a particular tiling design includes many atom designs specifying the design of a single atom.
  • an atom is a group of typically four probes used to investigate a single base position with each probe hybridizing to a sequence including a different base at that position.
  • Atom designs are listed in an atom design table 486 . Records identifying the designs of individual probes are listed in a probe design table 488 .
  • a probe design role table 490 indicates the roles of probes listed in probe design table 488 in the atom designs of atom design table 486 . For combinations of probe design and atom design, probe design role table 490 indicates which base the probe hybridizes to at the substitution position and whether the probe represents a match or a mismatch to the wild type.
  • a probe data table 492 gives the hybridization intensity values for particular probes designs as determined in particular scan experiments. Each record of the table also gives the number of pixels used to determine the intensity value and the standard deviation of intensity as measured among the pixels.
  • FIGS. 4 E- 4 G depict aspects of polymorphism database 102 related to analysis procedures and their results according to one embodiment of the present invention.
  • An analysis table 494 lists analyses performed. An analysis generally refers to a non-trivial transformation of data. Records of analysis table 494 include references to protocol table 428 to specify parameters used for each analysis. Analyses may take as their input raw data or the results of previous analyses.
  • An analysis dependency table 496 lists dependencies among analyses where one analysis depends on the data developed by another analysis.
  • An analysis input table 498 lists inputs for analyses listed in analysis table 494 .
  • a chip design sequence map table 500 correlates particular fragments with chip designs.
  • a sequence position table 502 lists investigated sequence positions indicating their positions on a fragment. Records of sequence position table 502 reference a genomic sequence position table 504 which gives sequence positions in the genome rather than within individual fragments.
  • a scan experiment set table 506 lists sets of scan experiments. This allows for groupings of experiments for individuals or populations to serve as the basis for polymorphism analysis.
  • a scan experiment used table 508 lists records indicating memberships of a scan experiment in a scan experiment set.
  • a tiling data table 510 lists records identifying tiling designs as implemented in particular chips measured by particular scan experiments.
  • An atom data table 512 lists the intensities measured for particular sequence positions as measured in scan experiments identified by the tiling data records.
  • a subject sequence position data table 514 lists combinations of sequence position and scan experiment.
  • a series of tables in FIGS. 4 E- 4 G correspond to different types of analysis that occur during the course of a polymorphism investigation. The types presented here are merely representative. A parallel series of tables provide the analysis results.
  • a polymorphism analysis table 516 lists references to analysis table 494 . The results of the performed polymorphism analyses are listed in a polymorphism position result table 518 . A record of this table gives a result for a polymorphism analysis for a particular position as determined based on a particular set of scan experiments. In one embodiment the result is whether a particular mutation is certain, likely, possible, or not possible at the position. The result may also be that the reference is wrong.
  • a user polymorphism analysis table 520 lists user interpretations of results as listed in polymorphism position result table 518 .
  • the records of user polymorphism analysis table 520 are references to analysis table 494 .
  • the user interpretations themselves are stored in a user polymorphism analysis result table 522 .
  • Each result is a likelihood of a particular mutation at a position as considered by a user plus an accompanying user comment.
  • a P-Hat analysis estimates the relative concentrations of wild type sequence and sequence having a particular mutation as determined in a particular scan experiment.
  • a P-Hat analysis table 524 lists references to analysis table 494 .
  • An atom result table 526 gives estimates of the relative concentration along with upper and lower bounds and a maximum intensity. For heterozygous mutations, the estimates of relative concentration will cluster around 0.5 For homozygous mutations, the estimates should cluster around 1.0.
  • Base call analyses are determinations of the base at a particular position for a particular individual that may be based on more than one experiments.
  • a base call analysis table 528 lists references to analysis table 494 .
  • a base call result table 530 lists the called bases for particular combinations of sequence position and subject.
  • a P-Hat grouping analysis determines a measure of likelihood that data in a set of scan experiments results from separate genotypes.
  • P-hat grouping analyses are listed in a p-hat grouping analysis table 532 by reference to analysis table 494 .
  • P-hat grouping analysis results are listed in a mutation fraction result table 534 .
  • a group separation is given for various combinations of sequence position and scan experiment set.
  • a clustering analysis determines an alternative measure of likelihood that data in a set of scan experiments results from separate genotypes.
  • Clustering analyses are listed in a clustering analysis table 536 by reference to analysis table 494 .
  • Clustering analysis results are listed in a clustering result table 538 .
  • a clustering factor is given for various combinations of sequence position and scan experiment set.
  • FIG. 4F shows tables which support normalization and footprint finding operations that support the analyses referred to in FIG. 4E.
  • Hybridization intensity measurements made in scan experiments should be normalized over a set of scan experiments. The normalization should take into account differences in amplification level produced by different PCR processes.
  • Normalization is done by region of sequence.
  • a normalization region analysis determines the boundaries of a region to be normalized. The determination of boundaries takes into account that different fragments of sequence are amplified by different PCR procedures.
  • a normalization region analysis table 540 lists normalization region analyses by reference to analysis table 494 .
  • a normalization region result table 542 lists the boundaries for each determined normalization region.
  • Normalization values for identified normalization regions are themselves determined by normalization analyses. Normalization analyses are listed in a normalization analysis table 544 by reference to analysis table 494 . A normalization result table 546 lists the normalization values for regions.
  • a footprint analysis determines regions of sequence for which the hybridization intensity is elevated for the purposes of quality control. Footprint analyses are listed in a footprint analysis table 548 by reference to analysis table 494 . Footprints are identified by sequence starting point and ending point in a particular scan experiment in a footprint table 550 .
  • FIG. 4G depicts tables pertaining to measurement quality according to one embodiment of the present invention.
  • a tiling data quality analysis determines the quality of results from a scan experiment. These analyses are listed in a tiling data quality analysis table 552 by reference to analysis table 494 . Tiling data quality analysis results are listed in a tiling data quality result table 554 .
  • the results include an average hybridization intensity value for perfect match or mismatch probes.
  • a wild type call rate gives the fraction of atom data where the probe corresponding to the reference base has the highest hybridization intensity.
  • a wild type call rate of around 1.0 indicates good quality. Where the call rate is less than 0.75, the scan experiment should be rejected.
  • An accept data field indicates whether the analysis indicates rejection or acceptance.
  • Analysis dependency table 496 indicates interrelationships among the various analyses of FIGS. 4 E- 4 G.
  • a footprint analysis may depend on a normalization analysis which may in turn depend on a normalization region analysis.
  • a basecall analysis or PHatGrouping analysis may depend on an atom analysis.
  • a polymorphism analysis may depend on any of these analyses and/or a user polymorphism analysis and/or a clustering analysis.
  • FIG. 4H shows tables of polymorphism database 102 related to efforts to seek patent protection according to one embodiment of the present invention.
  • a polymorphism patent sequence table 560 lists sequences for which patent protection is sought.
  • a patent application table 562 lists patent applications directed toward the protection of polymorphisms.
  • a polymer patent application sequence map table 564 implements a many-to-many relationship between patent applications and sequences.
  • a prior application table 566 lists relationships between patent applications and prior related patent applications.
  • An attorney table 568 lists attorneys responsible for preparing patent applications listed in patent application table 562 .
  • a law firm table 570 lists the law firms to which the attorneys listed in attorney table 568 belong.
  • An employee group table 572 lists groups of inventors for the patent applications listed in table 562 . Individual inventors are listed in employee table 432 .
  • An employee group map table 574 implements a many-to-many relationship between inventors and groups of inventors.
  • the data model of FIG. 4H greatly facilitates the process of securing patent protection for polymorphisms and thereby increases the commercial incentive for investigation of polymorphisms.
  • IsReference SMALLINT Whether or not subject is in a group.
  • tblSpecies SpeciesId INTEGER Species identifier. Name: VARCHAR2(30) Name of species.
  • SubjectRelationship Subject1: INTEGER First subject in relationship
  • Subject2 INTEGER Second subject in relationship. Position: VARCHAR2(2) Nature of relationship.
  • tblSubjectGroup GroupId INTEGER Identifier of group of subjects (not same as ethnic group).
  • GroupCode VARCHAR2(20) Code identifier for group. Comments: LONG VARCHAR User comments on group. upsize_ts: DATE Creation date for group.
  • tblSubjectParticipation SubjectId INTEGER Reference to subject table.
  • GroupId INTEGER Reference to subject group table.
  • tblSample SampleId INTEGER Sample identifier.
  • SubjectID INTEGER Reference to subject table.
  • SampleSourceId CHAR(18) Institutional source of sample. Code: VARCHAR2(20) Code representing individual subject. Recipient: VARCHAR2(20) Person accepting sample. Provider: VARCHAR2(20) Person or institution providing sample. DateReceived: DATE Date sample received.
  • ProtocolId INTEGER Reference to protocol table.
  • SampleTypeId INTEGER Reference to sample type table.
  • tblSampleType SampleTypeId INTEGER Sample type identifier. Description: VARCHAR2(50) Description of sample type.
  • SampleSourceId CHAR(18) Identifier of institutional sample source. ProviderName: VARCHAR2(20) Name of individual or institutional sample provider.
  • Item ItemId INTEGER Item identifier.
  • ItemTypeId INTEGER Item type identifier.
  • ItemName VARCHAR2(50) Name of item.
  • ItemDerivation Item1Id: INTEGER Derivation source.
  • Item2Id INTEGER Derivation result.
  • EmployeeId INTEGER Employee responsible for derivation.
  • DerivationTypeId INTEGER Derivation type identifier.
  • Protocolid VARCHAR2(18) Reference to protocol table. Date: DATE Date of derivation.
  • tblChip ChipId INTEGER Rename reference to item table.
  • ChipDesignPlacementId INTEGER Placement on wafer.
  • LocationId INTEGER Location of chip.
  • WaferId INTEGER Wafer the chip was on.
  • tblHybedChip HybedChipId INTEGER Rename reference to item table.
  • SubjectID INTEGER Reference to subject table.
  • ProtocolId INTEGER Reference to protocol table.
  • Repetition SMALLINT Refers to number of times chip has been washed and reused.
  • tblHybSampleMap ItemId INTEGER Reference to item table.
  • Protocol ProtocolId INTEGER Protocol identifier.
  • ProtocolTemplateId INTEGER Protocol template identifier.
  • VARCHAR2(100) Name of protocol.
  • tblScanExperiment ScanExptId INTEGER Scan experiment identifier.
  • ItemId INTEGER Reference to item table.
  • ScanCode VARCHAR2(25) File for scan results.
  • ProtocolId INTEGERP Reference to protocol table.
  • ScanRatingId INTEGER Assessment of scan quality.
  • ExperimenterId INTEGER Experimenter identifier. Date: DATE Date of experiment.
  • ConversionTool VARCHAR2(50) Program used to convert from scan image to intensities.
  • ConversionDate DATE Date of conversion.
  • ScanStatus VARCHAR2(50) whether or not scan image has been converted to intensities Comments: LONG VARCHAR Comments.
  • Employee EmployeeId INTEGER Employee identifier.
  • EmployeeCode VARCHAR2(5) Code for employee
  • FName VARCHAR2(20) First name of employee.
  • MName VARCHAR2(20) Middle name of employee.
  • LName VARCHAR2(20) Last name of employee.
  • ItemType ItemId INTEGER Item type identifier.
  • ItemTypeName VARCHAR2(30) Name of item type.
  • FormName VARCHAR2(100) Reference to user interface form for item type.
  • ItemDerivationType DerivationTypeId INTEGER Derivation type identifier.
  • DerivationType VARCHAR2(50) Description of derivation type.
  • ItemTypeDerivation NextItemTypeId INTEGER Result type of derivation.
  • ItemTypeId INTEGER Source type of derivation.
  • ItemAttribute itemAttributeId INTEGER Item attribute identifier.
  • ItemAttributeTypeId INTEGER Reference to item attribute type table. Attribute: VARCHAR2(50) Attribute value.
  • ItemAttributeItemMap ItemAttributeId INGEGER Reference to item attribute table.
  • ItemId INTEGER Reference to item table.
  • ItemAttributeType ItemAttributetypeId INTEGER Item attribute identifier.
  • ItemAttributeName VARCHAR2(30) Name of item attribute type.
  • ItemTypeMap ItemAttributeTypeId INTEGER Reference to item attribute type table.
  • ItemTypeId INTEGER Reference to item type table.
  • ProtocolTemplate ProtocolTemplateId INTEGER Protocol template identifier. Name: VARCHAR2(100) Name of protocol template. DateCreated: DATE Date protocol template created. FormName: VARCHAR2(50) Name of the electronic form used for protocol template.
  • Parameter ParameterId INTEGER Parameter identifier.
  • ParameterTemplateId INTEGER Reference to parameter template table. Value: VARCHAR2(20) Value of parameter.
  • ProtocolID INTEGER Reference to protocol table.
  • ParameterTemplate ParameterTemplateId INTEGER Parameter template identifier.
  • ParamTemplateSetId INTEGER Reference to parameter template set table. StandardValue: VARCHAR2(100) Default value for parameter.
  • ParamTemplateSet ParamTemplateSetId INTEGER Parameter template set identifier.
  • TypeId INTEGER Renamed reference to parameter template set type table.
  • ParamTemplateSetType ParamTempSetTypeId INTEGER Parameter template set type identifier. Description: VARCHAR2(50) Description of parameter template set type.
  • ParameterTemplateSetMap ProtocolTemplateId INTEGER Reference to protocol template table.
  • ParamTemplateSetId INTEGER Reference to parameter template set table.
  • ProtocolTemplateDoc ProtocolDocId INTEGER Protocol Template document identifier.
  • ProtocolTemplateId INTEGER Reference to protcol template table. Name: VARCHAR2(100) Name of protocol template.
  • PathAndFileName VARCHAR2(50) File name for protocol template document.
  • AuthorName INTEGER Author of protocol template document.
  • CreationDate DATE Creation date of protocol template document.
  • tbFragment FragmentId INTEGER Fragment identifier. ChipSequence: LONG VARCHAR Sequence of fragment. Code: VARCHAR2(50) Code representing fragment.
  • tblPrimerPair PrimerPairId INTEGER Identifier for primer pair.
  • LeftPrimerId INTEGER Left primer identifier.
  • RightPrimerId INTEGER Right primer identifier.
  • PCRSize INTEGER length of amplified fragment Worked: SMALLINT Whether or not pair successfully amplified fragment.
  • FragmentId INTEGER Reference to fragment table.
  • tblPCR Item1Id INTEGER First part of reference to item derivation table.
  • Item2Id INTEGER Second part of reference to item derivation table.
  • Reactionworked SMALLINT Whether or not PCR reaction worked.
  • PrimePairPCRMap PrimerPairId INTEGER Reference to primer pair table.
  • Item1Id INTEGER First part of referenced item derivation table.
  • Item2Id INTEGER Second part of referenced item derivation table.
  • tblPrimer PrimerId INTEGER Primer identifier. ProtocolId: INTEGER Reference to protocol table. OligoSeq: VARCHAR2(35) Sequence of primer. Position: INTEGER Position of primer on fragment. Length: INTEGER Length of primer. MeltingTemp: INTEGER Melting temperature of primer. Direction: VARCHAR2(20) Direction (forward or reverse). tblPrimerOrder OrderId: INTEGER Order identifier. EmployeeId: INTEGER Employee who made order. VendorId: INTEGER Vendor for order. OrderDate: DATE Date of order. Owner: VARCHAR2(50) Name of employee making order. Vendor: VARCHAR2(50) Name of vendor.
  • Vendor VARCHAR2(50) Name of vendor.
  • PhoneNumber VARCHAR2(15) Phone number of vendor. FaxNumber: VARCHAR2(15) Fax Number of vendor.
  • tblPrimerOrderDesignMap PrimerId INTEGER Reference to primer table.
  • OrderId INTEGER Reference to order table.
  • tblWafer WaferId INTEGER Wafer identifier.
  • LotId INTEGER Lot to which wafer belongs. Code: VARCHAR2(8) Code for wafer. SynthesisDate_delete: DATE Synthesis date for wafer. Released: DATE Date wafer available. Done: SMALLINT Whether wafer production is complete. ExpirationDate: DATE Expiration date of wafer. ExpectedLife: CHAR(18) Expected useful life of wafer. tblLot LotId: INTEGER Lot identifier. WaferDesignId: INTEGER Identifier for wafer design. LotNumber: VARCHAR2(12) Lot number. WaferPN: VARCHAR2(50) Part number for wafer.
  • TilingDesignID INTEGER Tiling design identifier.
  • ChipDesignSequenceMapID NUMBER Reference to chip design sequence map.
  • TilingFormatID INTEGER Reference to tiling format table. UnitNumber: INTEGER 1 for sense, 0 for antisense AtomOffset: INTEGER # to add to translate atom position in tiling to atom position in chip design tblTiling Format
  • TilingFormatID INTEGER Tiling format identifier Orientation: CHAR(18) Orientation for tiling.
  • ProbeLength SMALLINT Length of probes.
  • SubstitutionPosition SMALLINT Substitution position for mutation base in probes.
  • tblAtomDesign AtomDesignId NUMBER Atom design identifier.
  • TilingDesignID INTEGER Reference to tiling design table. Position: INTEGER Position of atom in sequence.
  • tblProbeDesign ProbeDesignID NUMBER Probe design identifier.
  • ChipDesignId INTEGER Reference to chip design x: SMALLINT x position of probe.
  • y SMALLINT y position of probe.
  • tblProbeDesignRole ProbeDesignID NUMBER Reference to probe design table.
  • AtomDesignID NUMBER Reference to atom design table.
  • Substitution CHAR(18) Substitution position in probe design. Mismatches: NUMBER Whether probe is match or mismatch.
  • tblProbeData ProbeDesignID NUMBER Reference to probe design table.
  • ScanExptID INTEGER Reference to scan experiment table.
  • Intensity FLOAT Measured hybridization intensity for probe.
  • NPixels NUMBER Number of pixels used for intensity calculation.
  • StDev NUMBER Standard deviation for pixels.
  • tblAnalysis AnalysisId INTEGER Analysis identifier. Analysis VersionID: INTEGER Reference to version of analysis. ProtocolID: INTEGER Reference to protocol table. DatePerformed: DATE Date analysis performed. NeedsUpdate: NUMBER Whether analysis is current.
  • tblAnalysisDependency INTEGER Analysis providing input.
  • SubAnalysisId INTEGER Analysis receiving input.
  • Role VARCHAR2(20) Role of data provided by parent analysis.
  • TblAnalysisInput AnalysisinputID INTEGER Analysis input identifier.
  • AnalysisId INTEGER Analysis receiving input.
  • Inputtype VARCHAR2(20) Type of input.
  • ObjectID INTEGER Reference to input data.
  • tblChipDesignSeguenceMap ChipDesignSequenceMapID NUMBER Chip design sequence map identifier. FragmentID: INTEGER Reference to fragment table.
  • ChipDesignId INTEGER Chip design identifier.
  • AtomOffset NUMBER # to add to translate atom position in tiling to atom position in chip design
  • tblSequencePosition SequencePositionID NUMBER Sequence position identifier.
  • ChipDesignSequenceMapID NUMBER Reference to chip design sequence map table. Position: NUMBER Position in fragment.
  • GenomicSequencePositionID INTEGER Reference to genomic sequence position table. RefBase: INTEGER Reference base.
  • tblGenomicSequencePosition GenomicSequencePositionID: INTEGER Genomic sequence position identifier.
  • tblScanExperimentSet ScanExperimentSetID NUMBER Scan experiment set identifier.
  • tbsScanExperimentUsed ScanExptID INTEGER Reference to scan experiment table.
  • ScanExperimentSetID NUMBER Reference to scan experiment set table.
  • tblTilingData TilingDataID: NUMBER Tiling data identifier.
  • ScanExptID INTEGER Reference to scan experiment table.
  • TilingDesignID INTEGER Reference to tiling design table.
  • tblAtomData AtomDataID INTEGER Atom data identifier.
  • TilingDataID NUMBER Reference to tiling data table.
  • SubjectSequencePositionID INTEGER Reference to subject sequence position table.
  • tblSubjectSequencePosition SubjectSequencePositionID: INTEGER Subject sequence position identifier. SubjectID: INTEGER Reference to subject table. SequencePositionID: NUMBER Reference to sequence position table. tblPolymorphismAnalysis AnalysisId: INTEGER Reference to analysis table. tblPolyPositionResult AnalysisId: INTEGER Reference to analysis table. PolyPositionID: INTEGER Polymorphism position identifier. ScanExperimentSetID: NUMBER Reference to scan experiment set table. PolyPositiontypeID: INTEGER Refers to possibility of polymorphism at position, e.g., certain, likely, possible, mismatch (reference is wrong).
  • WTBase CHAR(18) Wild type base at position. MuBase: INTEGER Mutation base at position.
  • tblUserPolyanalysis AnalysisId INTEGER Reference to analysis table.
  • tblUserPolyanalysisResult AnalysisId INTEGER Reference to analysis table.
  • SequencePositionID NUMBER Reference to sequence position table.
  • ScanExperimentSetID NUMBER Reference to scan experiment set table.
  • PolyPositionTypeID INTEGER See polymorphism position result table.
  • UserComment VARCHAR2(256) User comment done polymorphism analysis.
  • tblAtomanalysis AnalysisId INTEGER Reference to analysis table.
  • tblAtomResult AnalysisId INTEGER Reference to analysis table.
  • AtomDataID INTEGER Reference to atom data table.
  • PHat FLOAT Relative concentration of mutant and wild type.
  • PHatUpperbound FLOAT Upperbound for relative concentration.
  • PHatLowerbound FLOAT Lowerbound for relative concentration.
  • MaxIntensity FLOAT Maximum measured intensity for atom.
  • WTIntensity FLOAT Measured wild type intensity.
  • MutIntensity FLOAT Measured mutation intensity.
  • LocalWTCallRate FLOAT rate at which atoms associated with surrounding sequence call reference base
  • IntensityRatio FLOAT Ratio of intensity of wild type probe over intensity of mutation probe.
  • tblBaseCallAnalysis AnalysisId INTEGER Reference to analysis table.
  • tblBaseCallResult AnalysisId INTEGER Reference to analysis table.
  • SubjectSequencePositionID INTEGER Reference to sequence position table.
  • ScanExperimentSetID NUMBER Reference to skin experiments set table. CalledBase: VARCHAR2(1) Base called for subject based on experiment set.
  • SuggestCheck NUMBER Used to indicate whether this sample should be used for resequencing tblClusteringAnalysis AnalysisId: INTEGER Reference to analysis table.
  • tblClusteringResult AnalysisId INTEGER Reference to analysis table.
  • SequencePositionID NUMBER Reference to sequence position table.
  • ScanExperimentSetID NUMBER Reference to scan experiment set table.
  • ClusteringFactor FLOAT Result of clustering analysis.
  • tblNormalizationRegionAnalysis AnalysisId INTEGER Reference to analysis table.
  • tblNormalizationRegion NormalizationRegionID INTEGER Normalization region identifier.
  • AnalysisId INTEGER Reference to analysis table.
  • ChipDesignSequenceMapID NUMBER Reference to chip design sequence map table. NumberScanExpt.Set Reference to scan experiment set table.
  • RegionEnd INTEGER Indication of end of the normalization region.
  • RegionStart INTEGER Indication of beginning of the normalization region.
  • tblNormalizationAnalysis AnalysisId INTEGER Reference to analysis table.
  • tblNormalizationResult NormalizationResultID INTEGER Normalization result identifier.
  • AnalysisId INTEGER Reference to analysis table.
  • TilingDataID INTEGER Reference to tiling data table.
  • NormalizationRegionResultID INTEGER Reference to normalization result.
  • NormalizationValue NUMBER Value used for normalization.
  • DataOK NUMBER Indication whether normalization result is usable.
  • tblFootprintAnalysis AnalysisId INTEGER Reference to analysis table.
  • tblFootprint FootprintID NUMBER Footprint identifier.
  • AnalysisId INTEGER Analysis identifier.
  • ChipDesignSequenceMapID NUMBER Reference to chip design sequence map table.
  • ScanExperimentSetID NUMBER Reference to scan experiment set table.
  • FFStart NUMBER Start of footprint and sequence.
  • FPEnd NUMBER End of footprint and sequence.
  • tblTilingDataQualityAnalysis AnalysisId INTEGER Reference to analysis table.
  • tbltilingDataQualityResult TilingDataID NUMBER Reference to tiling data table.
  • AnalysisId INTEGER Reference to analysis table.
  • AvgWTIntensity NUMBER Average wiId type intensity.
  • WTCallRate NUMBER Fraction of atoms where brightest of probes is one with reference space.
  • AcceptData INTEGER Whether data is of acceptable quality.
  • tblDifficult Regionanalysis AnalysisId INTEGER Reference to analysis table.
  • tblDifficultRegionResult ScanExptId INTEGER Reference to scan experiment table.
  • AnalysisId INTEGER Reference to analysis table.
  • ChipDesignSequenceMapID NUMBER Reference to chip design sequence map table.
  • RgnStart NUMBER Beginning of difficult region in sequence.
  • RgnEnd NUMBER End of difficult region in sequence.
  • Reason INTEGER Code indicating reason for difficult region, e.g., two or more non-wild type bases and less than a probe length.
  • q tblPolyPatentSeq PolyPatentSeqId NUMBER Polymorphism sequence identifier.
  • Polyscreen VARCHAR2(50) reference to internal grouping of polymorphisms
  • FragmentCode VARCHAR2(50) Fragment sequence found in Position: LONG Position of polymorphism.
  • RefAllel CHAR(2) Wild type base at position.
  • FreqP FLOAT Frequency of wild type.
  • AltAllele CHAR(2) Mutation base at position.
  • FreqQ FLOAT Frequency of mutation base.
  • Heterozygocity FLOAT Heterozygocity value.
  • SequenceTag VARCHAR2(50) Sequence containing polymorphism including ambiguity code at polymorphism position.
  • GeneName VARCHAR2(50) Name of gene.
  • ChromosomeNum VARCHAR2(20) Chromosome number.
  • ChromosomeLoc VARCHAR2(20) Location of gene on chromosome.
  • ForwardPrimer VARCHAR2(50) Identifier for forward primer used to implement fragment.
  • ReversePrimer VARCHAR2(50) Identifier of primer used to amplify fragment.
  • tblPatentApp PatentAppId NUMBER Patent application identifier.
  • GroupId NUMBER Reference to employee group table.
  • AttorneyId NUMBER Reference to attorney table.
  • DocketNum VARCHAR2(30) Docket number for patent application.
  • FilingDate DATE Filing date for filing application.
  • Classification VARCHAR2(30) Patent office classification for patent application.
  • SerialNumber VARCHAR2(50) Serial number assigned by patent office.
  • MiddleName VARCHAR2(5) Middle name of attorney.
  • LastName VARCHAR2(30) Last name of attorney.
  • RegistrationNum VARCHAR2(25) Patent office registration number of attorney.
  • tblLawFirm LawFirmId NUMBER Law firm identifier.
  • Company VARCHAR2(100) Name of law firm. Address: VARCHAR2(100) Address of law firm. City: VARCHAR2(30) City address of law firm. State: VARCHAR2(20) State address of law firm. ZipCode: VARCHAR2(15) Zip Code of law firm. Country: VARCHAR2(15) Country of law firm. Telephone: VARCHAR2(30) Telephone Fax: VARCHAR2(30) number of law firm.
  • TELEX VARCHAR2(20) Facsimile number of law firm. Telex number of law firm.
  • tblEmployeeGroup GroupId NUMBER Identifier for inventor group.
  • GroupName VARCHAR2(50) Name of inventor group.
  • Comments VARCHAR2(50) Comments.
  • GroupList VARCHAR2(255) Written out list of inventor names.
  • tblEmployeeGrpMap EmployeeId INTEGER Reference to employee table for inventor/em- ployees. GroupId: NUMBER Reference to inventor group table.

Abstract

Systems and methods for organizing information relating to study of polymorphisms. A database model is provided which interrelates information about one or more of, e.g, subjects from whom samples are extracted, primers used in extracting the DNA from the subjects, about the samples themselves, about experiments done on samples, about particular oligonucleotide probe arrays used to perform experiments, about analysis procedures performed on the samples, and about analysis results. The model is readily translatable into database languages such as SQL. The database model scales to permit storage of information about large numbers of subjects, samples, experiments, chips, etc.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • The present application claims priority from U.S. Prov. App. No. 60/053,842 filed Jul. 25, 1997, entitled COMPREHENSIVE BIO-INFORMATICS DATABASE, from U.S. Prov. App. No. 60/069,198 filed on Dec. 11, 1997, entitled COMPREHENSIVE DATABASE FOR BIOINFORMATICS, and from U.S. Prov. App. No. 60/069,436, entitled GENE EXPRESSION AND EVALUATION SYSTEM, filed on Dec. 11, 1997. The contents of all three provisional applications are herein incorporated by reference. [0001]
  • The subject matter of the present application is related to the subject matter of the following three co-assigned applications filed on the same day as the present application. GENE EXPRESSION AND EVALUATION SYSTEM (Attorney Docket No. 018547-035010), METHOD AND APPARATUS FOR PROVIDING A BIOINFORMATICS DATABASE (Attorney Docket No. 018547-033810), METHOD AND SYSTEM FOR PROVIDING A PROBE ARRAY CHIP DESIGN DATABASE (Attorney Docket No. 018547-033830). The contents of these three applications are herein incorporated by reference.[0002]
  • BACKGROUND OF THE INVENTION
  • The present invention relates to the collection and storage of information pertaining to chips for processing biological samples and thereby identifying polymorphisms. [0003]
  • The genomes of all organisms undergo spontaneous mutation in the course of their continuing evolution generating variant forms of progenitor sequences (Gusella, [0004] Ann. Rev. Biochem. 55, 831-854 (1986)). The variant form may confer an evolutionary advantage or disadvantage relative to a progenitor form or may be neutral. In some instances, a variant form confers a lethal disadvantage and is not transmitted to subsequent generations of the organism. In other instances, a variant form confers an evolutionary advantage to the species and is eventually incorporated into the DNA of many or most members of the species and effectively becomes the progenitor form. In many instances, both progenitor and variant form(s) survive and co-exist in a species population. The coexistence of multiple forms of a sequence gives rise to polymorphisms.
  • Despite the increased amount of nucleotide sequence data being generated in recent years, only a minute proportion of the total repository of polymorphisms in humans and other organisms has so far been identified. The paucity of polymorphisms hitherto identified is due to the large amount of work required for their detection by conventional methods. For example, a conventional approach to identifying polymorphisms might be to sequence the same stretch of oligonucleotides in a population of individuals by dideoxy sequencing. In this type of approach, the amount of work increases in proportion to both the length of sequence and the number of individuals in a population and becomes impractical for large stretches of DNA or large numbers of persons. [0005]
  • Devices and computer systems for forming and using arrays of materials on a substrate have been developed. These devices and systems have been used for identifying polymorphisms. For example, PCT application WO92/10588, incorporated herein by reference for all purposes, describes techniques for sequencing or sequence checking nucleic acids and other materials. Arrays for performing these operations may be formed in arrays according to the methods of, for example, the pioneering techniques disclosed in U.S. Pat. No. 5,143,854 and U.S. Pat. No. 5,571,639, both incorporated herein by reference for all purposes. [0006]
  • According to one aspect of the techniques described therein, an array of nucleic acid probes is fabricated at known locations on a chip or substrate. A fluorescently labeled nucleic acid is then brought into contact with the chip and a scanner generates an image file indicating the locations where the labeled nucleic acids bound to the chip. Based upon the identities of the probes at these locations, it becomes possible to extract information such as the identity of polymorphic forms in of DNA or RNA. Such systems have been used to form, for example, arrays of DNA that may be used to study and detect mutations relevant to cystic fibrosis, the P53 gene (relevant to certain cancers), HIV, and other genetic characteristics. [0007]
  • It would be highly useful to apply such arrays to the study of polymorphisms on a large scale. For example, it would be useful to conduct large scale studies on the correlation between certain polymorphisms and individual characteristics such as susceptibility to diseases and effectiveness of drug treatments. To achieve these benefits, it is contemplated that the operations of chip design, construction, sample preparation, and analysis will occur on a very large scale. The quantity of information related to each of these steps to store and correlate is vast. For large scale polymorphism studies, it will be necessary to store this information in a way to facilitate later advantageous querying and retrieval. What is needed is a system and method suitable for storing and organizing large quantities of information used in conjunction with polymorphism studies. [0008]
  • SUMMARY OF THE INVENTION
  • The present invention provides systems and methods for organizing information relating to study of polymorphisms. A database model is provided which interrelates information about one or more of, e.g, subjects from whom samples are extracted, primers used in extracting the DNA from the subjects, about the samples themselves, about experiments done on samples, about particular oligonucleotide probe arrays used to perform experiments, about analysis procedures performed on the samples, and about analysis results. The model is readily translatable into database languages such as SQL. The database model scales to permit storage of information about large numbers of subjects, samples, experiments, chips, etc. [0009]
  • Applications include linkage studies to determine resistance to drugs, susceptibility to diseases, and study of every characteristic of humans and other organisms that is related genetic variability. Another application of a database constructed according to this model is quality control of the various steps of performing a polymorphism study. By preserving information about every step of a polymorphism study, one can assess the reliability of the results or use the preserved information as feedback to improve procedures. [0010]
  • A further understanding of the nature and advantages of the inventions herein may be realized by reference to the remaining portions of the specification and the attached drawings.[0011]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates an overall system and process for forming and analyzing arrays of biological materials such as DNA or RNA. [0012]
  • FIG. 2A illustrates a computer system suitable for use in conjunction with the overall system of FIG. 1. [0013]
  • FIG. 2B illustrates a computer network suitable for use in conjunction with the overall system of FIG. 1. [0014]
  • FIG. 3 illustrates a key for interpreting a database model. [0015]
  • FIGS. [0016] 4A-4H illustrate a database model for maintaining information for the system and process of FIG. 1 according to one embodiment of the present invention.
  • DESCRIPTION OF SPECIFIC EMBODIMENTS
  • Investigation of Polymorphisms [0017]
  • A. Preparation of Samples [0018]
  • Polymorphisms are detected in a target nucleic acid from an individual being analyzed. For assay of genomic DNA, virtually any biological sample (other than pure red blood cells) is suitable. For example, convenient tissue samples include whole blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair. For assay of cDNA or mRNA, the tissue sample must be obtained from an organ in which the target nucleic acid is expressed. For example, if the target nucleic acid is a cytochrome P450, the liver is a suitable source. [0019]
  • Many of the methods described below require amplification of DNA from target samples. This can be accomplished by e.g., PCR. See generally [0020] PCR Technology: Principles and Applications for DNA Amplification (ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and Applications (eds. Innis, et al., Academic Press, San Diego, Calif., 1990); Mattila et al., Nucleic Acids Res. 19, 4967 (1991); Eckert et al., PCR Methods and Applications 1, 17 (1991); PCR (eds. McPherson et al., IRL Press, Oxford); and U.S. Pat. No. 4,683,202 (each of which is incorporated by reference for all purposes).
  • Other suitable amplification methods include the ligase chain reaction (LCR) (see Wu and Wallace, [0021] Genomics 4, 560 (1989), Landegren et al., Science 241, 1077 (1988), transcription amplification (Kwoh et al., Proc. Natl. Acad. Sci. USA 86, 1173 (1989)), and self-sustained sequence replication (Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874 (1990)) and nucleic acid based sequence amplification (NASBA). The latter two amplification methods involve isothermal reactions based on isothermal transcription, which produce both single stranded RNA (ssRNA) and double stranded DNA (dsDNA) as the amplification products in a ratio of about 30 or 100 to 1, respectively.
  • B. Detection of Polymorphisms in Target DNA [0022]
  • There are two distinct types of analysis depending whether a polymorphism in question has already been characterized. The first type of analysis is sometimes referred to as de novo characterization. This analysis compares target sequences in different individuals to identify points of variation, i.e., polymorphic sites. By analyzing groups of individuals representing the greatest ethnic diversity among humans and greatest breed and species variety in plants and animals, patterns characteristic of the most common alleles/haplotypes of the locus can be identified, and the frequencies of such populations in the population determined. Additional allelic frequencies can be determined for subpopulations characterized by criteria such as geography, race, or gender. The second type of analysis is determining which form(s) of a characterized polymorphism are present in individuals under test. There are a variety of suitable procedures, which are discussed in turn. [0023]
  • 1. Allele-Specific Probes [0024]
  • The design and use of allele-specific probes for analyzing polymorphisms is described by e.g., Saiki et al., [0025] Nature 324, 163-166 (1986); Dattagupta, EP 235,726, Saiki, WO 89/11548. Allele-specific probes can be designed that hybridize to a segment of target DNA from one individual but do not hybridize to the corresponding segment from another individual due to the presence of different polymorphic forms in the respective segments from the two individuals. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles. Some probes are designed to hybridize to a segment of target DNA such that the polymorphic site aligns with a central position (e.g., in a 15 mer at the 7 position; in a 16 mer, at either the 8 or 9 position) of the probe. This design of probe achieves good discrimination in hybridization between different allelic forms.
  • Allele-specific probes are often used in pairs, one member of a pair showing a perfect match to a reference form of a target sequence and the other member showing a perfect match to a variant form. Several pairs of probes can then be immobilized on the same support for simultaneous analysis of multiple polymorphisms within the same target sequence. [0026]
  • 2. Tiling Arrays [0027]
  • The polymorphisms can also be identified by hybridization to nucleic acid arrays, some example of which are described by WO 95/11995 (incorporated by reference in its entirety for all purposes). WO 95/11995 also describes subarrays that are optimized for detection of a variant forms of a precharacterized polymorphism. Such a subarray contains probes designed to be complementary to a second reference sequence, which is an allelic variant of the first reference sequence. The second group of probes is designed by the same principles as described in the Examples except that the probes exhibit complementarily to the second reference sequence. The inclusion of a second group (or further groups) can be particular useful for analyzing short subsequences of the primary reference sequence in which multiple mutations are expected to occur within a short distance commensurate with the length of the probes (i.e., two or more mutations within 9 to 21 bases). [0028]
  • 3. Allele-Specific Primers [0029]
  • An allele-specific primer hybridizes to a site on target DNA overlapping a polymorphism and only primes amplification of an allelic form to which the primer exhibits perfect complementarily. See Gibbs, [0030] Nucleic Acid Res. 17, 2427-2448 (1989). This primer is used in conjunction with a second primer which hybridizes at a distal site. Amplification proceeds from the two primers leading to a detectable product signifying the particular allelic form is present. A control is usually performed with a second pair of primers, one of which shows a single base mismatch at the polymorphic site and the other of which exhibits perfect complementarily to a distal site. The single-base mismatch prevents amplification and no detectable product is formed. The method works best when the mismatch is included in the 3′-most position of the oligonucleotide aligned with the polymorphism because this position is most destabilizing to elongation from the primer. See, e.g., WO 93/22456.
  • 4. Direct-Sequencing [0031]
  • The direct analysis of the sequence of polymorphisms of the present invention can be accomplished using either the dideoxy chain termination method or the Maxam Gilbert method (see Sambrook et al., [0032] Molecular Cloning, A Laboratory Manual (2nd Ed., CSHP, New York 1989); Zyskind et al., Recombinant DNA Laboratory Manual, (Acad. Press, 1988)).
  • 5. Denaturing Gradient Gel Electrophoresis [0033]
  • Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution. Erlich, ed., [0034] PCR Technology, Principles and Applications for DNA Amplification, (W. H. Freeman and Co, New York, 1992), Chapter 7.
  • 6. Single-Strand Conformation Polymorphism Analysis [0035]
  • Alleles of target sequences can be differentiated using single-strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described in Orita et al., [0036] Proc. Nat. Acad. Sci. 86, 2766-2770 (1989). Amplified PCR products can be generated as described above, and heated or otherwise denatured, to form single stranded amplification products. Single-stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence. The different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence difference between alleles of target sequences.
  • Biological Material Analysis System [0037]
  • One embodiment of the present invention operates in the context of a system for analyzing biological or other materials using arrays that themselves include probes that may be made of biological materials such as RNA or DNA. The VLSIPS™ and GeneChip™ technologies provide methods of making and using very large arrays of polymers, such as nucleic acids, on chips. See U.S. Pat. No. 5,143,854 and PCT Patent Publication Nos. WO 90/15070 and 92/10092, each of which is hereby incorporated by reference for all purposes. Nucleic acid probes on the chip are used to detect complementary nucleic acid sequences in a sample nucleic acid of interest (the “target” nucleic acid). [0038]
  • FIG. 1 illustrates an [0039] overall system 100 for forming and analyzing arrays of biological materials such as RNA or DNA. A part of system 100 is a polymorphism database 102. Polymorphism database 102 includes information about, e.g., biological sources, preparation of samples, design of arrays, raw data obtained from applying experiments to chips, analysis procedures applied, and analysis results, etc. Polymorphism database 102 facilitates large scale study of polymorphisms.
  • A chip design system [0040] 104 is used to design arrays of polymers such as biological polymers such as RNA or DNA. Chip design system 104 may be, for example, an appropriately programmed Sun Workstation or personal computer or workstation, such as an IBM PC equivalent, including appropriate memory and a CPU. Chip design system 104 obtains inputs from a user regarding chip design objectives including polymorphisms of interest, and other inputs regarding the desired features of the array. Optionally, chip design system 104 from external databases such as GenBank. The output of chip design system 104 is a set of chip design computer files in the form of, for example, a switch matrix, as described in PCT application WO 92/10092, and other associated computer files. The chip design computer files form a part of polymorphism database 102. Systems for designing chips for study of polymorphisms are disclosed in U.S. Pat. No. 5,571,639 and in PCT application WO 95/11995, the contents of which are herein incorporated by reference.
  • The chip design files are input to a mask design system (not shown) that designs the lithographic masks used in the fabrication of arrays of molecules such as DNA. The mask design system designs the lithographic masks used in the fabrication of probe arrays. The mask design system generates mask design files that are then used by a mask construction system (not shown) to construct masks or other synthesis patterns such as chrome-on-glass masks for use in the fabrication of polymer arrays. [0041]
  • The masks are used in a synthesis system (not shown). The synthesis system includes the necessary hardware and software used to fabricate arrays of polymers on a substrate or chip. The synthesis system includes a light source and a chemical flow cell on which the substrate or chip is placed. A mask is placed between the light source and the substrate/chip, and the two are translated relative to each other at appropriate times for deprotection of selected regions of the chip. Selected chemical reagents are directed through the flow cell for coupling to deprotected regions, as well as for washing and other operations. The substrates fabricated by the synthesis system are optionally diced into smaller chips. The output of the synthesis system is a chip ready for application of a target sample. [0042]
  • Information about the mask design, mask construction, and probe array synthesis is presented by way of background. A [0043] biological source 112 is, for example, tissue from a plant or animal. Various processing steps are applied to material from biological source 112 by a sample preparation system 114. Operation of sample preparation system 114 in the context of a polymorphism study is discussed below in further detail.
  • The prepared samples include nucleic acid sequences such as DNA. When the sample is applied to the chip by a [0044] sample exposure system 116, the nucleic acids may or may not bond to the probes. The nucleic acids can be tagged with fluoroscein labels to determine which probes have bonded to nucleotide sequences from the sample. The prepared samples will be placed in a scanning system 118. Scanning system 118 includes a detection device such as a confocal microscope or CCD (charge-coupled device) that is used to detect the location where labeled receptors have bound to the substrate. The output of scanning system 118 is an image file(s) indicating, in the case of fluorescein labeled receptor, the fluorescence intensity (photon counts or other related measurements, such as voltage) as a function of position on the substrate. These image files may also form a part of polymorphism database 102. Since higher photon counts will be observed where the labeled nucleic acid(s) has bound more strongly to the array of probes, and since the monomer sequence of the probes on the substrate is known as a function of position, it becomes possible to analize the sequence(s) of the nucleic acid(s) that are complementary to the probes.
  • The image files and the design of the chips are input to an [0045] analysis system 120 that, e.g., calls bases. Such analysis techniques are described in EPO Pub. No. 0717113A, the contents of which are herein incorporated by reference.
  • Chip design system [0046] 104, analysis system 120 and control portions of exposure system 116, sample preparation system 114, and scanning system 118 may be appropriately programmed computers such as a Sun workstation or IBM-compatible PC. An independent computer for each system may perform the computer-implemented functions of these systems or one computer may combine the computerized functions of two or more systems. One or more computers may maintain chip design database 102 independent of the computers operating the systems of FIG. 1 or chip design database 102 may be fully or partially maintained by these computers.
  • FIG. 2A depicts a block diagram of a [0047] host computer system 10 suitable for implementing the present invention. Host computer system 210 includes a bus 212 which interconnects major subsystems such as a central processor 214, a system memory 216 (typically RAM), an input/output (I/O) adapter 218, an external device such as a display screen 224 via a display adapter 226, a keyboard 232 and a mouse 234 via an I/O adapter 218, a SCSI host adapter 236, and a floppy disk drive 238 operative to receive a floppy disk 240. SCSI host adapter 236 may act as a storage interface to a fixed disk drive 242 or a CD-ROM player 244 operative to receive a CD-ROM 246. Fixed disk 244 may be a part of host computer system 210 or may be separate and accessed through other interface systems. A network interface 248 may provide a direct connection to a remote server via a telephone link or to the Internet. Network interface 248 may also connect to a local area network (LAN) or other network interconnecting many computer systems. Many other devices or subsystems (not shown) may be connected in a similar manner.
  • Also, it is not necessary for all of the devices shown in FIG. 2A to be present to practice the present invention, as discussed below. The devices and subsystems may be interconnected in different ways from that shown in FIG. 2A. The operation of a computer system such as that shown in FIG. 2A is readily known in the art and is not discussed in detail in this application. Code to implement the present invention, may be operably disposed or stored in computer-readable storage media such as [0048] system memory 216, fixed disk 242, CD-ROM 246, or floppy disk 240.
  • FIG. 2B depicts a [0049] network 260 interconnecting multiple computer systems 210. Network 260 may be a local area network (LAN), wide area network (WAN), etc. Bioinformatics database 102 and the computer-related operations of the other elements of FIG. 2B may be divided amongst computer systems 210 in any way with network 260 being used to communicate information among the various computers. Portable storage media such as floppy disks may be used to carry information between computers instead of network 260.
  • Overall Description of Database [0050]
  • [0051] Polymorphism database 102 is preferably a relational database with a complex internal structure. The structure and contents of chip design database 102 will be described with reference to a logical model depicted in FIGS. 4A-4H that describes the contents of tables of the database as well as interrelationships among the tables. A visual depiction of this model will be an Entity Relationship Diagram (ERD) which includes entities, relationships, and attributes. A detailed discussion of ERDs is found in “ERwin version 3.0 Methods Guide” available from Logic Works, Inc. of Princeton, N.J., the contents of which are herein incorporated by reference. Those of skill in the art will appreciate that automated tools such as Developer 2000 available from Oracle will convert the ERD from FIGS. 4A-4H directly into executable code such as SQL code for creating and operating the database.
  • FIG. 3 is a key to the ERD that will be used to describe the contents of [0052] chip design database 102. A representative table 302 includes one or more key attributes 304 and one or more non-key attributes 306. Representative table 302 includes one or more records where each record includes fields corresponding to the listed attributes. The contents of the key fields taken together identify an individual record. In the ERD, each table is represented by a rectangle divided by a horizontal line. The fields or attributes above the line are key while the fields or attributes below the line are non-key. An identifying relationship 308 signifies that the key attribute of a parent table 310 is also a key attribute of a child table 312. A non-identifying relationship 314 signifies that the key attribute of a parent table 316 is also a non-key attribute of a child table 318. Where (FK) appears in parenthesis, it indicates that an attribute of one table is a key attribute of another table. Both the depicted non-identifying and identifying relationship are one to one-or-more relationships where one record in the parent table corresponds to one or more records in the child table. An alternative non-identifying relationship 324 is a one to zero-or-more relationship where one record in a parent table 320 corresponds to zero or more records in a child table 322.
  • Database Model [0053]
  • FIGS. [0054] 4A-4H are entity relationship diagrams (ERDs) showing elements of polymorphism database 102 according to one embodiment of the present invention. Each rectangle in the diagram corresponds to a table in database 102. First, the relationships and general contents of the various tables will be described.
  • The interrelationships and general contents of the tables of [0055] database 102 will be described first. Then a chart will be presented listing and describing all of the fields of the various tables.
  • FIG. 4A illustrates core elements of [0056] database 102 according to one embodiment of the present invention. A subject table 402 lists organisms from which samples have been extracted for polymorphism analysis or other tissue sources. Samples may also be obtained from tissue collections not associated with any one identified organism. Information stored within subject table 402 includes the name, gender, family, position with family, (e.g., father, mother, etc.), and ethnic group. For human subjects, the name and family will preferably be represented in coded form to assure privacy. Associated with each subject is a species as listed in a species table 404. Also, a relationship may be defined among subjects a subject relationship table 406 which includes records corresponding to related subjects. These relationships may be father-mother, sibling, twins, etc. Subjects may be part of a group that is being studied, e.g., a group with a congenital disease, or a toxic reaction to a particular drug. The groups are listed in a subject group table 408. Participation of subjects in groups is defined by a subject participation table 410 which lists all group memberships.
  • Samples and their attributes are listed in a sample table [0057] 412. Each sample has an associated sample type. The sample types are listed in a sample type table 414. Possible sample types include blood, urine, etc. Companies or institutions that provide samples are listed in a sample source table 416.
  • [0058] Database 102 provides an item table 418 that includes records for items. There are various types of items that correspond to different stages of the sample preparation process. An “item derivation” transforms an item of one type into an item of another type. The following table lists various item types and item derivation types for a representative embodiment.
    Item Type Derived from by Item Derivation Type
    Sample other samples pooling
    Sample other sample splitting
    Extracted DNA Sample DNA Extraction
    Target (Sequences of Extracted DNA PCR
    interest amplified)
    Fluorescently Labeled Target Labeling
    Target
    Hybridized Chip Labeled Target Hybridization (application
    of target to chip)
    Stained Hybridized Chip Hybridized Chip Staining
  • [0059]
    Item Type Derived from by Item Derivation Type
    Fluorescently Labeled Target Labeling
    Target
    Hybridized Chip Labeled Target Hybridization (application
    of target to chip)
    Stained Hybridized Chip Hybridized Chip Staining
  • Item derivations are listed in an item derivation table [0060] 420. It should be noted that derivations need not produce a change between item types. Each item derivation occurs in accordance with a protocol that characterizes the step or steps in the derivation. Protocols are listed in a protocol table 428. Each item derivation is performed by an employee listed in employee table 432.
  • Unused chips are listed in a chip table [0061] 422. Hybridized chips (i.e., chips that have had target applied) are listed in a hybridized chip table 424. A hybridized sample map table 426 lists the relationships between hybridized chips and the samples that have been applied to them.
  • Stained hybridized chips are scanned in a process referred to here as a scan experiment. Scan experiments are listed in a scan experiment table [0062] 430. The scan experiment occurs in accordance with a protocol listed in protocol table 428. The scan experiment is performed by an employee listed in employee table 432.
  • FIG. 4B depicts further details of the data model for items and item derivations. The various item types are listed in an item type table [0063] 434 and the various item derivation types are listed in an item derivation type table 436. The relationships between successive item types, e.g., sample and target are defined in an item type derivation table 438. An item has associated attributes. For example, for a target, database 102 may store the concentration, volume, location and/or remaining amount. All item attributes are stored in an item attribute table 440. Item attributes may be shared among multiple items. For example, a series of targets may all share a preparation date. An item attribute item map table 442 implements a many-to-many relationship between item attributes and items. The various types of item attributes such as preparer, preparation date, etc. are listed in an item attribute type table 444. Each item type has corresponding attribute types. Some attribute types are, however, shared among various item types. Accordingly, there is a many-to-many relationship among item attribute types and item types that is implemented by an item type map table 446.
  • The tables of FIG. 4B represent a powerfully general model of the sample preparation process. Changes in process steps that require changes in the type of information that should be stored may be implemented by changing and adding table contents rather than providing new tables or changing relationships among tables. [0064]
  • FIG. 4C depicts a detailed data model for storing information about protocols according to the present invention. Protocols as stored in protocol table [0065] 428 represent information about particular processes that have been performed including item derivations, analyses, and scan experiments. Each protocol has an associated protocol template. Protocol templates identify protocol types. For example, one protocol template may be a PCR template. All protocols associated with the PCR template identify parameters for performing a PCR procedure. Protocol templates are listed in a protocol template table 448. A parameter table 450 lists all the parameters and their values for all the protocols listed in protocol table 428. A parameter template table 452 lists the various parameter types along with default values. An examples of a parameter template would be a PCR reaction temperature. The parameter template would include a default value for this parameter. Parameter table 450 might then list many different PCR reaction temperature values that would be used by many different protocols. If a parameter value has not been modified by the user, it inherits the standard value of the associated parameter template. A parameter template set is a set of parameter templates that are used for a particular purpose, e.g., in association with protocols according to one or more protocol templates. Parameter template sets are listed in a parameter template set table 454. There are different types of parameter template set and these are listed in a parameter template set table 456. A mapping between parameter template sets and protocol templates is defined by a protocol template set map table 458.
  • Protocol templates may have associated lengthy verbal information about how to perform protocol steps. A protocol template document table [0066] 460 stores references to documents that include instructions for performing protocols.
  • As with the items, the data model for protocols defined by FIG. 4C is highly general and allows significant changes in the way item derivations, analyses, and experiments are performed without changing the underlying data model. [0067]
  • Referring again to FIG. 4A, there are tables to record information concerning the use of primers in PCR. A fragment table [0068] 462 lists all the sequence fragments investigated in conjunction with database 102. Associated with each fragment are one or more primer pairs used to amplify the fragment in a PCR process. A primer pair table 464 lists all the primer pairs including information about whether the primer pair actually worked to amplify the fragment. In order to develop the information about the effectiveness of primer pairs, there is a PCR table 466 that lists records identifying the outcome of multiple PCR operations. The individual PCR operations are identified by reference to item derivation table 420.
  • A single PCR operation may be used to amplify many different fragments and thus employ many different primer pairs. Of course, a single primer pair may be used in multiple PCR operations. There is therefore a many-to-many relationship between PCR operations and primer pairs that is recorded by a primer pair PCR map table [0069] 468.
  • Information about individual primers is stored in a primer table [0070] 470. Also, each primer has an associated protocol in protocol table 428 that characterizes the primer preparation process. Information about primer orders is listed in a primer order table 472. Each primer order is to a vendor and the vendors are listed in a vendor table 474. Each primer order is made by an employee listed in employee table 432. A primer order design map table 476 implements a many-to-many relationship between primer orders and primers.
  • The data model described here thus preserves information about primers used in PCR reactions. One can improve results by using primers that have successfully amplified a given fragment in the past. Sometimes particular groups of primer pairs cannot be multiplexed together in the same PCR process. The information preserved here thus permits experimenters to make optimal use of expensive and time consuming PCR procedures. [0071]
  • It is also useful to preserve information about the chip production process and the origin of individual chips. A wafer table [0072] 478 lists wafers. When chips are produced, many chips are produced at the same time as part of a single wafer. Chip table 422 stores references to wafer table 478 for each chip and the location of each chip on its wafer at production time. Sometimes there is analytic significance associated with the location of a chip on the wafer. Each wafer is produced as part of a lot and the identify of the lot for each wafer is recorded by wafer table 478 as a reference to a lot table 480 that lists each lot.
  • FIG. 4D depicts further details of tables pertaining to chip design that are preferably maintained within [0073] polymorphism database 102 according to one embodiment of the present invention. A tiling design table 482 lists tiling designs. Each tiling design represents the application of a particular tiling format to a sequence to be investigated. Tiling formats indicate probe orientation, probe length, and the position within a probe of a single nucleotide polymorphism being investigated. In a preferred embodiment, there may be very few tiling formats and they are listed in a tiling format table 484.
  • A particular tiling design includes many atom designs specifying the design of a single atom. In one embodiment, an atom is a group of typically four probes used to investigate a single base position with each probe hybridizing to a sequence including a different base at that position. Atom designs are listed in an atom design table [0074] 486. Records identifying the designs of individual probes are listed in a probe design table 488. A probe design role table 490 indicates the roles of probes listed in probe design table 488 in the atom designs of atom design table 486. For combinations of probe design and atom design, probe design role table 490 indicates which base the probe hybridizes to at the substitution position and whether the probe represents a match or a mismatch to the wild type.
  • A probe data table [0075] 492 gives the hybridization intensity values for particular probes designs as determined in particular scan experiments. Each record of the table also gives the number of pixels used to determine the intensity value and the standard deviation of intensity as measured among the pixels.
  • FIGS. [0076] 4E-4G depict aspects of polymorphism database 102 related to analysis procedures and their results according to one embodiment of the present invention. An analysis table 494 lists analyses performed. An analysis generally refers to a non-trivial transformation of data. Records of analysis table 494 include references to protocol table 428 to specify parameters used for each analysis. Analyses may take as their input raw data or the results of previous analyses. An analysis dependency table 496 lists dependencies among analyses where one analysis depends on the data developed by another analysis. An analysis input table 498 lists inputs for analyses listed in analysis table 494.
  • On the right side of FIG. 4E are various tables used to support analyses. A chip design sequence map table [0077] 500 correlates particular fragments with chip designs. A sequence position table 502 lists investigated sequence positions indicating their positions on a fragment. Records of sequence position table 502 reference a genomic sequence position table 504 which gives sequence positions in the genome rather than within individual fragments.
  • A scan experiment set table [0078] 506 lists sets of scan experiments. This allows for groupings of experiments for individuals or populations to serve as the basis for polymorphism analysis. A scan experiment used table 508 lists records indicating memberships of a scan experiment in a scan experiment set.
  • A tiling data table [0079] 510 lists records identifying tiling designs as implemented in particular chips measured by particular scan experiments. An atom data table 512 lists the intensities measured for particular sequence positions as measured in scan experiments identified by the tiling data records. A subject sequence position data table 514 lists combinations of sequence position and scan experiment.
  • A series of tables in FIGS. [0080] 4E-4G correspond to different types of analysis that occur during the course of a polymorphism investigation. The types presented here are merely representative. A parallel series of tables provide the analysis results. A polymorphism analysis table 516 lists references to analysis table 494. The results of the performed polymorphism analyses are listed in a polymorphism position result table 518. A record of this table gives a result for a polymorphism analysis for a particular position as determined based on a particular set of scan experiments. In one embodiment the result is whether a particular mutation is certain, likely, possible, or not possible at the position. The result may also be that the reference is wrong.
  • A user polymorphism analysis table [0081] 520 lists user interpretations of results as listed in polymorphism position result table 518. The records of user polymorphism analysis table 520 are references to analysis table 494. The user interpretations themselves are stored in a user polymorphism analysis result table 522. Each result is a likelihood of a particular mutation at a position as considered by a user plus an accompanying user comment.
  • A P-Hat analysis estimates the relative concentrations of wild type sequence and sequence having a particular mutation as determined in a particular scan experiment. A P-Hat analysis table [0082] 524 lists references to analysis table 494. An atom result table 526 gives estimates of the relative concentration along with upper and lower bounds and a maximum intensity. For heterozygous mutations, the estimates of relative concentration will cluster around 0.5 For homozygous mutations, the estimates should cluster around 1.0.
  • Base call analyses are determinations of the base at a particular position for a particular individual that may be based on more than one experiments. A base call analysis table [0083] 528 lists references to analysis table 494. A base call result table 530 lists the called bases for particular combinations of sequence position and subject.
  • A P-Hat grouping analysis determines a measure of likelihood that data in a set of scan experiments results from separate genotypes. P-hat grouping analyses are listed in a p-hat grouping analysis table [0084] 532 by reference to analysis table 494. P-hat grouping analysis results are listed in a mutation fraction result table 534. A group separation is given for various combinations of sequence position and scan experiment set.
  • A clustering analysis determines an alternative measure of likelihood that data in a set of scan experiments results from separate genotypes. Clustering analyses are listed in a clustering analysis table [0085] 536 by reference to analysis table 494. Clustering analysis results are listed in a clustering result table 538. A clustering factor is given for various combinations of sequence position and scan experiment set.
  • FIG. 4F shows tables which support normalization and footprint finding operations that support the analyses referred to in FIG. 4E. Hybridization intensity measurements made in scan experiments should be normalized over a set of scan experiments. The normalization should take into account differences in amplification level produced by different PCR processes. [0086]
  • Normalization is done by region of sequence. A normalization region analysis determines the boundaries of a region to be normalized. The determination of boundaries takes into account that different fragments of sequence are amplified by different PCR procedures. A normalization region analysis table [0087] 540 lists normalization region analyses by reference to analysis table 494. A normalization region result table 542 lists the boundaries for each determined normalization region.
  • Normalization values for identified normalization regions are themselves determined by normalization analyses. Normalization analyses are listed in a normalization analysis table [0088] 544 by reference to analysis table 494. A normalization result table 546 lists the normalization values for regions.
  • A footprint analysis determines regions of sequence for which the hybridization intensity is elevated for the purposes of quality control. Footprint analyses are listed in a footprint analysis table [0089] 548 by reference to analysis table 494. Footprints are identified by sequence starting point and ending point in a particular scan experiment in a footprint table 550.
  • FIG. 4G depicts tables pertaining to measurement quality according to one embodiment of the present invention. A tiling data quality analysis determines the quality of results from a scan experiment. These analyses are listed in a tiling data quality analysis table [0090] 552 by reference to analysis table 494. Tiling data quality analysis results are listed in a tiling data quality result table 554. The results include an average hybridization intensity value for perfect match or mismatch probes. A wild type call rate gives the fraction of atom data where the probe corresponding to the reference base has the highest hybridization intensity. A wild type call rate of around 1.0 indicates good quality. Where the call rate is less than 0.75, the scan experiment should be rejected. An accept data field indicates whether the analysis indicates rejection or acceptance.
  • Where scan experiment measurements indicate two or more non-wild type bases within a probe length, this indicates a measurement problem for the affected region of sequence. These regions are identified by difficult region analyses listed in a difficult region analysis table [0091] 556 by reference to analysis table 494. A difficult region result table 558 lists the regions identified as being difficult.
  • Analysis dependency table [0092] 496 indicates interrelationships among the various analyses of FIGS. 4E-4G. A footprint analysis may depend on a normalization analysis which may in turn depend on a normalization region analysis. A basecall analysis or PHatGrouping analysis may depend on an atom analysis. A polymorphism analysis may depend on any of these analyses and/or a user polymorphism analysis and/or a clustering analysis.
  • Another aspect of the investigation of polymorphisms is seeking patent protection for identified polymorphisms. FIG. 4H shows tables of [0093] polymorphism database 102 related to efforts to seek patent protection according to one embodiment of the present invention. A polymorphism patent sequence table 560 lists sequences for which patent protection is sought. A patent application table 562 lists patent applications directed toward the protection of polymorphisms. A polymer patent application sequence map table 564 implements a many-to-many relationship between patent applications and sequences. A prior application table 566 lists relationships between patent applications and prior related patent applications. An attorney table 568 lists attorneys responsible for preparing patent applications listed in patent application table 562. A law firm table 570 lists the law firms to which the attorneys listed in attorney table 568 belong.
  • An employee group table [0094] 572 lists groups of inventors for the patent applications listed in table 562. Individual inventors are listed in employee table 432. An employee group map table 574 implements a many-to-many relationship between inventors and groups of inventors.
  • The data model of FIG. 4H greatly facilitates the process of securing patent protection for polymorphisms and thereby increases the commercial incentive for investigation of polymorphisms. [0095]
  • Database Contents [0096]
  • The contents of the tables introduced above will now be presented in greater detail in the following chart. [0097]
    TABLE FIELD COMMENT
    tblSubject SubjectId: INTEGER Identifies
    biological source
    of sample.
    SpeciesID: INTEGER Species of
    subject.
    Name: VARCHAR2(20) Name of subject
    (anonimized for
    human subjects).
    Gender: VARCHAR2(10) Gender of
    subject.
    Family: VARCHAR2(20) Family of subject
    (anonimized for
    human subjects).
    Member: SMALLINT Position in family
    (father, mother,
    etc.).
    Group: VARCHAR2(20) Ethnic group.
    CellLineID: VARCHAR2(20) Identifier for
    sample source not
    associated with
    particular
    organism.
    IsReference: SMALLINT Whether or not
    subject is in a
    group.
    tblSpecies SpeciesId: INTEGER Species identifier.
    Name: VARCHAR2(30) Name of species.
    SubjectRelationship Subject1: INTEGER First subject in
    relationship
    Subject2: INTEGER Second subject in
    relationship.
    Position: VARCHAR2(2) Nature of
    relationship.
    tblSubjectGroup GroupId: INTEGER Identifier of
    group of subjects
    (not same as
    ethnic group).
    GroupCode: VARCHAR2(20) Code identifier
    for group.
    Comments: LONG VARCHAR User comments
    on group.
    upsize_ts: DATE Creation date for
    group.
    tblSubjectParticipation SubjectId: INTEGER Reference to
    subject table.
    GroupId: INTEGER Reference to
    subject group
    table.
    tblSample SampleId: INTEGER Sample identifier.
    SubjectID: INTEGER Reference to
    subject table.
    SampleSourceId: CHAR(18) Institutional
    source of sample.
    Code: VARCHAR2(20) Code representing
    individual subject.
    Recipient: VARCHAR2(20) Person accepting
    sample.
    Provider: VARCHAR2(20) Person or
    institution
    providing sample.
    DateReceived: DATE Date sample
    received.
    ProtocolId: INTEGER Reference to
    protocol table.
    SampleTypeId: INTEGER Reference to
    sample type table.
    tblSampleType SampleTypeId: INTEGER Sample type
    identifier.
    Description: VARCHAR2(50) Description of
    sample type.
    tblSample Source SampleSourceId: CHAR(18) Identifier of
    institutional
    sample source.
    ProviderName: VARCHAR2(20) Name of
    individual or
    institutional
    sample provider.
    Item ItemId: INTEGER Item identifier.
    ItemTypeId: INTEGER Item type
    identifier.
    ItemName: VARCHAR2(50) Name of item.
    ItemDerivation Item1Id: INTEGER Derivation
    source.
    Item2Id: INTEGER Derivation result.
    EmployeeId: INTEGER Employee
    responsible for
    derivation.
    DerivationTypeId: INTEGER Derivation type
    identifier.
    Protocolid: VARCHAR2(18) Reference to
    protocol table.
    Date: DATE Date of
    derivation.
    tblChip ChipId: INTEGER Rename reference
    to item table.
    ChipDesignPlacementId: INTEGER Placement on
    wafer.
    LocationId: INTEGER Location of chip.
    WaferId: INTEGER Wafer the chip
    was on.
    tblHybedChip HybedChipId: INTEGER Rename reference
    to item table.
    SubjectID: INTEGER Reference to
    subject table.
    ProtocolId: INTEGER Reference to
    protocol table.
    Repetition: SMALLINT Refers to number
    of times chip has
    been washed and
    reused.
    tblHybSampleMap ItemId: INTEGER Reference to item
    table.
    Protocol ProtocolId: INTEGER Protocol
    identifier.
    ProtocolTemplateId: INTEGER Protocol template
    identifier.
    Name: VARCHAR2(100) Name of protocol.
    tblScanExperiment ScanExptId: INTEGER Scan experiment
    identifier.
    ItemId: INTEGER Reference to item
    table.
    ScanCode: VARCHAR2(25) File for scan
    results.
    ProtocolId: INTEGERP Reference to
    protocol table.
    ScanRatingId: INTEGER Assessment of
    scan quality.
    ExperimenterId: INTEGER Experimenter
    identifier.
    Date: DATE Date of
    experiment.
    ConversionTool: VARCHAR2(50) Program used to
    convert from scan
    image to
    intensities.
    ConversionDate: DATE Date of
    conversion.
    ScanStatus: VARCHAR2(50) whether or not
    scan image has
    been converted to
    intensities
    Comments: LONG VARCHAR Comments.
    Employee EmployeeId: INTEGER Employee
    identifier.
    EmployeeCode: VARCHAR2(5) Code for
    employee
    FName: VARCHAR2(20) First name of
    employee.
    MName: VARCHAR2(20) Middle name of
    employee.
    LName: VARCHAR2(20) Last name of
    employee.
    ItemType ItemId: INTEGER Item type
    identifier.
    ItemTypeName: VARCHAR2(30) Name of item
    type.
    FormName: VARCHAR2(100) Reference to user
    interface form for
    item type.
    ItemDerivationType DerivationTypeId: INTEGER Derivation type
    identifier.
    DerivationType: VARCHAR2(50) Description of
    derivation type.
    ItemTypeDerivation NextItemTypeId: INTEGER Result type of
    derivation.
    ItemTypeId: INTEGER Source type of
    derivation.
    ItemAttribute itemAttributeId: INTEGER Item attribute
    identifier.
    ItemAttributeTypeId: INTEGER Reference to item
    attribute type
    table.
    Attribute: VARCHAR2(50) Attribute value.
    ItemAttributeItemMap ItemAttributeId: INGEGER Reference to item
    attribute table.
    ItemId: INTEGER Reference to item
    table.
    ItemAttributeType ItemAttributetypeId: INTEGER Item attribute
    identifier.
    ItemAttributeName: VARCHAR2(30) Name of item
    attribute type.
    ItemTypeMap ItemAttributeTypeId: INTEGER Reference to item
    attribute type
    table.
    ItemTypeId: INTEGER Reference to item
    type table.
    ProtocolTemplate ProtocolTemplateId: INTEGER Protocol template
    identifier.
    Name: VARCHAR2(100) Name of protocol
    template.
    DateCreated: DATE Date protocol
    template created.
    FormName: VARCHAR2(50) Name of the
    electronic form
    used for protocol
    template.
    Parameter ParameterId: INTEGER Parameter
    identifier.
    ParameterTemplateId: INTEGER Reference to
    parameter
    template table.
    Value: VARCHAR2(20) Value of
    parameter.
    ProtocolID: INTEGER Reference to
    protocol table.
    ParameterTemplate ParameterTemplateId: INTEGER Parameter
    template
    identifier.
    Name: VARCHAR2(100) Name of
    parameter
    template.
    ParamTemplateSetId: INTEGER Reference to
    parameter
    template set table.
    StandardValue: VARCHAR2(100) Default value for
    parameter.
    ParamTemplateSet ParamTemplateSetId: INTEGER Parameter
    template set
    identifier.
    TypeId: INTEGER Renamed
    reference to
    parameter
    template set type
    table.
    Name: VARCHAR2(20) Name of
    parameter
    template set.
    ParamTemplateSetType ParamTempSetTypeId: INTEGER Parameter
    template set type
    identifier.
    Description: VARCHAR2(50) Description of
    parameter
    template set type.
    ParameterTemplateSetMap ProtocolTemplateId: INTEGER Reference to
    protocol template
    table.
    ParamTemplateSetId: INTEGER Reference to
    parameter
    template set table.
    ProtocolTemplateDoc ProtocolDocId: INTEGER Protocol Template
    document
    identifier.
    ProtocolTemplateId: INTEGER Reference to
    protcol template
    table.
    Name: VARCHAR2(100) Name of protocol
    template.
    PathAndFileName: VARCHAR2(50) File name for
    protocol template
    document.
    AuthorName: INTEGER Author of
    protocol template
    document.
    CreationDate: DATE Creation date of
    protocol template
    document.
    tbFragment FragmentId: INTEGER Fragment
    identifier.
    ChipSequence: LONG VARCHAR Sequence of
    fragment.
    Code: VARCHAR2(50) Code representing
    fragment.
    tblPrimerPair PrimerPairId: INTEGER Identifier for
    primer pair.
    LeftPrimerId: INTEGER Left primer
    identifier.
    RightPrimerId: INTEGER Right primer
    identifier.
    PCRSize: INTEGER length of
    amplified
    fragment
    Worked: SMALLINT Whether or not
    pair successfully
    amplified
    fragment.
    FragmentId: INTEGER Reference to
    fragment table.
    tblPCR Item1Id: INTEGER First part of
    reference to item
    derivation table.
    Item2Id: INTEGER Second part of
    reference to item
    derivation table.
    Reactionworked: SMALLINT Whether or not
    PCR reaction
    worked.
    PrimePairPCRMap PrimerPairId: INTEGER Reference to
    primer pair table.
    Item1Id: INTEGER First part of
    referenced item
    derivation table.
    Item2Id: INTEGER Second part of
    referenced item
    derivation table.
    tblPrimer PrimerId: INTEGER Primer identifier.
    ProtocolId: INTEGER Reference to
    protocol table.
    OligoSeq: VARCHAR2(35) Sequence of
    primer.
    Position: INTEGER Position of primer
    on fragment.
    Length: INTEGER Length of primer.
    MeltingTemp: INTEGER Melting
    temperature of
    primer.
    Direction: VARCHAR2(20) Direction
    (forward or
    reverse).
    tblPrimerOrder OrderId: INTEGER Order identifier.
    EmployeeId: INTEGER Employee who
    made order.
    VendorId: INTEGER Vendor for order.
    OrderDate: DATE Date of order.
    Owner: VARCHAR2(50) Name of
    employee making
    order.
    Vendor: VARCHAR2(50) Name of vendor.
    tbl Vendor VendorId: INTEGER Vendor identifier.
    Vendor: VARCHAR2(50) Name of vendor.
    PhoneNumber: VARCHAR2(15) Phone number of
    vendor.
    FaxNumber: VARCHAR2(15) Fax Number of
    vendor.
    Address: VARCHAR2(50) Address of
    vendor.
    City: VARCHAR2(50) City of vendor.
    State: VARCHAR2(50) State of vendor.
    Zip: VARCHAR2(50) Zip code of
    vendor.
    tblPrimerOrderDesignMap PrimerId: INTEGER Reference to
    primer table.
    OrderId: INTEGER Reference to
    order table.
    tblWafer WaferId: INTEGER Wafer identifier.
    LotId: INTEGER Lot to which
    wafer belongs.
    Code: VARCHAR2(8) Code for wafer.
    SynthesisDate_delete: DATE Synthesis date for
    wafer.
    Released: DATE Date wafer
    available.
    Done: SMALLINT Whether wafer
    production is
    complete.
    ExpirationDate: DATE Expiration date of
    wafer.
    ExpectedLife: CHAR(18) Expected useful
    life of wafer.
    tblLot LotId: INTEGER Lot identifier.
    WaferDesignId: INTEGER Identifier for
    wafer design.
    LotNumber: VARCHAR2(12) Lot number.
    WaferPN: VARCHAR2(50) Part number for
    wafer.
    tblTiling Design TilingDesignID: INTEGER Tiling design
    identifier.
    ChipDesignSequenceMapID: NUMBER Reference to chip
    design sequence
    map.
    TilingFormatID: INTEGER Reference to
    tiling format
    table.
    UnitNumber: INTEGER 1 for sense, 0 for
    antisense
    AtomOffset: INTEGER # to add to
    translate atom
    position in tiling
    to atom position
    in chip design
    tblTiling Format TilingFormatID: INTEGER Tiling format
    identifier
    Orientation: CHAR(18) Orientation for
    tiling.
    ProbeLength: SMALLINT Length of probes.
    SubstitutionPosition: SMALLINT Substitution
    position for
    mutation base in
    probes.
    tblAtomDesign AtomDesignId: NUMBER Atom design
    identifier.
    TilingDesignID: INTEGER Reference to
    tiling design
    table.
    Position: INTEGER Position of atom
    in sequence.
    tblProbeDesign ProbeDesignID: NUMBER Probe design
    identifier.
    ChipDesignId: INTEGER Reference to chip
    design
    x: SMALLINT x position of
    probe.
    y: SMALLINT y position of
    probe.
    tblProbeDesignRole ProbeDesignID: NUMBER Reference to
    probe design
    table.
    AtomDesignID: NUMBER Reference to atom
    design table.
    Substitution: CHAR(18) Substitution
    position in probe
    design.
    Mismatches: NUMBER Whether probe is
    match or
    mismatch.
    tblProbeData ProbeDesignID: NUMBER Reference to
    probe design
    table.
    ScanExptID: INTEGER Reference to scan
    experiment table.
    Intensity: FLOAT Measured
    hybridization
    intensity for
    probe.
    NPixels; NUMBER Number of pixels
    used for intensity
    calculation.
    StDev: NUMBER Standard
    deviation for
    pixels.
    tblAnalysis AnalysisId: INTEGER Analysis
    identifier.
    Analysis VersionID: INTEGER Reference to
    version of
    analysis.
    ProtocolID: INTEGER Reference to
    protocol table.
    DatePerformed: DATE Date analysis
    performed.
    NeedsUpdate: NUMBER Whether analysis
    is current.
    tblAnalysisDependency ParentAnalysisId: INTEGER Analysis
    providing input.
    SubAnalysisId: INTEGER Analysis receiving
    input.
    Role: VARCHAR2(20) Role of data
    provided by
    parent analysis.
    TblAnalysisInput AnalysisinputID: INTEGER Analysis input
    identifier.
    AnalysisId: INTEGER Analysis receiving
    input.
    Inputtype: VARCHAR2(20) Type of input.
    ObjectID: INTEGER Reference to input
    data.
    tblChipDesignSeguenceMap ChipDesignSequenceMapID: NUMBER Chip design
    sequence map
    identifier.
    FragmentID: INTEGER Reference to
    fragment table.
    ChipDesignId: INTEGER Chip design
    identifier.
    AtomOffset: NUMBER # to add to
    translate atom
    position in tiling
    to atom position
    in chip design
    tblSequencePosition SequencePositionID: NUMBER Sequence position
    identifier.
    ChipDesignSequenceMapID: NUMBER Reference to chip
    design sequence
    map table.
    Position: NUMBER Position in
    fragment.
    GenomicSequencePositionID: INTEGER Reference to
    genomic sequence
    position table.
    RefBase: INTEGER Reference base.
    tblGenomicSequencePosition GenomicSequencePositionID: INTEGER Genomic
    sequence position
    identifier.
    tblScanExperimentSet ScanExperimentSetID: NUMBER Scan experiment
    set identifier.
    tbsScanExperimentUsed ScanExptID: INTEGER Reference to scan
    experiment table.
    ScanExperimentSetID: NUMBER Reference to scan
    experiment set
    table.
    tblTilingData TilingDataID: NUMBER Tiling data
    identifier.
    ScanExptID: INTEGER Reference to scan
    experiment table.
    TilingDesignID: INTEGER Reference to
    tiling design
    table.
    tblAtomData AtomDataID: INTEGER Atom data
    identifier.
    TilingDataID: NUMBER Reference to
    tiling data table.
    SubjectSequencePositionID: INTEGER Reference to
    subject sequence
    position table.
    tblSubjectSequencePosition SubjectSequencePositionID: INTEGER Subject sequence
    position identifier.
    SubjectID: INTEGER Reference to
    subject table.
    SequencePositionID: NUMBER Reference to
    sequence position
    table.
    tblPolymorphismAnalysis AnalysisId: INTEGER Reference to
    analysis table.
    tblPolyPositionResult AnalysisId: INTEGER Reference to
    analysis table.
    PolyPositionID: INTEGER Polymorphism
    position identifier.
    ScanExperimentSetID: NUMBER Reference to scan
    experiment set
    table.
    PolyPositiontypeID: INTEGER Refers to
    possibility of
    polymorphism at
    position, e.g.,
    certain, likely,
    possible,
    mismatch
    (reference is
    wrong).
    WTBase: CHAR(18) Wild type base at
    position.
    MuBase: INTEGER Mutation base at
    position.
    tblUserPolyanalysis AnalysisId: INTEGER Reference to
    analysis table.
    tblUserPolyanalysisResult AnalysisId: INTEGER Reference to
    analysis table.
    SequencePositionID: NUMBER Reference to
    sequence position
    table.
    ScanExperimentSetID: NUMBER Reference to scan
    experiment set
    table.
    PolyPositionTypeID: INTEGER See
    polymorphism
    position result
    table.
    UserComment: VARCHAR2(256) User comment
    done
    polymorphism
    analysis.
    tblAtomanalysis AnalysisId: INTEGER Reference to
    analysis table.
    tblAtomResult AnalysisId: INTEGER Reference to
    analysis table.
    AtomDataID: INTEGER Reference to atom
    data table.
    PHat: FLOAT Relative
    concentration of
    mutant and wild
    type.
    PHatUpperbound: FLOAT Upperbound for
    relative
    concentration.
    PHatLowerbound: FLOAT Lowerbound for
    relative
    concentration.
    MaxIntensity: FLOAT Maximum
    measured
    intensity for
    atom.
    WTIntensity: FLOAT Measured wild
    type intensity.
    MutIntensity: FLOAT Measured
    mutation
    intensity.
    LocalWTCallRate: FLOAT rate at which
    atoms associated
    with surrounding
    sequence call
    reference base
    IntensityRatio: FLOAT Ratio of intensity
    of wild type probe
    over intensity of
    mutation probe.
    tblBaseCallAnalysis AnalysisId: INTEGER Reference to
    analysis table.
    tblBaseCallResult AnalysisId: INTEGER Reference to
    analysis table.
    SubjectSequencePositionID: INTEGER Reference to
    sequence position
    table.
    ScanExperimentSetID: NUMBER Reference to skin
    experiments set
    table.
    CalledBase: VARCHAR2(1) Base called for
    subject based on
    experiment set.
    SuggestCheck: NUMBER Used to indicate
    whether this
    sample should be
    used for
    resequencing
    tblClusteringAnalysis AnalysisId: INTEGER Reference to
    analysis table.
    tblClusteringResult AnalysisId: INTEGER Reference to
    analysis table.
    SequencePositionID: NUMBER Reference to
    sequence position
    table.
    ScanExperimentSetID: NUMBER Reference to scan
    experiment set
    table.
    ClusteringFactor: FLOAT Result of
    clustering
    analysis.
    tblNormalizationRegionAnalysis AnalysisId: INTEGER Reference to
    analysis table.
    tblNormalizationRegion NormalizationRegionID: INTEGER Normalization
    region identifier.
    AnalysisId: INTEGER Reference to
    analysis table.
    ChipDesignSequenceMapID: NUMBER Reference to chip
    design sequence
    map table.
    NumberScanExpt.Set Reference to scan
    experiment set
    table.
    RegionEnd: INTEGER Indication of end
    of the
    normalization
    region.
    RegionStart: INTEGER Indication of
    beginning of the
    normalization
    region.
    tblNormalizationAnalysis AnalysisId: INTEGER Reference to
    analysis table.
    tblNormalizationResult NormalizationResultID: INTEGER Normalization
    result identifier.
    AnalysisId: INTEGER Reference to
    analysis table.
    TilingDataID: INTEGER Reference to
    tiling data table.
    NormalizationRegionResultID: INTEGER Reference to
    normalization
    result.
    NormalizationValue: NUMBER Value used for
    normalization.
    DataOK: NUMBER Indication
    whether
    normalization
    result is usable.
    tblFootprintAnalysis AnalysisId: INTEGER Reference to
    analysis table.
    tblFootprint FootprintID: NUMBER Footprint
    identifier.
    AnalysisId: INTEGER Analysis
    identifier.
    ChipDesignSequenceMapID: NUMBER Reference to chip
    design sequence
    map table.
    ScanExperimentSetID: NUMBER Reference to scan
    experiment set
    table.
    FFStart: NUMBER Start of footprint
    and sequence.
    FPEnd: NUMBER End of footprint
    and sequence.
    tblTilingDataQualityAnalysis AnalysisId: INTEGER Reference to
    analysis table.
    tbltilingDataQualityResult TilingDataID: NUMBER Reference to
    tiling data table.
    AnalysisId: INTEGER Reference to
    analysis table.
    AvgWTIntensity: NUMBER Average wiId type
    intensity.
    WTCallRate: NUMBER Fraction of atoms
    where brightest of
    probes is one with
    reference space.
    AcceptData: INTEGER Whether data is of
    acceptable
    quality.
    tblDifficult Regionanalysis AnalysisId: INTEGER Reference to
    analysis table.
    tblDifficultRegionResult ScanExptId: INTEGER Reference to scan
    experiment table.
    AnalysisId: INTEGER Reference to
    analysis table.
    ChipDesignSequenceMapID: NUMBER Reference to chip
    design sequence
    map table.
    RgnStart: NUMBER Beginning of
    difficult region in
    sequence.
    RgnEnd: NUMBER End of difficult
    region in
    sequence.
    Reason: INTEGER Code indicating
    reason for
    difficult region,
    e.g., two or more
    non-wild type
    bases and less
    than a probe
    length.q
    tblPolyPatentSeq PolyPatentSeqId: NUMBER Polymorphism
    sequence
    identifier.
    Polyscreen: VARCHAR2(50) reference to
    internal grouping
    of polymorphisms
    FragmentCode: VARCHAR2(50) Fragment
    sequence found in
    Position: LONG Position of
    polymorphism.
    RefAllel: CHAR(2) Wild type base at
    position.
    FreqP: FLOAT Frequency of wild
    type.
    AltAllele: CHAR(2) Mutation base at
    position.
    FreqQ: FLOAT Frequency of
    mutation base.
    Heterozygocity: FLOAT Heterozygocity
    value.
    SequenceTag: VARCHAR2(50) Sequence
    containing
    polymorphism
    including
    ambiguity code at
    polymorphism
    position.
    GeneName: VARCHAR2(50) Name of gene.
    ChromosomeNum: VARCHAR2(20) Chromosome
    number.
    ChromosomeLoc: VARCHAR2(20) Location of gene
    on chromosome.
    ForwardPrimer: VARCHAR2(50) Identifier for
    forward primer
    used to implement
    fragment.
    ReversePrimer: VARCHAR2(50) Identifier of
    primer used to
    amplify fragment.
    tblPatentApp PatentAppId: NUMBER Patent application
    identifier.
    GroupId: NUMBER Reference to
    employee group
    table.
    AttorneyId: NUMBER Reference to
    attorney table.
    DocketNum: VARCHAR2(30) Docket number
    for patent
    application.
    FilingDate: DATE Filing date for
    filing application.
    Classification: VARCHAR2(30) Patent office
    classification for
    patent application.
    SerialNumber: VARCHAR2(50) Serial number
    assigned by patent
    office.
    CountryCode: VARCHAR2(50) Country in which
    patent application
    was filed.
    InventionTitle: VARCHAR2(100) Title for patent
    application
    tblPolyPatentSeqMap PatentAppId: NUMBER Reference to
    patent application
    table.
    PolyPatentSeqId: NUMBER Reference to
    polymorphism
    patent sequence
    table.
    tblPriorApp PriorAppId: NUMBER Reference to
    related prior
    patent application
    in patent
    application table.
    AppId: NUMBER Reference to
    application to
    which prior
    application is
    related.
    tblAttorney AttorneyId: NUMBER Attorney
    identifier.
    LawFirmId: NUMBER Law firm where
    attorney works.
    FirstName: VARCHAR2(20) First name of
    attorney.
    MiddleName: VARCHAR2(5) Middle name of
    attorney.
    LastName: VARCHAR2(30) Last name of
    attorney.
    RegistrationNum: VARCHAR2(25) Patent office
    registration
    number of
    attorney.
    tblLawFirm LawFirmId: NUMBER Law firm
    identifier.
    Company: VARCHAR2(100) Name of law
    firm.
    Address: VARCHAR2(100) Address of law
    firm.
    City: VARCHAR2(30) City address of
    law firm.
    State: VARCHAR2(20) State address of
    law firm.
    ZipCode: VARCHAR2(15) Zip Code of law
    firm.
    Country: VARCHAR2(15) Country of law
    firm.
    Telephone: VARCHAR2(30) Telephone
    Fax: VARCHAR2(30) number of law
    firm.
    TELEX: VARCHAR2(20) Facsimile number
    of law firm.
    Telex number of
    law firm.
    tblEmployeeGroup GroupId: NUMBER Identifier for
    inventor group.
    GroupName: VARCHAR2(50) Name of inventor
    group.
    Comments: VARCHAR2(50) Comments.
    GroupList: VARCHAR2(255) Written out list of
    inventor names.
    tblEmployeeGrpMap EmployeeId: INTEGER Reference to
    employee table
    for
    inventor/em-
    ployees.
    GroupId: NUMBER Reference to
    inventor group
    table.
  • It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. For example, tables may be deleted, contents of multiple tables may be consolidated, or contents of one or more tables may be distributed among more tables than described herein to improve query speeds and/or to aid system maintenance. Also, the database architecture and data models described herein are not limited to biological applications but may be used in any application. All publications, patents, and patent applications cited herein are hereby incorporated by reference. [0098]

Claims (23)

What is claimed is:
1. A computer-readable storage medium having stored thereon:
an item table listing a plurality of item records identifying items;
an item attribute table listing a plurality of item attribute records identifying attributes of said items; and
wherein there is a many-to-many relationship between item records and item attribute records.
2. The computer-readable storage medium of claim 1 wherein an item attribute item map table implements said many-to-many relationship between item records and item attribute records, said item attribute item map table listing a plurality of map records identifying both a particular item attribute and a particular item.
3. The computer-readable storage medium of claim 1 having further stored thereon:
an item derivation table listing a plurality of item derivation records identifying transformations between ones of said items used in biological analysis.
4. The computer-readable storage medium of claim 3 having further stored thereon:
a protocol table listing a plurality of protocol records specifying parameters of said transformation.
5. The computer-readable storage medium wherein said items are used in a biological analysis.
6. The computer-readable storage medium of claim 1 wherein said biological analysis comprises a polymorphism analysis.
7. A computer-readable storage medium having stored thereon:
an atom result table listing a plurality of atom result records, specifying relative wild-type and mutant sequence concentrations in targets; and
a subject sequence position table listing a plurality of subject sequence position records, specifying combinations of subjects from whom said targets are derived and sequence positions, each said atom result record being associated with one or more atom result records.
8. The computer-readable storage medium of claim 7 wherein said atom result records further specify upper and lower bounds for said concentrations.
9. The computer-readable storage medium of claim 7 having further stored thereon:
a subject table listing subject records specifying said subjects.
10. A computer-readable storage medium having stored thereon:
a polymorphism table listing polymorphism sequence records specifying sequences known to contain polymorphisms; and
a patent application table listing patent application records specifying one or more polymorphisms specified by said polymorphism sequence records.
11. The computer-readable storage medium of claim 10 wherein said polymorphism sequence records specify for each one of said polymorphisms a polymorphism position, a reference allele, and a base allele.
12. The computer-readable storage medium of claim 11 wherein said polymorphism sequence records further specify for each one of said polymorphisms a measured heterozygocity.
13. A computer-implemented method comprising:
creating n item table listing a plurality of item records identifying items used in biological analysis; and
creating an item attribute table listing a plurality of item attribute records identifying attributes of said items; and
wherein there is a many-to-many relationship between item records and item attribute records.
14. The computer-implemented method of claim 13 further comprising the step of:
creating an item attribute item map table implements said many-to-many relationship between item records and item attribute records, said item attribute item map table listing a plurality of map records identifying both a particular item attribute and a particular item.
15. The computer-implemented method of claim 13 comprising: an item derivation table listing a plurality of item derivation records identifying transformations between ones of said items used in biological analysis.
16. The computer-implemented method of claim 15 further comprising:
creating a protocol table listing a plurality of protocol records specifying parameters of said transformation.
17. The computer-implemented method of claim 13 wherein said biological analysis comprises a polymorphism analysis.
18. A computer-implemented method comprising:
creating an atom result table listing a plurality of atom result records, specifying relative wild-type and mutant sequence concentrations in targets; and creating a subject sequence position table listing a plurality of subject sequence position records, specifying combinations of subjects from whom said targets are derived and sequence positions, each said atom result record being associated with one or more atom result records.
19. The computer-implemented method of claim 18 wherein said atom result records further specify upper and lower bounds for said concentrations.
20. The computer-implemented method of claim 18 further comprising:
creating a subject table listing subject records specifying said subjects.
21. A computer-implemented method comprising:
creating a polymorphism table listing polymorphism sequence records specifying sequences known to contain polymorphisms; and
creating a patent application table listing patent application records specifying one or more polymorphisms specified by said polymorphism sequence records.
22. The computer-implemented method of claim 21 wherein said polymorphism sequence records specify for each one of said polymorphisms a polymorphism position, a reference allele, and a base allele.
23. The computer-implemented method of claim 22 wherein said polymorphism sequence records further specify for at least one of said polymorphisms a measured heterozygocity.
US10/219,021 1997-07-25 2002-08-14 Method and system for providing a polymorphism database Abandoned US20030074363A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US10/219,021 US20030074363A1 (en) 1997-07-25 2002-08-14 Method and system for providing a polymorphism database
US11/038,624 US20050164270A1 (en) 1997-07-25 2005-01-18 Methods and system for providing a polymorphism database

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US5384297P 1997-07-25 1997-07-25
US6919897P 1997-12-11 1997-12-11
US6943697P 1997-12-11 1997-12-11
US09/122,169 US6484183B1 (en) 1997-07-25 1998-07-24 Method and system for providing a polymorphism database
US10/219,021 US20030074363A1 (en) 1997-07-25 2002-08-14 Method and system for providing a polymorphism database

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/122,169 Continuation US6484183B1 (en) 1997-07-25 1998-07-24 Method and system for providing a polymorphism database

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US11/038,624 Continuation US20050164270A1 (en) 1997-07-25 2005-01-18 Methods and system for providing a polymorphism database

Publications (1)

Publication Number Publication Date
US20030074363A1 true US20030074363A1 (en) 2003-04-17

Family

ID=27368502

Family Applications (10)

Application Number Title Priority Date Filing Date
US09/122,304 Expired - Lifetime US6188783B1 (en) 1997-07-25 1998-07-24 Method and system for providing a probe array chip design database
US09/122,169 Expired - Lifetime US6484183B1 (en) 1997-07-25 1998-07-24 Method and system for providing a polymorphism database
US09/122,167 Expired - Lifetime US6229911B1 (en) 1997-07-25 1998-07-24 Method and apparatus for providing a bioinformatics database
US09/122,434 Expired - Lifetime US6308170B1 (en) 1997-07-25 1998-07-24 Gene expression and evaluation system
US09/836,867 Expired - Lifetime US6567540B2 (en) 1997-07-25 2001-04-16 Method and apparatus for providing a bioinformatics database
US09/940,285 Expired - Lifetime US6532462B2 (en) 1997-07-25 2001-08-27 Gene expression and evaluation system using a filter table with a gene expression database
US10/219,021 Abandoned US20030074363A1 (en) 1997-07-25 2002-08-14 Method and system for providing a polymorphism database
US10/374,170 Expired - Lifetime US6882742B2 (en) 1997-07-25 2003-02-25 Method and apparatus for providing a bioinformatics database
US11/038,624 Abandoned US20050164270A1 (en) 1997-07-25 2005-01-18 Methods and system for providing a polymorphism database
US11/080,216 Expired - Fee Related US7215804B2 (en) 1997-07-25 2005-03-14 Method and apparatus for providing a bioinformatics database

Family Applications Before (6)

Application Number Title Priority Date Filing Date
US09/122,304 Expired - Lifetime US6188783B1 (en) 1997-07-25 1998-07-24 Method and system for providing a probe array chip design database
US09/122,169 Expired - Lifetime US6484183B1 (en) 1997-07-25 1998-07-24 Method and system for providing a polymorphism database
US09/122,167 Expired - Lifetime US6229911B1 (en) 1997-07-25 1998-07-24 Method and apparatus for providing a bioinformatics database
US09/122,434 Expired - Lifetime US6308170B1 (en) 1997-07-25 1998-07-24 Gene expression and evaluation system
US09/836,867 Expired - Lifetime US6567540B2 (en) 1997-07-25 2001-04-16 Method and apparatus for providing a bioinformatics database
US09/940,285 Expired - Lifetime US6532462B2 (en) 1997-07-25 2001-08-27 Gene expression and evaluation system using a filter table with a gene expression database

Family Applications After (3)

Application Number Title Priority Date Filing Date
US10/374,170 Expired - Lifetime US6882742B2 (en) 1997-07-25 2003-02-25 Method and apparatus for providing a bioinformatics database
US11/038,624 Abandoned US20050164270A1 (en) 1997-07-25 2005-01-18 Methods and system for providing a polymorphism database
US11/080,216 Expired - Fee Related US7215804B2 (en) 1997-07-25 2005-03-14 Method and apparatus for providing a bioinformatics database

Country Status (6)

Country Link
US (10) US6188783B1 (en)
EP (4) EP0998697A4 (en)
JP (6) JP3776728B2 (en)
AT (1) ATE264523T1 (en)
DE (1) DE69823206T2 (en)
WO (4) WO1999005574A1 (en)

Families Citing this family (323)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA1223831A (en) 1982-06-23 1987-07-07 Dean Engelhardt Modified nucleotides, methods of preparing and utilizing and compositions containing the same
US5989835A (en) 1997-02-27 1999-11-23 Cellomics, Inc. System for cell-based screening
US7068830B2 (en) * 1997-07-25 2006-06-27 Affymetrix, Inc. Method and system for providing a probe array chip design database
US6826296B2 (en) * 1997-07-25 2004-11-30 Affymetrix, Inc. Method and system for providing a probe array chip design database
DE69823206T2 (en) * 1997-07-25 2004-08-19 Affymetrix, Inc. (a Delaware Corp.), Santa Clara METHOD FOR PRODUCING A BIO-INFORMATICS DATABASE
US6420108B2 (en) * 1998-02-09 2002-07-16 Affymetrix, Inc. Computer-aided display for comparative gene expression
US6349144B1 (en) * 1998-02-07 2002-02-19 Biodiscovery, Inc. Automated DNA array segmentation and analysis
US6990221B2 (en) * 1998-02-07 2006-01-24 Biodiscovery, Inc. Automated DNA array image segmentation and analysis
EP1066576A1 (en) * 1998-03-26 2001-01-10 Incyte Pharmaceuticals, Inc. System and methods for analyzing biomolecular sequences
US6324479B1 (en) * 1998-05-08 2001-11-27 Rosetta Impharmatics, Inc. Methods of determining protein activity levels using gene expression profiles
US6606622B1 (en) * 1998-07-13 2003-08-12 James M. Sorace Software method for the conversion, storage and querying of the data of cellular biological assays on the basis of experimental design
US20040199544A1 (en) * 2000-11-02 2004-10-07 Affymetrix, Inc. Method and apparatus for providing an expression data mining database
US6185561B1 (en) * 1998-09-17 2001-02-06 Affymetrix, Inc. Method and apparatus for providing and expression data mining database
DE69924645T2 (en) 1998-11-13 2006-03-02 Cellomics, Inc. METHOD AND SYSTEM FOR EFFICIENTLY GAINING AND STORING EXPERIMENTAL DATA
US6453241B1 (en) * 1998-12-23 2002-09-17 Rosetta Inpharmatics, Inc. Method and system for analyzing biological response signal data
US6245511B1 (en) * 1999-02-22 2001-06-12 Vialogy Corp Method and apparatus for exponentially convergent therapy effectiveness monitoring using DNA microarray based viral load measurements
US6136541A (en) 1999-02-22 2000-10-24 Vialogy Corporation Method and apparatus for analyzing hybridized biochip patterns using resonance interactions employing quantum expressor functions
US20040111219A1 (en) * 1999-02-22 2004-06-10 Sandeep Gulati Active interferometric signal analysis in software
US6142681A (en) 1999-02-22 2000-11-07 Vialogy Corporation Method and apparatus for interpreting hybridized bioelectronic DNA microarray patterns using self-scaling convergent reverberant dynamics
US6507788B1 (en) * 1999-02-25 2003-01-14 Société de Conseils de Recherches et D'Applications Scientifiques (S.C.R.A.S.) Rational selection of putative peptides from identified nucleotide, or peptide sequences, of unknown function
US6215894B1 (en) * 1999-02-26 2001-04-10 General Scanning, Incorporated Automatic imaging and analysis of microarray biochips
JP2000258368A (en) * 1999-03-12 2000-09-22 Jeol Ltd X-ray microanalyzer having sound monitor function
EP1041514B1 (en) * 1999-03-30 2006-03-01 Fuji Photo Film Co., Ltd. Method and apparatus for selectively displaying measurement result and corresponding images
WO2000062205A1 (en) * 1999-04-13 2000-10-19 Schulze Michael D Method of obtaining an electronically-stored financial document
US6743576B1 (en) 1999-05-14 2004-06-01 Cytokinetics, Inc. Database system for predictive cellular bioinformatics
US20030228565A1 (en) * 2000-04-26 2003-12-11 Cytokinetics, Inc. Method and apparatus for predictive cellular bioinformatics
US7151847B2 (en) 2001-02-20 2006-12-19 Cytokinetics, Inc. Image analysis of the golgi complex
US6651008B1 (en) 1999-05-14 2003-11-18 Cytokinetics, Inc. Database system including computer code for predictive cellular bioinformatics
US6876760B1 (en) 2000-12-04 2005-04-05 Cytokinetics, Inc. Classifying cells based on information contained in cell images
US6334099B1 (en) 1999-05-25 2001-12-25 Digital Gene Technologies, Inc. Methods for normalization of experimental data
JP3469504B2 (en) * 1999-06-01 2003-11-25 日立ソフトウエアエンジニアリング株式会社 Microarray chip and indexing method thereof
EP1185701A1 (en) * 1999-06-11 2002-03-13 Clingenix, Inc. Gene specific arrays and the use thereof
US6716579B1 (en) 1999-06-11 2004-04-06 Narayan Baidya Gene specific arrays, preparation and use
US7058517B1 (en) 1999-06-25 2006-06-06 Genaissance Pharmaceuticals, Inc. Methods for obtaining and using haplotype data
US6931396B1 (en) 1999-06-29 2005-08-16 Gene Logic Inc. Biological data processing
US6631211B1 (en) * 1999-07-08 2003-10-07 Perkinelmer Las, Inc. Interactive system for analyzing scatter plots
US6470277B1 (en) 1999-07-30 2002-10-22 Agy Therapeutics, Inc. Techniques for facilitating identification of candidate genes
US7062076B1 (en) 1999-08-27 2006-06-13 Iris Biotechnologies, Inc. Artificial intelligence system for genetic analysis
JP2003508853A (en) * 1999-08-27 2003-03-04 アイリス・バイオ・テクノロジーズ・インコーポレイテッド Artificial intelligence system for gene analysis
WO2001018524A2 (en) 1999-08-30 2001-03-15 Illumina, Inc. Methods for improving signal detection from an array
US7099502B2 (en) * 1999-10-12 2006-08-29 Biodiscovery, Inc. System and method for automatically processing microarrays
JP2003519829A (en) 1999-10-13 2003-06-24 シークエノム・インコーポレーテツド Methods for creating a database and a database for identifying polymorphic genetic markers
AU1574801A (en) * 1999-10-26 2001-05-08 Genometrix Genomics Incorporated Process for requesting biological experiments and for the delivery of experimental information
NZ539430A (en) 1999-12-10 2006-09-29 Invitrogen Corp Use of multiple recombination sites with unique specificity in recombinational cloning
CN100350406C (en) * 2000-01-25 2007-11-21 阿菲梅特里克斯公司 Method, system and computer software for providing genomic web portal
US20030097222A1 (en) * 2000-01-25 2003-05-22 Craford David M. Method, system, and computer software for providing a genomic web portal
WO2001055951A2 (en) 2000-01-25 2001-08-02 Cellomics, Inc. Method and system for automated inference of physico-chemical interaction knowl edge
US7582420B2 (en) 2001-07-12 2009-09-01 Illumina, Inc. Multiplex nucleic acid reactions
US20050214825A1 (en) * 2000-02-07 2005-09-29 John Stuelpnagel Multiplex sample analysis on universal arrays
US8076063B2 (en) 2000-02-07 2011-12-13 Illumina, Inc. Multiplexed methylation detection methods
US7955794B2 (en) * 2000-09-21 2011-06-07 Illumina, Inc. Multiplex nucleic acid reactions
US6770441B2 (en) * 2000-02-10 2004-08-03 Illumina, Inc. Array compositions and methods of making same
NZ521626A (en) 2000-03-29 2005-09-30 Cambia Methods for genotyping by hybridization analysis
DE10015816A1 (en) * 2000-03-30 2001-10-18 Infineon Technologies Ag Biosensor chip
AU2001251034A1 (en) * 2000-03-31 2001-10-15 Gene Logic, Inc Gene expression profiles in esophageal tissue
AU784944B2 (en) * 2000-04-18 2006-08-03 Combimatrix Corporation Automated system and process for custom-designed biological array design and analysis
US20030009295A1 (en) * 2001-03-14 2003-01-09 Victor Markowitz System and method for retrieving and using gene expression data from multiple sources
US20030171876A1 (en) * 2002-03-05 2003-09-11 Victor Markowitz System and method for managing gene expression data
US7020561B1 (en) 2000-05-23 2006-03-28 Gene Logic, Inc. Methods and systems for efficient comparison, identification, processing, and importing of gene expression data
US7577683B2 (en) * 2000-06-08 2009-08-18 Ingenuity Systems, Inc. Methods for the construction and maintenance of a knowledge representation system
US6772160B2 (en) * 2000-06-08 2004-08-03 Ingenuity Systems, Inc. Techniques for facilitating information acquisition and storage
US6741986B2 (en) * 2000-12-08 2004-05-25 Ingenuity Systems, Inc. Method and system for performing information extraction and quality control for a knowledgebase
US6931326B1 (en) 2000-06-26 2005-08-16 Genaissance Pharmaceuticals, Inc. Methods for obtaining and using haplotype data
US20020152196A1 (en) * 2000-07-07 2002-10-17 Westbrook Carol A. cDNA database and biochip for analysis of hematopoietic tissue
JP3517644B2 (en) * 2000-07-13 2004-04-12 孝 五條堀 Method and system for displaying expression phenomenon of living organisms and program
US20020059326A1 (en) * 2000-07-25 2002-05-16 Derek Bernhart System, method, and computer program product for management of biological experiment information
NL1016034C2 (en) 2000-08-03 2002-02-08 Tno Method and system for identifying and quantifying chemical components of a mixture of materials to be investigated.
US7198924B2 (en) 2000-12-11 2007-04-03 Invitrogen Corporation Methods and compositions for synthesis of nucleic acid molecules using multiple recognition sites
EP1573634A2 (en) 2000-08-22 2005-09-14 Affymetrix, Inc. System method, and computer software product for controlling biological microarray scanner
WO2002017190A1 (en) * 2000-08-22 2002-02-28 Varro Technologies, Inc. Method and system for sharing biological information
US7062092B2 (en) 2000-08-22 2006-06-13 Affymetrix, Inc. System, method, and computer software product for gain adjustment in biological microarray scanner
GB0021286D0 (en) * 2000-08-30 2000-10-18 Gemini Genomics Ab Identification of drug metabolic capacity
US6539102B1 (en) * 2000-09-01 2003-03-25 Large Scale Proteomics Reference database
US6813615B1 (en) 2000-09-06 2004-11-02 Cellomics, Inc. Method and system for interpreting and validating experimental data with automated reasoning
WO2002023308A2 (en) * 2000-09-12 2002-03-21 Viaken Systems, Inc. Techniques for providing and obtaining research and development information technology on remote computing resources
WO2002034944A1 (en) * 2000-10-23 2002-05-02 Dia Chip Limited High precision and intellectual biochip arrayer having function of respotting
US7117095B2 (en) * 2000-11-21 2006-10-03 Affymetrix, Inc. Methods for selecting nucleic acid probes
US8255791B2 (en) * 2000-11-29 2012-08-28 Dov Koren Collaborative, flexible, interactive real-time displays
US7218764B2 (en) 2000-12-04 2007-05-15 Cytokinetics, Inc. Ploidy classification method
US6706867B1 (en) * 2000-12-19 2004-03-16 The United States Of America As Represented By The Department Of Health And Human Services DNA array sequence selection
US20020143768A1 (en) * 2000-12-21 2002-10-03 Berno Anthony Berno Probe array data storage and retrieval
US20020192671A1 (en) * 2001-01-23 2002-12-19 Castle Arthur L. Method and system for predicting the biological activity, including toxicology and toxicity, of substances
US20020183936A1 (en) * 2001-01-24 2002-12-05 Affymetrix, Inc. Method, system, and computer software for providing a genomic web portal
WO2002059271A2 (en) * 2001-01-25 2002-08-01 Gene Logic, Inc. Gene expression profiles in breast tissue
US20030017455A1 (en) * 2001-01-29 2003-01-23 Webb Peter G. Chemical array fabrication with identity map
US6949638B2 (en) * 2001-01-29 2005-09-27 Affymetrix, Inc. Photolithographic method and system for efficient mask usage in manufacturing DNA arrays
US7315784B2 (en) 2001-02-15 2008-01-01 Siemens Aktiengesellschaft Network for evaluating data obtained in a biochip measurement device
US6956961B2 (en) 2001-02-20 2005-10-18 Cytokinetics, Inc. Extracting shape information contained in cell images
US7016787B2 (en) 2001-02-20 2006-03-21 Cytokinetics, Inc. Characterizing biological stimuli by response curves
JP3867046B2 (en) * 2001-02-23 2007-01-10 株式会社日立製作所 Analysis system
US7110885B2 (en) 2001-03-08 2006-09-19 Dnaprint Genomics, Inc. Efficient methods and apparatus for high-throughput processing of gene sequence data
US20020168651A1 (en) * 2001-03-12 2002-11-14 Affymetrix, Inc. Method and computer software product for determining orientation of sequence clusters
US6804679B2 (en) 2001-03-12 2004-10-12 Affymetrix, Inc. System, method, and user interfaces for managing genomic data
JP2002269114A (en) * 2001-03-14 2002-09-20 Kousaku Ookubo Knowledge database, and method for constructing knowledge database
US20040033502A1 (en) * 2001-03-28 2004-02-19 Amanda Williams Gene expression profiles in esophageal tissue
JP2002297617A (en) * 2001-03-29 2002-10-11 Hitachi Software Eng Co Ltd Method for displaying correlation between biopolymer and probe
US20030087259A1 (en) * 2001-04-18 2003-05-08 Clancy Brian M. Methods and compositions for regulating bone and cartilage formation
US7251568B2 (en) 2001-04-18 2007-07-31 Wyeth Methods and compositions for regulating bone and cartilage formation
US7155453B2 (en) * 2002-05-22 2006-12-26 Agilent Technologies, Inc. Biotechnology information naming system
US20070015146A1 (en) * 2001-05-22 2007-01-18 Gene Logic, Inc. Molecular nephrotoxicology modeling
CA2447357A1 (en) * 2001-05-22 2002-11-28 Gene Logic, Inc. Molecular toxicology modeling
US20030009294A1 (en) * 2001-06-07 2003-01-09 Jill Cheng Integrated system for gene expression analysis
AU2002316267A1 (en) * 2001-06-14 2003-01-02 Rigel Pharmaceuticals, Inc. Multidimensional biodata integration and relationship inference
WO2003001335A2 (en) * 2001-06-22 2003-01-03 Gene Logic, Inc. Platform for management and mining of genomic data
KR100794698B1 (en) * 2001-06-28 2008-01-14 (주)바이오니아 Quality control method for biological chip
AU2002320316A1 (en) * 2001-07-06 2003-01-21 Lipomics Technologies, Inc. Generating, viewing, interpreting, and utilizing a quantitative database of metabolites
WO2003068908A2 (en) * 2001-07-10 2003-08-21 Gene Logic, Inc. Cardiotoxin molecular toxicology modeling
US7447594B2 (en) * 2001-07-10 2008-11-04 Ocimum Biosolutions, Inc. Molecular cardiotoxicology modeling
CN1537229A (en) * 2001-07-31 2004-10-13 ���ְ�˹��ʽ���� Gene inspection apparatus and target nucleic acid extraction method using the same
US7251642B1 (en) 2001-08-06 2007-07-31 Gene Logic Inc. Analysis engine and work space manager for use with gene expression data
JP3977038B2 (en) * 2001-08-27 2007-09-19 株式会社半導体エネルギー研究所 Laser irradiation apparatus and laser irradiation method
US20030096248A1 (en) * 2001-09-04 2003-05-22 Vitivity, Inc. Diagnosis and treatment of vascular disease
CA2459508A1 (en) * 2001-09-24 2003-04-03 Lipomics Technologies, Inc. Methods of using quantitative lipid metabolome data
JP2003099624A (en) * 2001-09-25 2003-04-04 Toyo Kohan Co Ltd Dna providing system
WO2003034064A2 (en) * 2001-10-12 2003-04-24 Duke University Image analysis of high-density synthetic dna microarrays
WO2003033128A2 (en) * 2001-10-12 2003-04-24 Duke University Methods for image analysis of high-density synthetic dna microarrays
US20060141493A1 (en) * 2001-11-09 2006-06-29 Duke University Office Of Science And Technology Atherosclerotic phenotype determinative genes and methods for using the same
US20040255136A1 (en) * 2001-11-12 2004-12-16 Alexey Borisovich Fadyushin Method and device for protecting information against unauthorised use
KR100474840B1 (en) * 2001-11-15 2005-03-08 삼성전자주식회사 Method and system with directory for providing a genotyping microarray probe design
CN1281324C (en) * 2001-12-19 2006-10-25 阿菲梅特里克斯公司 Manufacturing process for array plate assembly
US20030177143A1 (en) * 2002-01-28 2003-09-18 Steve Gardner Modular bioinformatics platform
US7225183B2 (en) * 2002-01-28 2007-05-29 Ipxl, Inc. Ontology-based information management system and method
US9418204B2 (en) * 2002-01-28 2016-08-16 Samsung Electronics Co., Ltd Bioinformatics system architecture with data and process integration
US8793073B2 (en) 2002-02-04 2014-07-29 Ingenuity Systems, Inc. Drug discovery methods
EP1490822A2 (en) * 2002-02-04 2004-12-29 Ingenuity Systems Inc. Drug discovery methods
US20030162183A1 (en) * 2002-02-27 2003-08-28 Robert Kincaid Array design system and method
WO2003082078A2 (en) * 2002-03-28 2003-10-09 Medical College Of Ohio Method and compositions for the diagnosis and treatment of non-small cell lung cancer using gene expression profiles
AU2002364707A1 (en) * 2002-04-23 2003-11-10 Duke University Atherosclerotic phenotype determinative genes and methods for using the same
US7006680B2 (en) * 2002-05-03 2006-02-28 Vialogy Corp. System and method for characterizing microarray output data
US20030220844A1 (en) * 2002-05-24 2003-11-27 Marnellos Georgios E. Method and system for purchasing genetic data
US6763308B2 (en) 2002-05-28 2004-07-13 Sas Institute Inc. Statistical outlier detection for gene expression microarray data
US20030229848A1 (en) * 2002-06-05 2003-12-11 Udo Arend Table filtering in a computer user interface
US7504215B2 (en) 2002-07-12 2009-03-17 Affymetrix, Inc. Nucleic acid labeling methods
US20050112689A1 (en) * 2003-04-04 2005-05-26 Robert Kincaid Systems and methods for statistically analyzing apparent CGH data anomalies and plotting same
US20050216459A1 (en) * 2002-08-08 2005-09-29 Aditya Vailaya Methods and systems, for ontological integration of disparate biological data
US9898578B2 (en) * 2003-04-04 2018-02-20 Agilent Technologies, Inc. Visualizing expression data on chromosomal graphic schemes
US7941542B2 (en) * 2002-09-06 2011-05-10 Oracle International Corporation Methods and apparatus for maintaining application execution over an intermittent network connection
US7512496B2 (en) * 2002-09-25 2009-03-31 Soheil Shams Apparatus, method, and computer program product for determining confidence measures and combined confidence measures for assessing the quality of microarrays
DE10393406T5 (en) * 2002-09-30 2005-12-22 Nimblegen Systems, Inc., Madison Parallel loading of arrays
CA2500783C (en) * 2002-10-01 2012-07-17 Nimblegen Systems, Inc. Microarrays having multiple oligonucleotides in single array features
US20040259105A1 (en) * 2002-10-03 2004-12-23 Jian-Bing Fan Multiplex nucleic acid analysis using archived or fixed samples
US9453251B2 (en) 2002-10-08 2016-09-27 Pfenex Inc. Expression of mammalian proteins in Pseudomonas fluorescens
AU2003218345A1 (en) * 2002-11-06 2004-06-03 Mount Sinai School Of Medicine Treatment of amyotrophic lateral sclerosis with nimesulide
EP2112229A3 (en) 2002-11-25 2009-12-02 Sequenom, Inc. Methods for identifying risk of breast cancer and treatments thereof
WO2004050840A2 (en) * 2002-11-27 2004-06-17 The Government Of The United States As Represented By The Secretary Of The Department Of Health And Human Services, Centers For Disease Control And Prevention Integration of gene expression data and non-gene data
JP2004191160A (en) * 2002-12-11 2004-07-08 Yokogawa Electric Corp Biochip measuring method and device
KR100506089B1 (en) * 2003-02-05 2005-08-05 삼성전자주식회사 System for designing probe array using heterogeneneous genomic information and method of the same
US7750908B2 (en) * 2003-04-04 2010-07-06 Agilent Technologies, Inc. Focus plus context viewing and manipulation of large collections of graphs
US7825929B2 (en) * 2003-04-04 2010-11-02 Agilent Technologies, Inc. Systems, tools and methods for focus and context viewing of large collections of graphs
WO2004096842A2 (en) * 2003-04-28 2004-11-11 Public Health Agency Of Canada Sars virus nucleotide and amino acid sequences and uses thereof
US20040249791A1 (en) * 2003-06-03 2004-12-09 Waters Michael D. Method and system for developing and querying a sequence driven contextual knowledge base
WO2005000098A2 (en) 2003-06-10 2005-01-06 The Trustees Of Boston University Detection methods for disorders of the lung
WO2005003301A2 (en) * 2003-06-17 2005-01-13 Signal Pharmaceuticals, Inc. Methods, compositions, and kits for predicting the effect of compounds on hot flash symptoms
GB2423988A (en) 2003-07-18 2006-09-13 Cytokinetics Inc Characterizing biological stimuli by response curves
US20050014217A1 (en) 2003-07-18 2005-01-20 Cytokinetics, Inc. Predicting hepatotoxicity using cell based assays
US7235353B2 (en) 2003-07-18 2007-06-26 Cytokinetics, Inc. Predicting hepatotoxicity using cell based assays
US7353116B2 (en) * 2003-07-31 2008-04-01 Agilent Technologies, Inc. Chemical array with test dependent signal reading or processing
US20050026306A1 (en) * 2003-07-31 2005-02-03 Robert Kincaid Method and system for generating virtual-microarrays
US20050026154A1 (en) * 2003-07-31 2005-02-03 Laurakay Bruhn Masking chemical arrays
US7475087B1 (en) 2003-08-29 2009-01-06 The United States Of America As Represented By The Secretary Of Agriculture Computer display tool for visualizing relationships between and among data
US20050049796A1 (en) * 2003-09-03 2005-03-03 Webb Peter G. Methods for encoding non-biological information on microarrays
US20050048506A1 (en) * 2003-09-03 2005-03-03 Fredrick Joseph P. Methods for encoding non-biological information on microarrays
US9394565B2 (en) 2003-09-05 2016-07-19 Agena Bioscience, Inc. Allele-specific sequence variation analysis
JP2007512838A (en) 2003-12-01 2007-05-24 インヴィトロジェン コーポレーション Nucleic acid molecules containing recombination sites and methods of use thereof
CA2497324A1 (en) 2004-02-17 2005-08-17 Affymetrix, Inc. Methods for fragmenting and labelling dna
US7660709B2 (en) * 2004-03-18 2010-02-09 Van Andel Research Institute Bioinformatics research and analysis system and methods associated therewith
US9249456B2 (en) 2004-03-26 2016-02-02 Agena Bioscience, Inc. Base specific cleavage of methylation-specific amplification products in combination with mass analysis
KR100632973B1 (en) * 2004-04-29 2006-10-12 주식회사 메딘텔 Circular matching system and method for circular matching
US20060003335A1 (en) * 2004-06-30 2006-01-05 Crispino John D Methods for diagnosing acute megakaryoblastic leukemia
US7323318B2 (en) 2004-07-15 2008-01-29 Cytokinetics, Inc. Assay for distinguishing live and dead cells
US20080199480A1 (en) * 2004-07-22 2008-08-21 Sequenom, Inc. Methods for Identifying Risk of Type II Diabetes and Treatments Thereof
US8603824B2 (en) * 2004-07-26 2013-12-10 Pfenex, Inc. Process for improved protein expression by strain engineering
US8484000B2 (en) * 2004-09-02 2013-07-09 Vialogy Llc Detecting events of interest using quantum resonance interferometry
US20060073506A1 (en) 2004-09-17 2006-04-06 Affymetrix, Inc. Methods for identifying biological samples
EP1645640B1 (en) 2004-10-05 2013-08-21 Affymetrix, Inc. Method for detecting chromosomal translocations
US20060083609A1 (en) * 2004-10-14 2006-04-20 Augspurger Murray D Fluid cooled marine turbine housing
US7682782B2 (en) 2004-10-29 2010-03-23 Affymetrix, Inc. System, method, and product for multiple wavelength detection using single source excitation
EP1652580A1 (en) 2004-10-29 2006-05-03 Affymetrix, Inc. High throughput microarray, package assembly and methods of manufacturing arrays
WO2006088445A2 (en) * 2005-02-11 2006-08-24 Southern Illinois University Metabolic primers for the detection of (per)chlorate-reducing bacteria and methods of use thereof
US20060211004A1 (en) 2005-02-15 2006-09-21 Ilsley Diane D Methods and compositions for determining non-specific cytotoxicity of a transfection agent
US20090203547A1 (en) * 2005-02-18 2009-08-13 Albert Banes Gene and Cognate Protein Profiles and Methods to Determine Connective Tissue Markers in Normal and Pathologic Conditions
CA2601922C (en) * 2005-02-18 2020-11-24 Monogram Biosciences, Inc. Methods and compositions for determining anti-hiv drug susceptibility and replication capacity of hiv
US8178291B2 (en) 2005-02-18 2012-05-15 Monogram Biosciences, Inc. Methods and compositions for determining hypersusceptibility of HIV-1 to non-nucleoside reverse transcriptase inhibitors
US20070118295A1 (en) * 2005-03-02 2007-05-24 Al-Murrani Samer Waleed Khedhe Methods and Systems for Designing Animal Food Compositions
US20060286569A1 (en) * 2005-03-10 2006-12-21 Bar-Or Yuval A Method, apparatus, and system for authentication using labels containing nucleotide sequences
JP2008537875A (en) 2005-03-14 2008-10-02 ザ ボード オブ トラスティーズ オブ ザ リーランド スタンフォード ジュニア ユニバーシティ Methods and compositions for assessing graft survival in solid organ transplant recipients
EP2360278A1 (en) 2005-04-14 2011-08-24 Trustees Of Boston University Diagnostic for lung disorders using class prediction
WO2006116455A2 (en) 2005-04-26 2006-11-02 Applera Corporation System for genetic surveillance and analysis
EP1896618A4 (en) * 2005-05-27 2009-12-30 Monogram Biosciences Inc Methods and compositions for determining resistance of hiv-1 to protease inhibitors
WO2006133267A2 (en) 2005-06-06 2006-12-14 Monogram Biosciences, Inc. Methods and compositions for determining altered susceptibility of hiv-1 to anti-hiv drugs
WO2006133266A2 (en) 2005-06-06 2006-12-14 Monogram Biosciences, Inc. Methods for determining resistance or susceptibility to hiv entry inhibitors
US8159959B2 (en) * 2005-11-07 2012-04-17 Vudu, Inc. Graphic user interface for playing video data
US7634363B2 (en) 2005-12-07 2009-12-15 Affymetrix, Inc. Methods for high throughput genotyping
US20090215055A1 (en) * 2005-12-13 2009-08-27 Erasmus University Medical Center Rotterdam Genetic Brain Tumor Markers
US7646450B2 (en) * 2005-12-29 2010-01-12 Lg Display Co., Ltd. Light emitting diode array, method of manufacturing the same, backlight assembly having the same, and LCD having the same
US20070198729A1 (en) * 2006-02-07 2007-08-23 Yechuri Sitaramarao S SQL network gadget
US20090061454A1 (en) 2006-03-09 2009-03-05 Brody Jerome S Diagnostic and prognostic methods for lung disorders using gene expression profiles from nose epithelial cells
US20100285973A1 (en) * 2006-05-30 2010-11-11 Synergenz Bioscience Limited of Sea Meadow House Methods and compositions for assessment of pulmonary function and disorders
EP3617321A3 (en) 2006-05-31 2020-04-29 Sequenom, Inc. Methods and compositions for the extraction and amplification of nucleic acid from a sample
AU2007257162A1 (en) 2006-06-05 2007-12-13 Cancer Care Ontario Assessment of risk for colorectal cancer
US20080033985A1 (en) * 2006-06-09 2008-02-07 Gulfstream Bioinformatics Corporation Biomedical Information Modeling
US7700756B2 (en) * 2006-07-27 2010-04-20 Southern Illinois University Metabolic primers for the detection of perchlorate-reducing bacteria and methods of use thereof
US20080033819A1 (en) * 2006-07-28 2008-02-07 Ingenuity Systems, Inc. Genomics based targeted advertising
AU2007284651B2 (en) 2006-08-09 2014-03-20 Institute For Systems Biology Organ-specific proteins and methods of their use
JP2010506588A (en) * 2006-10-17 2010-03-04 シナージェンズ バイオサイエンス リミティド Methods and compositions for assessment of lung function and disorders
US9845494B2 (en) 2006-10-18 2017-12-19 Affymetrix, Inc. Enzymatic methods for genotyping on arrays
EP2450456A3 (en) 2006-11-02 2012-08-01 Yale University Assessment of oocyte competence
NZ551157A (en) * 2006-11-08 2008-06-30 Rebecca Lee Roberts Method of identifying individuals at risk of thiopurine drug resistance and intolerance - GMPS
US8293684B2 (en) * 2006-11-29 2012-10-23 Exiqon Locked nucleic acid reagents for labelling nucleic acids
US7902345B2 (en) 2006-12-05 2011-03-08 Sequenom, Inc. Detection and quantification of biomolecules using mass spectrometry
CA2673092A1 (en) * 2006-12-19 2008-06-26 Synergenz Bioscience Limited Methods and compositions for the assessment of cardiovascular function and disorders
CN101657217B (en) 2007-01-30 2013-11-20 药品循环公司 Methods for determining cancer resistance to histone deacetylase inhibitors
WO2008098142A2 (en) 2007-02-08 2008-08-14 Sequenom, Inc. Nucleic acid-based tests for rhd typing, gender determination and nucleic acid quantification
US9581595B2 (en) 2007-02-26 2017-02-28 Laboratory Corporation Of America Holdings Compositions and methods for determining whether a subject would benefit from co-receptor inhibitor therapy
CA2679954A1 (en) 2007-03-05 2008-09-12 Cancer Care Ontario Assessment of risk for colorectal cancer
US8652780B2 (en) 2007-03-26 2014-02-18 Sequenom, Inc. Restriction endonuclease enhanced polymorphic sequence detection
WO2008134461A2 (en) 2007-04-27 2008-11-06 Dow Global Technologies, Inc. Method for rapidly screening microbial hosts to identify certain strains with improved yield and/or quality in the expression of heterologous proteins
US9580719B2 (en) 2007-04-27 2017-02-28 Pfenex, Inc. Method for rapidly screening microbial hosts to identify certain strains with improved yield and/or quality in the expression of heterologous proteins
US8200440B2 (en) * 2007-05-18 2012-06-12 Affymetrix, Inc. System, method, and computer software product for genotype determination using probe array data
US9404150B2 (en) 2007-08-29 2016-08-02 Sequenom, Inc. Methods and compositions for universal size-specific PCR
US8765368B2 (en) * 2007-09-17 2014-07-01 The University Of Toledo Cancer risk biomarker
NZ587179A (en) 2008-01-25 2012-07-27 Theranostics Lab Detection of polymorphisms CYP2C19*17 and CYP2C19*3 in CYP2C19 gene related to antiplatelet drug metabolism (e.g. for clopidogrel metabolism)
CN101918597B (en) 2008-03-11 2013-09-18 国立癌中心 Method for measuring chromosome, gene or specific nucleotide sequence copy numbers using SNP array
US8709726B2 (en) 2008-03-11 2014-04-29 Sequenom, Inc. Nucleic acid-based tests for prenatal gender determination
CA2718137A1 (en) 2008-03-26 2009-10-01 Sequenom, Inc. Restriction endonuclease enhanced polymorphic sequence detection
JP2011525106A (en) * 2008-06-04 2011-09-15 ジ・アリゾナ・ボード・オブ・リージェンツ・オン・ビハーフ・オブ・ザ・ユニバーシティ・オブ・アリゾナ Markers for diffuse B large cell lymphoma and methods of use thereof
EP2324129A4 (en) 2008-08-18 2012-06-20 Univ Leland Stanford Junior Methods and compositions for determining a graft tolerant phenotype in a subject
US8476013B2 (en) * 2008-09-16 2013-07-02 Sequenom, Inc. Processes and compositions for methylation-based acid enrichment of fetal nucleic acid from a maternal sample useful for non-invasive prenatal diagnoses
US8962247B2 (en) * 2008-09-16 2015-02-24 Sequenom, Inc. Processes and compositions for methylation-based enrichment of fetal nucleic acid from a maternal sample useful for non invasive prenatal diagnoses
EP3260123A1 (en) 2008-11-06 2017-12-27 University of Miami Role of soluble upar in the pathogenesis of proteinuric kidney disease
US8039794B2 (en) * 2008-12-16 2011-10-18 Quest Diagnostics Investments Incorporated Mass spectrometry assay for thiopurine-S-methyl transferase activity and products generated thereby
KR101025848B1 (en) * 2008-12-30 2011-03-30 삼성전자주식회사 The method and apparatus for integrating and managing personal genome
CA2749103A1 (en) 2009-01-07 2010-07-15 Steve Stone Cancer biomarkers
KR20110138340A (en) * 2009-01-20 2011-12-27 더 보드 어브 트러스티스 어브 더 리랜드 스탠포드 주니어 유니버시티 Single cell gene expression for diagnosis, prognosis and identification of drug targets
ES2805347T3 (en) 2009-02-11 2021-02-11 Caris Mpi Inc Molecular profiling of tumors
US8832581B2 (en) * 2009-03-05 2014-09-09 Ming Zhang Gene expression browser for web-based search and visualization of characteristics of gene expression
CN102428191A (en) * 2009-03-18 2012-04-25 塞昆纳姆股份有限公司 Use Of Thermostable Endonucleases For Generating Reporter Molecules
EP3211095B1 (en) 2009-04-03 2019-01-02 Sequenom, Inc. Nucleic acid preparation compositions and methods
AU2010315400B2 (en) 2009-10-27 2016-07-21 Caris Mpi, Inc. Molecular profiling for personalized medicine
US20110201008A1 (en) * 2009-12-01 2011-08-18 University Of Miami Assays, methods and kits for measuring response to therapy and predicting clinical outcome in patients with b-cell lymphoma
EP3185013B1 (en) 2009-12-02 2019-10-09 The Board of Trustees of the Leland Stanford Junior University Biomarkers for determining an allograft tolerant phenotype
US8501122B2 (en) 2009-12-08 2013-08-06 Affymetrix, Inc. Manufacturing and processing polymer arrays
US8835358B2 (en) 2009-12-15 2014-09-16 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
US9926593B2 (en) 2009-12-22 2018-03-27 Sequenom, Inc. Processes and kits for identifying aneuploidy
US9798855B2 (en) 2010-01-07 2017-10-24 Affymetrix, Inc. Differential filtering of genetic data
WO2011091435A2 (en) 2010-01-25 2011-07-28 Mount Sinai School Of Medicine Methods of treating liver disease
CA2794255C (en) 2010-03-25 2020-01-14 Minnie M. Sarwal Protein and gene biomarkers for rejection of organ transplants
WO2011139714A2 (en) 2010-04-26 2011-11-10 Atyr Pharma, Inc. Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of cysteinyl-trna synthetase
US8961960B2 (en) 2010-04-27 2015-02-24 Atyr Pharma, Inc. Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of isoleucyl tRNA synthetases
EP2563911B1 (en) 2010-04-28 2021-07-21 aTyr Pharma, Inc. Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of alanyl trna synthetases
AU2011248490B2 (en) 2010-04-29 2016-11-10 Pangu Biopharma Limited Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of Asparaginyl tRNA synthetases
US9034320B2 (en) 2010-04-29 2015-05-19 Atyr Pharma, Inc. Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of Valyl-tRNA synthetases
EP2566495B1 (en) 2010-05-03 2017-03-01 aTyr Pharma, Inc. Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of phenylalanyl-alpha-trna synthetases
JP5976638B2 (en) 2010-05-03 2016-08-23 エータイアー ファーマ, インコーポレイテッド Innovative discovery of therapeutic, diagnostic and antibody compositions related to protein fragments of arginyl tRNA synthetase
US8981045B2 (en) 2010-05-03 2015-03-17 Atyr Pharma, Inc. Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of methionyl-tRNA synthetases
US9062302B2 (en) 2010-05-04 2015-06-23 Atyr Pharma, Inc. Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of p38 multi-tRNA synthetase complex
EP2568996B1 (en) 2010-05-14 2017-10-04 aTyr Pharma, Inc. Therapeutic, diagnostic, and antibody compositions related to protein fragments of phenylalanyl-beta-trna synthetases
AU2011258106B2 (en) 2010-05-27 2017-02-23 Pangu Biopharma Limited Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of glutaminyl-tRNA synthetases
CN103118694B (en) 2010-06-01 2016-08-03 Atyr医药公司 The discovery for the treatment of, diagnosis and the antibody compositions relevant to the protein fragments of lysyl-tRNA synzyme
US20120053253A1 (en) 2010-07-07 2012-03-01 Myriad Genetics, Incorporated Gene signatures for cancer prognosis
EP2593125B1 (en) 2010-07-12 2017-11-01 aTyr Pharma, Inc. Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of glycyl-trna synthetases
US9029506B2 (en) 2010-08-25 2015-05-12 Atyr Pharma, Inc. Innovative discovery of therapeutic, diagnostic, and antibody compositions related to protein fragments of tyrosyl-tRNA synthetases
WO2012033961A2 (en) * 2010-09-09 2012-03-15 Abbott Laboratories Systems and methods for displaying molecular probes and chromosomes
KR20130115250A (en) 2010-09-15 2013-10-21 알막 다이아그노스틱스 리미티드 Molecular diagnostic test for cancer
CA2834218C (en) 2011-04-29 2021-02-16 Sequenom, Inc. Quantification of a minority nucleic acid species using inhibitory oligonucleotides
WO2013112216A1 (en) 2012-01-24 2013-08-01 Cd Diagnostics, Llc System for detecting infection in synovial fluid
CA2864300A1 (en) 2012-02-16 2013-08-22 Atyr Pharma, Inc. Histidyl-trna synthetases for treating autoimmune and inflammatory diseases
EP2820174B1 (en) 2012-02-27 2019-12-25 The University of North Carolina at Chapel Hill Methods and uses for molecular tags
EP3321378B1 (en) 2012-02-27 2021-11-10 Becton, Dickinson and Company Compositions for molecular counting
EP3401399B1 (en) 2012-03-02 2020-04-22 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations
US9920361B2 (en) 2012-05-21 2018-03-20 Sequenom, Inc. Methods and compositions for analyzing nucleic acid
JP2015521862A (en) 2012-07-13 2015-08-03 セクエノム, インコーポレイテッド Process and composition for enrichment based on methylation of fetal nucleic acid from maternal samples useful for non-invasive prenatal diagnosis
US8766754B2 (en) 2012-07-18 2014-07-01 The Regents Of The University Of California Concave nanomagnets with widely tunable anisotropy
EP4190918A1 (en) 2012-11-16 2023-06-07 Myriad Genetics, Inc. Gene signatures for cancer prognosis
EP2925886B1 (en) 2012-11-27 2019-04-24 Pontificia Universidad Católica de Chile Compositions and methods for diagnosing thyroid tumors
US9896728B2 (en) 2013-01-29 2018-02-20 Arcticrx Ltd. Method for determining a therapeutic approach for the treatment of age-related macular degeneration (AMD)
EP3597774A1 (en) 2013-03-13 2020-01-22 Sequenom, Inc. Primers for dna methylation analysis
DK2971156T3 (en) 2013-03-15 2020-10-19 Myriad Genetics Inc GENES AND GENSIGNATURES FOR DIAGNOSIS AND TREATMENT OF MELANOMA
US10535420B2 (en) 2013-03-15 2020-01-14 Affymetrix, Inc. Systems and methods for probe design to detect the presence of simple and complex indels
KR101520615B1 (en) 2013-03-20 2015-05-18 서울대학교산학협력단 Markers for diagnosis of liver cancer
US10390724B2 (en) 2013-06-26 2019-08-27 The Penn State Research Foundation Three-dimensional bio-medical probe sensing and contacting structures with addressibility and tunability
AU2014306867B2 (en) 2013-08-12 2017-10-26 Genentech, Inc. Compositions and method for treating complement-associated conditions
KR101527283B1 (en) 2013-08-13 2015-06-10 서울대학교산학협력단 Method for screening cancer marker based on de-glycosylation of glycoproteins and marker for HCC
CA2938731A1 (en) 2014-02-08 2015-08-13 Genentech, Inc. Methods of treating alzheimer's disease
US11365447B2 (en) 2014-03-13 2022-06-21 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations
US11505836B2 (en) 2014-04-22 2022-11-22 Envirologix Inc. Compositions and methods for enhancing and/or predicting DNA amplification
EP3143160B1 (en) 2014-05-13 2019-11-06 Myriad Genetics, Inc. Gene signatures for cancer prognosis
ES2946681T3 (en) 2014-07-02 2023-07-24 Myriad Mypath Llc Genes and gene signatures for the diagnosis and treatment of melanoma
WO2016061252A1 (en) 2014-10-14 2016-04-21 The University Of North Carolina At Chapel Hill Methods and compositions for prognostic and/or diagnostic subtyping of pancreatic cancer
JP2017536813A (en) 2014-10-20 2017-12-14 エンバイロロジックス インコーポレイテッド Compositions and methods for detecting RNA viruses
JP7065610B6 (en) 2014-10-24 2022-06-06 コーニンクレッカ フィリップス エヌ ヴェ Medical prognosis and prediction of therapeutic response using multiple cellular signaling pathway activities
US11640845B2 (en) 2014-10-24 2023-05-02 Koninklijke Philips N.V. Bioinformatics process for identifying at risk subject populations
US10016159B2 (en) 2014-10-24 2018-07-10 Koninklijke Philips N.V. Determination of TGF-β pathway activity using unique combination of target genes
BR112017016350A2 (en) 2015-01-30 2018-03-27 Envirologix Inc. substrate molecule
WO2016126253A1 (en) 2015-02-05 2016-08-11 The Penn State Research Foundation Nano-pore arrays for bio-medical, environmental, and industrial sorting, filtering, monitoring, or dispensing
EP3285224A4 (en) * 2015-04-13 2018-10-10 National Institute of Advanced Industrial Science and Technology Experimental data recording device, computer program, experimental data, experimental data recording method, experimental data display device and experimental data display method
GB201512869D0 (en) 2015-07-21 2015-09-02 Almac Diagnostics Ltd Gene signature for minute therapies
CA2994416A1 (en) 2015-08-04 2017-02-09 Cd Diagnostics, Inc. Methods for detecting adverse local tissue reaction (altr) necrosis
KR102618536B1 (en) * 2015-08-12 2023-12-27 삼성전자주식회사 Method and device for mutation prioritization for personalized therapy of one or more patients
US10720227B2 (en) * 2015-08-12 2020-07-21 Samsung Electronics Co., Ltd. Method and device for mutation prioritization for personalized therapy
BR112018002848A2 (en) 2015-08-14 2018-11-06 Koninklijke Philips Nv method, apparatus, non-transient storage media, computer program, kit for measuring expression levels of six or more cell signaling target genes
EP3362580B1 (en) 2015-10-18 2021-02-17 Affymetrix, Inc. Multiallelic genotyping of single nucleotide polymorphisms and indels
EP3377650A1 (en) 2015-11-19 2018-09-26 Susanne Wagner Signatures for predicting cancer immune therapy response
US9836444B2 (en) * 2015-12-10 2017-12-05 International Business Machines Corporation Spread cell value visualization
EP3400312A4 (en) 2016-01-06 2019-08-28 Myriad Genetics, Inc. Genes and gene signatures for diagnosis and treatment of melanoma
JP6707181B2 (en) 2016-04-20 2020-06-10 エルディエックス・プログノスティクス・リミテッド・カンパニーLdx Prognostics Limited Co. Kits or packages for identifying pregnant women at risk of preterm birth and use of such kits or packages
WO2017193062A1 (en) 2016-05-06 2017-11-09 Myriad Genetics, Inc. Gene signatures for renal cancer prognosis
WO2017197573A1 (en) 2016-05-17 2017-11-23 Ldx Prognostics Limited Co. Methods and compositions for providing preeclampsia assessment
CN106066948B (en) * 2016-06-07 2018-09-28 北京大学 A kind of gene expression amount shows method and device
EP3532854A4 (en) * 2016-09-27 2020-10-14 BAE SYSTEMS Information and Electronic Systems Integration Inc. Techniques for implementing a portable spectrum analyzer
US20180165414A1 (en) * 2016-12-14 2018-06-14 FlowJo, LLC Applied Computer Technology for Management, Synthesis, Visualization, and Exploration of Parameters in Large Multi-Parameter Data Sets
KR102116178B1 (en) 2017-05-10 2020-05-27 서울대학교산학협력단 Biomarker for monitoring or detecting early onset of liver cancer from patient having high risk of liver cancer and its use
EP3631417B1 (en) 2017-05-25 2024-02-14 Flowjo, LLC Visualization, comparative analysis, and automated difference detection for large multi-parameter data sets
EP3461915A1 (en) 2017-10-02 2019-04-03 Koninklijke Philips N.V. Assessment of jak-stat1/2 cellular signaling pathway activity using mathematical modelling of target gene expression
EP3502279A1 (en) 2017-12-20 2019-06-26 Koninklijke Philips N.V. Assessment of mapk-ap 1 cellular signaling pathway activity using mathematical modelling of target gene expression
EP3728321A1 (en) 2017-12-22 2020-10-28 F. Hoffmann-La Roche AG Use of pilra binding agents for treatment of a disease
GB2574582A (en) * 2018-05-28 2019-12-18 Rainer Gabriel Schweiger Martin Method for simulating a technical device
WO2019246160A2 (en) 2018-06-18 2019-12-26 Igenomix, S.L. Methods, compositions, and kits for assessing endometrial transformation
EP3864165A4 (en) 2018-10-09 2022-08-03 Genecentric Therapeutics, Inc. Detecting cancer cell of origin
MX2021006234A (en) 2018-11-30 2021-09-10 Caris Mpi Inc Next-generation molecular profiling.
EP3956476A1 (en) 2019-04-17 2022-02-23 Igenomix S.L. Improved methods for the early diagnosis of uterine leiomyomas and leiomyosarcomas
IL293489A (en) 2019-12-02 2022-08-01 Caris Mpi Inc Pan-cancer platinum response predictor
WO2023178295A1 (en) 2022-03-18 2023-09-21 Ludwig Institute For Cancer Research Ltd Methods and systems for analyzing chromatins

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5570291A (en) * 1994-08-24 1996-10-29 Wallace Computer Services, Inc. Custom product estimating and order processing system
US5724575A (en) * 1994-02-25 1998-03-03 Actamed Corp. Method and system for object-based relational distributed databases
US5744575A (en) * 1995-06-06 1998-04-28 Ube Industries, Ltd. Aromatic polyimide and gas separation
US5871697A (en) * 1995-10-24 1999-02-16 Curagen Corporation Method and apparatus for identifying, classifying, or quantifying DNA sequences in a sample without sequencing
US6324533B1 (en) * 1998-05-29 2001-11-27 International Business Machines Corporation Integrated database and data-mining system
US6338071B1 (en) * 1999-08-18 2002-01-08 Affymetrix, Inc. Method and system for providing a contract management system using an action-item table
US6339767B1 (en) * 1997-06-02 2002-01-15 Aurigin Systems, Inc. Using hyperbolic trees to visualize data generated by patent-centric and group-oriented data processing
US6484183B1 (en) * 1997-07-25 2002-11-19 Affymetrix, Inc. Method and system for providing a polymorphism database
US6754666B1 (en) * 1999-08-19 2004-06-22 A2I, Inc. Efficient storage and access in a database management system

Family Cites Families (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4683202A (en) 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
NO870613L (en) 1986-03-05 1987-09-07 Molecular Diagnostics Inc DETECTION OF MICROORGANISMS IN A SAMPLE CONTAINING NUCLEIC ACID.
US4740611A (en) * 1986-10-30 1988-04-26 The Standard Oil Company N,N'-disubstituted ureas
US4705864A (en) * 1986-11-10 1987-11-10 The Standard Oil Company Aryl oxime derivatives of hydantoins
EP0307476A4 (en) 1986-12-20 1990-12-12 Kukita, Takeshi Bilirubin antigen, monoclonal antibody therefor, process for their preparation, and their use
US5202231A (en) 1987-04-01 1993-04-13 Drmanac Radoje T Method of sequencing of genomes by hybridization of oligonucleotide probes
US5525464A (en) 1987-04-01 1996-06-11 Hyseq, Inc. Method of sequencing by hybridization of oligonucleotide probes
US5700637A (en) 1988-05-03 1997-12-23 Isis Innovation Limited Apparatus and method for analyzing polynucleotide sequences and method of generating oligonucleotide arrays
JP2897959B2 (en) 1988-05-20 1999-05-31 エフ.ホフマン―ラ ロシュ アクチェンゲゼルシャフト Immobilized sequence-specific probe
US5206137A (en) * 1988-09-08 1993-04-27 Lifecodes Corporation Compositions and methods useful for genetic analysis
US6203977B1 (en) * 1988-11-15 2001-03-20 Yale University Delineation of individual human chromosomes in metaphase and interphase cells by in situ suppression hybridization
JPH02299598A (en) 1989-04-14 1990-12-11 Ro Inst For Molecular Genetics & Geneteic Res Determination by means of hybridization, together with oligonucleotide probe of all or part of extremely short sequence in sample of nucleic acid connecting with separate particle of microscopic size
US5143854A (en) 1989-06-07 1992-09-01 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof
US5925525A (en) * 1989-06-07 1999-07-20 Affymetrix, Inc. Method of identifying nucleotide differences
US5800992A (en) * 1989-06-07 1998-09-01 Fodor; Stephen P.A. Method of detecting nucleic acids
US6040138A (en) 1995-09-15 2000-03-21 Affymetrix, Inc. Expression monitoring by hybridization to high density oligonucleotide arrays
US5871928A (en) * 1989-06-07 1999-02-16 Fodor; Stephen P. A. Methods for nucleic acid analysis
JPH06504997A (en) 1990-12-06 1994-06-09 アフィメトリックス, インコーポレイテッド Synthesis of immobilized polymers on a very large scale
DE69132843T2 (en) 1990-12-06 2002-09-12 Affymetrix Inc N D Ges D Staat Identification of nucleic acids in samples
GR1000797B (en) 1991-06-10 1993-01-25 Emmanouil E Petromanolakis Wave-making energy absorber during vessel s propulsion
CA2093659C (en) * 1991-08-13 2001-01-16 Gregory T. Bleck Dna sequence encoding bovine .alpha.-lactalbumin and methods of use
US5846717A (en) * 1996-01-24 1998-12-08 Third Wave Technologies, Inc. Detection of nucleic acid sequences by invader-directed cleavage
EP0655090B1 (en) 1992-04-27 2000-12-27 The Trustees Of Dartmouth College Detection of gene sequences in biological fluids
US6251920B1 (en) * 1993-05-13 2001-06-26 Neorx Corporation Prevention and treatment of cardiovascular pathologies
US6395494B1 (en) * 1993-05-13 2002-05-28 Neorx Corporation Method to determine TGF-β
US5524070A (en) * 1992-10-07 1996-06-04 The Research Foundation Of State University Of New York Local adaptive contrast enhancement
US5632282A (en) * 1993-07-20 1997-05-27 Hay; S. Hutson Ocular disease detection apparatus
DE69433180T2 (en) 1993-10-26 2004-06-24 Affymetrix, Inc., Santa Clara FIELDS OF NUCLEIC ACID PROBE ON ORGANIC CHIPS
WO1995011755A1 (en) 1993-10-28 1995-05-04 Houston Advanced Research Center Microfabricated, flowthrough porous apparatus for discrete detection of binding reactions
US6096503A (en) * 1993-11-12 2000-08-01 The Scripps Research Institute Method for simultaneous identification of differentially expresses mRNAs and measurement of relative concentrations
EP0667586A3 (en) * 1994-02-14 1996-08-28 Digital Equipment Corp Database generator.
US5571639A (en) * 1994-05-24 1996-11-05 Affymax Technologies N.V. Computer-aided engineering system for design of sequence arrays and lithographic masks
US5795716A (en) 1994-10-21 1998-08-18 Chee; Mark S. Computer-aided visualization and analysis system for sequence evaluation
US6600996B2 (en) 1994-10-21 2003-07-29 Affymetrix, Inc. Computer-aided techniques for analyzing biological sequences
JPH11501741A (en) * 1995-01-27 1999-02-09 インサイト ファーマシューティカルズ インク. Computer system for storing and analyzing microbiological data
US5961923A (en) * 1995-04-25 1999-10-05 Irori Matrices with memories and uses thereof
US5707806A (en) 1995-06-07 1998-01-13 Genzyme Corporation Direct sequence identification of mutations by cleavage- and ligation-associated mutation-specific sequencing
US5777888A (en) 1995-08-09 1998-07-07 Regents Of The University Of California Systems for generating and analyzing stimulus-response output signal matrices
GB9522615D0 (en) 1995-11-03 1996-01-03 Pharmacia Spa 4-Phenyl-4-oxo-butanoic acid derivatives with kynurenine-3-hydroxylase inhibiting activity
US5778200A (en) 1995-11-21 1998-07-07 Advanced Micro Devices, Inc. Bus arbiter including aging factor counters to dynamically vary arbitration priority
JP2002515738A (en) 1996-01-23 2002-05-28 アフィメトリックス,インコーポレイティド Nucleic acid analysis
JP2000504575A (en) 1996-02-08 2000-04-18 アフィメトリックス,インコーポレイテッド Chip-based speciation and phenotypic characterization of microorganisms
US6108635A (en) * 1996-05-22 2000-08-22 Interleukin Genetics, Inc. Integrated disease information system
US5989835A (en) * 1997-02-27 1999-11-23 Cellomics, Inc. System for cell-based screening
CA2270527A1 (en) * 1996-11-04 1998-05-14 3-Dimensional Pharmaceuticals, Inc. System, method, and computer program product for the visualization and interactive processing and analysis of chemical data
US5968784A (en) * 1997-01-15 1999-10-19 Chugai Pharmaceutical Co., Ltd. Method for analyzing quantitative expression of genes
US6205447B1 (en) * 1997-06-30 2001-03-20 International Business Machines Corporation Relational database management of multi-dimensional data
US6032151A (en) * 1997-11-17 2000-02-29 Sun Microsystems, Inc. Database system employing polymorphic entry and entry matching
US6025194A (en) * 1997-11-19 2000-02-15 Geron Corporation Nucleic acid sequence of senescence asssociated gene
US5991766A (en) * 1997-12-02 1999-11-23 Electronic Data Systems Corporation Method and system for managing redundant objects in a distributed object system
US20030036855A1 (en) * 1998-03-16 2003-02-20 Praelux Incorporated, A Corporation Of New Jersey Method and apparatus for screening chemical compounds
AU2342900A (en) * 1998-09-23 2000-05-01 Cleveland Clinic Foundation, The Novel interferon stimulated and repressed genes
US6140054A (en) * 1998-09-30 2000-10-31 University Of Utah Research Foundation Multiplex genotyping using fluorescent hybridization probes
US6251601B1 (en) * 1999-02-02 2001-06-26 Vysis, Inc. Simultaneous measurement of gene expression and genomic abnormalities using nucleic acid microarrays
US6284465B1 (en) * 1999-04-15 2001-09-04 Agilent Technologies, Inc. Apparatus, systems and method for locating nucleic acids bound to surfaces
AU6909300A (en) * 1999-08-20 2001-03-19 Merck & Co., Inc. Substituted ureas as cell adhesion inhibitors
US7072896B2 (en) * 2000-02-16 2006-07-04 Verizon Laboratories Inc. System and method for automatic loading of an XML document defined by a document-type definition into a relational database including the generation of a relational schema therefor
US6569615B1 (en) * 2000-04-10 2003-05-27 The United States Of America As Represented By The Department Of Veteran's Affairs Composition and methods for tissue preservation
US20020062258A1 (en) * 2000-05-18 2002-05-23 Bailey Steven C. Computer-implemented procurement of items using parametric searching
US7731904B2 (en) * 2000-09-19 2010-06-08 Canon Kabushiki Kaisha Method for making probe support and apparatus used for the method
AU2002322775A1 (en) * 2001-07-27 2003-02-17 The Regents Of The University Of California Stk15 (stk6) gene polymorphism and methods of determining cancer risk
US20030126139A1 (en) * 2001-12-28 2003-07-03 Lee Timothy A. System and method for loading commercial web sites

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5724575A (en) * 1994-02-25 1998-03-03 Actamed Corp. Method and system for object-based relational distributed databases
US5570291A (en) * 1994-08-24 1996-10-29 Wallace Computer Services, Inc. Custom product estimating and order processing system
US5744575A (en) * 1995-06-06 1998-04-28 Ube Industries, Ltd. Aromatic polyimide and gas separation
US5871697A (en) * 1995-10-24 1999-02-16 Curagen Corporation Method and apparatus for identifying, classifying, or quantifying DNA sequences in a sample without sequencing
US6339767B1 (en) * 1997-06-02 2002-01-15 Aurigin Systems, Inc. Using hyperbolic trees to visualize data generated by patent-centric and group-oriented data processing
US6484183B1 (en) * 1997-07-25 2002-11-19 Affymetrix, Inc. Method and system for providing a polymorphism database
US6324533B1 (en) * 1998-05-29 2001-11-27 International Business Machines Corporation Integrated database and data-mining system
US6338071B1 (en) * 1999-08-18 2002-01-08 Affymetrix, Inc. Method and system for providing a contract management system using an action-item table
US6754666B1 (en) * 1999-08-19 2004-06-22 A2I, Inc. Efficient storage and access in a database management system

Also Published As

Publication number Publication date
US6882742B2 (en) 2005-04-19
US6229911B1 (en) 2001-05-08
DE69823206D1 (en) 2004-05-19
EP1009861A1 (en) 2000-06-21
US20030125882A1 (en) 2003-07-03
JP2005100389A (en) 2005-04-14
ATE264523T1 (en) 2004-04-15
EP0998697A4 (en) 2002-07-24
EP1002264A2 (en) 2000-05-24
EP0998697A1 (en) 2000-05-10
US6188783B1 (en) 2001-02-13
DE69823206T2 (en) 2004-08-19
US20020062319A1 (en) 2002-05-23
JP2005353057A (en) 2005-12-22
WO1999005591A2 (en) 1999-02-04
US6484183B1 (en) 2002-11-19
EP1002264B1 (en) 2004-04-14
WO1999005574A1 (en) 1999-02-04
US6567540B2 (en) 2003-05-20
US7215804B2 (en) 2007-05-08
EP1007737A4 (en) 2002-07-03
JP2001511546A (en) 2001-08-14
WO1999005324A1 (en) 1999-02-04
JP2001511550A (en) 2001-08-14
WO1999005323A1 (en) 1999-02-04
US6308170B1 (en) 2001-10-23
US20020012456A1 (en) 2002-01-31
US6532462B2 (en) 2003-03-11
JP2001515234A (en) 2001-09-18
EP1007737A1 (en) 2000-06-14
US20050164270A1 (en) 2005-07-28
EP1002264A4 (en) 2002-07-03
EP1009861A4 (en) 2002-10-16
JP3776728B2 (en) 2006-05-17
WO1999005591A3 (en) 1999-05-20
US20050157915A1 (en) 2005-07-21
JP2001511529A (en) 2001-08-14

Similar Documents

Publication Publication Date Title
US20030074363A1 (en) Method and system for providing a polymorphism database
US6826296B2 (en) Method and system for providing a probe array chip design database
National Research Council DNA technology in forensic science
US6185561B1 (en) Method and apparatus for providing and expression data mining database
US20030220844A1 (en) Method and system for purchasing genetic data
CA2345441A1 (en) Complexity management and analysis of genomic dna
US20030028501A1 (en) Computer based method for providing a laboratory information management system
US20060271513A1 (en) Method and apparatus for providing an expression data mining database
US20060212229A1 (en) Method and system for providing a probe array chip design database
Charru et al. HYPERGENE: a clinical and genetic database for genetic analysis of human hypertension
US20060259251A1 (en) Computer software products for associating gene expression with genetic variations
WO2000016220A9 (en) Method and apparatus for providing an expression data mining database and laboratory information management
EP1396800A2 (en) Method and apparatus for providing a bioinformatics database
JP2003526133A6 (en) Method and apparatus for providing expression data mining database and laboratory information management
JP2003526133A (en) Method and apparatus for providing expression data mining database and laboratory information management
Gilbert et al. Strategies for genotype generation
Carroll Mobile elements: Genome-wide distribution and complexity
Grant A Microarray Database
EP1038245A1 (en) Method and apparatus for providing an expression data mining database and laboratory information management
National Research Council (US) Committee on Human Genome Diversity Sampling Issues

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION