WO2002089015A1 - Pharmacovigilance database - Google Patents

Pharmacovigilance database Download PDF

Info

Publication number
WO2002089015A1
WO2002089015A1 PCT/US2002/013662 US0213662W WO02089015A1 WO 2002089015 A1 WO2002089015 A1 WO 2002089015A1 US 0213662 W US0213662 W US 0213662W WO 02089015 A1 WO02089015 A1 WO 02089015A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
database
verbatim
terms
source data
Prior art date
Application number
PCT/US2002/013662
Other languages
French (fr)
Inventor
Victor V. Gogolak
Original Assignee
Qed Solutions, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qed Solutions, Inc. filed Critical Qed Solutions, Inc.
Publication of WO2002089015A1 publication Critical patent/WO2002089015A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
    • G16H20/13ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients delivered from dispensers
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99943Generating database or data structure, e.g. via user interface
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99944Object-oriented database structure
    • Y10S707/99945Object-oriented database structure processing

Definitions

  • the present invention relates generally to systems and methods for developing a pharmacovigilance database from source data, both publicly available and privately developed, and reference data.
  • SRS SRS Reporting System
  • the SRS contains adverse drug reaction reports from a variety of sources over a period covering 1 969 through 1 997.
  • This data is available in an ASCII flat file from the FDA.
  • the flat file by its nature is not a relationaliy structured database amenable to typical query.
  • AERS Adverse Event Reporting System
  • AERS is a non-cumulative database of post-market drug adverse events. Its purpose is to serve as an early warning indicator or signaling system for adverse drug reactions not detected during pre-market testing.
  • the data, without a search engine, is available on CD ROM from the federal government in a combination of ASCII delimited flat file and SGML format.
  • the files include: demographic and administrative information; along with drug, reaction, patient outcome, and source for each case.
  • an antidepressant may be Prozac, a fluoxetine, a serotonin reuptake inhibitor, or a serotonin receptor specific modulator.
  • antidepressants include many other drugs, such as lithium and other catecholaminergic drugs, and there are serotonin reuptake inhibitors in addition to Prozac.
  • Even "standardized" terminology can differ between databases. For example, some adverse event databases request reaction terminology consistent with the Medical Dictionary for Regulatory Activities (MedDRATM), while other databases request, or already contain, input consistent with World Health Organization Adverse Reaction Terminology (WHO-ART) or Coding Symbols for a Thesaurus of Adverse Reaction Terms (COSTART) developed and maintained by the FDA's Center for Drug Evaluation and Research.
  • WHO-ART World Health Organization Adverse Reaction Terminology
  • COSTART Coding Symbols for a Thesaurus of Adverse Reaction Terms
  • data corruption in databases such as SRS and AERS is acknowledged, but not quantified, by the proponents.
  • Data corruption at the database field level can include extraneous non-alpha characters, noise words, misspellings, and dislocations (e.g., data that is valid for one field, erroneously entered into another, inappropriate field).
  • Databases that allow entry of free text information are especially susceptible to data corruption.
  • existing adverse event databases have been known to contain redundant cases documenting the same adverse event.
  • FIM Data Site Filtering And Translation For Heterogeneous Databases
  • the system includes a user interface for generating a global query to search the virtual database, a smart dictionary database that contains configuration data, a data information manager that decomposes the global query into local queries, and a plurality of local information managers that execute the local queries to search for and retrieve data from the enumerated databases.
  • a filter generates a list of those local databases that contain information relevant to the global query. As a result, the data information manager only generates local queries for the enumerated local databases.
  • An input translator converts the global query into the respective local formats for the local databases so that the system provides true integration of heterogeneous databases.
  • An output translator converts the data retrieved from each local database into a uniform input/output format so that the data presented to the user is integrated. The user typically selects the input/output format as his or her local format or a global format associated with the virtual database.
  • Defined Data Items From Medical Service Records Generated By Health Care Providers discloses a central medical record repository for a managed health care organization that accepts and stores medical record documents in any format from medical service providers.
  • the repository identifies the document using information automatically extracted from the document and stores the extracted data in a document database.
  • the repository links the document to a patient by extracting from the document demographic data identifying the patient and matching it to data stored in a patient database.
  • Data is extracted automatically from medical records containing "unstructured" or free form text by identifying conventional organization components in the text and is organized by executing rules that extract data with the aid of such information.
  • Documents for a patient are retrieved by identifying the patient using demographic data.
  • U.S. Patent No. 5,845,255 to Mayaud, "Prescription Management System” discloses a wirelessly deployable, electronic prescription creation system for physician use which captures into a prescription a patient condition-objective of the prescribed treatment and provides for patient record assembly from source elements, with privacy controls for patient and doctor, adverse indication review and online access to comprehensive drug information including scientific literature. Extensions to novel multi-drug packages and dispensing devices, and an "intelligent network" remote data retrieval architecture as well as onscreen physician-to- pharmacy and physician-to-physician e-mail are also provided. [001 2] U.S. Patent No. 5,91 1 , 1 32 to Sloane, "Method (Of) Using (A) Central
  • Epidemiological Database discloses a method in which patient disease is diagnosed and/or treated using electronic data communications between not only the physician and his/her patient, but via the use of electronic data communications between the physician and one or more entities which can contribute to the patient's diagnosis and/or treatment, such electronic data communications including information that was previously received electronically from the patient and/or was developed as a consequence of an electronic messaging interaction that occurred between the patient and the physician.
  • Such other entities illustratively include a medical diagnostic center and an epidemiological database computer facility that collects epidemiological transaction records from physicians, hospitals, and other institutions that have medical facilities, such as schools and large businesses.
  • the epidemiological transaction record illustratively includes various medical, personal, and epidemiological data relevant to the patient and his/her present symptoms, including test results, as well as the diagnosis, if one has already been arrived at by the e-doc.
  • the epidemiological database computer facility can correlate this information with the other epidemiological transaction records that it receives over time in order to help physicians make and/or confirm diagnoses as well as to identify and track epidemiological events and/or trends.
  • U.S. Patent No. 5,924,074 to Evans, "Electronic Medical Records System” discloses a medical records system that creates and maintains all patient data electronically.
  • the system captures patient data, such as patient complaints, lab orders, medications, diagnoses, and procedures, at its source at the time of entry using a graphical user interface having touch screens.
  • authorized healthcare providers can access, analyze, update, and electronically annotate patient data even while other providers are using the same patient record.
  • the system likewise permits instant, sophisticated analysis of patient data to identify relationships among the data considered.
  • the system includes the capability to access reference databases for consultation regarding allergies, medication interactions, and practice guidelines.
  • the system also includes the capability to incorporate legacy data, such as paper files and mainframe data, for a patient.
  • Litigation discloses a medical database and associated methods that are especially suited for compiling information in a medical malpractice situation.
  • a general medical database is provided and specific medical information corresponding to a given situation is entered. Entry of the information automatically cross-references some terms of the entered data to definitions in the general medical database. Terms are readily looked up when reviewing specific medical information and definitions are easily inserted where desired.
  • a drug reference display provides two- way lookup from drugs to their side effects (or contraindications or interactions) and back. Significant information from an entered medical chronology is easily copied to a significant information section when a reviewer finds the information important.
  • U.S. Patent No. 6,21 9,674 to Classen, "System for creating and managing proprietary product data” discloses systems and methods for creating and using product data to enhance the safety of a medical or non-medical product.
  • the systems receive vast amounts of data regarding adverse events associated with a particular product and analyze the data in light of already known adverse events associated with the product.
  • the system develops at least one proprietary database of newly discovered adverse event information and new uses for the product and may catalog adverse event information for a large number of population sub-groups.
  • the system may also be programmed to incorporate the information into intellectual property and contract documents. Manufacturers can include the information in consumer product information that they provide to consumers or, in the case of certain medical products, prescribers of the medical products.
  • references None of the above references, alone or in combination, addresses improving the quality of the underlying verbatim adverse drug event data. Nor do the references address mapping this underlying data to accepted pharmaceutical community terms and hierarchies. Specifically, the references do not address parsing of flat file adverse drug event data into a relational database structure to support efficient query. The problem of differing terminology in the data fields of disparate databases also remains un-addressed; as does the problem of data corruption in the form of misspelling and extraneous characters, along with resolution of redundant cases.
  • the present invention is a method for developing a pharmacovigilance database from source data and reference data.
  • the unedited source data contains verbatim terms.
  • the method includes parsing source data into a relational database; performing cleanup on the relational database; and mapping verbatim terms from the cleaned database to at least one token from at least one reference source.
  • Cleanup includes removing redundant entries, correcting misspellings, removing irrelevant non-alpha characters and noise words, and relocating dislocated terms.
  • preferred embodiments standardize and map historical terms to current terms. Where the choice of the reference data is itself an option, preferred embodiments incorporate a method for selecting the reference data source and the automatically propagated correction and mapping rules associated with that choice.
  • Mapping verbatim terms to tokens includes nominating tokens from the source data, choosing tokens from the reference sources, and linking chosen tokens to corresponding verbatim terms. In one embodiment, the history of clean up and mapping is saved as the pedigree of the verbatim-to-token mapping.
  • Figure 1 illustrates a method of the present invention for development of a pharmacovigilance database.
  • Figure 2 illustrates a specific implementation of a pharmacovigilance database of the present invention.
  • Figure 3 illustrates mapping of cleaned verbatim source data drug terms to trade and generic canonical terms from the National Drug Code Directory and the Food and Drug Administration Orange Book.
  • Figure 4 is a sample window illustrating how a list of unresolved drug verbatim are presented to an operator along with suggestions for resolution.
  • Figure 5 is a sample window illustrating how an individual unresolved drug verbatim entry may be presented to an operator
  • Figure 6 is a sample window illustrating how an operator may effect resolution of an unresolved drug verbatim entry.
  • Figure 7 illustrates mapping of cleaned verbatim source data reaction terms to WHOART, COSTART and MedDRA reaction terms and hierarchies.
  • Figure 8 illustrates mapping of source data drug terms to reference source "map to" tokens.
  • Figure 9 illustrates mapping of source data reaction terms to reference source "map to" tokens for the MedDRA reaction hierarchy.
  • the present invention includes a preferred method for developing a pharmacovigilance database 1 00 from source data 200.
  • the method includes the steps of parsing 31 0 source data 200 into a relational database structure, performing cleanup 330 on the parsed data, and mapping 340 cleaned parsed source data to reference data 230. While the preferred methods of the present invention parse source data 200 prior to performing cleanup 330, cleanup 330 can be performed independently of, and prior to, parsing.
  • the source data 200 is already in a relational database structure amenable to embodiments of the present invention.
  • the source data includes, but is not limited to, SRS 21 0 and AERS 220.
  • AERS and SRS call for reports on outcomes, report source, and concomitant drugs.
  • Other data and sources can serve as inputs to the process.
  • Preferred embodiments accommodate adverse event data from pharmaceutical corporations, hospitals; physicians, and health insurers; along with data from state, federal, and international agencies.
  • the primary sources of the pharmaceutical industry data are individual adverse event databases of the pharmaceutical corporation safety departments. In each case, source data may be focused on clinical trails, post-market surveillance, research databases, or the like.
  • the unedited data in each source database is referred to as "verbatim.”
  • reference data from accepted canonical references e.g., MedDRATM 231 , National Drug Code Directory 232, and FDA Orange Book 233, is used in preferred embodiments of the present invention.
  • Preferred embodiments of the invention also link to genomic and proteomic data.
  • Preferred embodiments of the invention provide means to substitute and manage both source and reference data.
  • the method illustrated in Figure 2 includes parsing 31 0 source data 21 0,
  • transformation from raw source data 21 0, 220 to a relational structure preferably includes parsing each data source into an image 1 22, 1 24 with fields tailored to its corresponding source. Subsequently, the images 1 22, 1 24 are consolidated 320 into a single safety tablespace 1 1 0. Since the database can be simple or complex, the present invention provides the ability to add many "dimensions" (e.g., age, sex, dates, reactions, doses, outcomes, report source, concomitant drugs): some structured, some narrative, some numerical, and many categorical variables such as reaction. Hierarchies in all dimensions (in both preferred and custom paths) are definable as required by the particular end user.
  • dimensions e.g., age, sex, dates, reactions, doses, outcomes, report source, concomitant drugs
  • the safety tablespace 1 1 0 provides a common set of fields for the parsed source data 1 22, 1 24.
  • Data cleanup may be performed independently of parsing source data into a safety database. This allows cleanup to be continual, ongoing, and iterative; either before or after one or more source databases are processed into the pharmacovigilance database.
  • Adverse event database cleanup is an incremental process, proceeding from automated cleanup of certain errors, through human- assisted cleanup of ambiguous entries, to human correction of identified gross errors.
  • Specific cleanup tasks include noise reduction (e.g., suppression of non- alpha characters noise words, and combination words); adjustment for misspellings; adjustment for dislocations, and resolution of possible redundant entries.
  • reactions, drugs, and counts of the occurrence (by case and absolute) of each are extracted 331 from the parsed AERS data 1 24.
  • the counts are then grouped 331 ; in this embodiment, grouping is by order of magnitude of the count.
  • the bulk of data cleanup 330 is performed on a computing platform separate from database storage.
  • a spreadsheet application such as Microsoft Excel is used to track cleanup operations.
  • the first column in such a spreadsheet may contain the verbatim term; the second column may contain a noise-suppressed verbatim term; the fourth column may contain the spell-checked verbatim term, and so on.
  • Other data cleanup applications such as Metaphone (discussed infra), also reside on this separate computing platform in the illustrated embodiment. However, cleanup applications need not reside on a separate computing platform, or may be accessible via the Internet or other computer network.
  • Noise reduction involves suppression of words and characters that are typically unnecessary in determining the correct name for drug or reaction verbatim.
  • Noise words and characters include, but are not limited to non-alpha characters (such as numbers, diacriticals, brackets, and control characters), words (e.g., "mg” or “tablet"), combination words (e.g., "20mg”with no space). For example, both “Tylenol (500 mg)”and “Tylenol Capsules” would be reduced to "Tylenol.”
  • a list of noise words and noise punctuation is stored in database tables associated with lexical processing. Non-alpha characters, such as control characters, are also suppressed at this stage.
  • misspellings are detected and adjusted for using known tools such as spell checkers, sound-alike suggestion programs, a verbatim replacement table, and human inspection.
  • a preferred spell checker operates on noise-suppressed verbatim terms, making a series of spelling variations on terms not found in the reference sources. These variations are used as the basis for searching reference sources and suggesting candidate canonical terms.
  • Reference sources include standard and special-purpose dictionaries.
  • the variations introduced include: adding an extra character to the term, e.g., allowing noise-suppressed verbatim such as "proza” to be searched as “Prozac;” removing a character from the term, e.g., allowing noise- suppressed verbatim such as "prozzac” to be searched as “Prozac;” swapping adjacent characters, e.g., allowing noise-suppressed verbatim such as "rpozac” to be searched as “Prozac.”
  • a sound-alike program such as Metaphone or Soundex is employed to suggest variations. Metaphone is a published algorithm similar to Soundex. It was originally published in the December 1 990 issue of Computer Language magazine.
  • Metaphone suggester calculates the Metaphone value for each entry in the reference sources and for each unresolved verbatim term.
  • Those reference source terms having a Metaphone value matching that of an unresolved verbatim term will be offered as a suggestion to a database developer for resolution.
  • the Metaphone value for both "prosac” and "prozack” is PRSK
  • the Metaphone value for both "Claritin” and "Klariton” is "KLRT.”
  • Preferred embodiments of the invention include steps for capturing and using domain-specific lexical knowledge not easily applied through noise reduction or spell checking. At the basic level, this amounts to use of a replacement table, containing mappings from known errors to corrected canonical terms. On a more sophisticated level, as domain-specific knowledge is accumulated, autocoders are employed to capture human decision-making experience regarding cleanup.
  • Dislocation errors are identified in preferred embodiments of the present invention where a term does not fit the type of the field it is found in, but nonetheless exists in reference sources outside the scope of the particular field.
  • case includes all data regarding the adverse events experienced by one person, taking a drug.
  • a sequence of events regarding a person, taking a drug should not be recorded as separate cases (potentially duplicating the adverse events associated with the case). This is important for correct statistical views of the data.
  • the present invention provides tools to operators for identification and consolidation redundant cases. In preferred embodiments of the present invention, multiple cases involving the same person over a contiguous period are presented to an operator for a determination whether or not such entries actually represent one case with multiple (or possibly single-occurrence, multiple-reported) events.
  • a case concerning an "eye pain" reaction is amended fifteen times, only one instance of eye pain should be aggregated for this individual case.
  • preferred embodiments of the present invention match successor reports with their predecessors using data inherent in the records, and comparing other information in the records to gauge the quality of the match. For example, two cases may match on "case identification” field, or a "drug manufacturer identification” field, or a "report date.” Those cases known to be redundant, and those cases showing a link between records are presented to researchers for resolution. In alternate embodiments, resolution between likely redundant cases is accomplished via an expert system.
  • verbatim terms e.g., drug and reaction terms, that have been parsed into a safety database and cleaned, are mapped to "tokens" from the reference data sources.
  • the word “token” refers to the specific term(s), from one of more of the reference sources, that is associated with one or more verbatim terms in a fashion that allows a search for the token to return results containing the verbatim term(s) linked to the token.
  • verbatim term source or cleaned
  • the verbatim term is mapped to the reference term as token.
  • no exact match is found between verbatim (cleaned or otherwise) and reference data terms
  • preferred embodiments of the present invention present a series of steps for resolving such unmatched terms.
  • source data verbatim terms are nominated as token candidates; frequency of occurrence and absolute count being typical bases for nominating a term as a token candidate.
  • verbatim drug and reaction terms are grouped by order of magnitude of absolute count 331 .
  • token candidates are chosen from accepted reference sources such as MedDRA, COSTART, and WHOART.
  • token candidates are chosen from corresponding canonical sources such as the National Drug Code Directory (NDCD), WHODRUG, and the Orange Book.
  • NDCD National Drug Code Directory
  • WHODRUG National Drug Code Directory
  • Orange Book Orange Book
  • mapping enables those searches of the pharmacovigilance database focused on tokenized fields, e.g., drug and reaction fields, to be executed with greater confidence.
  • variability in adverse event data entry typically a difficult-to- control aspect of data collection on a large scale, is mitigated as a source of error.
  • Figure 2 indicates a stage for mapping 340 SRS and AERS corrected verbatim to NDCD 232, MedDRA, 231 and Orange Book 233 canonical terms and structures.
  • a verbatim term source or cleaned
  • the verbatim term is mapped to the reference term as token.
  • Figure 3 illustrates mapping of cleaned 330 source data verbatim drug terms to trade names, and generic/compound names found in NDCD 232 and the FDA Orange Book 233.
  • Figure 4 is a sample interactive screen for resolving non-exact matches.
  • a user is presented with a number of assigned unresolved entries.
  • Preferred embodiments of the invention present the user with any suggestions identified by lexical processing (e.g., Metaphone, fixed list) for each unresolved verbatim term. The user may then select from this list or, as illustrated in Figure 5, enter a surrogate term. After selecting a candidate term (or entering a surrogate term and choosing "consider surrogate"), a list of generic drug names will be shown (if the matched term was indeed a trade name rather than a generic). As illustrated in Figure 6, at this point, a user can either save the mapping or modify the list of generic terms. This last option will allow a user to override the list of generics.
  • lexical processing e.g., Metaphone, fixed list
  • Figure 7 illustrates mapping of cleaned source data reaction terms to standardized hierarchies such as WHOART 234, COSTART 235, and MedDRA 231 .
  • cleaned source data reaction terms are mapped to multiple levels (and possibly multiple entries within a level) of the hierarchy.
  • mapping of cleaned verbatim reaction terms proceeds in a fashion similar to mapping of drug terms.
  • mapping may be performed on uncleaned (or even unparsed) source data. Transparency in the process of moving from source data verbatim terms to a cleaned safety database with verbatim terms mapped to tokens is important to both database developers/operators and to end users.
  • Preferred embodiments of the present invention capture the way source data terms have been cleaned and mapped as the "pedigree" of each term.
  • the "pedigree" of a term is the link between the mapped term and the decisions made during data cleanup. End users typically wish to verify the pedigree of the data they use.
  • retained data includes one or more of the following as appropriate: verbatim term, token mapped to, source of the verbatim term, number of occurrences of the verbatim term, number of cases in which the verbatim term appears, which type of cleanup (if any) was performed, a cross-reference to where the token is defined, and dates of the earliest and latest reported occurrence.
  • FIG 8. An exemplary pedigree screen from an illustrative embodiment of the invention disclosed in a related patent application is presented in Figure 8.
  • the screen illustrates the nature of mapping in accordance with the present invention, and a manner in which the pedigree of a drug term can be used.
  • the "Map To" column 600 shows generic name or trade name token, e.g., "PROZAC'to which the "Verbatim”601 term, e.g., "Fluoxetine Hcl” is mapped.
  • the verbatim term can be any form of the name under which this drug was found in the "Source” 602, e.g., "AERS” data, including misspellings, variations, etc.
  • the "Incidents” column 603 represents the number of times the verbatim terms occurs in the indicated source data, while the “Case Count” column 604 discloses the number of case in which the verbatim term appears in the source data.
  • the "QEDRx Processing” column 605 indicates the type of cleanup that has been performed on the data. In this particular embodiment, the sub-columns in order under "QEDRx Processing" indicate: spelling correction, noise word correction; combo word correction; removal of numerics; and removal of marks.
  • the "Cross-Reference” column 606 indicates which reference source the "Map To" term is associated with. Finally, "First/Last Reported Reactions" 607 indicates the date range from the earliest to latest cases containing the verbatim term.
  • FIG. 9 An exemplary screen from an illustrative embodiment of an invention disclosed in a related application is presented in Figure 9.
  • the screen illustrates, among other things, the nature of mapping of verbatim reaction data to the MedDRA reaction hierarchy.
  • the verbatim data is identified under the heading "as reported," e.g., "Hypotension NOS.”
  • Subsequent columns map the verbatim to MedDRA preferred terms (e.g., Hypotension NOS), high level terms (e.g., Hypotension), high level group term (e.g., decreased and nonspecific blood pressure disorders and shock), and system/organ/class term (e.g., vascular disorders).
  • Hypotension NOS Hypotension NOS
  • high level terms e.g., Hypotension
  • high level group term e.g., decreased and nonspecific blood pressure disorders and shock
  • system/organ/class term e.g., vascular disorders
  • Preferred embodiments of the present invention include those implemented on a single computer or across a network of computers, e.g., a local area network of the Internet.
  • Preferred embodiments include implementations on computer-readable media storing a computer program product performing one or more of the steps described herein.
  • Such a computer program product contains modules implementing the steps as functions inter-related as described herein.
  • Preferred embodiments of the invention include the unique data structures described herein, encoded on a computer-readable medium and computer signals transmissible over a computer/communications network.

Abstract

The present invention includes a preferred method for developing a pharmocovigilance database (100) from source data (200). The method includes the steps of parsing (310) source data (200) into a relational database structure, performing cleanup (330) on the parsed data, and mapping (340) cleaned parsed source data to reference data (230). While the preferred methods of the present invention parse source data (200) prior to performing cleanup (330), cleanup (330) can be performed independently of, and prior to, parsing (310).

Description

SPECIFICATION
[PHARMACOVIGILANCE DATABASE]
Cross Reference to Related Applications
This application is related to the following co-pending applications, each filed May 2, 2001 , and incorporates the disclosure of these applications by reference in their entirety: A Method and System for Analyzing Drug Adverse Effects; A Method and System for Web-Based Analysis of Drug Adverse Effects; Method and System for Graphically Depicting Drug Adverse Effects Risk; A Method and System for Analyzing Drug Adverse Effects Employing Multivariate Statistical Analysis
Background of Invention
[0001 ] FIELD OF THE INVENTION. The present invention relates generally to systems and methods for developing a pharmacovigilance database from source data, both publicly available and privately developed, and reference data.
[0002] DESCRIPTION OF THE RELATED ART. In September 1 997, information regarding cardiopulmonary disease related to the use of fenfluramine and phentermine ("fen-phen") prompted the United States Food and Drug Administration (FDA) to request the manufacturers of these drugs to voluntarily withdraw both treatments for obesity from the market. Subsequent studies show a 25 percent incidence of heart valve disease apparently resulting from diet drug use. Thus, up to 1 ,250,000 people may have sustained heart valve damage from these diet drugs and the FDA indicates that this may be the largest adverse drug effect they have ever dealt with.
[0003] Under existing federal regulations, post-marketing safety reports are be submitted to the FDA for serious and unexpected adverse experiences from all sources (domestic and foreign); and spontaneously reported adverse experiences that occur domestically and that are: serious and expected; or non-serious and unexpected; or non-serious and expected.
[0004] To facilitate reporting and data analysis, the FDA created the Spontaneous
Reporting System (SRS), a pharmacovigilance database. The SRS contains adverse drug reaction reports from a variety of sources over a period covering 1 969 through 1 997. This data is available in an ASCII flat file from the FDA. However, the flat file, by its nature is not a relationaliy structured database amenable to typical query. [0005] Over the past several years (1 997 2001 ), the FDA has implemented a follow-on system to SRS, i.e., the Adverse Event Reporting System (AERS). AERS is a non-cumulative database of post-market drug adverse events. Its purpose is to serve as an early warning indicator or signaling system for adverse drug reactions not detected during pre-market testing. The data, without a search engine, is available on CD ROM from the federal government in a combination of ASCII delimited flat file and SGML format. The files include: demographic and administrative information; along with drug, reaction, patient outcome, and source for each case.
[0006] Beyond SRS and AERS, pharmaceutical companies, hospitals, and other entities have also been known to track adverse drug effects; often using unique database structures. The existence of these various databases using different structures presents an obstacle to efficient use of potentially valuable data. As with SRS and AERS, database structure can vary within an organization over time, and also between concurrent adverse event databases. Such variability makes it cumbersome to query across databases.
[0007] In addition, differing terminology employed by disparate databases also make conventional queries cumbersome and the results unreliable. This problem is acute in the area of medical information related to substances such as drugs. Drugs and other prescription and non-prescription therapeutic substances may be known by a variety of names. In addition to the chemical name, many drugs have several clinical names recognized by health care professionals in the field. It is not uncommon for a drug to have several different trade names depending on the manufacturer. This matter is further complicated by one or more functional names that may be associated with a drug or other substance. For example, an antidepressant may be Prozac, a fluoxetine, a serotonin reuptake inhibitor, or a serotonin receptor specific modulator. However, antidepressants include many other drugs, such as lithium and other catecholaminergic drugs, and there are serotonin reuptake inhibitors in addition to Prozac. Even "standardized" terminology can differ between databases. For example, some adverse event databases request reaction terminology consistent with the Medical Dictionary for Regulatory Activities (MedDRA™), while other databases request, or already contain, input consistent with World Health Organization Adverse Reaction Terminology (WHO-ART) or Coding Symbols for a Thesaurus of Adverse Reaction Terms (COSTART) developed and maintained by the FDA's Center for Drug Evaluation and Research.
[0008] Further, data corruption in databases such as SRS and AERS is acknowledged, but not quantified, by the proponents. Data corruption at the database field level can include extraneous non-alpha characters, noise words, misspellings, and dislocations (e.g., data that is valid for one field, erroneously entered into another, inappropriate field). Databases that allow entry of free text information are especially susceptible to data corruption. At a higher level, existing adverse event databases have been known to contain redundant cases documenting the same adverse event.
[0009] U.S. Patent No. 5,634,053 to Noble et al., "Federated Information
Management (FIM) System And Method For Providing Data Site Filtering And Translation For Heterogeneous Databases" discloses an information management system that integrates data from a plurality of interconnected local databases to provide users with access to a virtual database. The system includes a user interface for generating a global query to search the virtual database, a smart dictionary database that contains configuration data, a data information manager that decomposes the global query into local queries, and a plurality of local information managers that execute the local queries to search for and retrieve data from the enumerated databases. A filter generates a list of those local databases that contain information relevant to the global query. As a result, the data information manager only generates local queries for the enumerated local databases. An input translator converts the global query into the respective local formats for the local databases so that the system provides true integration of heterogeneous databases. An output translator converts the data retrieved from each local database into a uniform input/output format so that the data presented to the user is integrated. The user typically selects the input/output format as his or her local format or a global format associated with the virtual database.
[001 0] U.S. Patent No. 5,664, 1 09 to Johnson et al., "Method For Extracting Pre-
Defined Data Items From Medical Service Records Generated By Health Care Providers" discloses a central medical record repository for a managed health care organization that accepts and stores medical record documents in any format from medical service providers. The repository then identifies the document using information automatically extracted from the document and stores the extracted data in a document database. The repository links the document to a patient by extracting from the document demographic data identifying the patient and matching it to data stored in a patient database. Data is extracted automatically from medical records containing "unstructured" or free form text by identifying conventional organization components in the text and is organized by executing rules that extract data with the aid of such information. Documents for a patient are retrieved by identifying the patient using demographic data.
[001 1 ] U.S. Patent No. 5,845,255 to Mayaud, "Prescription Management System" discloses a wirelessly deployable, electronic prescription creation system for physician use which captures into a prescription a patient condition-objective of the prescribed treatment and provides for patient record assembly from source elements, with privacy controls for patient and doctor, adverse indication review and online access to comprehensive drug information including scientific literature. Extensions to novel multi-drug packages and dispensing devices, and an "intelligent network" remote data retrieval architecture as well as onscreen physician-to- pharmacy and physician-to-physician e-mail are also provided. [001 2] U.S. Patent No. 5,91 1 , 1 32 to Sloane, "Method (Of) Using (A) Central
Epidemiological Database" discloses a method in which patient disease is diagnosed and/or treated using electronic data communications between not only the physician and his/her patient, but via the use of electronic data communications between the physician and one or more entities which can contribute to the patient's diagnosis and/or treatment, such electronic data communications including information that was previously received electronically from the patient and/or was developed as a consequence of an electronic messaging interaction that occurred between the patient and the physician. Such other entities illustratively include a medical diagnostic center and an epidemiological database computer facility that collects epidemiological transaction records from physicians, hospitals, and other institutions that have medical facilities, such as schools and large businesses. The epidemiological transaction record illustratively includes various medical, personal, and epidemiological data relevant to the patient and his/her present symptoms, including test results, as well as the diagnosis, if one has already been arrived at by the e-doc. The epidemiological database computer facility can correlate this information with the other epidemiological transaction records that it receives over time in order to help physicians make and/or confirm diagnoses as well as to identify and track epidemiological events and/or trends.
[001 3] U.S. Patent No. 5,924,074 to Evans, "Electronic Medical Records System" discloses a medical records system that creates and maintains all patient data electronically. The system captures patient data, such as patient complaints, lab orders, medications, diagnoses, and procedures, at its source at the time of entry using a graphical user interface having touch screens. Using pen-based portable computers with wireless connections to a computer network, authorized healthcare providers can access, analyze, update, and electronically annotate patient data even while other providers are using the same patient record. The system likewise permits instant, sophisticated analysis of patient data to identify relationships among the data considered. Moreover, the system includes the capability to access reference databases for consultation regarding allergies, medication interactions, and practice guidelines. The system also includes the capability to incorporate legacy data, such as paper files and mainframe data, for a patient.
[001 4] U.S. Patent No. 6,076,088 to Paik et al., "Information Extraction System
And Method Using Concept Relation Concept (CRC) Triples" discloses an information extraction system that allows users to ask questions about documents in a database, and responds to queries by returning possibly relevant information that is extracted from the documents. The system is domain-independent, and automatically builds its own subject knowledge base. It can be applied to any new corpus of text with quick results, and no requirement for lengthy manual input. For this reason, it is also a dynamic system that can acquire new knowledge and add it to the knowledge base immediately by automatically identifying new names, events, or concepts. [001 5] U.S. Patent No. 6, 1 28,620 to Pissanos et al., "Medical Database For
Litigation" discloses a medical database and associated methods that are especially suited for compiling information in a medical malpractice situation. A general medical database is provided and specific medical information corresponding to a given situation is entered. Entry of the information automatically cross-references some terms of the entered data to definitions in the general medical database. Terms are readily looked up when reviewing specific medical information and definitions are easily inserted where desired. A drug reference display provides two- way lookup from drugs to their side effects (or contraindications or interactions) and back. Significant information from an entered medical chronology is easily copied to a significant information section when a reviewer finds the information important.
[001 6] U.S. Patent No. 6,21 9,674 to Classen, "System for creating and managing proprietary product data" discloses systems and methods for creating and using product data to enhance the safety of a medical or non-medical product. The systems receive vast amounts of data regarding adverse events associated with a particular product and analyze the data in light of already known adverse events associated with the product. The system develops at least one proprietary database of newly discovered adverse event information and new uses for the product and may catalog adverse event information for a large number of population sub-groups. The system may also be programmed to incorporate the information into intellectual property and contract documents. Manufacturers can include the information in consumer product information that they provide to consumers or, in the case of certain medical products, prescribers of the medical products.
[001 7] None of the above references, alone or in combination, addresses improving the quality of the underlying verbatim adverse drug event data. Nor do the references address mapping this underlying data to accepted pharmaceutical community terms and hierarchies. Specifically, the references do not address parsing of flat file adverse drug event data into a relational database structure to support efficient query. The problem of differing terminology in the data fields of disparate databases also remains un-addressed; as does the problem of data corruption in the form of misspelling and extraneous characters, along with resolution of redundant cases.
[001 8] In view of the above-described deficiencies associated with data concerning drugs and other substances associated with medical databases, especially known adverse event databases, there is a need to solve these problems and enhance the quality and accuracy of such data. These enhancements and benefits are described in detail herein below with respect to several alternative embodiments of the present invention. Summary of Invention
[001 9] The present invention in its several disclosed embodiments alleviates the drawbacks described above with respect to existing adverse event databases and incorporates several additionally beneficial features.
[0020] In a preferred embodiment, the present invention is a method for developing a pharmacovigilance database from source data and reference data. The unedited source data contains verbatim terms. The method includes parsing source data into a relational database; performing cleanup on the relational database; and mapping verbatim terms from the cleaned database to at least one token from at least one reference source.
[0021 ] Cleanup includes removing redundant entries, correcting misspellings, removing irrelevant non-alpha characters and noise words, and relocating dislocated terms. When the source or reference data spans more than one generation, preferred embodiments standardize and map historical terms to current terms. Where the choice of the reference data is itself an option, preferred embodiments incorporate a method for selecting the reference data source and the automatically propagated correction and mapping rules associated with that choice. Mapping verbatim terms to tokens includes nominating tokens from the source data, choosing tokens from the reference sources, and linking chosen tokens to corresponding verbatim terms. In one embodiment, the history of clean up and mapping is saved as the pedigree of the verbatim-to-token mapping.
[0022] It will be appreciated that such a system and method for developing a pharmacovigilance database is advantageous to the various risk assessors in the pharmaceutical field. Pharmaceutical industry personnel would have higher quality data (i.e., more current, complete, and accurate data) with which to monitor and manage drugs in the marketplace. Marketing and sales personnel could employ such a database to understand and position a drug to optimum advantage to patients and physicians. Research & development personnel could assess drugs planned for market introduction in light of adverse event reports of other drugs in the same chemical or therapeutic class. Regulators and the public could benefit from increased quality date available as a basis for labeling drugs.
[0023] It is an object of the present invention to integrate disparate adverse drug effect databases into a structure amenable to efficient query.
[0024] It is a further object of the present invention to mitigate the effect of data corruption on adverse drug event databases.
[0025] It is an object of the present invention to develop an adverse drug effect database amenable to query using canonical terms accepted in the pharmaceutical industry. Linking cases to standard vocabulary for data such as drug name and reaction enables meaningful statistical comparisons to be made.
[0026] The beneficial effects described above apply generally to the exemplary systems and methods for developing a pharmacovigilance database. The specific structures through which these benefits are delivered will be described in detail hereinbelow.
Brief Description of Drawings
[0027] The invention will now be described in detail, by way of example without limitation thereto and with reference to the attached figures.
[0028] Figure 1 illustrates a method of the present invention for development of a pharmacovigilance database.
[0029] Figure 2 illustrates a specific implementation of a pharmacovigilance database of the present invention.
[0030] Figure 3 illustrates mapping of cleaned verbatim source data drug terms to trade and generic canonical terms from the National Drug Code Directory and the Food and Drug Administration Orange Book.
[0031 ] Figure 4 is a sample window illustrating how a list of unresolved drug verbatim are presented to an operator along with suggestions for resolution.
[0032] Figure 5 is a sample window illustrating how an individual unresolved drug verbatim entry may be presented to an operator Figure 6 is a sample window illustrating how an operator may effect resolution of an unresolved drug verbatim entry.
[0033] Figure 7 illustrates mapping of cleaned verbatim source data reaction terms to WHOART, COSTART and MedDRA reaction terms and hierarchies.
[0034] Figure 8 illustrates mapping of source data drug terms to reference source "map to" tokens.
[0035] Figure 9 illustrates mapping of source data reaction terms to reference source "map to" tokens for the MedDRA reaction hierarchy.
Detailed Description
[0036] As required, detailed preferred embodiments of the present invention are disclosed herein; however, it is to be understood that the disclosed embodiments are merely exemplary of the invention that may be embodied in various and alternative forms. The figures are not necessarily to scale, some features may be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present invention.
[0037] Referring to Figure 1 , the present invention includes a preferred method for developing a pharmacovigilance database 1 00 from source data 200. The method includes the steps of parsing 31 0 source data 200 into a relational database structure, performing cleanup 330 on the parsed data, and mapping 340 cleaned parsed source data to reference data 230. While the preferred methods of the present invention parse source data 200 prior to performing cleanup 330, cleanup 330 can be performed independently of, and prior to, parsing. In one embodiment, the source data 200 is already in a relational database structure amenable to embodiments of the present invention.
[0038] Referring to Figure 2, a specific implementation of a preferred method of the present invention is illustrated. In this embodiment, the source data includes, but is not limited to, SRS 21 0 and AERS 220. In addition to demographic information, AERS and SRS call for reports on outcomes, report source, and concomitant drugs. Other data and sources (domestic, foreign, or international in scope) can serve as inputs to the process. Preferred embodiments accommodate adverse event data from pharmaceutical corporations, hospitals; physicians, and health insurers; along with data from state, federal, and international agencies. The primary sources of the pharmaceutical industry data are individual adverse event databases of the pharmaceutical corporation safety departments. In each case, source data may be focused on clinical trails, post-market surveillance, research databases, or the like. The unedited data in each source database is referred to as "verbatim." In addition to source data, reference data from accepted canonical references, e.g., MedDRA™ 231 , National Drug Code Directory 232, and FDA Orange Book 233, is used in preferred embodiments of the present invention. Preferred embodiments of the invention also link to genomic and proteomic data. Preferred embodiments of the invention provide means to substitute and manage both source and reference data.
[0039] The method illustrated in Figure 2 includes parsing 31 0 source data 21 0,
220 into a relational database structure. For data sources such as SRS and AERS not already in a relational database structure, transformation from raw source data 21 0, 220 to a relational structure preferably includes parsing each data source into an image 1 22, 1 24 with fields tailored to its corresponding source. Subsequently, the images 1 22, 1 24 are consolidated 320 into a single safety tablespace 1 1 0. Since the database can be simple or complex, the present invention provides the ability to add many "dimensions" (e.g., age, sex, dates, reactions, doses, outcomes, report source, concomitant drugs): some structured, some narrative, some numerical, and many categorical variables such as reaction. Hierarchies in all dimensions (in both preferred and custom paths) are definable as required by the particular end user.
[0040] Since several of the most favored data sources are not published in a format that lends itself to direct query, e.g., SRS is available from the U.S. Government only as delimited ASCII data, parsing such data in to a relational database model allows the use of leveraging data management tools which are ineffective on flat files. In preferred embodiments of the present invention, the safety tablespace 1 1 0 provides a common set of fields for the parsed source data 1 22, 1 24.
[0041 ] Data cleanup may be performed independently of parsing source data into a safety database. This allows cleanup to be continual, ongoing, and iterative; either before or after one or more source databases are processed into the pharmacovigilance database. Adverse event database cleanup is an incremental process, proceeding from automated cleanup of certain errors, through human- assisted cleanup of ambiguous entries, to human correction of identified gross errors. Specific cleanup tasks include noise reduction (e.g., suppression of non- alpha characters noise words, and combination words); adjustment for misspellings; adjustment for dislocations, and resolution of possible redundant entries. In the preferred embodiment illustrated in Figure 2, reactions, drugs, and counts of the occurrence (by case and absolute) of each are extracted 331 from the parsed AERS data 1 24. The counts are then grouped 331 ; in this embodiment, grouping is by order of magnitude of the count. In the preferred embodiment illustrated in Figure 2, the bulk of data cleanup 330 is performed on a computing platform separate from database storage. A spreadsheet application, such as Microsoft Excel is used to track cleanup operations. For example, the first column in such a spreadsheet may contain the verbatim term; the second column may contain a noise-suppressed verbatim term; the fourth column may contain the spell-checked verbatim term, and so on. Other data cleanup applications, such as Metaphone (discussed infra), also reside on this separate computing platform in the illustrated embodiment. However, cleanup applications need not reside on a separate computing platform, or may be accessible via the Internet or other computer network. Noise reduction involves suppression of words and characters that are typically unnecessary in determining the correct name for drug or reaction verbatim. Noise words and characters include, but are not limited to non-alpha characters (such as numbers, diacriticals, brackets, and control characters), words (e.g., "mg" or "tablet"), combination words (e.g., "20mg"with no space). For example, both "Tylenol (500 mg)"and "Tylenol Capsules" would be reduced to "Tylenol." A list of noise words and noise punctuation is stored in database tables associated with lexical processing. Non-alpha characters, such as control characters, are also suppressed at this stage.
[0042] After noise reduction, misspellings are detected and adjusted for using known tools such as spell checkers, sound-alike suggestion programs, a verbatim replacement table, and human inspection. [0043] A preferred spell checker operates on noise-suppressed verbatim terms, making a series of spelling variations on terms not found in the reference sources. These variations are used as the basis for searching reference sources and suggesting candidate canonical terms. Reference sources include standard and special-purpose dictionaries. The variations introduced include: adding an extra character to the term, e.g., allowing noise-suppressed verbatim such as "proza" to be searched as "Prozac;" removing a character from the term, e.g., allowing noise- suppressed verbatim such as "prozzac" to be searched as "Prozac;" swapping adjacent characters, e.g., allowing noise-suppressed verbatim such as "rpozac" to be searched as "Prozac." In addition to a spelling suggester, a sound-alike program, such as Metaphone or Soundex is employed to suggest variations. Metaphone is a published algorithm similar to Soundex. It was originally published in the December 1 990 issue of Computer Language magazine. Every word has a four-letter Metaphone value that can be calculated. The Metaphone suggester calculates the Metaphone value for each entry in the reference sources and for each unresolved verbatim term. Those reference source terms having a Metaphone value matching that of an unresolved verbatim term will be offered as a suggestion to a database developer for resolution. For example, the Metaphone value for both "prosac" and "prozack" is PRSK; the Metaphone value for both "Claritin" and "Klariton" is "KLRT." Where no candidates satisfy the developer, an option is provided for accepting a surrogate term from the developer.
[0044] Preferred embodiments of the invention include steps for capturing and using domain-specific lexical knowledge not easily applied through noise reduction or spell checking. At the basic level, this amounts to use of a replacement table, containing mappings from known errors to corrected canonical terms. On a more sophisticated level, as domain-specific knowledge is accumulated, autocoders are employed to capture human decision-making experience regarding cleanup.
[0045] Human interaction is particularly useful in identification and correction of dislocation errors, i.e., where a term valid in one field (e.g., headache/reaction) appears in a field where it is not valid (e.g., headache/drug). Dislocation errors are identified in preferred embodiments of the present invention where a term does not fit the type of the field it is found in, but nonetheless exists in reference sources outside the scope of the particular field.
[0046] Redundant entries are identified and removed with operator assistance. A
"case" includes all data regarding the adverse events experienced by one person, taking a drug. A sequence of events regarding a person, taking a drug should not be recorded as separate cases (potentially duplicating the adverse events associated with the case). This is important for correct statistical views of the data. The present invention provides tools to operators for identification and consolidation redundant cases. In preferred embodiments of the present invention, multiple cases involving the same person over a contiguous period are presented to an operator for a determination whether or not such entries actually represent one case with multiple (or possibly single-occurrence, multiple-reported) events.
[0047] If a case concerning an "eye pain" reaction is amended fifteen times, only one instance of eye pain should be aggregated for this individual case. Through record linking, preferred embodiments of the present invention match successor reports with their predecessors using data inherent in the records, and comparing other information in the records to gauge the quality of the match. For example, two cases may match on "case identification" field, or a "drug manufacturer identification" field, or a "report date." Those cases known to be redundant, and those cases showing a link between records are presented to researchers for resolution. In alternate embodiments, resolution between likely redundant cases is accomplished via an expert system.
[0048] Note that the underlying verbatim terms are not changed by application of noise suppression, the use of spell checkers, the resolution of dislocations, or the resolution of redundant entries. Verbatim terms, e.g., drug and reaction terms, that have been parsed into a safety database and cleaned, are mapped to "tokens" from the reference data sources. The word "token" refers to the specific term(s), from one of more of the reference sources, that is associated with one or more verbatim terms in a fashion that allows a search for the token to return results containing the verbatim term(s) linked to the token. Where an exact match exists between a verbatim term (source or cleaned) and a reference term, the verbatim term is mapped to the reference term as token. Where no exact match is found between verbatim (cleaned or otherwise) and reference data terms, preferred embodiments of the present invention present a series of steps for resolving such unmatched terms.
[0049] In addition to corruption in verbatim data, valid variations in terminology may also be resolved through mapping to reference data tokens. For example, "PROZAC" and other trade names for flouxetine are preferably mapped to the generic "flouxetine." In another example, luliberin, gonadotropin releasing hormone, GnRH, gonadotropin releasing factor, luteinizing hormone releasing hormone, LHRH, and LH-FSH RH are equivalents and may be considered as such for analyzing adverse effects. Furthermore, different chemical derivatives, such as acidic or basic forms of the same drug may be grouped together, where a reference data term exists, under the same token in order to analyze adverse drug events. In some embodiments of the invention, source data verbatim terms are nominated as token candidates; frequency of occurrence and absolute count being typical bases for nominating a term as a token candidate. In Figure 2, verbatim drug and reaction terms are grouped by order of magnitude of absolute count 331 . For reactions, token candidates are chosen from accepted reference sources such as MedDRA, COSTART, and WHOART. For drugs, token candidates are chosen from corresponding canonical sources such as the National Drug Code Directory (NDCD), WHODRUG, and the Orange Book. Individual verbatim terms are then mapped to the selected tokens. In preferred embodiments, this process is used for multiple database dimensions in addition to drug and reaction, e.g., outcomes where the definition of "serious" outcomes can differ over time and between reference sources. This mapping enables those searches of the pharmacovigilance database focused on tokenized fields, e.g., drug and reaction fields, to be executed with greater confidence. Using the mapping approach, variability in adverse event data entry, typically a difficult-to- control aspect of data collection on a large scale, is mitigated as a source of error.
[0050] Figure 2 indicates a stage for mapping 340 SRS and AERS corrected verbatim to NDCD 232, MedDRA, 231 and Orange Book 233 canonical terms and structures. As noted earlier, where an exact match exists between a verbatim term (source or cleaned) and a reference term, the verbatim term is mapped to the reference term as token. Where no exact match is found between verbatim (cleaned or otherwise) and reference data terms, preferred embodiments of the present invention present a series of steps for resolving such unmatched terms. Figure 3 illustrates mapping of cleaned 330 source data verbatim drug terms to trade names, and generic/compound names found in NDCD 232 and the FDA Orange Book 233. Figure 4 is a sample interactive screen for resolving non-exact matches. In this sample screen, a user is presented with a number of assigned unresolved entries. Preferred embodiments of the invention present the user with any suggestions identified by lexical processing (e.g., Metaphone, fixed list) for each unresolved verbatim term. The user may then select from this list or, as illustrated in Figure 5, enter a surrogate term. After selecting a candidate term (or entering a surrogate term and choosing "consider surrogate"), a list of generic drug names will be shown (if the matched term was indeed a trade name rather than a generic). As illustrated in Figure 6, at this point, a user can either save the mapping or modify the list of generic terms. This last option will allow a user to override the list of generics.
[0051 ] Figure 7 illustrates mapping of cleaned source data reaction terms to standardized hierarchies such as WHOART 234, COSTART 235, and MedDRA 231 . Specifically, cleaned source data reaction terms are mapped to multiple levels (and possibly multiple entries within a level) of the hierarchy. In preferred embodiments, mapping of cleaned verbatim reaction terms proceeds in a fashion similar to mapping of drug terms. Also note that while the illustrated preferred embodiments perform mapping on cleaned source data, mapping may be performed on uncleaned (or even unparsed) source data. Transparency in the process of moving from source data verbatim terms to a cleaned safety database with verbatim terms mapped to tokens is important to both database developers/operators and to end users. Preferred embodiments of the present invention capture the way source data terms have been cleaned and mapped as the "pedigree" of each term. The "pedigree" of a term is the link between the mapped term and the decisions made during data cleanup. End users typically wish to verify the pedigree of the data they use. In those embodiments, retained data includes one or more of the following as appropriate: verbatim term, token mapped to, source of the verbatim term, number of occurrences of the verbatim term, number of cases in which the verbatim term appears, which type of cleanup (if any) was performed, a cross-reference to where the token is defined, and dates of the earliest and latest reported occurrence.
[0052] An exemplary pedigree screen from an illustrative embodiment of the invention disclosed in a related patent application is presented in Figure 8. The screen illustrates the nature of mapping in accordance with the present invention, and a manner in which the pedigree of a drug term can be used. Referring to the fourth entry from the bottom of Figure 4 as an example, the "Map To" column 600 shows generic name or trade name token, e.g., "PROZAC'to which the "Verbatim"601 term, e.g., "Fluoxetine Hcl" is mapped. The verbatim term can be any form of the name under which this drug was found in the "Source" 602, e.g., "AERS" data, including misspellings, variations, etc. The "Incidents" column 603 represents the number of times the verbatim terms occurs in the indicated source data, while the "Case Count" column 604 discloses the number of case in which the verbatim term appears in the source data. The "QEDRx Processing" column 605 indicates the type of cleanup that has been performed on the data. In this particular embodiment, the sub-columns in order under "QEDRx Processing" indicate: spelling correction, noise word correction; combo word correction; removal of numerics; and removal of marks. The "Cross-Reference" column 606 indicates which reference source the "Map To" term is associated with. Finally, "First/Last Reported Reactions" 607 indicates the date range from the earliest to latest cases containing the verbatim term.
[0053] An exemplary screen from an illustrative embodiment of an invention disclosed in a related application is presented in Figure 9. The screen illustrates, among other things, the nature of mapping of verbatim reaction data to the MedDRA reaction hierarchy. The verbatim data is identified under the heading "as reported," e.g., "Hypotension NOS." Subsequent columns map the verbatim to MedDRA preferred terms (e.g., Hypotension NOS), high level terms (e.g., Hypotension), high level group term (e.g., decreased and nonspecific blood pressure disorders and shock), and system/organ/class term (e.g., vascular disorders).
[0054] Preferred embodiments of the present invention include those implemented on a single computer or across a network of computers, e.g., a local area network of the Internet. Preferred embodiments include implementations on computer-readable media storing a computer program product performing one or more of the steps described herein. Such a computer program product contains modules implementing the steps as functions inter-related as described herein. Preferred embodiments of the invention include the unique data structures described herein, encoded on a computer-readable medium and computer signals transmissible over a computer/communications network.
[0055] A method and system for developing a pharmacovigilance database has been described herein. These and other variations, which will be appreciated by those skilled in the art, are within the intended scope of this invention as claimed below. As previously stated, detailed embodiments of the present invention are disclosed herein; however, it is to be understood that the disclosed embodiments are merely exemplary of the invention that may be embodied in various forms.

Claims

Claims [cl ]
1 . A method for developing a pharmacovigilance database from source data having verbatim terms and from reference data, the method comprising:
parsing source data into a relational database;
performing cleanup on the relational database; and
mapping verbatim terms from the cleaned safety database to at least one token from at least one reference source. [c2]
2. A method for developing a pharmacovigilance database from source data having verbatim terms and from reference data, the method comprising:
performing cleanup on the relational database;
parsing source data into a relational database; and
mapping verbatim terms from the cleaned safety database to at least one token from at least one reference source. [c3]
3. The method of Claim 1 wherein parsing further comprises parsing publicly available source data into the relational database.
[c4]
4. The method of Claim 3 wherein the publicly available source data comprises Spontaneous Reporting System (SRS) data.
[c5]
5. The method of Claim 3 wherein the publicly available source data comprises Adverse Event Reporting System (AERS) data.
[c6]
6. The method of Claim 1 wherein parsing further comprises, parsing privately available source data into the relational database. [C7]
7. The method of Claim 1 wherein parsing further comprises, parsing a combination of publicly and privately available source data into the relational database.
[c8]
8. The method of Claim 1 wherein performing cleanup further comprises, suppressing at least one redundant entry.
[c9]
9. The method of Claim 1 wherein performing cleanup further comprises, suppressing printable and non-printable non-alpha characters.
[cl O]
1 0. The method of Claim 1 wherein performing cleanup further comprises suppressing numeric characters.
[el l ]
1 1 . The method of Claim 1 wherein performing cleanup further comprises, suppressing noise words.
[cl 2]
1 2. The method of Claim 1 wherein performing cleanup further comprises, suppressing combination words.
[cl 3]
1 3. The method of Claim 1 wherein performing cleanup further comprises, suppressing misspellings.
[cl 4]
14.The method of Claim 1 3 wherein suppressing misspellings further comprises:
interactively identifying likely misspelled terms to an operator;
accepting direction from an operator; and
editing the likely misspelled term for which direction was accepted in accordance with the direction. [cl 5]
1 5. The method of Claim 1 4 wherein suppressing misspelling further comprises nominating at least one correction for at least one likely misspelled term to an operator. [cl 6]
1 6. The method of Claim 1 wherein performing cleanup further comprises mapping at least one dislocated valid entry to the proper field for that entry. [cl 7]
1 7. The method of Claim 1 wherein mapping verbatim terms from the source database to tokens further comprises:
identifying at least one token, and
associating at least one verbatim term from at least on data source with an identified token. [cl 8]
1 8. The method of Claim 1 7 wherein mapping verbatim terms from the source database to tokens further comprises retaining pedigree information regarding the mapping between at least one verbatim term and its corresponding token.
PCT/US2002/013662 2001-05-02 2002-05-01 Pharmacovigilance database WO2002089015A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/681,587 US6778994B2 (en) 2001-05-02 2001-05-02 Pharmacovigilance database
US09/681,587 2001-05-02

Publications (1)

Publication Number Publication Date
WO2002089015A1 true WO2002089015A1 (en) 2002-11-07

Family

ID=24735922

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/013662 WO2002089015A1 (en) 2001-05-02 2002-05-01 Pharmacovigilance database

Country Status (2)

Country Link
US (4) US6778994B2 (en)
WO (1) WO2002089015A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1717722A3 (en) * 2005-04-25 2008-05-21 Ingenix Inc. System and method for early identification of safety concerns of new drugs
EP1999671A2 (en) * 2006-03-24 2008-12-10 IntelliDOT Corporation Electronic data capture in a medical workflow system
US7856362B2 (en) 2005-04-25 2010-12-21 Ingenix, Inc. System and method for early identification of safety concerns of new drugs
EP2601633A4 (en) * 2010-08-06 2016-11-30 Cardiomems Inc Systems and methods for using physiological information

Families Citing this family (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7912689B1 (en) 1999-02-11 2011-03-22 Cambridgesoft Corporation Enhancing structure diagram generation through use of symmetry
US7295931B1 (en) * 1999-02-18 2007-11-13 Cambridgesoft Corporation Deriving fixed bond information
US7356419B1 (en) * 2000-05-05 2008-04-08 Cambridgesoft Corporation Deriving product information
US7272509B1 (en) * 2000-05-05 2007-09-18 Cambridgesoft Corporation Managing product information
US7440904B2 (en) * 2000-10-11 2008-10-21 Malik M. Hanson Method and system for generating personal/individual health records
US7801777B2 (en) * 2001-01-23 2010-09-21 Oracle International Corporation System and method for managing the development and manufacturing of a beverage
US7275070B2 (en) * 2001-01-23 2007-09-25 Conformia Software, Inc. System and method for managing the development and manufacturing of a pharmaceutical drug
US20020165806A1 (en) * 2001-01-23 2002-11-07 Kataria Anjali Rani System and method for managing a regulated industry
US7487182B2 (en) * 2001-01-23 2009-02-03 Conformia Software, Inc. Systems and methods for managing the development and manufacturing of a drug
US7925612B2 (en) * 2001-05-02 2011-04-12 Victor Gogolak Method for graphically depicting drug adverse effect risks
US7542961B2 (en) * 2001-05-02 2009-06-02 Victor Gogolak Method and system for analyzing drug adverse effects
US6778994B2 (en) * 2001-05-02 2004-08-17 Victor Gogolak Pharmacovigilance database
US7130861B2 (en) * 2001-08-16 2006-10-31 Sentius International Corporation Automated creation and delivery of database content
US7461006B2 (en) * 2001-08-29 2008-12-02 Victor Gogolak Method and system for the analysis and association of patient-specific and population-based genomic data with drug safety adverse event data
US20040054679A1 (en) * 2002-06-04 2004-03-18 James Ralston Remotely invoked metaphonic database searching capability
US7698157B2 (en) * 2002-06-12 2010-04-13 Anvita, Inc. System and method for multi-dimensional physician-specific data mining for pharmaceutical sales and marketing
US7398279B2 (en) * 2002-06-28 2008-07-08 Francis J. Muno, Jr. Method, routines and system for identification of imprints on dosage forms
US7171620B2 (en) * 2002-07-24 2007-01-30 Xerox Corporation System and method for managing document retention of shared documents
US7970621B2 (en) * 2002-10-18 2011-06-28 Cerner Innovation, Inc. Automated order entry system and method
US20040172285A1 (en) * 2003-02-18 2004-09-02 Gibson Jerry Tyrone Systems and methods for selecting drugs
US7657443B2 (en) * 2003-12-19 2010-02-02 Carefusion 303, Inc. Intravenous medication harm index system
US7376644B2 (en) * 2004-02-02 2008-05-20 Ram Consulting Inc. Knowledge portal for accessing, analyzing and standardizing data
US7870046B2 (en) 2004-03-04 2011-01-11 Cae Solutions Corporation System, apparatus and method for standardized financial reporting
US20100299320A1 (en) * 2004-03-26 2010-11-25 Ecapable, Inc. Method and System to Facilitate Decision Point Information Flow and to Improve Compliance with a Given Standardized Vocabulary
US20110231206A1 (en) * 2004-03-26 2011-09-22 Ecapable, Inc. Method which creates a community-wide health information infrastructure
US20060224573A1 (en) * 2004-03-26 2006-10-05 Ecapable, Inc. Method and system to facilitate decision point information flow and to improve compliance with a given standardized vocabulary
US20070214018A1 (en) * 2004-03-26 2007-09-13 Ecapable, Inc. Method which creates a community-wide health information infrastructure
US8918432B2 (en) * 2004-07-19 2014-12-23 Cerner Innovation, Inc. System and method for management of drug labeling information
US20060085216A1 (en) * 2004-10-15 2006-04-20 Guerrero John M Method and apparatus for discouraging non-meritorious lawsuits and providing recourse for victims thereof
US20100153134A1 (en) * 2005-03-24 2010-06-17 Ecapable, Inc. National Health Information and Electronic Medical Record System and Method
NZ563292A (en) * 2005-05-11 2010-12-24 Carefusion 303 Inc Evaluating drug data sets against aggregate sets from multiple institutions
US20080004504A1 (en) * 2006-06-30 2008-01-03 Kimmo Uutela System for detecting allergic reactions resulting from a chemical substance given to a patient
US8025634B1 (en) * 2006-09-18 2011-09-27 Baxter International Inc. Method and system for controlled infusion of therapeutic substances
US8170972B2 (en) * 2007-05-02 2012-05-01 General Electric Company Conflicting rule resolution system
US20090037215A1 (en) * 2007-08-02 2009-02-05 Clinical Trials Software Ltd Screening method
US9390160B2 (en) * 2007-08-22 2016-07-12 Cedric Bousquet Systems and methods for providing improved access to pharmacovigilance data
US20090144266A1 (en) * 2007-12-04 2009-06-04 Eclipsys Corporation Search method for entries in a database
WO2010024116A1 (en) * 2008-08-26 2010-03-04 インターナショナル・ビジネス・マシーンズ・コーポレーション Search device, search method and search program using open search engine
US11244745B2 (en) 2010-01-22 2022-02-08 Deka Products Limited Partnership Computer-implemented method, system, and apparatus for electronic patient care
US11881307B2 (en) 2012-05-24 2024-01-23 Deka Products Limited Partnership System, method, and apparatus for electronic patient care
US10453157B2 (en) 2010-01-22 2019-10-22 Deka Products Limited Partnership System, method, and apparatus for electronic patient care
US20110313789A1 (en) 2010-01-22 2011-12-22 Deka Products Limited Partnership Electronic patient monitoring system
US10911515B2 (en) 2012-05-24 2021-02-02 Deka Products Limited Partnership System, method, and apparatus for electronic patient care
US10108785B2 (en) 2010-01-22 2018-10-23 Deka Products Limited Partnership System, method, and apparatus for electronic patient care
US10242159B2 (en) 2010-01-22 2019-03-26 Deka Products Limited Partnership System and apparatus for electronic patient care
US11210611B2 (en) 2011-12-21 2021-12-28 Deka Products Limited Partnership System, method, and apparatus for electronic patient care
US11164672B2 (en) 2010-01-22 2021-11-02 Deka Products Limited Partnership System and apparatus for electronic patient care
US8515921B2 (en) * 2010-08-03 2013-08-20 Oracle International Corporation Data rationalization
US11869671B1 (en) 2011-09-14 2024-01-09 Cerner Innovation, Inc. Context-sensitive health outcome surveillance and signal detection
US11380440B1 (en) 2011-09-14 2022-07-05 Cerner Innovation, Inc. Marker screening and signal detection
US9799040B2 (en) * 2012-03-27 2017-10-24 Iprova Sarl Method and apparatus for computer assisted innovation
US20140164265A1 (en) * 2012-12-06 2014-06-12 Oracle International Corporation Clinical trial adverse event reporting system
CA2921182A1 (en) 2013-08-12 2015-02-19 Ironwood Medical Information Technologies, LLC Medical data system and method
EP3069282A4 (en) * 2013-11-12 2017-03-29 ECBIG (e-Commerce&business Integration) APS System and method of combining medicinal product information of medicaments
US10614196B2 (en) 2014-08-14 2020-04-07 Accenture Global Services Limited System for automated analysis of clinical text for pharmacovigilance
US10474702B1 (en) 2014-08-18 2019-11-12 Street Diligence, Inc. Computer-implemented apparatus and method for providing information concerning a financial instrument
US11144994B1 (en) 2014-08-18 2021-10-12 Street Diligence, Inc. Computer-implemented apparatus and method for providing information concerning a financial instrument
US20170011312A1 (en) * 2015-07-07 2017-01-12 Tyco Fire & Security Gmbh Predicting Work Orders For Scheduling Service Tasks On Intrusion And Fire Monitoring
US10762169B2 (en) 2017-06-16 2020-09-01 Accenture Global Solutions Limited System and method for determining side-effects associated with a substance
US10325020B2 (en) * 2017-06-29 2019-06-18 Accenture Global Solutions Limited Contextual pharmacovigilance system
JP7260315B2 (en) * 2018-09-06 2023-04-18 Phcホールディングス株式会社 Medication guidance support device and medication guidance support system
US11874946B2 (en) 2020-08-05 2024-01-16 International Business Machines Corporation Database map restructuring for data security
US11550800B1 (en) * 2020-09-30 2023-01-10 Amazon Technologies, Inc. Low latency query processing and data retrieval at the edge
WO2022226045A1 (en) * 2021-04-23 2022-10-27 Lexisnexis Risk Solutions Fl Inc. Referential data grouping and tokenization for longitudinal use of de-identified data
US11907305B1 (en) * 2021-07-09 2024-02-20 Veeva Systems Inc. Systems and methods for analyzing adverse events of a source file and arranging the adverse events on a user interface

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5642731A (en) * 1990-01-17 1997-07-01 Informedix, Inc. Method of and apparatus for monitoring the management of disease
US6000828A (en) * 1997-08-22 1999-12-14 Power Med Incorporated Method of improving drug treatment
US6219674B1 (en) * 1999-11-24 2001-04-17 Classen Immunotherapies, Inc. System for creating and managing proprietary product data

Family Cites Families (78)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4384329A (en) * 1980-12-19 1983-05-17 International Business Machines Corporation Retrieval of related linked linguistic expressions including synonyms and antonyms
US5371807A (en) 1992-03-20 1994-12-06 Digital Equipment Corporation Method and apparatus for text classification
US5299121A (en) 1992-06-04 1994-03-29 Medscreen, Inc. Non-prescription drug medication screening system
WO1994000817A1 (en) 1992-06-22 1994-01-06 Health Risk Management, Inc. Health care management system
US5502576A (en) 1992-08-24 1996-03-26 Ramsay International Corporation Method and apparatus for the transmission, storage, and retrieval of documents in an electronic domain
US5692171A (en) 1992-11-20 1997-11-25 Bull S.A. Method of extracting statistical profiles, and use of the statistics created by the method
US5337919A (en) 1993-02-11 1994-08-16 Dispensing Technologies, Inc. Automatic dispensing system for prescriptions and the like
US5594637A (en) 1993-05-26 1997-01-14 Base Ten Systems, Inc. System and method for assessing medical risk
US5495604A (en) 1993-08-25 1996-02-27 Asymetrix Corporation Method and apparatus for the modeling and query of database structures using natural language-like constructs
US5833599A (en) 1993-12-13 1998-11-10 Multum Information Services Providing patient-specific drug information
US5580728A (en) 1994-06-17 1996-12-03 Perlin; Mark W. Method and system for genotyping
US5737539A (en) 1994-10-28 1998-04-07 Advanced Health Med-E-Systems Corp. Prescription creation system
US5845255A (en) 1994-10-28 1998-12-01 Advanced Health Med-E-Systems Corporation Prescription management system
US5758095A (en) 1995-02-24 1998-05-26 Albaum; David Interactive medication ordering system
US5911132A (en) 1995-04-26 1999-06-08 Lucent Technologies Inc. Method using central epidemiological database
US5664109A (en) 1995-06-07 1997-09-02 E-Systems, Inc. Method for extracting pre-defined data items from medical service records generated by health care providers
US5659731A (en) * 1995-06-19 1997-08-19 Dun & Bradstreet, Inc. Method for rating a match for a given entity found in a list of entities
US6076083A (en) 1995-08-20 2000-06-13 Baker; Michelle Diagnostic system utilizing a Bayesian network model having link weights updated experimentally
US5634053A (en) 1995-08-29 1997-05-27 Hughes Aircraft Company Federated information management (FIM) system and method for providing data site filtering and translation for heterogeneous databases
US6209004B1 (en) 1995-09-01 2001-03-27 Taylor Microtechnology Inc. Method and system for generating and distributing document sets using a relational database
US6112182A (en) * 1996-01-16 2000-08-29 Healthcare Computer Corporation Method and apparatus for integrated management of pharmaceutical and healthcare services
US6076088A (en) 1996-02-09 2000-06-13 Paik; Woojin Information extraction system and method using concept relation concept (CRC) triples
US5804803A (en) 1996-04-02 1998-09-08 International Business Machines Corporation Mechanism for retrieving information using data encoded on an object
FR2747027B1 (en) 1996-04-09 1998-05-29 Cohen Laroque Emmanuel S METHOD FOR DETERMINING THE DEPTH OF ANESTHESIA AND DEVICE FOR CARRYING OUT SAID METHOD
US5978804A (en) 1996-04-11 1999-11-02 Dietzman; Gregg R. Natural products information system
US6108635A (en) 1996-05-22 2000-08-22 Interleukin Genetics, Inc. Integrated disease information system
US5864789A (en) 1996-06-24 1999-01-26 Apple Computer, Inc. System and method for creating pattern-recognizing computer structures from example text
US5924074A (en) 1996-09-27 1999-07-13 Azron Incorporated Electronic medical records system
US6246975B1 (en) 1996-10-30 2001-06-12 American Board Of Family Practice, Inc. Computer architecture and process of patient generation, evolution, and simulation for computer based testing system
US6226564B1 (en) 1996-11-01 2001-05-01 John C. Stuart Method and apparatus for dispensing drugs to prevent inadvertent administration of incorrect drug to patient
US6151581A (en) 1996-12-17 2000-11-21 Pulsegroup Inc. System for and method of collecting and populating a database with physician/patient data for processing to improve practice quality and healthcare delivery
US5860917A (en) 1997-01-15 1999-01-19 Chiron Corporation Method and apparatus for predicting therapeutic outcomes
US6098062A (en) 1997-01-17 2000-08-01 Janssen; Terry Argument structure hierarchy system and method for facilitating analysis and decision-making processes
JP3084618B2 (en) 1997-02-14 2000-09-04 システムコンサルティングサービス株式会社 Pharmacy / pharmacy information system
US6082776A (en) 1997-05-07 2000-07-04 Feinberg; Lawrence E. Storing personal medical information
US6466923B1 (en) 1997-05-12 2002-10-15 Chroma Graphics, Inc. Method and apparatus for biomathematical pattern recognition
NL1006141C1 (en) 1997-05-27 1998-12-01 Holland Ind Diamantwerken Bv Grinding machine.
US6137911A (en) 1997-06-16 2000-10-24 The Dialog Corporation Plc Test classification system and method
US5991729A (en) 1997-06-28 1999-11-23 Barry; James T. Methods for generating patient-specific medical reports
US6055528A (en) 1997-07-25 2000-04-25 Claritech Corporation Method for cross-linguistic document retrieval
US6587829B1 (en) 1997-07-31 2003-07-01 Schering Corporation Method and apparatus for improving patient compliance with prescriptions
US6697783B1 (en) * 1997-09-30 2004-02-24 Medco Health Solutions, Inc. Computer implemented medical integrated decision support system
GB9726654D0 (en) * 1997-12-17 1998-02-18 British Telecomm Data input and retrieval apparatus
US6055538A (en) 1997-12-22 2000-04-25 Hewlett Packard Company Methods and system for using web browser to search large collections of documents
US20020010595A1 (en) * 1998-02-27 2002-01-24 Kapp Thomas L. Web-based medication management system
JPH11282934A (en) 1998-03-30 1999-10-15 System Yoshii:Kk Medicine preparation and medical diagnosis support device
US6014631A (en) 1998-04-02 2000-01-11 Merck-Medco Managed Care, Llc Computer implemented patient medication review system and process for the managed care, health care and/or pharmacy industry
BR9909906A (en) * 1998-04-03 2000-12-26 Triangle Pharmaceuticals Inc Computer program systems, methods and products to guide the selection of therapeutic treatment regimens
US6092072A (en) 1998-04-07 2000-07-18 Lucent Technologies, Inc. Programmed medium for clustering large databases
US6273854B1 (en) 1998-05-05 2001-08-14 Body Bio Corporation Medical diagnostic analysis method and system
US6253169B1 (en) 1998-05-28 2001-06-26 International Business Machines Corporation Method for improvement accuracy of decision tree based text categorization
EP1125224A4 (en) 1998-10-02 2006-10-25 Ncr Corp Techniques for deploying analytic models in parallel
US6067524A (en) 1999-01-07 2000-05-23 Catalina Marketing International, Inc. Method and system for automatically generating advisory information for pharmacy patients along with normally transmitted data
US6128620A (en) 1999-02-02 2000-10-03 Lemed Inc Medical database for litigation
US6684221B1 (en) * 1999-05-06 2004-01-27 Oracle International Corporation Uniform hierarchical information classification and mapping system
US6507829B1 (en) * 1999-06-18 2003-01-14 Ppd Development, Lp Textual data classification method and apparatus
US20020077756A1 (en) 1999-11-29 2002-06-20 Scott Arouh Neural-network-based identification, and application, of genomic information practically relevant to diverse biological and sociological problems, including drug dosage estimation
US6658396B1 (en) * 1999-11-29 2003-12-02 Tang Sharon S Neural network drug dosage estimation
US20020040282A1 (en) * 2000-03-22 2002-04-04 Bailey Thomas C. Drug monitoring and alerting system
US6542902B2 (en) * 2000-03-24 2003-04-01 Bridge Medical, Inc. Method and apparatus for displaying medication information
US20020142815A1 (en) 2000-12-08 2002-10-03 Brant Candelore Method for creating a user profile through game play
US6876966B1 (en) 2000-10-16 2005-04-05 Microsoft Corporation Pattern recognition training method and apparatus using inserted noise followed by noise reduction
US20040015372A1 (en) 2000-10-20 2004-01-22 Harris Bergman Method and system for processing and aggregating medical information for comparative and statistical analysis
US20020073042A1 (en) 2000-12-07 2002-06-13 Maritzen L. Michael Method and apparatus for secure wireless interoperability and communication between access devices
US20020129031A1 (en) * 2001-01-05 2002-09-12 Lau Lee Min Managing relationships between unique concepts in a database
US6993402B2 (en) * 2001-02-28 2006-01-31 Vigilanz Corporation Method and system for identifying and anticipating adverse drug events
US6789091B2 (en) 2001-05-02 2004-09-07 Victor Gogolak Method and system for web-based analysis of drug adverse effects
US20020183965A1 (en) 2001-05-02 2002-12-05 Gogolak Victor V. Method for analyzing drug adverse effects employing multivariate statistical analysis
US7925612B2 (en) 2001-05-02 2011-04-12 Victor Gogolak Method for graphically depicting drug adverse effect risks
US6778994B2 (en) 2001-05-02 2004-08-17 Victor Gogolak Pharmacovigilance database
US7542961B2 (en) 2001-05-02 2009-06-02 Victor Gogolak Method and system for analyzing drug adverse effects
US20020169771A1 (en) * 2001-05-09 2002-11-14 Melmon Kenneth L. System & method for facilitating knowledge management
US6950755B2 (en) 2001-07-02 2005-09-27 City Of Hope Genotype pattern recognition and classification
US6958211B2 (en) 2001-08-08 2005-10-25 Tibotech Bvba Methods of assessing HIV integrase inhibitor therapy
WO2003019455A2 (en) * 2001-08-22 2003-03-06 Keystone Therapeutics, Inc. System, method and computer program for monitoring and managing medications
US7461006B2 (en) 2001-08-29 2008-12-02 Victor Gogolak Method and system for the analysis and association of patient-specific and population-based genomic data with drug safety adverse event data
US20040010511A1 (en) 2002-07-11 2004-01-15 Gogolak Victor V. Method and system for drug utilization review
US8180653B2 (en) * 2006-01-18 2012-05-15 Catalina Marketing Corporation Pharmacy network computer system and printer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5642731A (en) * 1990-01-17 1997-07-01 Informedix, Inc. Method of and apparatus for monitoring the management of disease
US6000828A (en) * 1997-08-22 1999-12-14 Power Med Incorporated Method of improving drug treatment
US6219674B1 (en) * 1999-11-24 2001-04-17 Classen Immunotherapies, Inc. System for creating and managing proprietary product data

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
"Vaccine adverse event reporting system (VAERS)", US DEPARTMENT OF HEALTH AND HUMAN SERVICES, July 2001 (2001-07-01), pages 1 - 12, XP002951657, Retrieved from the Internet <URL:www.vaers.org/search/README> *
"VAERS data: guide to interpreting case report information obtained from the vaccine adverse event reporting system (VAERS)", 11 June 2002 (2002-06-11), pages 1 - 2, XP002951655, Retrieved from the Internet <URL:www.vaers.org/info> *
DUMOUCHEL, W.: "Bayesian data mining in large frequency tables with an application to the FDA spontaneous reporting system", THE AMERICAN STATISTICIAN, vol. 53, no. 3, April 1999 (1999-04-01), pages 177 - 190 (1-30), XP002951660 *
MEDWATCH: "Post-marketing surveillance for adverse events after vaccination: the national vaccine adverse event reporting system (VAERS)", November 1998 (1998-11-01), pages 1 - 12, XP002951661 *
MOOTREY ET AL.: "Surveillance for adverse events following vaccination", 1999, VPD SURVEILLANCE MANUAL, XP002951658 *
SZARFMAN A.: "New methods for signal detection", PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON PHARMACOEPIDEMIOLOGY, 28 August 1999 (1999-08-28), pages 1 - 57, XP002951659 *
SZARFMAN, A.: "Application of screening algorithms and computer systems to efficiently signal combinations of drugs and events in FDA's spontaneous reports", PROCEEDINGS OF THE 129TH ANNUAL MEETING OF THE AMERICAN PUBLIC HEALTH ASSOCIATION, 22 October 2001 (2001-10-22), pages 1 - 2, XP002951656 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1717722A3 (en) * 2005-04-25 2008-05-21 Ingenix Inc. System and method for early identification of safety concerns of new drugs
US7856362B2 (en) 2005-04-25 2010-12-21 Ingenix, Inc. System and method for early identification of safety concerns of new drugs
US7917374B2 (en) 2005-04-25 2011-03-29 Ingenix, Inc. System and method for early identification of safety concerns of new drugs
US7966196B2 (en) 2005-04-25 2011-06-21 Ingenix, Inc. System and method for early identification of safety concerns of new drugs
US8285562B2 (en) 2005-04-25 2012-10-09 Ingenix, Inc. System and method for early identification of safety concerns of new drugs
US8473309B2 (en) 2005-04-25 2013-06-25 Optuminsight, Inc. System and method for early identification of safety concerns of new drugs
EP1999671A2 (en) * 2006-03-24 2008-12-10 IntelliDOT Corporation Electronic data capture in a medical workflow system
EP1999671A4 (en) * 2006-03-24 2013-04-24 Patientsafe Solutions Inc Electronic data capture in a medical workflow system
EP2601633A4 (en) * 2010-08-06 2016-11-30 Cardiomems Inc Systems and methods for using physiological information

Also Published As

Publication number Publication date
US20100125615A1 (en) 2010-05-20
US6778994B2 (en) 2004-08-17
US20020188465A1 (en) 2002-12-12
US20120209864A1 (en) 2012-08-16
US7539684B2 (en) 2009-05-26
US8131769B2 (en) 2012-03-06
US20020165853A1 (en) 2002-11-07
US8694555B2 (en) 2014-04-08

Similar Documents

Publication Publication Date Title
US6778994B2 (en) Pharmacovigilance database
US7461006B2 (en) Method and system for the analysis and association of patient-specific and population-based genomic data with drug safety adverse event data
US8086468B2 (en) Method for computerising and standardizing medical information
Zhou et al. Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to process medication information in outpatient clinical notes
US20190006027A1 (en) Automatic identification and extraction of medical conditions and evidences from electronic health records
US6789091B2 (en) Method and system for web-based analysis of drug adverse effects
US20020082868A1 (en) Systems, methods and computer program products for creating and maintaining electronic medical records
Brown et al. Evaluation of the quality of information retrieval of clinical findings from a computerized patient database using a semantic terminological model
JP2008537227A (en) System and method for analyzing medical data
US20020128861A1 (en) Mapping clinical data with a health data dictionary
US8600772B2 (en) Systems and methods for interfacing with healthcare organization coding system
CN114664463A (en) General practitioner diagnoses auxiliary system
Khan et al. Health Quest: A generalized clinical decision support system with multi-label classification
JP2006079189A (en) Receipt file creation system, medical chart file creation system and file creation system
US20040073463A1 (en) Apparatus, methods and computer software products for clinical study analysis and presentation
West et al. Reflections on the use of electronic health record data for clinical research
Soares et al. An interdisciplinary approach to reducing errors in extracted electronic health record data for research
Quindroit et al. Definition of a practical taxonomy for referencing data quality problems in health care databases
Tuttle et al. The Semantic Web As" Perfection Seeking": A View from Drug Terminology.
Saiod et al. The Impact of Deep Learning on the Semantic Machine Learning Representation
Rucker et al. Design and use of a joint order vocabulary knowledge representation tier in a multi-tier CPOE architecture
Collen et al. Development of medical information systems (MISs)
Patrick et al. Intelligent Clinical Notes System: An information retrieval and information extraction system for Clinical Notes
Boxwala et al. Coverage of patient safety terms in the UMLS Metathesaurus
Gaussens et al. The role of electronic tracking in monitoring data output in clinical trials

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: COMMUNICATION UNDER RULE 69 EPC (EPO FORM 1205A DATED 05.03.2004)

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP