US20060004595A1 - Data integration method - Google Patents

Data integration method Download PDF

Info

Publication number
US20060004595A1
US20060004595A1 US11/137,821 US13782105A US2006004595A1 US 20060004595 A1 US20060004595 A1 US 20060004595A1 US 13782105 A US13782105 A US 13782105A US 2006004595 A1 US2006004595 A1 US 2006004595A1
Authority
US
United States
Prior art keywords
data
driver
business
linkage
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/137,821
Inventor
Jan Rowland
Robin Davies
Michael Prevoznak
Charles Benke
Ahmad Sharif
Sandra Stoker
Robert Porreca
Eric Gustafson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US11/137,821 priority Critical patent/US20060004595A1/en
Publication of US20060004595A1 publication Critical patent/US20060004595A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/40Data acquisition and logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling

Definitions

  • the present invention relates to a process of collecting and enhancing commercial data and, more particularly, to quality assurance and five quality drivers.
  • One aspect of the present invention is a method of data integration comprising collecting information comprising primary data.
  • the primary data is tested for accuracy and processed to produce secondary data and enhanced information comprising the primary data and the secondary data is provided.
  • primary and/or secondary data is sampled periodically thereby generating sample data.
  • the sample data is evaluated against at least one predetermined condition. Based upon this evaluation, testing and/or processing steps are adjusted.
  • testing comprises at least one of the following steps: (a) determining if the primary data matches stored data and (b) assigning an identification number to the primary data. It is determined if the primary data meets a first threshold condition before assigning an identification number in step (b) if the primary data does not match the stored data in step (a).
  • the first threshold condition is multiple sources confirm that a business associated with the primary data exists.
  • the identification number is an entity identifier.
  • the primary data is stored in a separate repository and assigned an identification number if it does not meet the first threshold condition. Additional primary data is received and it is determined if the primary data and the additional primary data meet the first threshold condition, the entity is moved into the multi-source repository.
  • the system includes a data generator, a testing unit, a first processing unit, and a second processing unit.
  • the data generator is capable of gathering primary data from at least one data source.
  • the testing unit is capable of testing the primary data for accuracy.
  • the first processing unit is capable of analyzing the primary data and generating secondary data from the result of the analysis.
  • the second processing unit is capable of merging the primary data and the secondary data to form enhanced information.
  • the testing unit, first processing unit, and the second processing unit may be the same or independent of one another.
  • the testing unit comprises at least one of a data matching unit and entity identifier unit.
  • the first processing unit comprises at least one of a corporate linkage unit and a predictive indicator unit.
  • Another aspect of the present invention is a machine-readable medium for storing executable instructions for data integration.
  • the instructions include collecting information comprising primary data, testing the primary data for accuracy, processing the primary data to produce secondary data, and providing enhanced information comprising the primary data and the secondary data.
  • the primary and/or secondary data is sampled periodically, thereby generating sample data.
  • the sample data is evaluated against at least one predetermined condition.
  • the testing and/or processing is adjusted based upon the evaluation.
  • FIG. 1 is a block diagram of the method of data integration according to the present invention
  • FIG. 2 is a block diagram of a system for data integration according to the present invention
  • FIG. 3 is a block diagram of a system for data integration according to the present invention.
  • FIG. 4 is a logic diagram depicting the method of data integration according to the present invention.
  • FIG. 5 is a block diagram of example sources of data collection according to the present invention.
  • FIG. 6 is a block diagram of more example sources of data collection according to the present invention.
  • FIGS. 7 and 8 are block diagrams of entity matching according to the present invention.
  • FIG. 9 is a block diagram of entity matching where no match is found in existing databases of traditional businesses, but through sourcing internal and external data stores an emerging business match is made
  • FIG. 10 is a block diagram of entity matching where matched data is delivered to one database or matched through internal and external data sources or assigned an identification number and housed in a single source repository
  • FIGS. 11 and 12 are block diagrams of a method of entity matching according to the present invention.
  • FIG. 13-15 are block diagrams of corporate linking according to the present invention.
  • FIG. 16 is a block diagram of corporate linkage following a merger/acquisition event
  • FIG. 17 is a logic diagram of an example method of performing corporate linkage according to the present invention.
  • FIG. 18 is a block diagram of corporate linkage where relationships are outside of the definition of legal ownership to show other types of linkages.
  • FIGS. 19 and 20 are block diagrams of an example method of providing a predictive indicator according to the present invention.
  • FIG. 1 shows an overview of a method of data processing according to the present invention.
  • the foundation of the method is quality assurance 102 , which is the continuous data auditing, validating, normalizing, correcting, and updating done to ensure quality all along the process.
  • quality assurance 102 is the continuous data auditing, validating, normalizing, correcting, and updating done to ensure quality all along the process.
  • These five drivers are: a data collection driver 108 , an entity matching driver 110 , an identification (ID) number driver 112 , a corporate linkage driver 114 , and a predictive indicators driver 116 .
  • These five drivers interface with a database 118 .
  • Database 118 is an organized collection of data and database management tools, such as a relational database, an object-oriented database, or any other kind of database. Data in database 118 is continually refined and enhanced based on customer feedback and quality assurance testing and procedures.
  • Data collection driver 108 brings together data from a variety of sources worldwide. Then, the data is integrated into database 118 through entity matching driver 110 , resulting in a single, more accurate picture of each business entity.
  • identification number driver 112 applies an identification number as a unique means of identifying and tracking a business globally through any changes it goes through.
  • Corporate linkage driver 114 then builds corporate families to enable a view of total corporate risk and opportunity.
  • predictive indicators driver 116 uses statistical analysis to rate a business' past performance and indicate the likelihood of a business to perform in a specific way in the future.
  • FIGS. 2 and 3 show two example embodiments of systems for data integration according to the present invention, although other systems would also be suitable for practicing the present invention.
  • FIG. 2 shows a network configuration while FIG. 3 shows a computer system configuration.
  • a network 200 facilitates communication among the other system components, including a computer system 202 .
  • the five quality drivers, data collection driver 108 , entity matching driver 110 , identification number driver 112 , corporate linkage driver 114 , and predictive indicators driver 116 , and quality assurance 102 work sequentially to enhance the incoming data 104 to turn it into quality information 106 stored in database 204 .
  • a computer system 300 has a processor 302 with access to memory 304 via a bus 306 .
  • Memory 304 stores an operating system program 308 , a data integration program 310 , and data 312 .
  • FIG. 4 shows detail around Quality Assurance for each driver as another embodiment of a method of data integration according to the present invention.
  • This method includes five main drivers of data integration: data collection 108 , entity matching 110 , identification number 112 , corporate linkage 114 , and predictive indicators 116 to produce quality information 106 .
  • Quality information 410 is produced as a result of quality assurance performed by each driver.
  • Quality assurance 400 is performed for data collection 108 to verify legal name and ownership to identify potential fraud, to update contact information, to update and make changes based on events, to verify and enhance third party information, and to ensure accuracy, completeness, timeliness and cross-border consistency. Quality assurance 400 continually refines and enhances data collection 108 .
  • Quality assurance 402 is performed for entity matching with manual and automated quality checks to ensure accurate matches and eliminate duplicates. Based on customer feedback and matching learnings, quality assurance 402 for entity matching 110 is continually refined and enhanced.
  • identification number 112 businesses are uniquely identified and tracked. Quality assurance 404 is performed for identification number 112 by retaining an identification number for the life of a business and by being recognized as an industry standard. The identification number allows verification of information in each of the five drivers. For data collection 108 , if data is not linked to an identification number, it indicates the possibility of a new business. For entity matching 110 , the identification number allows new data to be accurately matched to existing businesses. For corporate linkage 114 , corporate families are assembled based on each business' identification number. For predictive indicators 116 , numbered data is used to build predictive tools. A verification process assigns an identification number when commercial activity is confirmed. Quality assurance 404 for identification number 112 includes validating and protecting against duplication. The identification number assignment process is continually refined and enhanced.
  • Quality assurance 406 for corporate linkage 114 includes building corporate families globally and updating them after mergers, acquisitions, and other events. Quality assurance 406 for corporate linkage 114 includes increasing completeness and accuracy of corporate families by having a dedicated team review corporate families and by matching corporate families. Based on customer feedback, the corporate linkage 114 is continually refined and enhanced.
  • Quality assurance 408 for predictive indicators 116 includes continually monitoring and adjusting predictive indicators 116 to reflect new information. Based on customer feedback, the predictive indicators 116 are continually refined and enhanced.
  • the five main components or drivers work together to integrate the data collected into quality information 106 that is useful for making business decisions.
  • the process is continually enhanced to continually improve quality based on feedback, learnings and experience spanning over the past 160 years.
  • Each of the five drivers is examined in more detail below, starting with data collection driver 108 .
  • FIG. 5 shows some sources of data used in data collection driver 108 .
  • Data is collected about customers, prospects, and suppliers with the goal of collecting the most complete data possible.
  • database 118 is a global database.
  • database 118 has data for millions of businesses worldwide and is updated daily.
  • database 118 contains direct investigations, news, and media 502 , payment and financial data (trade data) 504 , public records and government registries 506 , and web sources and directories 508 .
  • Payment and financial data includes trade records updated frequently, complete coverage of public company financials, and coverage of financial statements on privately held companies.
  • Public records and government registries include, suits liens, judgments, uniform commercial code filings, bankruptcy filings, and business registrations.
  • Web sources and directories include uniform resource locators (URLs), updates from domains, and customers providing online updates.
  • URLs uniform resource locators
  • Data can also be collected from other strategic data partners. These strategic partners provide data from international markets and conduct an agreed upon amount of due diligence on the data, prior to delivery into the global database. The inclusion of data from strategic partners enables comprehensive global coverage.
  • top news providers are monitored every day to uncover changes and updates that affect the risk level and/or marketing attributes and of the user's customers, prospects, and suppliers. This data is focused on publicly traded companies with additional coverage devoted to mergers and/or acquisitions and high risk or business deterioration. News is posted within 24 hours of release.
  • the types of events include mergers and acquisitions, control changes, purchase or sale of assets, officer, name or location changes, earnings updates, and business closings.
  • the benefit is updated information that affects the risk level of companies the user does business with and indications of key changes that can be used for marketing purposes.
  • database 118 payment experiences from companies are collected to help the user predict future payment habits of prospects and customers.
  • Accounts receivable data on U.S.-based companies provides an overall evaluation of how quickly and completely a company made payment to each vendor.
  • Many reports created using database 118 include payment data.
  • This payment experiences data has many benefits. Users get a picture of how a company is paying their vendors, bank loans, and other financial obligations. It enables showing payment trends over time. It enables creation of predictive scores for use in applications such as automated credit approvals. It helps pre-screen potential customers based on their ability to pay on time. Payment experiences are summarized to show the user how different industries are paid and credit limits.
  • An example database 118 has public records from U.S. courts and legal filing offices to provide critical insights into the risk of a company.
  • This data includes U.S.-based company information on suits, liens, judgments, bankruptcies, and U.C.C. filings (collectively called public records), information obtained from courts and recording offices, company filing for bankruptcy protection under Chapter 11 (re-organization) or Chapter 7 (liquidation).
  • This data captures a majority of the U.S. public filings and has many benefits. Over 10 years of historical coverage enable predictive credit ratings and scores. Users understand legal actions that could affect a company's ability to continue as an ongoing concern. A company's rating is negatively impacted when a bankruptcy takes place. Users are notified about all companies affected in a corporate family when a bankruptcy occurs within the corporate family.
  • database 118 complete coverage of public company financial statements and many privately held company financial statements help the user to understand financial strength.
  • This data includes balance sheet and income statements and private company financial statements collected from certified public accountants (CPAs) or from corporate officers.
  • CPAs certified public accountants
  • SEC Securities and Exchange Commission
  • the database 118 has complete coverage on public companies. Most financial statements are on privately-held companies. This data has many benefits. Users understand financial strength, ability to pay on time and ability to continue as an ongoing concern. This data helps target prospects by size or financial strength.
  • An example database 118 has data from telephone calls that verify and enhance the third party information leading to over one and one-half million updates to the database 118 everyday. This data includes interviews with business principals to verify and enhance information from other sources. Every public company is monitored daily.. There is a focus on collecting value-added data (e.g., business name, address, telephone number, SIC, employee number, sales, CEO/owner name). This has many benefits. It serves as an additional check on the accuracy of the data, helps validate third party data, builds content on small businesses, and makes the data consistent across the globe. Consistency of data enables customers to rely on the same high quality of information country to country, creating opportunity for growth, consistency in credit and marketing policies globally, understanding risk exposure, marketing opportunity and reliance on suppliers globally.
  • value-added data e.g., business name, address, telephone number, SIC, employee number, sales, CEO/owner name. This has many benefits. It serves as an additional check on the accuracy of the data, helps validate third party data, builds content on small businesses, and makes the data
  • the URL file is collected from external and internal sources. Each URL is mined several times a year to confirm its status (live, parked, under construction, redirect, inactive) and verify it belongs to the company it has been assigned to using the name, address or telephone number from the existing database. Besides verification several times a year additional data elements such as security data, certificate data, strength of encryption and other data are collected from the URL. The URL's verified are populated in the database using one-down linkage to expand coverage across family tree members.
  • FIG. 6 shows some additional sources of data used by data collection driver 108 for increased accuracy, such as telephone company data 602 , internet, news and media 604 , direct investigations 606 , company financial information 608 , payment data 610 , courts and legal filings offices 612 , government registries 614 , and diversity data 616 .
  • This completeness of information aids profitable business decisions.
  • risk management a user assesses risk from non-United States (U.S.) companies with the resulting information. Risk from small business customers can be more completely identified. The user can make more informed risk decisions when they are based on more complete information.
  • sales and marketing the user can identify new prospects from data drawn from multiple sources.
  • the user can gain access to international customers and prospects and cherry pick a prospect list with value-added information such as standard industrial classification (SIC) and contact name.
  • value-added information such as standard industrial classification (SIC) and contact name.
  • the user may assess risk from foreign suppliers with the resulting information and identify the risk from suppliers more completely.
  • the user gains a fresher more complete picture of each customer, prospect, and supplier because of daily updates to database 118 .
  • telephone company data is collected to identify new businesses, changes in existing records and to provide updated contact information. Businesses request new listings when initiating phone service.
  • the benefits of this data include indication of a new business or change in phone number and enabling creation of new records or enhancing existing ones, providing the most recent address, phone number, and line of business (SIC) information.
  • SIC line of business
  • database 118 includes business registrations from state government registries to verify legal name and ownership to identify potential frauds.
  • Database 1 18 has information on business registrations filed at the time a company is incorporated. This has many benefits. It enables verification of the existence of registered businesses, confirms information, such as a company's organizational structure, date, and state of incorporation (or organization), help aid in fraud investigation through review of names and principals and business standing within a state, and identification of all changed records and new-to-file records.
  • Quality assurance 102 of database 118 ensures accuracy, completeness, timeliness, and cross-border consistency of global data. Quality assurance includes standardizing data, correcting and updating data, ensuring phone numbers connect and mailing addresses deliver to the intended recipient, and conducting manual reviews.
  • Quality assurance 102 includes standardizing data. Numerous quality edits and validations are made at the time of data entry. Data is validated to ensure consistency between branch and headquarter names, reasonability between number of employees, sales volume and line of business, prevent duplication of records, validate out-of-business status changes and more. Global cleansing software, is used to standardize marketable records and ensure consistency in presentation of records Addresses are standardized before inclusion in the database.
  • Quality assurance 102 includes correcting and updating data.
  • data In an example, the status of suits, liens, judgments and bankruptcy filings are reviewed and updated. Data flows between internal teams to ensure information is consistently updated between areas of news, risk, ratings and delivery. Constantly updating and refreshing the data, leads to high response rates on customer acquisition promotions, high match rates between files and high quality data in the database 118 .
  • Quality assurance 102 includes manual reviews. Third party data is validated with manual reasonability reviews. Payment re-checks are manually performed on trade references appearing abnormal or exaggerated. Financial statements are reviewed to identify high risk businesses, ensure accuracy and apply capital strength ratings consistently across the universe of records. Comparisons of merger/acquisition update volumes are done with externally published numbers to ensure complete coverage.
  • Data is continually refined and enhanced through quality assurance 102 and global data collection 108 .
  • FIG. 7 shows how multiple unmatched pieces of data 702 may be turned into a complete single business 704 .
  • Entity matching driver 110 checks the incoming data 104 to see if it belongs to any existing business in database 118 .
  • ABC, Inc., Chuck's Mini-Mart, and Charles Smith appear to be separate companies, but after entity matching, it is clear that they are all part of one enterprise, ABC Inc. and Chuck's Mini-Mart.
  • the different addresses and other associated information is also reconciled into a complete single business 704 .
  • Entity matching driver 110 detects similarities in incoming data and combines it into a single business. Queries are more likely to be accurate, customer, supplier, and prospect information is consolidated to provide more complete and accurate profiles, and there are less duplicate records.
  • the customer can receive information about the quality of their matched records via D&B's matching feedback mechanisms, allowing the customer to decide how to use the matched information in their business processes.
  • Another benefit is that the customer receives a consistent answer as the matching process is repeatable and defined.
  • FIG. 8 shows how incoming data 104 that matches a business in database 118 is appended to that business through entity matching driver 110 .
  • entity matching driver 110 is designed to match data to the right business every time, thus, increasing efficiency. Entity matching driver 110 provides more complete and accurate profiles of customers, prospects, and suppliers and ensures far fewer duplicate businesses.
  • FIG. 11 shows an example method of matching via match driver 110 .
  • This method includes cleaning and parsing and standardizing 1102 , performing candidate retrieval 1104 , and evaluation and decision making 1106 .
  • Cleaning and parsing 1102 includes identifying key components of inquiry data 1108 , normalizing and standardizing name, address, and city 1110 , performing name consistency 1112 , and performing address standardization 1114 .
  • Candidate retrieval 1104 includes gathering possible match candidates from a reference database 1116 , using optimized keys to improve retrieval quality and throughput 1118 , and retrieval is optimized based on data provided in the inquiry data, observations of existing reference data and ongoing tuning 1120 .
  • Evaluation and decision making 1106 includes evaluating matches according to a consistent standard 1122 , applying a match grade 1124 , applying a confidence code 1126 , and applying a confidence percentile 1128 .
  • Entity matching 110 To ensure quality assurance 102 of entity matching 110 , manual and automated checks are performed. Samples of matched records are manually reviewed. Based on experience, customer feedback and learnings, entity matching 110 is recalibrated. Entity matching 110 allows and corrects for variations in spelling, formats, trade names, addresses, and the like. Entity matching 110 uses a match grade and confidence code to determine if the match passes the quality threshold. Entity matching 110 provides a consistent, repeatable process that is not based on human judgment. The benefits are more accurate matches and less duplicates.
  • Quality assurance 102 of entity matching 110 includes continually refining and enhancing entity matching 110 based on customer feedback. Samples of matched records are manually reviewed, technology allows for corrections in spelling, formats, trade names, addresses. Technology also interprets context of key parts of the inquiry to better find difficult matches (i.e. interpret parts of the sound, geographic position, implied line of business, acronyms). Quality assurance is also ensured by using a customized retrieval approach for each inquiry that looks at the best way to find a match to optimize the result for each unique inquiry (i.e. some matches are better made by using sound algorithms, other matches are better made by using exact name matches). As enhancements are made, they become available both online and in batch systems to ensure consistency. The benefits of these improvements are increased search candidates, additional functionality and increased throughput. In other words, more hits, more better hits, and more better hits faster. Matching capabilities include matches to a proprietary database containing multiple names and addresses per record, the ability to identify matches that don't look exactly like each other, and the ability to select by the quality of the match.
  • Identification (ID) number driver 112 appends a unique identification number to every business location so it can be easily and accurately identified.
  • This identification number is non-indicative.
  • One example of the unique identification number is such as the D-U-N-S® Number available from Dun & Bradstreet headquartered in Short Hills, N.J., which is a nine-digit number that allows business locations to be easily tracked through changes and updates. The identification number is retained for the life of a business. No two business locations ever receive the same identification number and the identification numbers are never recycled.
  • the identification number acts as an industry standard for business identification. It is endorsed by the United Nations, the European Commission, and over fifty industry groups.
  • the identification number is a central concept in the data processing method according to the present invention. For quality assurance, the identification number allows verification of information at every stage of the process. For data collection driver 108 , if data is not linked to an existing identification number, it indicates the possibility of a new business. For entity matching driver 110 , the identification number allows new data to be accurately matched to existing businesses. For corporate linkage driver 114 , corporate families are assembled based on each business' identification number. For predictive indicators driver 116 , the identification number is used to build predictive tools.
  • the identification number opens new areas of opportunity to a user's business by helping to verify that a business exists and validating the business location. Users are provided a complete view of prospects, customers, and suppliers. Existing data is clarified, duplication is identified, and related businesses are shown to be related. Users can more easily manage large groups of customers or suppliers when the identification number is appended to the user's information.
  • the identification number enables fast and easy data updates when appended to the user's information.
  • the identification number provides a complete view of prospects and customers by placing businesses, where applicable, within their domestic and global corporate ‘families’, identifying penetration and opportunities for up-sell and cross-sell. The identification number also helps aggregate data from multiple and disparate systems to gain better insight with one complete view of prospects, customers and suppliers.
  • the identification number not only helps identify duplication in files within the database, but also enables customers with a unique key that can be used to identify duplication in the customer's existing portfolio of accounts.
  • FIG. 13 shows an example method of identification number driver 112 .
  • Data collection 108 provides input data that is pre-processed 1300 by de-duping, appending phone and SIC, validating address and town, and checking for branches and franchises. This processed data is matched to a unique identifier file 1302 . If a match is found, data is appended to an existing record in the multi-sourced file 1304 . If a match is not found, the data is included in a single source repository 1306 and, then, unique identifier assignment rules are applied 1308 . As new files are received and additional sources validate a record in the single source repository, that record then becomes included in the multi source file. If new sources do not validate the record, the record stays in the single source repository.
  • Quality assurance 102 includes how identification numbers are managed.
  • an identification number is retained for the life of a business. No two businesses ever receive the same identification number. Identification numbers are never recycled.. The identification number is retained when a company moves anywhere within the same country.
  • the identification number is preferably an industry standard for business identification.
  • Quality assurance 102 of identification number driver 112 includes validation and protection against duplication. Rigorous processing is done to identify duplicate identification numbers including using duplicate scoring systems, implementing controls around bulk file building and undergoing validations prior to entering the database. In an example, every business is validated before it is included in database 118 so that the address is based on postal standards, incoming records are validated in relation to a town file (e.g., address, city, ZIP, state, and telephone number), and phone number and line of business are verified. There is multiple source validation, i.e., business registrations sometimes do not indicate a business has begun operations.
  • Quality assurance 102 of identification number driver 112 includes refining and enhancing the identification number assignment process.
  • FIGS. 14-16 show how corporate linkage driver 114 builds corporate linkage to reveal how companies are related. Without corporate linkage, the companies, L Refinery Div. 1402 , C Stores Inc. 1404 , and G Storage Div. 1406 in FIG. 14 appear to be unrelated.
  • L Inc. 1504 has two branches, L Storage Div. 1510 and L Refinery Div. 1402 (shown in FIG. 14 ).
  • C Inc. 1506 has two branches, Industrial Co. 1512 and Building Co. 1514 and a subsidiary, C Stores Inc. 1404 (shown in FIG. 4 ).
  • G Inc. 1508 has two branches, G Storage Div. 1406 (shown in FIG. 14 ) and G Refinery Div. 1516 .
  • C Stores Inc. has four branches, North Store Inc. 1518 , South Store Inc. 1520 , West Store Inc. 1522 , and East Store Inc. 1524 . Building extensive corporate linkage allows a business information provider to be an industry leader by providing this complete detail.
  • FIG. 16 shows how corporate linkage driver 114 updates family trees after mergers and acquisitions.
  • ABC 1602 and XYZ 1604 exist before a merger and each have their own subsidiaries and branches.
  • ABC XYZ 1606 has two subsidiaries, ABC subsidiary 1608 and XYZ subsidiary 1610 , each with their own branches and/or subsidiaries.
  • Corporate linkage driver 114 opens up profitable opportunities in risk management, sales and marketing, and supply management for a user. It allows the user to understand the total risk exposure and regulatory and statutory compliance implications across a corporate family. The user recognizes the relationship between bankruptcy or financial stress in one company and the rest of its corporate family. The user increases sales by up-selling and cross-selling with a corporate family. The user reduces expenses by reducing research time. The user can maximize the opportunity based on revenues from an entire corporate family. The user can understand where purchase decisions are made. The user can identify possible conflicts of interest. The user can determine its total spend with a corporate family to better negotiate.
  • FIG. 17 shows an example method of performing corporate linkage driver 114 .
  • it shows a method of updating family tree linkage 1700 where the goal is to correctly link all subsidiaries and branches of each entity having an identification number with consistent names, tradestyles, and correct employee numbers, while resolving all look-a-likes (LALs).
  • LALs look-a-likes
  • members include a global ultimate, a domestic ultimate, parents corporations, subsidiaries, headquarters, and branches.
  • a global ultimate is a highest ranking member of a corporate family globally.
  • a domestic ultimate is the highest ranking member of a corporate family within a specific country.
  • a parent corporation is a company that owns more than half of another company.
  • a subsidiary is a company that is more than half owned by a parent company.
  • Headquarters is a company with reporting branches or divisions.
  • a branch is a secondary location or operation, not a separate entity.
  • FIG. 18 shows other relationships among DUNS numbered entities can be overlaid on to the Corporate Linkage view to enrich the overall understanding of a group of otherwise potentially independent entities. Examples of this include franchise relationships, associations, co-ops, agents, dealers, chapters and affiliated concerns.
  • Quality assurance 102 during corporate linkage 114 increases the completeness and accuracy of corporate families.
  • a dedicated team reviews corporate families. This ensures business names, tradestyles, and SICs are consistent within a corporate family.
  • Quality assurance 102 includes checking for duplicates. There are central review and updates for the largest global family trees. Changes are monitored to identify and track mergers and acquisitions and other major events.
  • Quality assurance 102 includes matching of corporate families. There are quality programs to ensure business entities are linked properly and to handle linkage breaks within a corporate tree.
  • Corporate linkage is done through legal ownership.
  • Quality assurance 102 of corporate linkage 114 includes continually refining and enhancing corporate linkage based on customer feedback.
  • corporate linkage 114 capabilities include global cross-border linkage, U.S.
  • Quality assurance processes include using a validation tool to identify erroneously unlinked records or ‘look-a-likes’. The quality assurance processes are continually refined and enhanced based on learnings, feedback and reviews.
  • Predictive indicator driver 116 summarizes the information collected on a business and uses it to predict future performance. Predictive indicators use statistical analysis to indicate the likelihood of a business to perform in a specific way in the future. There are many benefits to predictive indicators. Users can make faster, more consistent decisions by allowing automated decisions for increased efficiency. Users can free up resources to look at time-intensive borderline decisions. Users can make more consistent decisions across the entire organization. Users can allow faster processing of large volumes of transactions. Users can apply scores across an entire portfolio to quickly identify risk and opportunity. Users can help estimate demand to target the right prospects and reduce acquisition costs.
  • Descriptive ratings summarize how a customer has historically been paying bills.
  • Predictive scores are a prediction of how likely it is for a business to pay promptly or continue as an ongoing concern.
  • Demand estimators estimate how much of a product a business is likely to buy in total (response, approval, look-a-like models).
  • Predictive indicators help a user to accelerate and impact profitability in all areas of its business.
  • descriptive ratings and predictive scores help the user grant or approve credit.
  • a rating indicates creditworthiness of a company based on past financial performance.
  • a score indicates likelihood of a business to continue as an ongoing concern or pay on time.
  • Predictive scores can be applied across the user's whole portfolio to quickly identify high-risk accounts and begin aggressive collection immediately or to evaluate the credit worthiness of each applicant.
  • a commercial credit score predicts the likelihood of a business paying slow over the next twelve months.
  • a financial stress score predicts the likelihood of a business failing over the next twelve months.
  • look-a-like models, response models and demand estimators let a user: identify prospects that look like their best customers, identify who is likely to respond to an offer, and/or how much product they will buy so that it can prioritize opportunities among customers or prospects.
  • Examples of demand estimators include number of personal computers and local or long distance spending.
  • predictive scores can be applied to all of a user's suppliers to quickly understand their risk of failing in the future.
  • predictive scores may be customized according to a user's specific need and criteria.
  • criteria may be used, such as (1) what behavior does the user want to predict; (2) what is the size of the business the user wants to assess; and (3) what are the decision rules based on the user's risk tolerance to translate risk assessment in to a credit decision or risk management or marketing action.
  • Predictive indicators are enabled by analytic capability and data capability. For example, a dedicated team of experienced business-to-business (B2B) expert PhDs may build the underlying predictive models and have access to industry-specific knowledge, financial and payment information, and extensive historical information for analysis.
  • B2B business-to-business
  • FIGS. 19 and 20 show an example method of creating a predictive indicator. It starts with market analysis 1802 and then there is a business decision on model development 1804 . This decision involves the type of score to be developed and output at the end, such as a failure risk score, a delinquency risk score, or an industry specific score.
  • the failure risk score is the likelihood that a company will cease operations.
  • the delinquency risk score is the likelihood that a company will pay late.
  • the industry specific score predicts something particular, such as the likelihood of using copiers or truckers or whether a company is a good credit risk.
  • Input data 1806 is gathered from an archive of credit database 1808 and a trade tape database 1810 which provide historical data related to credit.
  • the next step refers to a risk to be evaluated, such as a financial stress score that predicts the likelihood of a negative failure in the next twelve months.
  • a development sample is selected from a business universe 1814 , a demographic profile is created of the business universe 1816 , and exploratory data analysis is performed 1818 (univariate analysis of all variables. Tasks are performed such as determining the relationships between the variable and what is being predicted, the range of a variable, the type of variable, including or not including variables, and other functions related to understanding what to put in the model. Variables may be selected in accordance with the observation period and the performance period and weights may be assigned to indicate accuracy or representativeness. Trends are factored in. Quality assurance includes periodically checking to see if anything in the business universe effects the initial model and to take a score and run it against a prior period to check that it is still indicative or predictive.
  • statistical analysis and model development processes including logistic regression and other estimation techniques 1820 are performed. This step includes applying the appropriate models, formulas, and statistics. Next, statistical coefficients are converted into a scorecard 1822 . Models are tested and validated 1824 , and technical specifications are developed 1826 . Finally, the model is implemented 1828 and tested 1830 . Data is run through the model to generate a score. Periodically, checks are performed to verify that the score is still valid and to determine if the scorecard needs to be updated.
  • Quality assurance 102 of predictive indicators 116 includes continually monitoring and adjusting predictive indicators to reflect new information. In an example, this includes periodic testing of predictiveness, continuous manual refinement and recalibration, automated changes, monthly audits and annual validation, and analyzing data for each model with respect to its predictive qualities and importance whenever models are created or updated. Also, predictive indicators are continually refined and enhanced based on customer feedback. Predictive indicators 116 has data depth, including demographic data, payment information, detailed public record information, such as suits, liens, judgments, bankruptcies, and UCC filings, public and private company financial information, and linkage data used to assign risk to the responsible entity (i.e., score branches with HQ data). An independent group of reviewers check and validate the results of the scores, from which continual refinement and enhancement is realized. Customer needs and industry trends are also considered when quality assurance processes are done to continually improve the models and scores.
  • a global database used to perform a method of data integration encompasses millions of records and is updated daily. Users gain a fresher, more complete picture of each of their customers, prospects, and suppliers, because of the large number of daily updates to the database. Users are able to assess the risk of non-U.S. companies, because the database has global data. Users can more completely identify the risk from small business customers. Users make more informed risk decisions. Users identify new prospects from data drawn from multiple sources. Users gain access to international customers, suppliers and prospects. Users receive enhanced prospect lists with value-added information, such as line of business and contact name. Users can assess risk from foreign suppliers. Users can identify more complete the risk from suppliers.

Abstract

A data integration method involves a unique method of collecting raw business data and processing it to produce highly useful and highly accurate information to enable business decisions. This process includes collecting global data, entity matching, applying an identification number, performing corporate linkage, and providing predictive indicators. These process steps work in series to filter and organize the raw business data and provide quality information to customers. In addition, the information is enhanced by quality assurance at each step in this process to ensure the high quality of the resulting data.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • The present application is a continuation-in-part of and claims the benefit of application Ser. No. 10/368072, filed Feb. 18, 2003, entitled “Data Integration Method,” which is currently pending.
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention relates to a process of collecting and enhancing commercial data and, more particularly, to quality assurance and five quality drivers.
  • 2. Description of the Related Art
  • To be successful, businesses need to make informed decisions. In risk management, businesses need to understand and manage total risk exposure. They need to identify and aggressively collect on high-risk accounts. In addition, they need to approve or grant credit quickly and consistently. They also need to verify prospect, customer and supplier data to ensure compliance with government regulations. In sales and marketing, businesses need to determine the most profitable customers and prospects to target, as well as incremental opportunity in an existing customer base. They need to understand who and how big their most important customers are, acquire new high-growth customers that look like their best customers and reallocate their sales force based on growth and opportunity. In supply management, businesses need to understand the total amount being spent with suppliers to negotiate better. They also need to uncover risks and dependencies on suppliers to reduce exposure to supplier failure.
  • The success of these business decisions depends largely on the quality of the information behind them. Quality is determined by whether the information is accurate, complete, timely, and Cross-Border Consistent. Accuracy is defined as having the right information on the right business. Completeness is defined as providing breadth and depth of data. Timeliness is making frequent updates to keep the information fresh. Cross-Border Consistency is providing consistent data across the globe. With thousands of sources of data available, it is a challenge to determine which is the quality information a business should rely on to make decisions. This is particularly true when businesses change so frequently. In the next 60 minutes in the U.S., 251 businesses will have a suit, lien, or judgment filed against them, 58 business addresses will change, 246 business telephone numbers will change or be disconnected, 81 directorship (CEO, CFO, etc.) changes will occur, 41 new businesses will open their doors, 7 corporations will file for bankruptcy, and 11 companies will change their name.
  • Conventional methods of providing business data are incomplete. Some providers collect incomplete data, fail to completely match entities, have incomplete numbering systems that recycle numbers, fail to provide corporate family information or provide incomplete corporate family information, and merely provide incomplete value-added predictive data. It is an object of the present invention to provide more complete, timely, accurate, and consistent business data. This includes data collection, entity matching, identification number assignment, corporate linkage, and predictive indicators. This produces high quality business information that provides insights so businesses can trust and decide with confidence.
  • SUMMARY OF THE INVENTION
  • One aspect of the present invention is a method of data integration comprising collecting information comprising primary data. The primary data is tested for accuracy and processed to produce secondary data and enhanced information comprising the primary data and the secondary data is provided. In some embodiments, primary and/or secondary data is sampled periodically thereby generating sample data. The sample data is evaluated against at least one predetermined condition. Based upon this evaluation, testing and/or processing steps are adjusted.
  • In some embodiments, testing comprises at least one of the following steps: (a) determining if the primary data matches stored data and (b) assigning an identification number to the primary data. It is determined if the primary data meets a first threshold condition before assigning an identification number in step (b) if the primary data does not match the stored data in step (a). The first threshold condition is multiple sources confirm that a business associated with the primary data exists. The identification number is an entity identifier. The primary data is stored in a separate repository and assigned an identification number if it does not meet the first threshold condition. Additional primary data is received and it is determined if the primary data and the additional primary data meet the first threshold condition, the entity is moved into the multi-source repository.
  • Another aspect of the present invention is a system for data integration. The system includes a data generator, a testing unit, a first processing unit, and a second processing unit. The data generator is capable of gathering primary data from at least one data source. The testing unit is capable of testing the primary data for accuracy. The first processing unit is capable of analyzing the primary data and generating secondary data from the result of the analysis. The second processing unit is capable of merging the primary data and the secondary data to form enhanced information. The testing unit, first processing unit, and the second processing unit may be the same or independent of one another. In some embodiments, the testing unit comprises at least one of a data matching unit and entity identifier unit. The first processing unit comprises at least one of a corporate linkage unit and a predictive indicator unit.
  • Another aspect of the present invention is a machine-readable medium for storing executable instructions for data integration. The instructions include collecting information comprising primary data, testing the primary data for accuracy, processing the primary data to produce secondary data, and providing enhanced information comprising the primary data and the secondary data.
  • In some embodiments, the primary and/or secondary data is sampled periodically, thereby generating sample data. The sample data is evaluated against at least one predetermined condition. The testing and/or processing is adjusted based upon the evaluation.
  • These and other features, aspects, and advantages of the present invention will become better understood with reference to the drawings, description, and claims.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram of the method of data integration according to the present invention;
  • FIG. 2 is a block diagram of a system for data integration according to the present invention;
  • FIG. 3 is a block diagram of a system for data integration according to the present invention;
  • FIG. 4 is a logic diagram depicting the method of data integration according to the present invention;
  • FIG. 5 is a block diagram of example sources of data collection according to the present invention;
  • FIG. 6 is a block diagram of more example sources of data collection according to the present invention;
  • FIGS. 7 and 8 are block diagrams of entity matching according to the present invention;
  • FIG. 9 is a block diagram of entity matching where no match is found in existing databases of traditional businesses, but through sourcing internal and external data stores an emerging business match is made
  • FIG. 10 is a block diagram of entity matching where matched data is delivered to one database or matched through internal and external data sources or assigned an identification number and housed in a single source repository
  • FIGS. 11 and 12 are block diagrams of a method of entity matching according to the present invention;
  • FIG. 13-15 are block diagrams of corporate linking according to the present invention;
  • FIG. 16 is a block diagram of corporate linkage following a merger/acquisition event;
  • FIG. 17 is a logic diagram of an example method of performing corporate linkage according to the present invention;
  • FIG. 18 is a block diagram of corporate linkage where relationships are outside of the definition of legal ownership to show other types of linkages; and
  • FIGS. 19 and 20 are block diagrams of an example method of providing a predictive indicator according to the present invention.
  • DETAILED DESCRIPTION OF THE INVENTION
  • In the following detailed description, reference is made to the accompanying drawings. These drawings form a part of this specification and show, by way of example, specific preferred embodiments in which the present invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the present invention. Other embodiments may be used and structural, logical, and electrical changes may be made without departing from the spirit and scope of the present invention. Therefore, the following detailed description is not to be taken in a limiting sense and the scope of the present invention is defined only by the appended claims.
  • FIG. 1 shows an overview of a method of data processing according to the present invention. The foundation of the method is quality assurance 102, which is the continuous data auditing, validating, normalizing, correcting, and updating done to ensure quality all along the process. There are five quality drivers that work sequentially to enhance the incoming data 104 to turn it into quality information 106. These five drivers are: a data collection driver 108, an entity matching driver 110, an identification (ID) number driver 112, a corporate linkage driver 114, and a predictive indicators driver 116. These five drivers interface with a database 118. Database 118 is an organized collection of data and database management tools, such as a relational database, an object-oriented database, or any other kind of database. Data in database 118 is continually refined and enhanced based on customer feedback and quality assurance testing and procedures.
  • Data collection driver 108 brings together data from a variety of sources worldwide. Then, the data is integrated into database 118 through entity matching driver 110, resulting in a single, more accurate picture of each business entity. Next, identification number driver 112 applies an identification number as a unique means of identifying and tracking a business globally through any changes it goes through. Corporate linkage driver 114 then builds corporate families to enable a view of total corporate risk and opportunity. Finally, predictive indicators driver 116 uses statistical analysis to rate a business' past performance and indicate the likelihood of a business to perform in a specific way in the future.
  • FIGS. 2 and 3 show two example embodiments of systems for data integration according to the present invention, although other systems would also be suitable for practicing the present invention. FIG. 2 shows a network configuration while FIG. 3 shows a computer system configuration. In FIG. 2, a network 200 facilitates communication among the other system components, including a computer system 202. The five quality drivers, data collection driver 108, entity matching driver 110, identification number driver 112, corporate linkage driver 114, and predictive indicators driver 116, and quality assurance 102 work sequentially to enhance the incoming data 104 to turn it into quality information 106 stored in database 204. In FIG. 3, a computer system 300 has a processor 302 with access to memory 304 via a bus 306. Memory 304 stores an operating system program 308, a data integration program 310, and data 312.
  • FIG. 4 shows detail around Quality Assurance for each driver as another embodiment of a method of data integration according to the present invention. This method includes five main drivers of data integration: data collection 108, entity matching 110, identification number 112, corporate linkage 114, and predictive indicators 116 to produce quality information 106. Quality information 410 is produced as a result of quality assurance performed by each driver.
  • For data collection 108, a very large amount of global data is collected from a variety of sources for increased accuracy. Quality assurance 400 is performed for data collection 108 to verify legal name and ownership to identify potential fraud, to update contact information, to update and make changes based on events, to verify and enhance third party information, and to ensure accuracy, completeness, timeliness and cross-border consistency. Quality assurance 400 continually refines and enhances data collection 108.
  • For entity matching 110, incoming data is matched to data in database 118. Quality assurance 402 is performed for entity matching with manual and automated quality checks to ensure accurate matches and eliminate duplicates. Based on customer feedback and matching learnings, quality assurance 402 for entity matching 110 is continually refined and enhanced.
  • For identification number 112, businesses are uniquely identified and tracked. Quality assurance 404 is performed for identification number 112 by retaining an identification number for the life of a business and by being recognized as an industry standard. The identification number allows verification of information in each of the five drivers. For data collection 108, if data is not linked to an identification number, it indicates the possibility of a new business. For entity matching 110, the identification number allows new data to be accurately matched to existing businesses. For corporate linkage 114, corporate families are assembled based on each business' identification number. For predictive indicators 116, numbered data is used to build predictive tools. A verification process assigns an identification number when commercial activity is confirmed. Quality assurance 404 for identification number 112 includes validating and protecting against duplication. The identification number assignment process is continually refined and enhanced.
  • For corporate linkage 114, corporate families are built to provide a view of total risk and opportunity. Quality assurance 406 for corporate linkage 114 includes building corporate families globally and updating them after mergers, acquisitions, and other events. Quality assurance 406 for corporate linkage 114 includes increasing completeness and accuracy of corporate families by having a dedicated team review corporate families and by matching corporate families. Based on customer feedback, the corporate linkage 114 is continually refined and enhanced.
  • For predictive indicators 116, statistical analysis is used to indicate the likelihood of a business to perform in a specific way in the future. Quality assurance 408 for predictive indicators 116 includes continually monitoring and adjusting predictive indicators 116 to reflect new information. Based on customer feedback, the predictive indicators 116 are continually refined and enhanced.
  • Thus, the five main components or drivers work together to integrate the data collected into quality information 106 that is useful for making business decisions. The process is continually enhanced to continually improve quality based on feedback, learnings and experience spanning over the past 160 years. Each of the five drivers is examined in more detail below, starting with data collection driver 108.
  • Global Data Collection
  • FIG. 5 shows some sources of data used in data collection driver 108. Data is collected about customers, prospects, and suppliers with the goal of collecting the most complete data possible. Preferably, database 118 is a global database. For example, database 118 has data for millions of businesses worldwide and is updated daily. In this example, database 118 contains direct investigations, news, and media 502, payment and financial data (trade data) 504, public records and government registries 506, and web sources and directories 508.. Payment and financial data includes trade records updated frequently, complete coverage of public company financials, and coverage of financial statements on privately held companies. Public records and government registries include, suits liens, judgments, uniform commercial code filings, bankruptcy filings, and business registrations. Web sources and directories include uniform resource locators (URLs), updates from domains, and customers providing online updates. Data can also be collected from other strategic data partners. These strategic partners provide data from international markets and conduct an agreed upon amount of due diligence on the data, prior to delivery into the global database. The inclusion of data from strategic partners enables comprehensive global coverage.
  • In an example database 118, top news providers are monitored every day to uncover changes and updates that affect the risk level and/or marketing attributes and of the user's customers, prospects, and suppliers. This data is focused on publicly traded companies with additional coverage devoted to mergers and/or acquisitions and high risk or business deterioration. News is posted within 24 hours of release. The types of events include mergers and acquisitions, control changes, purchase or sale of assets, officer, name or location changes, earnings updates, and business closings. The benefit is updated information that affects the risk level of companies the user does business with and indications of key changes that can be used for marketing purposes.
  • In an example database 118, payment experiences from companies are collected to help the user predict future payment habits of prospects and customers. Accounts receivable data on U.S.-based companies provides an overall evaluation of how quickly and completely a company made payment to each vendor. Many reports created using database 118 include payment data. This payment experiences data has many benefits. Users get a picture of how a company is paying their vendors, bank loans, and other financial obligations. It enables showing payment trends over time. It enables creation of predictive scores for use in applications such as automated credit approvals. It helps pre-screen potential customers based on their ability to pay on time. Payment experiences are summarized to show the user how different industries are paid and credit limits.
  • An example database 118 has public records from U.S. courts and legal filing offices to provide critical insights into the risk of a company. This data includes U.S.-based company information on suits, liens, judgments, bankruptcies, and U.C.C. filings (collectively called public records), information obtained from courts and recording offices, company filing for bankruptcy protection under Chapter 11 (re-organization) or Chapter 7 (liquidation). This data captures a majority of the U.S. public filings and has many benefits. Over 10 years of historical coverage enable predictive credit ratings and scores. Users understand legal actions that could affect a company's ability to continue as an ongoing concern. A company's rating is negatively impacted when a bankruptcy takes place. Users are notified about all companies affected in a corporate family when a bankruptcy occurs within the corporate family.
  • In an example database 118, complete coverage of public company financial statements and many privately held company financial statements help the user to understand financial strength. This data includes balance sheet and income statements and private company financial statements collected from certified public accountants (CPAs) or from corporate officers. In the US, for example, public company financial information is obtained from the Securities and Exchange Commission (SEC) or annual reports, 10K's and 10Q's. The database 118 has complete coverage on public companies. Most financial statements are on privately-held companies. This data has many benefits. Users understand financial strength, ability to pay on time and ability to continue as an ongoing concern. This data helps target prospects by size or financial strength.
  • An example database 118 has data from telephone calls that verify and enhance the third party information leading to over one and one-half million updates to the database 118 everyday. This data includes interviews with business principals to verify and enhance information from other sources. Every public company is monitored daily.. There is a focus on collecting value-added data (e.g., business name, address, telephone number, SIC, employee number, sales, CEO/owner name). This has many benefits. It serves as an additional check on the accuracy of the data, helps validate third party data, builds content on small businesses, and makes the data consistent across the globe. Consistency of data enables customers to rely on the same high quality of information country to country, creating opportunity for growth, consistency in credit and marketing policies globally, understanding risk exposure, marketing opportunity and reliance on suppliers globally.
  • The URL file is collected from external and internal sources. Each URL is mined several times a year to confirm its status (live, parked, under construction, redirect, inactive) and verify it belongs to the company it has been assigned to using the name, address or telephone number from the existing database. Besides verification several times a year additional data elements such as security data, certificate data, strength of encryption and other data are collected from the URL. The URL's verified are populated in the database using one-down linkage to expand coverage across family tree members.
  • FIG. 6 shows some additional sources of data used by data collection driver 108 for increased accuracy, such as telephone company data 602, internet, news and media 604, direct investigations 606, company financial information 608, payment data 610, courts and legal filings offices 612, government registries 614, and diversity data 616. This completeness of information aids profitable business decisions. In risk management, a user assesses risk from non-United States (U.S.) companies with the resulting information. Risk from small business customers can be more completely identified. The user can make more informed risk decisions when they are based on more complete information. In sales and marketing, the user can identify new prospects from data drawn from multiple sources. The user can gain access to international customers and prospects and cherry pick a prospect list with value-added information such as standard industrial classification (SIC) and contact name. In supply management, the user may assess risk from foreign suppliers with the resulting information and identify the risk from suppliers more completely. The user gains a fresher more complete picture of each customer, prospect, and supplier because of daily updates to database 118.
  • In an example, telephone company data is collected to identify new businesses, changes in existing records and to provide updated contact information. Businesses request new listings when initiating phone service. The benefits of this data include indication of a new business or change in phone number and enabling creation of new records or enhancing existing ones, providing the most recent address, phone number, and line of business (SIC) information.
  • In an example, database 118 includes business registrations from state government registries to verify legal name and ownership to identify potential frauds. Database 1 18 has information on business registrations filed at the time a company is incorporated. This has many benefits. It enables verification of the existence of registered businesses, confirms information, such as a company's organizational structure, date, and state of incorporation (or organization), help aid in fraud investigation through review of names and principals and business standing within a state, and identification of all changed records and new-to-file records.
  • Quality assurance 102 of database 118 ensures accuracy, completeness, timeliness, and cross-border consistency of global data. Quality assurance includes standardizing data, correcting and updating data, ensuring phone numbers connect and mailing addresses deliver to the intended recipient, and conducting manual reviews.
  • Quality assurance 102 includes standardizing data. Numerous quality edits and validations are made at the time of data entry. Data is validated to ensure consistency between branch and headquarter names, reasonability between number of employees, sales volume and line of business, prevent duplication of records, validate out-of-business status changes and more. Global cleansing software, is used to standardize marketable records and ensure consistency in presentation of records Addresses are standardized before inclusion in the database.
  • Quality assurance 102 includes correcting and updating data. In an example, the status of suits, liens, judgments and bankruptcy filings are reviewed and updated. Data flows between internal teams to ensure information is consistently updated between areas of news, risk, ratings and delivery. Constantly updating and refreshing the data, leads to high response rates on customer acquisition promotions, high match rates between files and high quality data in the database 118.
  • Quality assurance 102 includes manual reviews. Third party data is validated with manual reasonability reviews. Payment re-checks are manually performed on trade references appearing abnormal or exaggerated. Financial statements are reviewed to identify high risk businesses, ensure accuracy and apply capital strength ratings consistently across the universe of records. Comparisons of merger/acquisition update volumes are done with externally published numbers to ensure complete coverage.
  • Data is continually refined and enhanced through quality assurance 102 and global data collection 108.
  • Entity Matching
  • FIG. 7 shows how multiple unmatched pieces of data 702 may be turned into a complete single business 704. Entity matching driver 110 checks the incoming data 104 to see if it belongs to any existing business in database 118. In this example, ABC, Inc., Chuck's Mini-Mart, and Charles Smith appear to be separate companies, but after entity matching, it is clear that they are all part of one enterprise, ABC Inc. and Chuck's Mini-Mart. The different addresses and other associated information is also reconciled into a complete single business 704.
  • There are many benefits from entity matching driver 110. Entity matching driver 110 detects similarities in incoming data and combines it into a single business. Queries are more likely to be accurate, customer, supplier, and prospect information is consolidated to provide more complete and accurate profiles, and there are less duplicate records. In addition, the customer can receive information about the quality of their matched records via D&B's matching feedback mechanisms, allowing the customer to decide how to use the matched information in their business processes. Another benefit is that the customer receives a consistent answer as the matching process is repeatable and defined.
  • FIG. 8 shows how incoming data 104 that matches a business in database 118 is appended to that business through entity matching driver 110. Another case is shown in FIG. 9, where incoming data 104 that does not match any business in database 118 is sourced through internal and external data sources and matched to an emerging business or, as shown in FIG. 10, is assigned an identification number and held in an single source repository as learnings are gained on the entity. Entity matching driver 110 is designed to match data to the right business every time, thus, increasing efficiency. Entity matching driver 110 provides more complete and accurate profiles of customers, prospects, and suppliers and ensures far fewer duplicate businesses.
  • FIG. 11 shows an example method of matching via match driver 110. This method includes cleaning and parsing and standardizing 1102, performing candidate retrieval 1104, and evaluation and decision making 1106. Cleaning and parsing 1102 includes identifying key components of inquiry data 1108, normalizing and standardizing name, address, and city 1110, performing name consistency 1112, and performing address standardization 1114. Candidate retrieval 1104 includes gathering possible match candidates from a reference database 1116, using optimized keys to improve retrieval quality and throughput 1118, and retrieval is optimized based on data provided in the inquiry data, observations of existing reference data and ongoing tuning 1120. Evaluation and decision making 1106 includes evaluating matches according to a consistent standard 1122, applying a match grade 1124, applying a confidence code 1126, and applying a confidence percentile 1128.
  • To ensure quality assurance 102 of entity matching 110, manual and automated checks are performed. Samples of matched records are manually reviewed. Based on experience, customer feedback and learnings, entity matching 110 is recalibrated. Entity matching 110 allows and corrects for variations in spelling, formats, trade names, addresses, and the like. Entity matching 110 uses a match grade and confidence code to determine if the match passes the quality threshold. Entity matching 110 provides a consistent, repeatable process that is not based on human judgment. The benefits are more accurate matches and less duplicates.
  • Quality assurance 102 of entity matching 110 includes continually refining and enhancing entity matching 110 based on customer feedback. Samples of matched records are manually reviewed, technology allows for corrections in spelling, formats, trade names, addresses. Technology also interprets context of key parts of the inquiry to better find difficult matches (i.e. interpret parts of the sound, geographic position, implied line of business, acronyms). Quality assurance is also ensured by using a customized retrieval approach for each inquiry that looks at the best way to find a match to optimize the result for each unique inquiry (i.e. some matches are better made by using sound algorithms, other matches are better made by using exact name matches). As enhancements are made, they become available both online and in batch systems to ensure consistency. The benefits of these improvements are increased search candidates, additional functionality and increased throughput. In other words, more hits, more better hits, and more better hits faster. Matching capabilities include matches to a proprietary database containing multiple names and addresses per record, the ability to identify matches that don't look exactly like each other, and the ability to select by the quality of the match.
  • DUNS Number
  • Identification (ID) number driver 112 appends a unique identification number to every business location so it can be easily and accurately identified. This identification number is non-indicative. One example of the unique identification number is such as the D-U-N-S® Number available from Dun & Bradstreet headquartered in Short Hills, N.J., which is a nine-digit number that allows business locations to be easily tracked through changes and updates. The identification number is retained for the life of a business. No two business locations ever receive the same identification number and the identification numbers are never recycled. The identification number acts as an industry standard for business identification. It is endorsed by the United Nations, the European Commission, and over fifty industry groups.
  • The identification number is a central concept in the data processing method according to the present invention. For quality assurance, the identification number allows verification of information at every stage of the process. For data collection driver 108, if data is not linked to an existing identification number, it indicates the possibility of a new business. For entity matching driver 110, the identification number allows new data to be accurately matched to existing businesses. For corporate linkage driver 114, corporate families are assembled based on each business' identification number. For predictive indicators driver 116, the identification number is used to build predictive tools.
  • Additionally, the identification number opens new areas of opportunity to a user's business by helping to verify that a business exists and validating the business location. Users are provided a complete view of prospects, customers, and suppliers. Existing data is clarified, duplication is identified, and related businesses are shown to be related. Users can more easily manage large groups of customers or suppliers when the identification number is appended to the user's information. The identification number enables fast and easy data updates when appended to the user's information. The identification number provides a complete view of prospects and customers by placing businesses, where applicable, within their domestic and global corporate ‘families’, identifying penetration and opportunities for up-sell and cross-sell. The identification number also helps aggregate data from multiple and disparate systems to gain better insight with one complete view of prospects, customers and suppliers.
  • The identification number not only helps identify duplication in files within the database, but also enables customers with a unique key that can be used to identify duplication in the customer's existing portfolio of accounts.
  • FIG. 13 shows an example method of identification number driver 112. Data collection 108 provides input data that is pre-processed 1300 by de-duping, appending phone and SIC, validating address and town, and checking for branches and franchises. This processed data is matched to a unique identifier file 1302. If a match is found, data is appended to an existing record in the multi-sourced file 1304. If a match is not found, the data is included in a single source repository 1306 and, then, unique identifier assignment rules are applied 1308. As new files are received and additional sources validate a record in the single source repository, that record then becomes included in the multi source file. If new sources do not validate the record, the record stays in the single source repository.
  • Quality assurance 102 includes how identification numbers are managed. In an example, an identification number is retained for the life of a business. No two businesses ever receive the same identification number. Identification numbers are never recycled.. The identification number is retained when a company moves anywhere within the same country. The identification number is preferably an industry standard for business identification.
  • Quality assurance 102 of identification number driver 112 includes validation and protection against duplication. Rigorous processing is done to identify duplicate identification numbers including using duplicate scoring systems, implementing controls around bulk file building and undergoing validations prior to entering the database. In an example, every business is validated before it is included in database 118 so that the address is based on postal standards, incoming records are validated in relation to a town file (e.g., address, city, ZIP, state, and telephone number), and phone number and line of business are verified. There is multiple source validation, i.e., business registrations sometimes do not indicate a business has begun operations.
  • Quality assurance 102 of identification number driver 112 includes refining and enhancing the identification number assignment process.
  • Corporate Linkage
  • FIGS. 14-16 show how corporate linkage driver 114 builds corporate linkage to reveal how companies are related. Without corporate linkage, the companies, L Refinery Div. 1402, C Stores Inc. 1404, and G Storage Div. 1406 in FIG. 14 appear to be unrelated.
  • As shown in FIG. 15, however, applying corporate linkage allows the entire corporate family to be viewable without limit in depth or breadth. Parent company U Products Group Corp. 1502 and has three subsidiaries under it, L Inc. 1504, C Inc 1506, and G Inc. 1508. L Inc. 1504 has two branches, L Storage Div. 1510 and L Refinery Div. 1402 (shown in FIG. 14). C Inc. 1506 has two branches, Industrial Co. 1512 and Building Co. 1514 and a subsidiary, C Stores Inc. 1404 (shown in FIG. 4). G Inc. 1508 has two branches, G Storage Div. 1406 (shown in FIG. 14) and G Refinery Div. 1516. C Stores Inc. has four branches, North Store Inc. 1518, South Store Inc. 1520, West Store Inc. 1522, and East Store Inc. 1524. Building extensive corporate linkage allows a business information provider to be an industry leader by providing this complete detail.
  • FIG. 16 shows how corporate linkage driver 114 updates family trees after mergers and acquisitions. In this example, two separate businesses, ABC 1602 and XYZ 1604 exist before a merger and each have their own subsidiaries and branches. After the merger, ABC XYZ 1606 has two subsidiaries, ABC subsidiary 1608 and XYZ subsidiary 1610, each with their own branches and/or subsidiaries.
  • Corporate linkage driver 114 opens up profitable opportunities in risk management, sales and marketing, and supply management for a user. It allows the user to understand the total risk exposure and regulatory and statutory compliance implications across a corporate family. The user recognizes the relationship between bankruptcy or financial stress in one company and the rest of its corporate family. The user increases sales by up-selling and cross-selling with a corporate family. The user reduces expenses by reducing research time. The user can maximize the opportunity based on revenues from an entire corporate family. The user can understand where purchase decisions are made. The user can identify possible conflicts of interest. The user can determine its total spend with a corporate family to better negotiate.
  • FIG. 17 shows an example method of performing corporate linkage driver 114. Generally, it shows a method of updating family tree linkage 1700 where the goal is to correctly link all subsidiaries and branches of each entity having an identification number with consistent names, tradestyles, and correct employee numbers, while resolving all look-a-likes (LALs).
  • Members of a corporate family are identified by their relationship to other members. In an example, members include a global ultimate, a domestic ultimate, parents corporations, subsidiaries, headquarters, and branches. A global ultimate is a highest ranking member of a corporate family globally. A domestic ultimate is the highest ranking member of a corporate family within a specific country. A parent corporation is a company that owns more than half of another company. A subsidiary is a company that is more than half owned by a parent company. Headquarters is a company with reporting branches or divisions. A branch is a secondary location or operation, not a separate entity.
  • FIG. 18 shows other relationships among DUNS numbered entities can be overlaid on to the Corporate Linkage view to enrich the overall understanding of a group of otherwise potentially independent entities. Examples of this include franchise relationships, associations, co-ops, agents, dealers, chapters and affiliated concerns.
  • Quality assurance 102 during corporate linkage 114 increases the completeness and accuracy of corporate families. In an example, a dedicated team reviews corporate families. This ensures business names, tradestyles, and SICs are consistent within a corporate family. Quality assurance 102 includes checking for duplicates. There are central review and updates for the largest global family trees. Changes are monitored to identify and track mergers and acquisitions and other major events. Quality assurance 102 includes matching of corporate families. There are quality programs to ensure business entities are linked properly and to handle linkage breaks within a corporate tree. Corporate linkage is done through legal ownership. Quality assurance 102 of corporate linkage 114 includes continually refining and enhancing corporate linkage based on customer feedback. Corporate linkage 114 capabilities include global cross-border linkage, U.S. linkage, public company linkage, private company linkage, and linkage defined by legal ownership versus business name. Quality assurance processes include using a validation tool to identify erroneously unlinked records or ‘look-a-likes’. The quality assurance processes are continually refined and enhanced based on learnings, feedback and reviews.
  • Predictive Indicators
  • Predictive indicator driver 116 summarizes the information collected on a business and uses it to predict future performance. Predictive indicators use statistical analysis to indicate the likelihood of a business to perform in a specific way in the future. There are many benefits to predictive indicators. Users can make faster, more consistent decisions by allowing automated decisions for increased efficiency. Users can free up resources to look at time-intensive borderline decisions. Users can make more consistent decisions across the entire organization. Users can allow faster processing of large volumes of transactions. Users can apply scores across an entire portfolio to quickly identify risk and opportunity. Users can help estimate demand to target the right prospects and reduce acquisition costs.
  • There are three types of predictive indicators: descriptive ratings, predictive scores, and demand estimators. Descriptive ratings summarize how a customer has historically been paying bills. Predictive scores are a prediction of how likely it is for a business to pay promptly or continue as an ongoing concern. Demand estimators estimate how much of a product a business is likely to buy in total (response, approval, look-a-like models).
  • Predictive indicators help a user to accelerate and impact profitability in all areas of its business. In risk management, descriptive ratings and predictive scores help the user grant or approve credit. A rating indicates creditworthiness of a company based on past financial performance. A score indicates likelihood of a business to continue as an ongoing concern or pay on time. Predictive scores can be applied across the user's whole portfolio to quickly identify high-risk accounts and begin aggressive collection immediately or to evaluate the credit worthiness of each applicant. A commercial credit score predicts the likelihood of a business paying slow over the next twelve months. A financial stress score predicts the likelihood of a business failing over the next twelve months. In sales and marketing, look-a-like models, response models and demand estimators let a user: identify prospects that look like their best customers, identify who is likely to respond to an offer, and/or how much product they will buy so that it can prioritize opportunities among customers or prospects. Examples of demand estimators include number of personal computers and local or long distance spending. In supply management, predictive scores can be applied to all of a user's suppliers to quickly understand their risk of failing in the future.
  • In addition, predictive scores may be customized according to a user's specific need and criteria. For example, criteria may be used, such as (1) what behavior does the user want to predict; (2) what is the size of the business the user wants to assess; and (3) what are the decision rules based on the user's risk tolerance to translate risk assessment in to a credit decision or risk management or marketing action.
  • Predictive indicators are enabled by analytic capability and data capability. For example, a dedicated team of experienced business-to-business (B2B) expert PhDs may build the underlying predictive models and have access to industry-specific knowledge, financial and payment information, and extensive historical information for analysis.
  • FIGS. 19 and 20 show an example method of creating a predictive indicator. It starts with market analysis 1802 and then there is a business decision on model development 1804. This decision involves the type of score to be developed and output at the end, such as a failure risk score, a delinquency risk score, or an industry specific score. The failure risk score is the likelihood that a company will cease operations. The delinquency risk score is the likelihood that a company will pay late. The industry specific score predicts something particular, such as the likelihood of using copiers or truckers or whether a company is a good credit risk. Input data 1806 is gathered from an archive of credit database 1808 and a trade tape database 1810 which provide historical data related to credit. There are two time periods of concern, an observation period which is a look historically at all the facts and a performance period which is a time period just after that to see what happened. For example, given data in the previous year, how did a company perform with respect to a certain time period in the current year. The next step, refers to a risk to be evaluated, such as a financial stress score that predicts the likelihood of a negative failure in the next twelve months.
  • A development sample is selected from a business universe 1814, a demographic profile is created of the business universe 1816, and exploratory data analysis is performed 1818 (univariate analysis of all variables. Tasks are performed such as determining the relationships between the variable and what is being predicted, the range of a variable, the type of variable, including or not including variables, and other functions related to understanding what to put in the model. Variables may be selected in accordance with the observation period and the performance period and weights may be assigned to indicate accuracy or representativeness. Trends are factored in. Quality assurance includes periodically checking to see if anything in the business universe effects the initial model and to take a score and run it against a prior period to check that it is still indicative or predictive.
  • Continuing on FIG. 19, statistical analysis and model development processes including logistic regression and other estimation techniques 1820 are performed. This step includes applying the appropriate models, formulas, and statistics. Next, statistical coefficients are converted into a scorecard 1822. Models are tested and validated 1824, and technical specifications are developed 1826. Finally, the model is implemented 1828 and tested 1830. Data is run through the model to generate a score. Periodically, checks are performed to verify that the score is still valid and to determine if the scorecard needs to be updated.
  • Quality assurance 102 of predictive indicators 116 includes continually monitoring and adjusting predictive indicators to reflect new information. In an example, this includes periodic testing of predictiveness, continuous manual refinement and recalibration, automated changes, monthly audits and annual validation, and analyzing data for each model with respect to its predictive qualities and importance whenever models are created or updated. Also, predictive indicators are continually refined and enhanced based on customer feedback. Predictive indicators 116 has data depth, including demographic data, payment information, detailed public record information, such as suits, liens, judgments, bankruptcies, and UCC filings, public and private company financial information, and linkage data used to assign risk to the responsible entity (i.e., score branches with HQ data). An independent group of reviewers check and validate the results of the scores, from which continual refinement and enhancement is realized. Customer needs and industry trends are also considered when quality assurance processes are done to continually improve the models and scores.
  • The present invention has many advantages. Preferably, a global database used to perform a method of data integration encompasses millions of records and is updated daily. Users gain a fresher, more complete picture of each of their customers, prospects, and suppliers, because of the large number of daily updates to the database. Users are able to assess the risk of non-U.S. companies, because the database has global data. Users can more completely identify the risk from small business customers. Users make more informed risk decisions. Users identify new prospects from data drawn from multiple sources. Users gain access to international customers, suppliers and prospects. Users receive enhanced prospect lists with value-added information, such as line of business and contact name. Users can assess risk from foreign suppliers. Users can identify more complete the risk from suppliers.
  • It is to be understood that the above description is intended to be illustrative and not restrictive. Many other embodiments will be apparent to those of skill in the art upon reviewing the above description. Various embodiments for performing data collection, performing entity matching, applying an identification number, performing corporate linkage and providing predictive indicators are described. The present invention has applicability to applications outside the business information industry. Therefore, the scope of the present invention should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.

Claims (33)

1. A system of data integration, comprising:
a data collection driver for collecting business information;
an entity matching driver for matching said business information to data in a database that is associated with a business entity;
an identification number driver for assigning a unique identifier to said business entity;
a corporate linkage driver for providing linkage of said business entity to a corporate family; and
a predictive indicator driver for providing statistical and analytical information about the likelihood that said business will perform in a specific way in the future;
wherein said data collection driver, said entity matching driver, said identification number driver, said corporate linkage driver, and said predictive indicator driver produce quality information for business decisions.
2. The system according to claim 1, wherein said data collection driver collects data from a plurality of sources, said sources including private company financial data, public company financial data, courts and legal filing office data, telephone company data, direct investigation data, government registries, diversity data, payment data, news, media and the Internet.
3. The system according to claim 1, wherein said data collection driver involves updating the data daily to ensure fresh, accurate data.
4. The system according to claim 1, wherein said data collection driver includes collecting the data from strategic partners, in addition to collecting raw data.
5. The system according to claim 1, wherein said data collection driver encompasses global information to ensure data is consistently available across the globe.
6. The system according to claim 1, wherein said data collection driver involves rigorous quality assurance practices to ensure the data coming in from various sources is accurate, timely, complete and consistent.
7. The system according to claim 1, wherein said entity matching driver matches to a proprietary database containing multiple names and addresses per record.
8. The system according to claim 1, wherein said entity matching driver identifies and removes duplicates.
9. The system according to claim 1, wherein said entity matching driver is able to identify matches that are not identical.
10. The system according to claim 1, wherein said entity matching driver is able to select a match by a confidence measure.
11. The system according to claim 1, wherein said entity matching driver uses a process of cleaning, parsing and standardizing, performing candidate retrieval, and evaluation and decision making as each match is made.
12. The system according to claim 1, wherein said entity matching driver is quality assured by constantly refining and returning match parameters based on customer feedback, manual reviews, match learnings and advances in match technologies.
13. The system according to claim 1, wherein said identification number driver assigns said unique identifier to a single, unique business location and never recycles said unique identifier.
14. The system according to claim 1, wherein said identification number driver uses said unique identifier to track said business through mergers, acquisitions, control changes and the entire business life cycle, including business discontinuance.
15. The system according to claim 1, wherein said identification number driver is used to aggregate data and identify duplicates in said database.
16. The system according to claim 1, wherein said identification number driver is a nonindicative number.
17. The system according to claim 1, wherein said identification number driver can be used to identify duplication in customer files as well as in said database.
18. The system according to claim 1, wherein said identification number driver plays a key role in each of the five drivers involved in this data integration process as it is an identification key for a business record as it travels through the entire process.
19. The system according to claim 1, wherein said identification number driver undergoes rigorous quality assurance procedures to protect against duplication and maintain the integrity of the numbering system.
20. The system according to claim 1, wherein said corporate linkage driver provides cross-border linkage.
21. The system according to claim 1, wherein said corporate linkage driver provides public company linkage.
22. The system according to claim 1, wherein said corporate linkage driver provides private company linkage.
23. The system according to claim 1, wherein said corporate linkage driver defines linkage by ownership.
24. The system according to claim 1, wherein said corporate linkage driver can also be referenced to provide details of relationships outside of corporate ownership, including affiliations, associations, co-ops, franchises, agents, dealers and chapters.
25. The system according to claim 1, wherein said corporate linkage driver uses quality assurance procedure to improve the accuracy, completeness, timeliness and consistency of corporate linkage data.
26. The system according to claim 1, wherein said predictive indicator driver uses payment information.
27. The system according to claim 1, wherein said predictive indicator driver uses public records.
28. The system according to claim 1, wherein said predictive indicator driver uses public and private company financial information.
29. The system according to claim 1, wherein said predictive indicator driver uses linkage data to assign risk to a responsible entity.
30. The system according to claim 1, wherein said predictive indicator driver can be selected from the group consisting of: descriptive ratings, predictive scores, and demand estimators.
31. The system according to claim 1, wherein said predictive indicator driver are enabled by analytic capability and data capability.
32. The system according to claim 1, wherein said predictive indicator driver are continually monitored and adjusted to improve the overall quality and accuracy.
33. A method of data integration, comprising:
collecting business information;
matching said business information to data in a database that is associated with a business entity;
assigning a unique identifier to said business entity;
providing linkage of said business entity to a corporate family;
providing statistical and analytical information as predictive indicators about the likelihood that said business will perform in a specific way in the future; and
producing quality business and financial information for business decisions.
US11/137,821 2003-02-18 2005-05-25 Data integration method Abandoned US20060004595A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/137,821 US20060004595A1 (en) 2003-02-18 2005-05-25 Data integration method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/368,072 US7822757B2 (en) 2003-02-18 2003-02-18 System and method for providing enhanced information
US11/137,821 US20060004595A1 (en) 2003-02-18 2005-05-25 Data integration method

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/368,072 Continuation-In-Part US7822757B2 (en) 2003-02-18 2003-02-18 System and method for providing enhanced information

Publications (1)

Publication Number Publication Date
US20060004595A1 true US20060004595A1 (en) 2006-01-05

Family

ID=32850091

Family Applications (3)

Application Number Title Priority Date Filing Date
US10/368,072 Expired - Fee Related US7822757B2 (en) 2003-02-18 2003-02-18 System and method for providing enhanced information
US11/137,821 Abandoned US20060004595A1 (en) 2003-02-18 2005-05-25 Data integration method
US12/892,496 Expired - Lifetime US8346790B2 (en) 2003-02-18 2010-09-28 Data integration method and system

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/368,072 Expired - Fee Related US7822757B2 (en) 2003-02-18 2003-02-18 System and method for providing enhanced information

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/892,496 Expired - Lifetime US8346790B2 (en) 2003-02-18 2010-09-28 Data integration method and system

Country Status (9)

Country Link
US (3) US7822757B2 (en)
EP (1) EP1599778A4 (en)
JP (1) JP4996242B2 (en)
KR (1) KR101006889B1 (en)
CN (1) CN1826578B (en)
AU (1) AU2004214217B2 (en)
CA (1) CA2516390C (en)
SG (1) SG157229A1 (en)
WO (1) WO2004074981A2 (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050192891A1 (en) * 2004-02-27 2005-09-01 Dun & Bradstreet, Inc. System and method for providing access to detailed payment experience
US20070226099A1 (en) * 2005-12-13 2007-09-27 General Electric Company System and method for predicting the financial health of a business entity
US20080221973A1 (en) * 2005-10-24 2008-09-11 Megdal Myles G Using commercial share of wallet to rate investments
US20090216717A1 (en) * 2008-02-25 2009-08-27 United Parcel Service Of America, Inc. Systems and Methods of Profiling Data For Integration
US20100088132A1 (en) * 2008-10-08 2010-04-08 Oracle International Corporation Merger and acquisition data validation
US7987159B2 (en) 2006-09-15 2011-07-26 Microsoft Corporation Detecting and managing changes in business data integration solutions
US20120016709A1 (en) * 2009-05-21 2012-01-19 Accenture Global Services Limited Enhanced postal data modeling framework
US20120317075A1 (en) * 2011-06-13 2012-12-13 Suresh Pasumarthi Synchronizing primary and secondary repositories
US20130097160A1 (en) * 2009-12-21 2013-04-18 Clear Channel Management Services, Inc. Enterprise data matching
US8626766B1 (en) 2011-09-28 2014-01-07 Google Inc. Systems and methods for ranking and importing business listings
US20140067803A1 (en) * 2012-09-06 2014-03-06 Sap Ag Data Enrichment Using Business Compendium
WO2014085567A1 (en) * 2012-11-30 2014-06-05 The Dun & Bradstreet Corporation System and method for updating organization family tree information
US20140156606A1 (en) * 2012-07-16 2014-06-05 Qatar Foundation Method and System for Integrating Data Into a Database
US8868456B1 (en) * 2004-09-29 2014-10-21 At&T Intellectual Property Ii, L.P. Method and apparatus for managing financial control validation processes
WO2014179552A1 (en) * 2013-05-02 2014-11-06 The Dun & Bradstreet Corporation A system and method using multi-dimensional rating to determine an entity's future commercial viability
US20150039399A1 (en) * 2013-08-01 2015-02-05 American Express Travel Related Services Company, Inc. System and method for liquidation management of a company
US8996391B2 (en) * 2013-03-14 2015-03-31 Credibility Corp. Custom score generation system and methods
US20150106244A1 (en) * 2013-10-15 2015-04-16 Mastercard International Incorporated Systems and methods for associating related merchants
US20160063521A1 (en) * 2014-08-29 2016-03-03 Accenture Global Services Limited Channel partner analytics
US9898497B2 (en) 2015-03-31 2018-02-20 Oracle International Corporation Validating coherency between multiple data sets between database transfers
US10140352B2 (en) 2014-07-17 2018-11-27 Oracle International Corporation Interfacing with a relational database for multi-dimensional analysis via a spreadsheet application
US11580571B2 (en) * 2016-02-04 2023-02-14 LMP Software, LLC Matching reviews between customer feedback systems
US11636417B2 (en) * 2020-12-17 2023-04-25 International Business Machines Corporation Cognitive analysis for enterprise decision meta model

Families Citing this family (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7822757B2 (en) * 2003-02-18 2010-10-26 Dun & Bradstreet, Inc. System and method for providing enhanced information
US7822660B1 (en) * 2003-03-07 2010-10-26 Mantas, Inc. Method and system for the protection of broker and investor relationships, accounts and transactions
US7970688B2 (en) * 2003-07-29 2011-06-28 Jp Morgan Chase Bank Method for pricing a trade
US20050108631A1 (en) * 2003-09-29 2005-05-19 Amorin Antonio C. Method of conducting data quality analysis
US20090299909A1 (en) * 2003-11-04 2009-12-03 Levi Andrew E System and method for comprehensive management of company equity structures and related company documents with financial and human resource system integration
US20050154751A1 (en) * 2003-11-04 2005-07-14 Boardroom Software, Inc. System and method for comprehensive management of company equity structures and related company documents with financial and human resource system integration
US20090293104A1 (en) * 2003-11-04 2009-11-26 Levi Andrew E System and method for comprehensive management of company equity structures and related company documents withfinancial and human resource system integration
US8458073B2 (en) * 2003-12-02 2013-06-04 Dun & Bradstreet, Inc. Enterprise risk assessment manager system
US8036907B2 (en) * 2003-12-23 2011-10-11 The Dun & Bradstreet Corporation Method and system for linking business entities using unique identifiers
CA2602720A1 (en) * 2005-03-24 2006-09-28 Accenture Global Services Gmbh Risk based data assessment
US7708196B2 (en) * 2005-10-05 2010-05-04 Dun And Bradstreet Corporation Modular web-based ASP application for multiple products
US20070112667A1 (en) * 2005-10-31 2007-05-17 Dun And Bradstreet System and method for providing a fraud risk score
JP5274259B2 (en) * 2005-11-23 2013-08-28 ダン アンド ブラッドストリート インコーポレイテッド System and method for searching and matching data having ideographic content
US8073724B2 (en) * 2005-12-02 2011-12-06 Saudi Arabian Oil Company Systems program product, and methods for organization realignment
EP1966729A4 (en) * 2005-12-27 2011-05-04 Dun & Bradstreet Corp Method and system for providing enhanced matching from customer driven queries
WO2007127468A2 (en) 2006-04-28 2007-11-08 Barclays Capital Inc. Method and system for implementing portal
US20080140438A1 (en) * 2006-12-08 2008-06-12 Teletech Holdings, Inc. Risk management tool
US20080147425A1 (en) * 2006-12-15 2008-06-19 American Express Travel Related Services Company, Inc. Strategic Partner Recognition
US8694361B2 (en) * 2006-12-15 2014-04-08 American Express Travel Related Services Company, Inc. Identifying and managing strategic partner relationships
US8473354B2 (en) 2007-11-14 2013-06-25 Panjiva, Inc. Evaluating public records of supply transactions
US9898767B2 (en) 2007-11-14 2018-02-20 Panjiva, Inc. Transaction facilitating marketplace platform
US8024428B2 (en) * 2008-03-19 2011-09-20 The Go Daddy Group, Inc. Methods for updating WHOIS with information collected from non-controlling party
US20100037299A1 (en) * 2008-08-08 2010-02-11 American Express Travel Related Services Company, Inc. Method, System, And Computer Program Product For Identifying An Authorized Officer Of A Business
US8171033B2 (en) * 2008-09-30 2012-05-01 Vmware, Inc. Methods and systems for the determination of thresholds via weighted quantile analysis
US9910875B2 (en) * 2008-12-22 2018-03-06 International Business Machines Corporation Best-value determination rules for an entity resolution system
CA2757232A1 (en) * 2009-03-27 2010-09-30 The Dun And Bradstreet Corporation Method and system for dynamically producing detailed trade payment experience for enhancing credit evaluation
CA2746898A1 (en) * 2010-07-20 2012-01-20 Accenture Global Services Limited Digital analytics platform
WO2012142158A2 (en) 2011-04-11 2012-10-18 Credibility Corp. Visualization tools for reviewing credibility and stateful hierarchical access to credibility
AU2012335994A1 (en) 2011-11-08 2014-05-29 Google Inc. Systems and methods for generating and displaying hierarchical search results
US10242330B2 (en) * 2012-11-06 2019-03-26 Nice-Systems Ltd Method and apparatus for detection and analysis of first contact resolution failures
US9963954B2 (en) 2012-11-16 2018-05-08 Saudi Arabian Oil Company Caliper steerable tool for lateral sensing and accessing
US9122710B1 (en) 2013-03-12 2015-09-01 Groupon, Inc. Discovery of new business openings using web content analysis
US8712907B1 (en) 2013-03-14 2014-04-29 Credibility Corp. Multi-dimensional credibility scoring
WO2014179088A1 (en) 2013-05-02 2014-11-06 The Dun & Bradstreet Corporation Apparatus and method for total loss prediction
US20150095210A1 (en) * 2013-09-27 2015-04-02 Brian Grech Merchant loan management and processing
US20150100366A1 (en) * 2013-10-08 2015-04-09 immixSolutions, Inc. System and method for managing a law enforcement investigation
US20150127501A1 (en) * 2013-11-07 2015-05-07 Strategic Exits Corp. System and Method for Capturing Exit Transaction Data
US9953105B1 (en) 2014-10-01 2018-04-24 Go Daddy Operating Company, LLC System and method for creating subdomains or directories for a domain name
US9785663B2 (en) 2014-11-14 2017-10-10 Go Daddy Operating Company, LLC Verifying a correspondence address for a registrant
US9779125B2 (en) 2014-11-14 2017-10-03 Go Daddy Operating Company, LLC Ensuring accurate domain name contact information
CN105608087B (en) * 2014-11-19 2020-01-31 菜鸟智能物流控股有限公司 resource scheduling method and device
US10725985B2 (en) * 2015-02-20 2020-07-28 Metropolitan Life Insurance Co. System and method for enterprise data quality processing
US11514096B2 (en) 2015-09-01 2022-11-29 Panjiva, Inc. Natural language processing for entity resolution
US10203984B2 (en) * 2016-07-26 2019-02-12 Sap Se Data structure for multiple executable tasks
CA3039374A1 (en) 2016-10-06 2018-04-12 The Dun & Bradstreet Corporation Machine learning classifier and prediction engine for artificial intelligence optimized prospect determination on win/loss classification
US11386435B2 (en) 2017-04-03 2022-07-12 The Dun And Bradstreet Corporation System and method for global third party intermediary identification system with anti-bribery and anti-corruption risk assessment
US11551244B2 (en) 2017-04-22 2023-01-10 Panjiva, Inc. Nowcasting abstracted census from individual customs transaction records
US10454878B2 (en) 2017-10-04 2019-10-22 The Dun & Bradstreet Corporation System and method for identity resolution across disparate distributed immutable ledger networks
CN107993010B (en) * 2017-12-02 2022-05-06 中科钢研节能科技有限公司 Project due diligence automatic analysis system and automatic analysis method
WO2019113124A1 (en) 2017-12-04 2019-06-13 Panjiva, Inc. Mtransaction processing improvements
US11580330B2 (en) * 2018-12-28 2023-02-14 Microsoft Technology Licensing, Llc Machine learning framework with model performance tracking and maintenance
CA3041600A1 (en) 2019-04-25 2020-10-25 The Dun & Bradstreet Corporation Machine learning classifier for identifying internet service providers from website tracking
CN110363401B (en) * 2019-06-26 2022-05-03 北京百度网讯科技有限公司 Integrated viscosity evaluation method and device, computer equipment and storage medium
JP2020149716A (en) * 2020-05-26 2020-09-17 株式会社ファーマクラウド System, method, and program for supporting distribution of pharmaceuticals
CN112001710A (en) * 2020-09-07 2020-11-27 山东钢铁集团日照有限公司 Big data reading and integrating system in steel product production process
WO2022060809A1 (en) * 2020-09-17 2022-03-24 Mastercard International Incorporated Continuous learning for seller disambiguation, assessment, and onboarding to electronic marketplaces

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010032117A1 (en) * 2000-04-07 2001-10-18 Persky Robert E. Continuous and updatable revenue sharing process for lists
US20020161778A1 (en) * 2001-02-24 2002-10-31 Core Integration Partners, Inc. Method and system of data warehousing and building business intelligence using a data storage model
US20030033155A1 (en) * 2001-05-17 2003-02-13 Randy Peerson Integration of data for user analysis according to departmental perspectives of a customer
US6523019B1 (en) * 1999-09-21 2003-02-18 Choicemaker Technologies, Inc. Probabilistic record linkage model derived from training data
US20030061232A1 (en) * 2001-09-21 2003-03-27 Dun & Bradstreet Inc. Method and system for processing business data

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5588147A (en) * 1994-01-14 1996-12-24 Microsoft Corporation Replication facility
US6460036B1 (en) * 1994-11-29 2002-10-01 Pinpoint Incorporated System and method for providing customized electronic newspapers and target advertisements
US5758257A (en) * 1994-11-29 1998-05-26 Herz; Frederick System and method for scheduling broadcast of and access to video programs and other data using customer profiles
US5806074A (en) * 1996-03-19 1998-09-08 Oracle Corporation Configurable conflict resolution in a computer implemented distributed database
US5819291A (en) * 1996-08-23 1998-10-06 General Electric Company Matching new customer records to existing customer records in a large business database using hash key
US6026391A (en) * 1997-10-31 2000-02-15 Oracle Corporation Systems and methods for estimating query response times in a computer system
KR20020022670A (en) * 1999-05-21 2002-03-27 아이엔씨 인터그레이티드 네트워크 코포레이션 Loop access system with graphical user interface
EP1102225A3 (en) 1999-11-22 2002-08-21 Ncr International Inc. Method of processing electronic payment data and an apparatus therefor
JP3866466B2 (en) * 1999-12-13 2007-01-10 株式会社東芝 Data structure management device, data structure management system, data structure management method, and recording medium for storing data structure management program
US6810429B1 (en) * 2000-02-03 2004-10-26 Mitsubishi Electric Research Laboratories, Inc. Enterprise integration system
AU2001278122A1 (en) 2000-07-31 2002-02-13 Eliyon Technologies Corporation Method for maintaining people and organization information
US7103586B2 (en) * 2001-03-16 2006-09-05 Gravic, Inc. Collision avoidance in database replication systems
US7953219B2 (en) * 2001-07-19 2011-05-31 Nice Systems, Ltd. Method apparatus and system for capturing and analyzing interaction based content
US7403942B1 (en) * 2003-02-04 2008-07-22 Seisint, Inc. Method and system for processing data records
US7822757B2 (en) * 2003-02-18 2010-10-26 Dun & Bradstreet, Inc. System and method for providing enhanced information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6523019B1 (en) * 1999-09-21 2003-02-18 Choicemaker Technologies, Inc. Probabilistic record linkage model derived from training data
US20010032117A1 (en) * 2000-04-07 2001-10-18 Persky Robert E. Continuous and updatable revenue sharing process for lists
US20020161778A1 (en) * 2001-02-24 2002-10-31 Core Integration Partners, Inc. Method and system of data warehousing and building business intelligence using a data storage model
US20030033155A1 (en) * 2001-05-17 2003-02-13 Randy Peerson Integration of data for user analysis according to departmental perspectives of a customer
US20030061232A1 (en) * 2001-09-21 2003-03-27 Dun & Bradstreet Inc. Method and system for processing business data

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050192891A1 (en) * 2004-02-27 2005-09-01 Dun & Bradstreet, Inc. System and method for providing access to detailed payment experience
US10387890B2 (en) 2004-09-29 2019-08-20 Lyft, Inc. Method and apparatus for managing financial control validation processes
US8868456B1 (en) * 2004-09-29 2014-10-21 At&T Intellectual Property Ii, L.P. Method and apparatus for managing financial control validation processes
US20080221973A1 (en) * 2005-10-24 2008-09-11 Megdal Myles G Using commercial share of wallet to rate investments
US20070226099A1 (en) * 2005-12-13 2007-09-27 General Electric Company System and method for predicting the financial health of a business entity
US7987159B2 (en) 2006-09-15 2011-07-26 Microsoft Corporation Detecting and managing changes in business data integration solutions
US20090216717A1 (en) * 2008-02-25 2009-08-27 United Parcel Service Of America, Inc. Systems and Methods of Profiling Data For Integration
US7912867B2 (en) 2008-02-25 2011-03-22 United Parcel Services Of America, Inc. Systems and methods of profiling data for integration
US20110137873A1 (en) * 2008-02-25 2011-06-09 Russell Suereth Systems and methods of profiling data for integration
US8321469B2 (en) 2008-02-25 2012-11-27 United Parcel Service Of America, Inc. Systems and methods of profiling data for integration
US8725701B2 (en) 2008-10-08 2014-05-13 Oracle International Corporation Merger and acquisition data validation
US20100088132A1 (en) * 2008-10-08 2010-04-08 Oracle International Corporation Merger and acquisition data validation
US9126235B2 (en) * 2009-05-21 2015-09-08 Accenture Global Services Limited Enhanced postal data modeling framework
US20120016709A1 (en) * 2009-05-21 2012-01-19 Accenture Global Services Limited Enhanced postal data modeling framework
US8725742B2 (en) * 2009-12-21 2014-05-13 Clear Channel Management Services, Inc. Enterprise data matching
US8682905B2 (en) * 2009-12-21 2014-03-25 Clear Channel Management Services, Inc. Enterprise data matching
US20130097160A1 (en) * 2009-12-21 2013-04-18 Clear Channel Management Services, Inc. Enterprise data matching
US9619821B2 (en) 2009-12-21 2017-04-11 Iheartmedia Management Services, Inc. Enterprise data re-matching
US20130290092A1 (en) * 2009-12-21 2013-10-31 Clear Channel Management Services, Inc. Processes to learn enterprise data matching
US8782057B2 (en) * 2009-12-21 2014-07-15 Clear Channel Management Services, Inc. Processes to learn enterprise data matching
US20130117068A1 (en) * 2009-12-21 2013-05-09 Clear Channel Management Services, Inc. Enterprise data matching
US20120317075A1 (en) * 2011-06-13 2012-12-13 Suresh Pasumarthi Synchronizing primary and secondary repositories
US8862543B2 (en) * 2011-06-13 2014-10-14 Business Objects Software Limited Synchronizing primary and secondary repositories
US8626766B1 (en) 2011-09-28 2014-01-07 Google Inc. Systems and methods for ranking and importing business listings
US20140156606A1 (en) * 2012-07-16 2014-06-05 Qatar Foundation Method and System for Integrating Data Into a Database
US9720986B2 (en) * 2012-07-16 2017-08-01 Qatar Foundation Method and system for integrating data into a database
US9582555B2 (en) * 2012-09-06 2017-02-28 Sap Se Data enrichment using business compendium
US20140067803A1 (en) * 2012-09-06 2014-03-06 Sap Ag Data Enrichment Using Business Compendium
US20150120347A1 (en) * 2012-11-30 2015-04-30 The Dun & Bradstreet Corporation System and method for updating organization family tree information
WO2014085567A1 (en) * 2012-11-30 2014-06-05 The Dun & Bradstreet Corporation System and method for updating organization family tree information
US8996391B2 (en) * 2013-03-14 2015-03-31 Credibility Corp. Custom score generation system and methods
WO2014179552A1 (en) * 2013-05-02 2014-11-06 The Dun & Bradstreet Corporation A system and method using multi-dimensional rating to determine an entity's future commercial viability
US20150039399A1 (en) * 2013-08-01 2015-02-05 American Express Travel Related Services Company, Inc. System and method for liquidation management of a company
US20150106244A1 (en) * 2013-10-15 2015-04-16 Mastercard International Incorporated Systems and methods for associating related merchants
US10521866B2 (en) * 2013-10-15 2019-12-31 Mastercard International Incorporated Systems and methods for associating related merchants
US11393044B2 (en) 2013-10-15 2022-07-19 Mastercard International Incorporated Systems and methods for associating related merchants
US10140352B2 (en) 2014-07-17 2018-11-27 Oracle International Corporation Interfacing with a relational database for multi-dimensional analysis via a spreadsheet application
US20160063521A1 (en) * 2014-08-29 2016-03-03 Accenture Global Services Limited Channel partner analytics
US9898497B2 (en) 2015-03-31 2018-02-20 Oracle International Corporation Validating coherency between multiple data sets between database transfers
US11580571B2 (en) * 2016-02-04 2023-02-14 LMP Software, LLC Matching reviews between customer feedback systems
US11636417B2 (en) * 2020-12-17 2023-04-25 International Business Machines Corporation Cognitive analysis for enterprise decision meta model

Also Published As

Publication number Publication date
US20110055173A1 (en) 2011-03-03
EP1599778A2 (en) 2005-11-30
SG157229A1 (en) 2009-12-29
KR20050115238A (en) 2005-12-07
US20040162742A1 (en) 2004-08-19
WO2004074981A3 (en) 2005-12-08
JP2006518512A (en) 2006-08-10
CN1826578A (en) 2006-08-30
JP4996242B2 (en) 2012-08-08
AU2004214217A1 (en) 2004-09-02
US7822757B2 (en) 2010-10-26
KR101006889B1 (en) 2011-01-12
CA2516390C (en) 2015-01-06
WO2004074981A2 (en) 2004-09-02
CA2516390A1 (en) 2004-09-02
AU2004214217B2 (en) 2009-10-29
CN1826578B (en) 2014-07-02
US8346790B2 (en) 2013-01-01
EP1599778A4 (en) 2006-11-15

Similar Documents

Publication Publication Date Title
US20060004595A1 (en) Data integration method
US11928697B2 (en) Methods and systems for using multiple data sets to analyze performance metrics of targeted companies
Berger et al. Debt maturity, risk, and asymmetric information
US9898779B2 (en) Consumer behaviors at lender level
US8639616B1 (en) Business to contact linkage system
US9324087B2 (en) Method, system, and computer program product for linking customer information
US8275700B2 (en) Lender rating system and method
US20070112667A1 (en) System and method for providing a fraud risk score
US20080301016A1 (en) Method, System, and Computer Program Product for Customer Linking and Identification Capability for Institutions
US20110137760A1 (en) Method, system, and computer program product for customer linking and identification capability for institutions
US7801808B1 (en) Database structure for financial products with unique, consistent identifier for parties that assume roles with respect to the products and methods of using the database structure
CN101194286A (en) Risk based data assessment
JPH11259578A (en) Analysis and strategy execution tool corresponding to data base
US20150302406A1 (en) Methods and systems for improving accurancy of merchant aggregation
JP2004523836A (en) System and method for managing financial account information
Liao et al. What If Borrowers Were Informed about Credit Reporting? Two Natural Field Experiments
Sadok et al. The Contribution of AI-Based Analysis and Rating Models to Financial Inclusion: The Lenddo Case for Women-Led SMEs in Developing Countries
Ho et al. The Effect of Ownership and Financing on Firm's Inventory and Profitability: An Empirical Analysis
KR20090057194A (en) Method for preliminarily selecting insolvent credit transaction company
ADEJUWON et al. Corporate Governance and Anti-Corruption Disclosure Quality in Nigeria
김병용 A customer data quality management strategy
Sharma Impact of Data Mining in Finance and Banking Sector: A Scientific Mechanism

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION