US20030225757A1 - Displaying portions of text from multiple documents over multiple database related to a search query in a computer network - Google Patents
Displaying portions of text from multiple documents over multiple database related to a search query in a computer network Download PDFInfo
- Publication number
- US20030225757A1 US20030225757A1 US10/387,747 US38774703A US2003225757A1 US 20030225757 A1 US20030225757 A1 US 20030225757A1 US 38774703 A US38774703 A US 38774703A US 2003225757 A1 US2003225757 A1 US 2003225757A1
- Authority
- US
- United States
- Prior art keywords
- query
- documents
- database
- text
- databases
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99934—Query formulation, input preparation, or translation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99935—Query augmenting and refining, e.g. inexact access
Definitions
- This invention relates in general to computer databases.
- this invention relates to locating and generating connections between concepts identified in a source document and data objects distributed throughout multiple databases in a computer network.
- Hyperlinks are ways of connecting the text of two documents together. Hyperlinks operate on a page image shown to a database user. A phrase or text section on the page image will be highlighted. When a user selects this phrase (clicks on it with a mouse), the user is immediately shown related text from another document.
- These hyperlinks are hardcoded links between a specific term and a specific set of text within a database or text on another network.
- the hyperlinks are useful because they allow a user to quickly retrieve documents related to the highlighted phrase without manually constructing and executing different searches.
- An example of conventional hyperlinks is U.S. Pat. No. 5,603,025 to Tabb, et al.
- a hypertext report writing module is created in which hypertext links are automatically embedded in documents from the database.
- hyperlinks are pre-determined relationships between specified terms in databases, it is generally not feasible to categorize many large databases to make predetermined relationships for all items of potential interest.
- conventional hypertext links are normally static. That is, even if there were enough resources to hardcode enough hypertext links to make them useful in a database, the process of hardcoding the links would only occur once. Thus, databases with hardcoded hyperlinks would not be linked to new data. These hyperlinks miss updates in the data. They also miss the addition of new databases to networks.
- the pre-determined and static nature of the hyperlinks as they currently exist makes them inappropriate for dynamically changing databases and difficult to use in distributed databases for information retrieval on wide ranging subjects. Accordingly, conventional search techniques have failed to address the need for a process capable of automatically generating connections between texts in different documents across multiple databases. Additionally, conventional search techniques have failed to provide a connection generating technique that can adapt to databases that are modified on a real time basis.
- the system of the present invention provides a method of and apparatus for displaying portions of text from multiple documents over multiple databases related to a search query.
- the initial step in this method is to identify a search query. Based on this identification, a search against multiple databases is initiated.
- the computer system identifies auxiliary databases either within a network or between networks that are likely to contain documents relating to terms in the search query.
- the databases are then searched to identify those documents relating to the identified query.
- the various sets of identified documents from multiple databases are then returned and processed to create an ordered ranking for the returned documents. Text portions from the highest ranking documents across the multiple databases are then automatically displayed to the user.
- FIG. 1 is an illustration of a computer system that operates according to the present invention for displaying text portions from multiple databases.
- FIG. 2 is a flowchart that illustrates a process according to an embodiment of the present invention for displaying text portions relating to a query from multiple databases.
- FIG. 3 is a flowchart that illustrates a process according to an embodiment of the present invention for inverting a database.
- FIG. 4 is an illustration of a listing of text that results from a noun phrase parsing process.
- FIG. 5 is a flowchart that illustrates a process according to an embodiment of the present invention for scoring subdocuments.
- FIG. 6 is a flowchart that illustrates a process according to an embodiment of the present invention for sorting.
- FIG. 1 illustrates a computer system for searching databases.
- the computer 220 is connected to a display 210 , an input system 205 (including for example, a keyboard and mouse) a memory system 230 and a communications link 280 .
- the communications link is a simple modem. It could also be a higher rate direct connection between computers or another device for interconnecting computer systems.
- the communications link 280 is in turn connected to a network of M other computers each having their own memory systems.
- the memory system 230 associated with computer 220 has a memory section 240 that contains a target database and it includes N memory sections that store a series of N auxiliary databases.
- the target database in memory section 240 stores information that a user is currently interested in searching.
- the remaining N memory sections store auxiliary databases related to a variety of topics.
- the M computers attached to communications link 280 each have similar memory sections that store N auxiliary databases.
- memory section 250 of memory system 230 stores a list of database addresses and identifiers.
- the computer system of FIG. 1 operates to display information from a target file or database to a user.
- a user will often recognize a specific idea or concept from the displayed information that may or may not be directly relevant to the general information currently being displayed. The user will desire to access or link to information about this specific concept without losing access to the general information currently being displayed.
- the computer system of FIG. 1 operates to provide links between identified concepts and information contained in multiple databases.
- the computer system of FIG. 1 provides these links by causing the computer 220 to receive a query and identify databases having information relevant to the query. Once the databases are identified computer 220 causes them to be searched such that they return documents or passages of documents relevant to the query.
- the computer 220 then organizes the returned documents or passages thereof and displays at least a portion of the text associated with those documents.
- FIG. 2 illustrates a process for operating the computer system of FIG. 1 according to the present invention.
- a query is identified in Step 10 of FIG. 2. This can be done by highlighting and selecting (through a conventional graphical user interface) a portion of text that the computer is already displaying. The query could also just be an input to the computer 220 made through a keyboard.
- the text is converted into a search request in step 20 of FIG. 2. Converting the identified query text into a search request involves the conventional steps of parsing the query text into terms and then making use of the terms to form a query. The form of the query will depend on the type of search technique that will be used to search the databases.
- search techniques use Boolean combinations of terms as the query. As a result, these techniques ‘AND’ the query terms together to form a query.
- Other search techniques make use of vector space analysis. In this case, the list of terms forms a query because the vector space algorithm does not use logical operators to form the query.
- step 30 of FIG. 2 selects the databases that will be searched.
- the computer system of FIG. 1 includes a memory space 250 that stores information to identify databases (and the types of information they store) or general database search engines. Since general database search engines, such as the LycosTM engine on the World Wide Web have their own resources for selecting the particular databases to search for a given query, Step 30 merely transmits a Boolean combination of query terms to these search engines (unless a user opts out of such a selection). For other databases identified in memory space 250 of FIG. 1, a Boolean combination of query terms is compared against the description of the databases listed in memory space 250 . As a result of this comparison, a set of auxiliary databases is selected that will be searched against the query.
- Step 40 begins the search process for the auxiliary databases selected in Step 30 .
- the target database will not be searched because the user is, presumably, already searching that database for the concepts of interest.
- the target database could also be selected in Step 30 and searched as well.
- the search process is started by transmitting a query to each of the selected auxiliary databases that are associated with computer 220 .
- Computer 220 will also transmit instructions and one or more forms of the search query to the M computers through the communications link 280 .
- the instructions sent by computer 220 could, for example, instruct computer 300 to use the LycosTM search engine to search databases on the World Wide Web for documents having a Boolean combination of the terms in the search query.
- the instructions sent by computer 220 could also, for example, instruct computer 400 to use a vector space search technique to search its associated auxiliary database N to retrieve documents related to the list of query terms.
- the documents retrieved in Step 40 from the auxiliary databases associated with the M computers are returned to computer 220 through communication link 280 .
- Step 50 of FIG. 2 determines a rank order of the documents for display.
- the processing of step 50 is completely independent of the processing used to retrieve the documents.
- the retrieved documents in effect, form an independent database that is analyzed by the computer 220 .
- various search techniques for retrieving documents across computer networks can be utilized, but all the returned documents are analyzed according to an independent process.
- the processing of step 50 can be as simple as selecting the documents for display that are returned first.
- the processing of Step 50 ranks the order of the returned documents according to a hierarchy of the databases in which the documents were located.
- Step 50 Still another processing alternative for Step 50 is to perform a vector space analysis on the returned documents. This analysis will rank the returned documents based on their relevance to the query.
- a vector space analysis computes a similarity score between the terms in the query and each of the returned documents can be computed by evaluating the shared and disjoint features of the query terms and a document over an orthogonal space of T terms of the document.
- FIG. 3 illustrates a process for inverting a database.
- a document from the database is selected.
- the document is broken into subdocuments.
- each subdocument generally corresponds to a paragraph of the document. Long paragraphs may consist of multiple subdocuments and several short paragraphs may be included in a single subdocument. The subdocuments all have approximately the same length.
- a subdocument is selected and parsed.
- the parsing process is a noun phrase parsing process.
- linguistic structure is assigned to sequences of words in a sentence. Those terms, including noun phrases, that have semantic meaning are listed.
- This parsing process can be implemented by a variety of techniques known in the art such as the use of lexicons, morphological analyzers or natural language grammar structures.
- FIG. 4 is an example listing of text passed for noun phrases. As is evident from the list of FIG. 4, the phrases tagged with a ‘T’ are noun phrases, words tagged with a ‘V’ are verbs, words tagged with an ‘X’ are quantities, words tagged with an ‘A’ are adverbs and so on.
- a term list containing noun phrases and their associated subdocument is generated in step 140 . All the subdocuments for each document are processed in this way and the list of terms and subdocuments is updated. Finally, all the documents of a database are processed according to steps 132 - 140 . The result of this inversion process is a term list identifying all the terms (specifically noun phrases in this example) of a database and their associated subdocuments.
- step 310 the term list of the inverted database is searched to identify all the subdocuments that are associated with each term of the query that was identified in step 10 of FIG. 2.
- step 320 computes a partial similarity score (according to the general formula discussed above) for the query term and the subdocument. The computation process repeats for each query term and subdocument.
- step 330 the partial scores for each subdocument are added or otherwise combined. As a result, when all the subdocuments have been scored for all the query terms, a subdocument score list is created in which each subdocument has an accumulated score.
- the subdocument score list contains a number of subdocument entries that are not sorted relative to their scores.
- the process of step 50 sorts the subdocuments by their score.
- This sort operation is a modified heap sort on the subdocument score list.
- a heap sort process is a process in which a heap is first created and then the documents with the highest scores are selected off the top of the heap to make the final sort order.
- the N subdocument scores are in heap form when the root (highest or lowest score magnitude on the subdocument score list represented by vector a(N)) is stored at a(1), the children of a[i] are a[2i] and a[2i+1] and the magnitude of a[i/2]>a[i] for 1 ⁇ i/2 ⁇ i ⁇ N.
- a[1] max (a[i]) for 1 ⁇ i ⁇ N. That is, the highest subdocument score is in the first position (a[1]) of the heap.
- step 50 merely selects this subdocument for further processing by the computer 220 .
- the computer 220 displays the document text associated with this highest ranked subdocument.
- the computer 220 can also display the text of the entire document associated with this subdocument.
- the computer 220 is also processing in the background (according to step 50 of FIG. 2) the remaining entries in the subdocument score list to reheapify them (i.e., reorganize them back into a heap form after the highest value subdocument has been removed).
- the next highest order subdocument is sought by computer 220 , it can be merely selected off the top of the heap and displayed.
- the remaining entries in the subdocument list would then be reheapified again.
- the computer system automatically connects the user to text portions of documents that are specifically related to the query. These text portions are retrieved from databases that do not have any particular structure or coded links in them. Additionally, these links are provided in spite of the fact that the set of returned documents may have been generated by different search techniques from different sources. Moreover, since the returned documents are automatically displayed, the user avoids the necessity of reorganizing the returned documents which may have been retrieved based on a variety of database search techniques.
Abstract
Description
- This is a continuation of patent application Ser. No. 09/295,840 filed Apr. 21, 1999, which is a division of patent application Ser. No. 08/900,639 filed Jul. 25, 1997, now issued as U.S. Pat. No. 5,926,808.
- This invention relates in general to computer databases. In particular, this invention relates to locating and generating connections between concepts identified in a source document and data objects distributed throughout multiple databases in a computer network.
- The volume of documents in databases is rapidly expanding. It has been estimated that in excess of 90% of all desired intelligence information is available in documents residing in accessible databases. Additionally, the number and size of computer databases available to computer users is expanding rapidly. This expansion is due both to the availability of multiple databases within a single network and the availability of multiple networks to a single computer. A major concern facing the user of a computer system that has access to multiple databases both within a network and between networks is the ability to conveniently locate relevant information. This problem is compounded in computer networks because the user is likely to be unaware of a number of databases across a network that contain relevant information.
- Typically, document retrieval from databases involves multiple user-driven searches across many different databases. The problem with this search technique is that it is often cumbersome because it requires significant interaction by the user to access many different databases. To cope with the ever-increasing expansion of databases across networks, recent attempts have been made at automating search processes. These improved systems have employed the generation of hyperlinks. Hyperlinks are ways of connecting the text of two documents together. Hyperlinks operate on a page image shown to a database user. A phrase or text section on the page image will be highlighted. When a user selects this phrase (clicks on it with a mouse), the user is immediately shown related text from another document. These hyperlinks are hardcoded links between a specific term and a specific set of text within a database or text on another network. The hyperlinks are useful because they allow a user to quickly retrieve documents related to the highlighted phrase without manually constructing and executing different searches. An example of conventional hyperlinks is U.S. Pat. No. 5,603,025 to Tabb, et al. In this patent, a hypertext report writing module is created in which hypertext links are automatically embedded in documents from the database.
- Although useful, conventional hypertext links are difficult to implement and use because these hypertext links have to be coded into the database itself. This fact renders conventional hypertext links inadequate for general purpose use in a computer network housing large quantities of distributed data. This is because the volume of potential hyperlinks is extremely large and the manual generation of such hardcoded links is, as a result, time consuming and expensive in large text databases.
- Also, since hyperlinks are pre-determined relationships between specified terms in databases, it is generally not feasible to categorize many large databases to make predetermined relationships for all items of potential interest. Moreover, conventional hypertext links are normally static. That is, even if there were enough resources to hardcode enough hypertext links to make them useful in a database, the process of hardcoding the links would only occur once. Thus, databases with hardcoded hyperlinks would not be linked to new data. These hyperlinks miss updates in the data. They also miss the addition of new databases to networks. The pre-determined and static nature of the hyperlinks as they currently exist makes them inappropriate for dynamically changing databases and difficult to use in distributed databases for information retrieval on wide ranging subjects. Accordingly, conventional search techniques have failed to address the need for a process capable of automatically generating connections between texts in different documents across multiple databases. Additionally, conventional search techniques have failed to provide a connection generating technique that can adapt to databases that are modified on a real time basis.
- It is the object of the present invention to analyze documents in a database system.
- It is a further object of the present invention to analyze documents in a database system by making connections between parts of related text in different documents.
- It is still a further object of the present invention to analyze documents in a database system by automating the process of connecting related text between different documents over multiple databases.
- It is still a further object of the present invention to analyze documents in a database system by automating the process of connecting related text between different documents across multiple computer networks.
- The system of the present invention provides a method of and apparatus for displaying portions of text from multiple documents over multiple databases related to a search query. The initial step in this method is to identify a search query. Based on this identification, a search against multiple databases is initiated. In particular, the computer system identifies auxiliary databases either within a network or between networks that are likely to contain documents relating to terms in the search query. Upon identification of these databases, the databases are then searched to identify those documents relating to the identified query. The various sets of identified documents from multiple databases are then returned and processed to create an ordered ranking for the returned documents. Text portions from the highest ranking documents across the multiple databases are then automatically displayed to the user.
- FIG. 1 is an illustration of a computer system that operates according to the present invention for displaying text portions from multiple databases.
- FIG. 2 is a flowchart that illustrates a process according to an embodiment of the present invention for displaying text portions relating to a query from multiple databases.
- FIG. 3 is a flowchart that illustrates a process according to an embodiment of the present invention for inverting a database.
- FIG. 4 is an illustration of a listing of text that results from a noun phrase parsing process.
- FIG. 5 is a flowchart that illustrates a process according to an embodiment of the present invention for scoring subdocuments.
- FIG. 6 is a flowchart that illustrates a process according to an embodiment of the present invention for sorting.
- FIG. 1 illustrates a computer system for searching databases. The
computer 220 is connected to adisplay 210, an input system 205 (including for example, a keyboard and mouse) amemory system 230 and acommunications link 280. Normally, the communications link is a simple modem. It could also be a higher rate direct connection between computers or another device for interconnecting computer systems. The communications link 280 is in turn connected to a network of M other computers each having their own memory systems. Thememory system 230 associated withcomputer 220 has amemory section 240 that contains a target database and it includes N memory sections that store a series of N auxiliary databases. The target database inmemory section 240 stores information that a user is currently interested in searching. The remaining N memory sections store auxiliary databases related to a variety of topics. The M computers attached to communications link 280 each have similar memory sections that store N auxiliary databases. In addition,memory section 250 ofmemory system 230 stores a list of database addresses and identifiers. - In general, the computer system of FIG. 1 operates to display information from a target file or database to a user. In the course of that general display of information, a user will often recognize a specific idea or concept from the displayed information that may or may not be directly relevant to the general information currently being displayed. The user will desire to access or link to information about this specific concept without losing access to the general information currently being displayed. The computer system of FIG. 1 operates to provide links between identified concepts and information contained in multiple databases. The computer system of FIG. 1 provides these links by causing the
computer 220 to receive a query and identify databases having information relevant to the query. Once the databases are identifiedcomputer 220 causes them to be searched such that they return documents or passages of documents relevant to the query. Thecomputer 220 then organizes the returned documents or passages thereof and displays at least a portion of the text associated with those documents. - Specifically, FIG. 2 illustrates a process for operating the computer system of FIG. 1 according to the present invention. Initially, a query is identified in
Step 10 of FIG. 2. This can be done by highlighting and selecting (through a conventional graphical user interface) a portion of text that the computer is already displaying. The query could also just be an input to thecomputer 220 made through a keyboard. Once the text of the query has been identified, the text is converted into a search request instep 20 of FIG. 2. Converting the identified query text into a search request involves the conventional steps of parsing the query text into terms and then making use of the terms to form a query. The form of the query will depend on the type of search technique that will be used to search the databases. Most search techniques use Boolean combinations of terms as the query. As a result, these techniques ‘AND’ the query terms together to form a query. Other search techniques make use of vector space analysis. In this case, the list of terms forms a query because the vector space algorithm does not use logical operators to form the query. - Once a query has been formed, step30 of FIG. 2 selects the databases that will be searched. The computer system of FIG. 1 includes a
memory space 250 that stores information to identify databases (and the types of information they store) or general database search engines. Since general database search engines, such as the Lycos™ engine on the World Wide Web have their own resources for selecting the particular databases to search for a given query, Step 30 merely transmits a Boolean combination of query terms to these search engines (unless a user opts out of such a selection). For other databases identified inmemory space 250 of FIG. 1, a Boolean combination of query terms is compared against the description of the databases listed inmemory space 250. As a result of this comparison, a set of auxiliary databases is selected that will be searched against the query. - Once the set of auxiliary databases is selected in
Step 30 of FIG. 2,Step 40 begins the search process for the auxiliary databases selected inStep 30. Normally the target database will not be searched because the user is, presumably, already searching that database for the concepts of interest. However, the target database could also be selected inStep 30 and searched as well. Referring to FIG. 1, the search process is started by transmitting a query to each of the selected auxiliary databases that are associated withcomputer 220.Computer 220 will also transmit instructions and one or more forms of the search query to the M computers through the communications link 280. The instructions sent bycomputer 220 could, for example, instructcomputer 300 to use the Lycos™ search engine to search databases on the World Wide Web for documents having a Boolean combination of the terms in the search query. The instructions sent bycomputer 220 could also, for example, instructcomputer 400 to use a vector space search technique to search its associated auxiliary database N to retrieve documents related to the list of query terms. The documents retrieved inStep 40 from the auxiliary databases associated with the M computers are returned tocomputer 220 throughcommunication link 280. - Once the documents retrieved from the auxiliary databases have been returned,
computer 220 processes them inStep 50 of FIG. 2 to determine a rank order of the documents for display. The processing ofstep 50 is completely independent of the processing used to retrieve the documents. The retrieved documents, in effect, form an independent database that is analyzed by thecomputer 220. As a result, various search techniques for retrieving documents across computer networks can be utilized, but all the returned documents are analyzed according to an independent process. The processing ofstep 50 can be as simple as selecting the documents for display that are returned first. Alternatively, the processing ofStep 50 ranks the order of the returned documents according to a hierarchy of the databases in which the documents were located. - Still another processing alternative for
Step 50 is to perform a vector space analysis on the returned documents. This analysis will rank the returned documents based on their relevance to the query. In particular, a vector space analysis computes a similarity score between the terms in the query and each of the returned documents can be computed by evaluating the shared and disjoint features of the query terms and a document over an orthogonal space of T terms of the document. The score can be computed by the following formula: - Where Qi refers to terms in the query and Dj refers to terms in the document.
- In order to score the retrieved documents, the set of retrieved documents is treated as a database and this database is inverted. The inversion step is a technique for creating a listing of all the terms of the database and the portions of the documents associated with those terms. FIG. 3 illustrates a process for inverting a database. In
step 132, a document from the database is selected. Instep 134, the document is broken into subdocuments. In this process, for example, each subdocument generally corresponds to a paragraph of the document. Long paragraphs may consist of multiple subdocuments and several short paragraphs may be included in a single subdocument. The subdocuments all have approximately the same length. - In
steps - Once the subdocument has been parsed, a term list containing noun phrases and their associated subdocument is generated in
step 140. All the subdocuments for each document are processed in this way and the list of terms and subdocuments is updated. Finally, all the documents of a database are processed according to steps 132-140. The result of this inversion process is a term list identifying all the terms (specifically noun phrases in this example) of a database and their associated subdocuments. - Once the retrieved document database has been inverted, the subdocuments of that database are scored. FIG. 5 is an illustration of the scoring process. In
step 310, the term list of the inverted database is searched to identify all the subdocuments that are associated with each term of the query that was identified instep 10 of FIG. 2. For each of the identified subdocuments,step 320 computes a partial similarity score (according to the general formula discussed above) for the query term and the subdocument. The computation process repeats for each query term and subdocument. Instep 330, the partial scores for each subdocument are added or otherwise combined. As a result, when all the subdocuments have been scored for all the query terms, a subdocument score list is created in which each subdocument has an accumulated score. - After
step 330 of FIG. 5, the subdocument score list contains a number of subdocument entries that are not sorted relative to their scores. At this point, the process ofstep 50 sorts the subdocuments by their score. This sort operation is a modified heap sort on the subdocument score list. A heap sort process is a process in which a heap is first created and then the documents with the highest scores are selected off the top of the heap to make the final sort order. FIG. 6 illustrates a general algorithm for a heap sort process. This process is initialized by setting l=(N/2)+1 and r=N, where N is the number of subdocuments in the subdocument score list. Then, the process of FIG. 6 is operated until l=1 or r<N. This process places the N subdocument scores in a heap form. The N subdocument scores are in heap form when the root (highest or lowest score magnitude on the subdocument score list represented by vector a(N)) is stored at a(1), the children of a[i] are a[2i] and a[2i+1] and the magnitude of a[i/2]>a[i] for 1<i/2<i<N. When the subdocument score list is in a heap form, a[1]=max (a[i]) for 1<i<N. That is, the highest subdocument score is in the first position (a[1]) of the heap. - Since subdocuments are ranked by score to quickly select the most relevant subdocuments and since the most relevant subdocument is at the top of the heap, the process of step50 (of FIG. 2) merely selects this subdocument for further processing by the
computer 220. Instep 60 of FIG. 2, thecomputer 220 then displays the document text associated with this highest ranked subdocument. Thecomputer 220 can also display the text of the entire document associated with this subdocument. While thecomputer 220 is displaying the text of the highest ranking subdocument, thecomputer 220 is also processing in the background (according to step 50 of FIG. 2) the remaining entries in the subdocument score list to reheapify them (i.e., reorganize them back into a heap form after the highest value subdocument has been removed). As a result, when the next highest order subdocument is sought bycomputer 220, it can be merely selected off the top of the heap and displayed. The remaining entries in the subdocument list would then be reheapified again. - According to the process illustrated in FIG. 2, once a user has selected a query (through highlighting text or otherwise), the computer system automatically connects the user to text portions of documents that are specifically related to the query. These text portions are retrieved from databases that do not have any particular structure or coded links in them. Additionally, these links are provided in spite of the fact that the set of returned documents may have been generated by different search techniques from different sources. Moreover, since the returned documents are automatically displayed, the user avoids the necessity of reorganizing the returned documents which may have been retrieved based on a variety of database search techniques.
- While the invention has been particularly described and illustrated with reference to a preferred embodiment, it will be understood by one of skill in the art that changes in the above description or illustrations may be made with respect to formal detail without departing from the spirit and scope of the invention.
Claims (18)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/387,747 US20030225757A1 (en) | 1997-07-25 | 2003-03-13 | Displaying portions of text from multiple documents over multiple database related to a search query in a computer network |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/900,639 US5926808A (en) | 1997-07-25 | 1997-07-25 | Displaying portions of text from multiple documents over multiple databases related to a search query in a computer network |
US29584099A | 1999-04-21 | 1999-04-21 | |
US10/387,747 US20030225757A1 (en) | 1997-07-25 | 2003-03-13 | Displaying portions of text from multiple documents over multiple database related to a search query in a computer network |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US29584099A Continuation | 1997-07-25 | 1999-04-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
US20030225757A1 true US20030225757A1 (en) | 2003-12-04 |
Family
ID=25412850
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US08/900,639 Expired - Fee Related US5926808A (en) | 1997-07-25 | 1997-07-25 | Displaying portions of text from multiple documents over multiple databases related to a search query in a computer network |
US10/387,747 Abandoned US20030225757A1 (en) | 1997-07-25 | 2003-03-13 | Displaying portions of text from multiple documents over multiple database related to a search query in a computer network |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US08/900,639 Expired - Fee Related US5926808A (en) | 1997-07-25 | 1997-07-25 | Displaying portions of text from multiple documents over multiple databases related to a search query in a computer network |
Country Status (2)
Country | Link |
---|---|
US (2) | US5926808A (en) |
JP (1) | JPH11102376A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050210003A1 (en) * | 2004-03-17 | 2005-09-22 | Yih-Kuen Tsay | Sequence based indexing and retrieval method for text documents |
US20060004724A1 (en) * | 2004-06-03 | 2006-01-05 | Oki Electric Industry Co., Ltd. | Information-processing system, information-processing method and information-processing program |
US20070192442A1 (en) * | 2001-07-24 | 2007-08-16 | Brightplanet Corporation | System and method for efficient control and capture of dynamic database content |
US20090064104A1 (en) * | 2007-08-31 | 2009-03-05 | Tom Baeyens | Method and apparatus for supporting multiple business process languages in BPM |
US20090063225A1 (en) * | 2007-08-31 | 2009-03-05 | Tom Baeyens | Tool for automated transformation of a business process definition into a web application package |
US20090070362A1 (en) * | 2007-09-12 | 2009-03-12 | Alejandro Guizar | BPM system portable across databases |
US20090144729A1 (en) * | 2007-11-30 | 2009-06-04 | Alejandro Guizar | Portable business process deployment model across different application servers |
US7908260B1 (en) | 2006-12-29 | 2011-03-15 | BrightPlanet Corporation II, Inc. | Source editing, internationalization, advanced configuration wizard, and summary page selection for information automation systems |
US20120173566A1 (en) * | 2010-12-31 | 2012-07-05 | Quora, Inc. | Multi-functional navigation bar |
US20130097494A1 (en) * | 2011-10-17 | 2013-04-18 | Xerox Corporation | Method and system for visual cues to facilitate navigation through an ordered set of documents |
US8914804B2 (en) | 2007-09-12 | 2014-12-16 | Red Hat, Inc. | Handling queues associated with web services of business processes |
US10816623B2 (en) | 2013-05-22 | 2020-10-27 | General Electric Company | System and method for reducing acoustic noise level in MR imaging |
Families Citing this family (72)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5822720A (en) | 1994-02-16 | 1998-10-13 | Sentius Corporation | System amd method for linking streams of multimedia data for reference material for display |
US6154757A (en) * | 1997-01-29 | 2000-11-28 | Krause; Philip R. | Electronic text reading environment enhancement method and apparatus |
US5926808A (en) * | 1997-07-25 | 1999-07-20 | Claritech Corporation | Displaying portions of text from multiple documents over multiple databases related to a search query in a computer network |
US6278990B1 (en) * | 1997-07-25 | 2001-08-21 | Claritech Corporation | Sort system for text retrieval |
US6353824B1 (en) * | 1997-11-18 | 2002-03-05 | Apple Computer, Inc. | Method for dynamic presentation of the contents topically rich capsule overviews corresponding to the plurality of documents, resolving co-referentiality in document segments |
US6209007B1 (en) * | 1997-11-26 | 2001-03-27 | International Business Machines Corporation | Web internet screen customizing system |
JP3571201B2 (en) * | 1997-12-12 | 2004-09-29 | 富士通株式会社 | Database search device and computer-readable recording medium storing database search program |
JP3571515B2 (en) * | 1997-12-19 | 2004-09-29 | 富士通株式会社 | Computer-readable storage medium storing a knowledge collection / storage / retrieval program |
US20080028292A1 (en) * | 1997-12-22 | 2008-01-31 | Ricoh Company, Ltd. | Techniques to facilitate reading of a document |
JP4286345B2 (en) * | 1998-05-08 | 2009-06-24 | 株式会社リコー | Search support system and computer-readable recording medium |
US7272604B1 (en) * | 1999-09-03 | 2007-09-18 | Atle Hedloy | Method, system and computer readable medium for addressing handling from an operating system |
NO984066L (en) * | 1998-09-03 | 2000-03-06 | Arendi As | Computer function button |
US7496854B2 (en) * | 1998-11-10 | 2009-02-24 | Arendi Holding Limited | Method, system and computer readable medium for addressing handling from a computer program |
US6582475B2 (en) * | 1998-09-09 | 2003-06-24 | Ricoh Company Limited | Automatic adaptive document printing help system |
IL126373A (en) | 1998-09-27 | 2003-06-24 | Haim Zvi Melman | Apparatus and method for search and retrieval of documents |
US7228492B1 (en) * | 1999-07-06 | 2007-06-05 | Ricoh Company, Ltd. | 2D graph displaying document locations of user-specified concept of interest |
US7013300B1 (en) * | 1999-08-03 | 2006-03-14 | Taylor David C | Locating, filtering, matching macro-context from indexed database for searching context where micro-context relevant to textual input by user |
US7219073B1 (en) * | 1999-08-03 | 2007-05-15 | Brandnamestores.Com | Method for extracting information utilizing a user-context-based search engine |
US6775665B1 (en) * | 1999-09-30 | 2004-08-10 | Ricoh Co., Ltd. | System for treating saved queries as searchable documents in a document management system |
US7127500B1 (en) * | 1999-11-10 | 2006-10-24 | Oclc Online Computer Library Center, Inc. | Retrieval of digital objects by redirection of controlled vocabulary searches |
AU7339700A (en) * | 1999-11-16 | 2001-05-30 | Searchcraft Corporation | Method for searching from a plurality of data sources |
KR100362381B1 (en) * | 1999-12-27 | 2002-11-23 | 한국전자통신연구원 | Web filtering system and method using thereof in internet |
US7567958B1 (en) | 2000-04-04 | 2009-07-28 | Aol, Llc | Filtering system for providing personalized information in the absence of negative data |
JP2001318948A (en) * | 2000-05-09 | 2001-11-16 | Hitachi Ltd | Method and device for retrieving document and medium having processing program for the method stored thereon |
JP2002024243A (en) * | 2000-07-07 | 2002-01-25 | Shimadzu Corp | Scientific information browse system and host computer and browsing computer used for the same |
US6691107B1 (en) * | 2000-07-21 | 2004-02-10 | International Business Machines Corporation | Method and system for improving a text search |
KR20020010226A (en) * | 2000-07-28 | 2002-02-04 | 정명수 | Internet Anything Response System |
KR20000063712A (en) * | 2000-07-31 | 2000-11-06 | 강원식 | Real time question & answer method and it's system using by Internet. |
KR20000064067A (en) * | 2000-08-18 | 2000-11-06 | 서영호 | Business Solution on Customer Support Service Response of Web based |
KR20020050401A (en) * | 2000-12-21 | 2002-06-27 | 한문철 | Method of answering questions concerning traffic accidents |
US7013312B2 (en) * | 2001-06-21 | 2006-03-14 | International Business Machines Corporation | Web-based strategic client planning system for end-user creation of queries, reports and database updates |
US7130861B2 (en) | 2001-08-16 | 2006-10-31 | Sentius International Corporation | Automated creation and delivery of database content |
US8799489B2 (en) * | 2002-06-27 | 2014-08-05 | Siebel Systems, Inc. | Multi-user system with dynamic data source selection |
KR20040031990A (en) * | 2002-10-08 | 2004-04-14 | 한국과학기술정보연구원 | System and Method for finding the original, and Storage media having program source thereof |
KR20040042927A (en) * | 2002-11-14 | 2004-05-22 | 주식회사 드리머 | Information searching service method using short message service and thereof |
US8095500B2 (en) * | 2003-06-13 | 2012-01-10 | Brilliant Digital Entertainment, Inc. | Methods and systems for searching content in distributed computing networks |
US7729992B2 (en) | 2003-06-13 | 2010-06-01 | Brilliant Digital Entertainment, Inc. | Monitoring of computer-related resources and associated methods and systems for disbursing compensation |
US20060168012A1 (en) * | 2004-11-24 | 2006-07-27 | Anthony Rose | Method and system for electronic messaging via distributed computing networks |
US20070094308A1 (en) * | 2004-12-30 | 2007-04-26 | Ncr Corporation | Maintaining synchronization among multiple active database systems |
US20070094237A1 (en) * | 2004-12-30 | 2007-04-26 | Ncr Corporation | Multiple active database systems |
US20070208753A1 (en) * | 2004-12-30 | 2007-09-06 | Ncr Corporation | Routing database requests among multiple active database systems |
US20060149707A1 (en) * | 2004-12-30 | 2006-07-06 | Mitchell Mark A | Multiple active database systems |
US7567990B2 (en) * | 2004-12-30 | 2009-07-28 | Teradata Us, Inc. | Transfering database workload among multiple database systems |
US20070174349A1 (en) * | 2004-12-30 | 2007-07-26 | Ncr Corporation | Maintaining consistent state information among multiple active database systems |
US8027876B2 (en) | 2005-08-08 | 2011-09-27 | Yoogli, Inc. | Online advertising valuation apparatus and method |
US8429167B2 (en) | 2005-08-08 | 2013-04-23 | Google Inc. | User-context-based search engine |
WO2007032095A1 (en) * | 2005-09-16 | 2007-03-22 | Bits Co., Ltd. | Document data managing method, managing system, and computer software |
US20070067849A1 (en) * | 2005-09-21 | 2007-03-22 | Jung Edward K | Reviewing electronic communications for possible restricted content |
US8214394B2 (en) | 2006-03-01 | 2012-07-03 | Oracle International Corporation | Propagating user identities in a secure federated search system |
US8027982B2 (en) * | 2006-03-01 | 2011-09-27 | Oracle International Corporation | Self-service sources for secure search |
US7941419B2 (en) * | 2006-03-01 | 2011-05-10 | Oracle International Corporation | Suggested content with attribute parameterization |
US8875249B2 (en) * | 2006-03-01 | 2014-10-28 | Oracle International Corporation | Minimum lifespan credentials for crawling data repositories |
US9177124B2 (en) * | 2006-03-01 | 2015-11-03 | Oracle International Corporation | Flexible authentication framework |
US8433712B2 (en) * | 2006-03-01 | 2013-04-30 | Oracle International Corporation | Link analysis for enterprise environment |
US8707451B2 (en) | 2006-03-01 | 2014-04-22 | Oracle International Corporation | Search hit URL modification for secure application integration |
US8868540B2 (en) * | 2006-03-01 | 2014-10-21 | Oracle International Corporation | Method for suggesting web links and alternate terms for matching search queries |
US8332430B2 (en) * | 2006-03-01 | 2012-12-11 | Oracle International Corporation | Secure search performance improvement |
US20070214129A1 (en) * | 2006-03-01 | 2007-09-13 | Oracle International Corporation | Flexible Authorization Model for Secure Search |
US8005816B2 (en) * | 2006-03-01 | 2011-08-23 | Oracle International Corporation | Auto generation of suggested links in a search system |
US7996392B2 (en) | 2007-06-27 | 2011-08-09 | Oracle International Corporation | Changing ranking algorithms based on customer settings |
US8316007B2 (en) * | 2007-06-28 | 2012-11-20 | Oracle International Corporation | Automatically finding acronyms and synonyms in a corpus |
JP5376625B2 (en) * | 2008-08-05 | 2013-12-25 | 学校法人東京電機大学 | Iterative fusion search method in search system |
US9092517B2 (en) * | 2008-09-23 | 2015-07-28 | Microsoft Technology Licensing, Llc | Generating synonyms based on query log data |
US8719249B2 (en) * | 2009-05-12 | 2014-05-06 | Microsoft Corporation | Query classification |
US9600566B2 (en) | 2010-05-14 | 2017-03-21 | Microsoft Technology Licensing, Llc | Identifying entity synonyms |
US10032131B2 (en) | 2012-06-20 | 2018-07-24 | Microsoft Technology Licensing, Llc | Data services for enterprises leveraging search system data assets |
US9594831B2 (en) | 2012-06-22 | 2017-03-14 | Microsoft Technology Licensing, Llc | Targeted disambiguation of named entities |
US9229924B2 (en) | 2012-08-24 | 2016-01-05 | Microsoft Technology Licensing, Llc | Word detection and domain dictionary recommendation |
US20150142444A1 (en) * | 2013-11-15 | 2015-05-21 | International Business Machines Corporation | Audio rendering order for text sources |
US9740748B2 (en) | 2014-03-19 | 2017-08-22 | International Business Machines Corporation | Similarity and ranking of databases based on database metadata |
US9230028B1 (en) * | 2014-06-18 | 2016-01-05 | Fmr Llc | Dynamic search service |
CN109376174B (en) * | 2018-12-30 | 2021-04-27 | 北京奇艺世纪科技有限公司 | Method and device for selecting database |
Citations (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5050071A (en) * | 1988-11-04 | 1991-09-17 | Harris Edward S | Text retrieval method for texts created by external application programs |
US5263159A (en) * | 1989-09-20 | 1993-11-16 | International Business Machines Corporation | Information retrieval based on rank-ordered cumulative query scores calculated from weights of all keywords in an inverted index file for minimizing access to a main database |
US5265065A (en) * | 1991-10-08 | 1993-11-23 | West Publishing Company | Method and apparatus for information retrieval from a database by replacing domain specific stemmed phases in a natural language to create a search query |
US5280573A (en) * | 1989-03-14 | 1994-01-18 | Sharp Kabushiki Kaisha | Document processing support system using keywords to retrieve explanatory information linked together by correlative arcs |
US5379366A (en) * | 1993-01-29 | 1995-01-03 | Noyes; Dallas B. | Method for representation of knowledge in a computer as a network database system |
US5454105A (en) * | 1989-06-14 | 1995-09-26 | Hitachi, Ltd. | Document information search method and system |
US5465353A (en) * | 1994-04-01 | 1995-11-07 | Ricoh Company, Ltd. | Image matching and retrieval by multi-access redundant hashing |
US5488725A (en) * | 1991-10-08 | 1996-01-30 | West Publishing Company | System of document representation retrieval by successive iterated probability sampling |
US5598557A (en) * | 1992-09-22 | 1997-01-28 | Caere Corporation | Apparatus and method for retrieving and grouping images representing text files based on the relevance of key words extracted from a selected file to the text files |
US5600835A (en) * | 1993-08-20 | 1997-02-04 | Canon Inc. | Adaptive non-literal text string retrieval |
US5603025A (en) * | 1994-07-29 | 1997-02-11 | Borland International, Inc. | Methods for hypertext reporting in a relational database management system |
US5634051A (en) * | 1993-10-28 | 1997-05-27 | Teltech Resource Network Corporation | Information management system |
US5671404A (en) * | 1994-03-31 | 1997-09-23 | Martin Lizee | System for querying databases automatically |
US5675788A (en) * | 1995-09-15 | 1997-10-07 | Infonautics Corp. | Method and apparatus for generating a composite document on a selected topic from a plurality of information sources |
US5717913A (en) * | 1995-01-03 | 1998-02-10 | University Of Central Florida | Method for detecting and extracting text data using database schemas |
US5721906A (en) * | 1994-03-24 | 1998-02-24 | Ncr Corporation | Multiple repositories of computer resources, transparent to user |
US5724571A (en) * | 1995-07-07 | 1998-03-03 | Sun Microsystems, Inc. | Method and apparatus for generating query responses in a computer-based document retrieval system |
US5742816A (en) * | 1995-09-15 | 1998-04-21 | Infonautics Corporation | Method and apparatus for identifying textual documents and multi-mediafiles corresponding to a search topic |
US5748954A (en) * | 1995-06-05 | 1998-05-05 | Carnegie Mellon University | Method for searching a queued and ranked constructed catalog of files stored on a network |
US5761497A (en) * | 1993-11-22 | 1998-06-02 | Reed Elsevier, Inc. | Associative text search and retrieval system that calculates ranking scores and window scores |
US5802515A (en) * | 1996-06-11 | 1998-09-01 | Massachusetts Institute Of Technology | Randomized query generation and document relevance ranking for robust information retrieval from a database |
US5826260A (en) * | 1995-12-11 | 1998-10-20 | International Business Machines Corporation | Information retrieval system and method for displaying and ordering information based on query element contribution |
US5826261A (en) * | 1996-05-10 | 1998-10-20 | Spencer; Graham | System and method for querying multiple, distributed databases by selective sharing of local relative significance information for terms related to the query |
US5848373A (en) * | 1994-06-24 | 1998-12-08 | Delorme Publishing Company | Computer aided map location system |
US5857184A (en) * | 1996-05-03 | 1999-01-05 | Walden Media, Inc. | Language and method for creating, organizing, and retrieving data from a database |
US5859972A (en) * | 1996-05-10 | 1999-01-12 | The Board Of Trustees Of The University Of Illinois | Multiple server repository and multiple server remote application virtual client computer |
US5864845A (en) * | 1996-06-28 | 1999-01-26 | Siemens Corporate Research, Inc. | Facilitating world wide web searches utilizing a multiple search engine query clustering fusion strategy |
US5873107A (en) * | 1996-03-29 | 1999-02-16 | Apple Computer, Inc. | System for automatically retrieving information relevant to text being authored |
US5903890A (en) * | 1996-03-05 | 1999-05-11 | Sofmap Future Design, Inc. | Database systems having single-association structures |
US5907840A (en) * | 1997-07-25 | 1999-05-25 | Claritech Corporation | Overlapping subdocuments in a vector space search process |
US5920854A (en) * | 1996-08-14 | 1999-07-06 | Infoseek Corporation | Real-time document collection search engine with phrase indexing |
US5920856A (en) * | 1997-06-09 | 1999-07-06 | Xerox Corporation | System for selecting multimedia databases over networks |
US5926808A (en) * | 1997-07-25 | 1999-07-20 | Claritech Corporation | Displaying portions of text from multiple documents over multiple databases related to a search query in a computer network |
US5933822A (en) * | 1997-07-22 | 1999-08-03 | Microsoft Corporation | Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision |
US5983221A (en) * | 1998-01-13 | 1999-11-09 | Wordstream, Inc. | Method and apparatus for improved document searching |
US6009442A (en) * | 1997-10-08 | 1999-12-28 | Caere Corporation | Computer-based document management system |
US6012053A (en) * | 1997-06-23 | 2000-01-04 | Lycos, Inc. | Computer system with user-controlled relevance ranking of search results |
US6078914A (en) * | 1996-12-09 | 2000-06-20 | Open Text Corporation | Natural language meta-search system and method |
-
1997
- 1997-07-25 US US08/900,639 patent/US5926808A/en not_active Expired - Fee Related
-
1998
- 1998-04-21 JP JP10110879A patent/JPH11102376A/en active Pending
-
2003
- 2003-03-13 US US10/387,747 patent/US20030225757A1/en not_active Abandoned
Patent Citations (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5050071A (en) * | 1988-11-04 | 1991-09-17 | Harris Edward S | Text retrieval method for texts created by external application programs |
US5280573A (en) * | 1989-03-14 | 1994-01-18 | Sharp Kabushiki Kaisha | Document processing support system using keywords to retrieve explanatory information linked together by correlative arcs |
US5454105A (en) * | 1989-06-14 | 1995-09-26 | Hitachi, Ltd. | Document information search method and system |
US5263159A (en) * | 1989-09-20 | 1993-11-16 | International Business Machines Corporation | Information retrieval based on rank-ordered cumulative query scores calculated from weights of all keywords in an inverted index file for minimizing access to a main database |
US5488725A (en) * | 1991-10-08 | 1996-01-30 | West Publishing Company | System of document representation retrieval by successive iterated probability sampling |
US5265065A (en) * | 1991-10-08 | 1993-11-23 | West Publishing Company | Method and apparatus for information retrieval from a database by replacing domain specific stemmed phases in a natural language to create a search query |
US5598557A (en) * | 1992-09-22 | 1997-01-28 | Caere Corporation | Apparatus and method for retrieving and grouping images representing text files based on the relevance of key words extracted from a selected file to the text files |
US5379366A (en) * | 1993-01-29 | 1995-01-03 | Noyes; Dallas B. | Method for representation of knowledge in a computer as a network database system |
US5600835A (en) * | 1993-08-20 | 1997-02-04 | Canon Inc. | Adaptive non-literal text string retrieval |
US5634051A (en) * | 1993-10-28 | 1997-05-27 | Teltech Resource Network Corporation | Information management system |
US5761497A (en) * | 1993-11-22 | 1998-06-02 | Reed Elsevier, Inc. | Associative text search and retrieval system that calculates ranking scores and window scores |
US5721906A (en) * | 1994-03-24 | 1998-02-24 | Ncr Corporation | Multiple repositories of computer resources, transparent to user |
US5671404A (en) * | 1994-03-31 | 1997-09-23 | Martin Lizee | System for querying databases automatically |
US5465353A (en) * | 1994-04-01 | 1995-11-07 | Ricoh Company, Ltd. | Image matching and retrieval by multi-access redundant hashing |
US5848373A (en) * | 1994-06-24 | 1998-12-08 | Delorme Publishing Company | Computer aided map location system |
US5603025A (en) * | 1994-07-29 | 1997-02-11 | Borland International, Inc. | Methods for hypertext reporting in a relational database management system |
US5717913A (en) * | 1995-01-03 | 1998-02-10 | University Of Central Florida | Method for detecting and extracting text data using database schemas |
US5748954A (en) * | 1995-06-05 | 1998-05-05 | Carnegie Mellon University | Method for searching a queued and ranked constructed catalog of files stored on a network |
US5724571A (en) * | 1995-07-07 | 1998-03-03 | Sun Microsystems, Inc. | Method and apparatus for generating query responses in a computer-based document retrieval system |
US5675788A (en) * | 1995-09-15 | 1997-10-07 | Infonautics Corp. | Method and apparatus for generating a composite document on a selected topic from a plurality of information sources |
US5742816A (en) * | 1995-09-15 | 1998-04-21 | Infonautics Corporation | Method and apparatus for identifying textual documents and multi-mediafiles corresponding to a search topic |
US5826260A (en) * | 1995-12-11 | 1998-10-20 | International Business Machines Corporation | Information retrieval system and method for displaying and ordering information based on query element contribution |
US5903890A (en) * | 1996-03-05 | 1999-05-11 | Sofmap Future Design, Inc. | Database systems having single-association structures |
US5873107A (en) * | 1996-03-29 | 1999-02-16 | Apple Computer, Inc. | System for automatically retrieving information relevant to text being authored |
US5857184A (en) * | 1996-05-03 | 1999-01-05 | Walden Media, Inc. | Language and method for creating, organizing, and retrieving data from a database |
US5859972A (en) * | 1996-05-10 | 1999-01-12 | The Board Of Trustees Of The University Of Illinois | Multiple server repository and multiple server remote application virtual client computer |
US5826261A (en) * | 1996-05-10 | 1998-10-20 | Spencer; Graham | System and method for querying multiple, distributed databases by selective sharing of local relative significance information for terms related to the query |
US5802515A (en) * | 1996-06-11 | 1998-09-01 | Massachusetts Institute Of Technology | Randomized query generation and document relevance ranking for robust information retrieval from a database |
US5864845A (en) * | 1996-06-28 | 1999-01-26 | Siemens Corporate Research, Inc. | Facilitating world wide web searches utilizing a multiple search engine query clustering fusion strategy |
US5920854A (en) * | 1996-08-14 | 1999-07-06 | Infoseek Corporation | Real-time document collection search engine with phrase indexing |
US6078914A (en) * | 1996-12-09 | 2000-06-20 | Open Text Corporation | Natural language meta-search system and method |
US5920856A (en) * | 1997-06-09 | 1999-07-06 | Xerox Corporation | System for selecting multimedia databases over networks |
US6012053A (en) * | 1997-06-23 | 2000-01-04 | Lycos, Inc. | Computer system with user-controlled relevance ranking of search results |
US5933822A (en) * | 1997-07-22 | 1999-08-03 | Microsoft Corporation | Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision |
US5907840A (en) * | 1997-07-25 | 1999-05-25 | Claritech Corporation | Overlapping subdocuments in a vector space search process |
US5926808A (en) * | 1997-07-25 | 1999-07-20 | Claritech Corporation | Displaying portions of text from multiple documents over multiple databases related to a search query in a computer network |
US6009442A (en) * | 1997-10-08 | 1999-12-28 | Caere Corporation | Computer-based document management system |
US5983221A (en) * | 1998-01-13 | 1999-11-09 | Wordstream, Inc. | Method and apparatus for improved document searching |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7676555B2 (en) | 2001-07-24 | 2010-03-09 | Brightplanet Corporation | System and method for efficient control and capture of dynamic database content |
US20070192442A1 (en) * | 2001-07-24 | 2007-08-16 | Brightplanet Corporation | System and method for efficient control and capture of dynamic database content |
US8380735B2 (en) | 2001-07-24 | 2013-02-19 | Brightplanet Corporation II, Inc | System and method for efficient control and capture of dynamic database content |
US20050210003A1 (en) * | 2004-03-17 | 2005-09-22 | Yih-Kuen Tsay | Sequence based indexing and retrieval method for text documents |
US20060004724A1 (en) * | 2004-06-03 | 2006-01-05 | Oki Electric Industry Co., Ltd. | Information-processing system, information-processing method and information-processing program |
US7908260B1 (en) | 2006-12-29 | 2011-03-15 | BrightPlanet Corporation II, Inc. | Source editing, internationalization, advanced configuration wizard, and summary page selection for information automation systems |
US20090063225A1 (en) * | 2007-08-31 | 2009-03-05 | Tom Baeyens | Tool for automated transformation of a business process definition into a web application package |
US20090064104A1 (en) * | 2007-08-31 | 2009-03-05 | Tom Baeyens | Method and apparatus for supporting multiple business process languages in BPM |
US8423955B2 (en) | 2007-08-31 | 2013-04-16 | Red Hat, Inc. | Method and apparatus for supporting multiple business process languages in BPM |
US9058571B2 (en) | 2007-08-31 | 2015-06-16 | Red Hat, Inc. | Tool for automated transformation of a business process definition into a web application package |
US20090070362A1 (en) * | 2007-09-12 | 2009-03-12 | Alejandro Guizar | BPM system portable across databases |
US8825713B2 (en) * | 2007-09-12 | 2014-09-02 | Red Hat, Inc. | BPM system portable across databases |
US8914804B2 (en) | 2007-09-12 | 2014-12-16 | Red Hat, Inc. | Handling queues associated with web services of business processes |
US20090144729A1 (en) * | 2007-11-30 | 2009-06-04 | Alejandro Guizar | Portable business process deployment model across different application servers |
US8954952B2 (en) | 2007-11-30 | 2015-02-10 | Red Hat, Inc. | Portable business process deployment model across different application servers |
US20120173566A1 (en) * | 2010-12-31 | 2012-07-05 | Quora, Inc. | Multi-functional navigation bar |
US20130097494A1 (en) * | 2011-10-17 | 2013-04-18 | Xerox Corporation | Method and system for visual cues to facilitate navigation through an ordered set of documents |
US8881007B2 (en) * | 2011-10-17 | 2014-11-04 | Xerox Corporation | Method and system for visual cues to facilitate navigation through an ordered set of documents |
US10816623B2 (en) | 2013-05-22 | 2020-10-27 | General Electric Company | System and method for reducing acoustic noise level in MR imaging |
Also Published As
Publication number | Publication date |
---|---|
US5926808A (en) | 1999-07-20 |
JPH11102376A (en) | 1999-04-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5926808A (en) | Displaying portions of text from multiple documents over multiple databases related to a search query in a computer network | |
US6523030B1 (en) | Sort system for merging database entries | |
US7587387B2 (en) | User interface for facts query engine with snippets from information sources that include query terms and answer terms | |
US6701310B1 (en) | Information search device and information search method using topic-centric query routing | |
US6205443B1 (en) | Overlapping subdocuments in a vector space search process | |
US6286000B1 (en) | Light weight document matcher | |
US6947920B2 (en) | Method and system for response time optimization of data query rankings and retrieval | |
US7398201B2 (en) | Method and system for enhanced data searching | |
US7725424B1 (en) | Use of generalized term frequency scores in information retrieval systems | |
US6725217B2 (en) | Method and system for knowledge repository exploration and visualization | |
US6564210B1 (en) | System and method for searching databases employing user profiles | |
US6385602B1 (en) | Presentation of search results using dynamic categorization | |
US5920859A (en) | Hypertext document retrieval system and method | |
US8285724B2 (en) | System and program for handling anchor text | |
US7283997B1 (en) | System and method for ranking the relevance of documents retrieved by a query | |
US6446066B1 (en) | Method and apparatus using run length encoding to evaluate a database | |
US20080027918A1 (en) | Method of generating a distributed text index for parallel query processing | |
US6505198B2 (en) | Sort system for text retrieval | |
US20040015485A1 (en) | Method and apparatus for improved internet searching | |
US20020040363A1 (en) | Automatic hierarchy based classification | |
WO2008127263A1 (en) | Methods and systems for formulating and executing concept-structured queries of unorganized data | |
Attardi et al. | Theseus: categorization by context | |
US6473755B2 (en) | Overlapping subdocuments in a vector space search process | |
Yang et al. | Dynamic clustering of web search results | |
Ferragina et al. | The anatomy of a Clustering Engine for Web Snippets |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: CLARITECH CORPORATION, PENNSYLVANIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:EVANS, DAVID A.;MCINERY, MICHAEL J.;REEL/FRAME:013867/0230 Effective date: 19970723 |
|
AS | Assignment |
Owner name: CLAIRVOYANCE CORPORATION, PENNSYLVANIA Free format text: CHANGE OF NAME;ASSIGNOR:CLARITECH CORPORATION;REEL/FRAME:018446/0708 Effective date: 20000621 |
|
AS | Assignment |
Owner name: CLAIRVOYANCE CORPORATION, PENNSYLVANIA Free format text: CHANGE OF NAME;ASSIGNOR:CLARITECH CORPORATION;REEL/FRAME:020493/0569 Effective date: 20000621 |
|
AS | Assignment |
Owner name: JUSTSYSTEMS EVANS RESEARCH INC., PENNSYLVANIA Free format text: CHANGE OF NAME;ASSIGNOR:CLAIRVOYANCE CORPORATION;REEL/FRAME:020571/0270 Effective date: 20070316 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |