CN100524307C - Method and device for establishing coupled relation between documents - Google Patents

Method and device for establishing coupled relation between documents Download PDF

Info

Publication number
CN100524307C
CN100524307C CNB2006100942198A CN200610094219A CN100524307C CN 100524307 C CN100524307 C CN 100524307C CN B2006100942198 A CNB2006100942198 A CN B2006100942198A CN 200610094219 A CN200610094219 A CN 200610094219A CN 100524307 C CN100524307 C CN 100524307C
Authority
CN
China
Prior art keywords
document
user
relevant documentation
search condition
relevant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2006100942198A
Other languages
Chinese (zh)
Other versions
CN101097574A (en
Inventor
王庆波
陈伟柱
费奔
苏中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to CNB2006100942198A priority Critical patent/CN100524307C/en
Priority to US11/740,431 priority patent/US7809716B2/en
Publication of CN101097574A publication Critical patent/CN101097574A/en
Application granted granted Critical
Publication of CN100524307C publication Critical patent/CN100524307C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures

Abstract

The invention discloses a kind of method and device to build the correlation between files according to the operation for the searching result. When the user uses searching engine to search the appropriate file in the memory, and the searching engine will return a group of arranged files as the researching result, and the files are correlated according to the searching condition. If the user click the file in searching result, and the clicked file satisfy a certain condition, for example exceed a defined time, and the files can be considered to correlate with the searching condition, and the correlation between the clicked files can be inferred. The invention can record the correlation between files according the operation for the researching history and the researching result, and the correlation relationship can be displayed when it is needed.

Description

A kind of method and apparatus of setting up coupled relation between documents
Technical field
The present invention relates to set up the method and apparatus of coupled relation between documents, and The present invention be more particularly directed to the operation history of result for retrieval be set up the method and apparatus of coupled relation between documents based on the user.
Background technology
Usually all store a large amount of documents in PC or network memory, from form, these documents are separate, and on content, all exist direct or indirect incidence relation between a lot of documents.And the incidence relation between these documents may be considerable information for the user or visitor of document.May want by searching other relevant documentation of one piece of document such as, a certain technical research personnel, thereby the technology of understanding this field is dynamic.Yet not providing with the document in the prior art is inlet, the technical scheme of inquiry relevant documentation.
Present way is, the user must read the content of each document in person, and by judging the incidence relation of determining between document, then the document storage that contains related content under identical catalogue, thereby set up incidence relation between document.Generally, the user can classify to document by the mode of directory tree (Tree-based directory).Yet there is obvious defects in the method that concerns between this record document, and it can't write down the incidence relation of document between different levels even different directories.Told about the method for how to carry out cartoon making such as document 1, so this piece document is incorporated in the catalogue of " Guide Book " by name, however in fact in this piece document and another catalogue " technical news " document of relevant cartoon making have relation more closely.In this case, unless the user understands in advance, otherwise he be difficult to find the incidence relation between document in different directories.In addition, said method needs regular directory tree is put in order of user, and the catalogue arrangement is a very consuming time and complicated job for the user who has mass data.
The determination methods of setting up degree of correlation between search condition and document only is provided in the prior art.This method is used in the web search based on Internet especially, such as Www.delphion.comIn such web search, when the user imports a certain keyword, during as a certain patent No., will return a results list, comprise a series of hyperlinks (Hyperlinks) in the tabulation, in order to being connected on the corresponding result for retrieval, and described the results list carries out from high to low ordering according to the degree relevant with keyword.The document that degree of correlation is the highest is represented with 100% usually.When the website when described degree of correlation is judged, usually the factor of considering is, whether (as, the position of the patent No.) appear in described keyword at the ad-hoc location of article, and the number that occurs in entire article of keyword.That is to say that perhaps the occurrence number of keyword in article is maximum, just can think that result for retrieval is the most relevant with described keyword if keyword occurs at the ad-hoc location of Search Results.
Yet said method can not be used to solve problem of the present invention because this method only helps the degree of correlation between deterministic retrieval condition and the document, and and be not used in the incidence relation of determining between document and the document.When if the user wants to understand other document relevant with a certain document, need own reading documents content, summary keyword and inputted search engine to search for.This manual method is made mistakes easily, and consuming time, effort.And this method,, has to import identical keyword once more during the relevant documentation that goes out according to a certain keyword retrieval before the user can't remember two days without any memory to the correlativity of document, searches for again and reads.
In addition, the correlationship that this method reflected is static, rather than the dynamical correlation that can upgrade automatically along with user's experience concerns.Comprise a large amount of keywords though that is to say certain piece of article; but this article might not be the user to be needed most and sees; particularly when there is ambiguity in user-selected keyword; the result that search engine retrieving goes out just the non-situation of asking of looking into occurs through regular meeting; such as importing term as the user when " windows ", the document that comprises " window " and windowing software will appear in the result for retrieval jointly.
Therefore, need a kind of easy, dynamic method, to help the user to search and to determine correlationship between document.
Summary of the invention
In order to solve the problems referred to above of prior art, the present invention proposes a kind of method and apparatus that is used for file retrieval result's operation being set up coupled relation between documents according to the user.When the user used search engine to search for the document that meets predetermined search condition (query) in storer, search engine can return one group of document through ordering, as result for retrieval, was relative to each other according to described search condition between these documents.If the user further clicks the document in the result for retrieval, and described click meets some requirements, certain long-time such as surpassing, can think to a certain extent that then the document is relevant really with described search condition, and then can derive, there is incidence relation between the result for retrieval document that the user clicked.The present invention is recorded in the document associations relation that is produced in the user search process, and in needs this incidence relation is presented to the user.
Therefore, the invention provides a kind of method that is used for file retrieval result's operation being set up coupled relation between documents according to the user, wherein said file retrieval result retrieves and the document results that obtains according to search condition, this method comprises: monitor user ' is to file retrieval result's operation, and, obtain the document that the user chooses according to described operation; According to the document that described user chooses, the storage list of relevant documents; According to described list of relevant documents, obtain the incidence relation between document, wherein, the described at least search condition of usefulness is described the incidence relation between described document.
The present invention also provides a kind of method according to the inquiry of the incidence relation between document relevant documentation, and this method comprises: receive the inlet document that the user selects; There are the relevant documentation of incidence relation in inquiry and described inlet document; And the Query Result that will comprise described relevant documentation returns to the user.
The present invention also provides a kind of device that is used for according to the user file retrieval result's operation being set up coupled relation between documents, wherein said file retrieval result retrieves and the document results that obtains according to search condition, this device comprises: monitor user ' is to file retrieval result's operation, and, obtain the parts of the document that the user chooses according to described operation; According to the document that the user chooses, the parts of storage list of relevant documents; According to described list of relevant documents, obtain the parts of the incidence relation between document, wherein, the described at least search condition of usefulness is described the incidence relation between described document.
The present invention also provides a kind of device according to the inquiry of the incidence relation between document relevant documentation, and this device comprises: the parts that receive the inlet document of user's selection; There are the parts of the relevant documentation of incidence relation in inquiry and described inlet document; And the Query Result that will comprise described relevant documentation returns to user's parts.
The present invention can bring following advantage: 1) guarantee to have certain correlativity between at least two documents of opening relationships, because the foundation of document relationships is based on the user to the operation of result for retrieval, therefore the present invention has utilized user's reading and judgement cleverly, generally, have only the user to read and just can be considered to result for retrieval accurately above certain result for retrieval for a long time, thereby there be getting in touch of essence with other document, can reduce the result for retrieval problem of unstable that the ambiguity owing to search condition produces thus; 2) foundation of correlationship is dynamic change between document of the present invention, and user's retrieval habit brings in constant renewal in because described document relationships is based on, so it can constantly revise the correlationship of being close to the users and being approved with fully; In addition, correlationship is a user individual between the document that 3) the present invention set up, and 4) foundation of correlationship all is to realize automatically with inquiry, need not the user and carry out extra operation, the present invention does not relate to complex calculations yet, only utilize simple calculating and storage means, can not cause too much expense to system.
Foregoing description has roughly been enumerated superior part of the present invention, and with the detailed description of most preferred embodiment of the present invention, these and other advantage of the present invention will be more obvious in conjunction with the drawings.
Description of drawings
The accompanying drawing of institute's reference only is used for example exemplary embodiments of the present invention in this explanation, should not be considered as to limit the scope of the present invention.
Figure 1A is the interface synoptic diagram of user search program in the prior art.
Figure 1B returns the interface synoptic diagram of result for retrieval for search engine in the prior art.
Fig. 2 is the structured flowchart of search engine system in the prior art.
Fig. 3 A is the WDS engine structure block diagram that has the log collection device.
Fig. 3 B is for setting up the network search engines structured flowchart of different log collection devices at different user.
Fig. 3 C is for setting up the network search engines structured flowchart of same log gatherer at different user.
Fig. 4 A is the WDS automotive engine system synoptic diagram that is used to set up coupled relation between documents.
Fig. 4 B is the network search engines system schematic that is used to set up coupled relation between documents.
Fig. 5 is a relevant documentation inquiry system structured flowchart.
Fig. 6 A is for setting up the process flow diagram of coupled relation between documents to file retrieval result's operation according to the user.
Fig. 6 B is the process flow diagram that judges whether to need to upgrade the coupled relation between documents table.
Fig. 6 C is the process flow diagram of generation/renewal coupled relation between documents table.
Fig. 7 A inquires about the process flow diagram of the relevant documentation relevant with the inlet document for according to the incidence relation between the document of having set up.
Fig. 7 B is the sortord of first deterministic retrieval condition, determines with described search condition to be the process flow diagram of sortord of the relevant documentation of index again.
Fig. 7 C is a sortord of determining earlier relevant documentation, determines with described relevant documentation to be the process flow diagram of sortord of the search condition of index again.
Fig. 7 D is a process flow diagram of only determining the sortord of relevant documentation.
Fig. 8 selects the interface synoptic diagram of the arrangement mode of search condition for the user.
Fig. 9 A-9D is the interface synoptic diagram of user inquiring relevant documentation.
Figure 10 A-10C is the result's of user inquiring relevant documentation a tree-like stretch-out view.
Embodiment
In the following discussion, provide a large amount of concrete details to help thoroughly to understand the present invention.Yet, obviously to those skilled in the art,, do not influence the understanding of the present invention even without these details.And should be realized that using following any concrete term only is for convenience of description, therefore, the present invention should not be confined to only be used in so any application-specific that term identified and/or hinted.
Except as otherwise noted, function available hardware of the present invention or software or their combination move.Yet in a preferred implementation column, except as otherwise noted, these functions are by processor, as computing machine or data into electronic data processing, according to coding, as computer program code, integrated circuit carry out.In general, the method for carrying out in order to realize embodiments of the invention can be a part, program, module, object or the instruction sequence of operating system or application-specific.Software of the present invention generally includes and will be numerous instructions of ready-made machine readable format by local computer, is executable instruction therefore.In addition, program comprises reside in this locality or variable that finds and data structure with respect to program in storer.In addition, various program described below can be discerned according to the application process of realizing them in certain embodiments of the invention.When carrying the computer-readable instruction that points to function of the present invention, such signal bearing media is represented embodiments of the invention.
At first some related among the present invention terms are carried out description below:
Document: alleged document is not limited to the document of text formatting among the present invention, as the document of doc, ppt, pdf form, and may be any type of document, comprises sound, video even other executable file.
Search condition: all conditions that refers to user's input in the search engine system for retrieval usefulness, comprise the field of keyword, keyword, the type of search file, the nearest modification time of search file, the author of search file, and other Advanced Options that may use.
File retrieval the results list: after referring to utilize search engine to retrieve, return to the tabulation of all documents of user.
With reference to the accompanying drawings embodiments of the present invention are described:
Figure 1A is interface 100 synoptic diagram of user search program in the prior art.In general, in user search interface 100, comprise several retrieval condition (query) input frame, the user can fill in corresponding retrieval type in these input frames.Some common search terms have been listed among the figure, wherein keyword column 110 does not represent that search engine will retrieve in strict accordance with the keyword 111 of user input, those of ordinary skill in the art understands, and search engine can also utilize the technology of searching for generally that keyword is expanded.Such as the search engine that possesses certain stock of knowledge storehouse (not shown), when the keyword 111 of user's input is " panda ", can search for the document that contains " panda " and " panda " in the lump.The upperseat concept of the keyword 111 of hurdle, keyword field 112 expression user inputs, search engine can provide this function with the expanded search ability.Such as the search engine that possesses certain stock of knowledge storehouse (not shown), when the keyword field 113 of user's input is " animal ", search engine can carry out semantic analysis to this input, returns to the user thereby retrieve the document that comprises various animal species.This class search engine will be applied in the professional search engine program of paying close attention to a certain or some field more.The type column 114 expression document storage forms of search file comprise doc, txt, pdf, ppt etc.Search condition can also comprise the author 118 of the nearest modification time 116 and the search file of search file, and other Advanced Options 120.Search condition shown in the figure can by " with ", " or ", the relation of " non-" makes up.Therefore hereinafter described search condition both can be a search condition, as a keyword, also can be a string retrieval type, as some keywords " with " " or " logical relation of " non-".The content of following search condition also is not limited to some examples of describing among the figure, but can use various special search terms according to the needs of search engine.A lot of embodiment described in this instructions are that example describes with the keyword, and this search condition of not representing other can not be applied to the present invention.
Figure 1B returns interface 100 synoptic diagram of result for retrieval for search engine in the prior art.All result for retrieval 134 will be displayed on as a result in the display field 132, although it is not shown, those having ordinary skill in the art will appreciate that, search display field 132 can present abundanter result for retrieval 134 information, comprise the Doctype of degree of correlation, the result for retrieval 134 of result for retrieval 134 and the keyword 111 of user's input, revise date, author recently, can also comprise document segment (snippet), described clip information comprises the one section content relevant with keyword 111 usually, understands the content of this result for retrieval 134 fast to help the user.Certainly, the user can check the content of document in the display field 132 as a result by clicking each result for retrieval.
Fig. 2 is the structured flowchart of search engine system 200 in the prior art.Document system monitor module (File System Watcher) 210 will obtain the change information of document data (as having generated one piece of new document from the operating system (not shown), or original document is modified), and with all changes information transmission to the module of creeping (crawler) 212.The module of creeping is given analyzer (Parser) 214 then in order to document is extracted.214 pairs of document contents that extract of analyzer are analyzed to be used for index, specifically comprise work such as participle, filtration, conversion, and store analysis result into the data-carrier store of creeping (Crawled Data Repository) 216.
Index is generally used for document being expressed as a kind of mode of being convenient to retrieve and being stored in the index data base, and the general method that adopts has vector space model, inverted entry, probability model etc.Index among Fig. 2 is set up module (Indexing Component) 218 and is obtained document from the data-carrier store 216 of creeping, at first deliver to natural language processing assembly (NLP assembly Component, Nature Language Process) 220, this assembly is used for sentence, chapter, carries out syntactic analysis (as punctuate), semantic analysis, pragmatic analysis etc.Described analysis result will be saved to the lexicon (lexicon) 224 that is attached thereto, as preserve a kind of Hash table (hash table), wherein comprise lists of keywords and pointer list, be used for being connected with the document of falling the gear.Index editing machine (Indexer) 222 links to each other with natural language processing assembly 220, be used for document is set up index, and be stored in the inverted index storer (Inverted IndexRepository) 226, for user search used.
Retrieval ordering module (Ranking System) 228 is used to realize the calculating of user search keyword and destination document matching degree, and all meet the series arrangement that document that retrieval requires can successively decrease according to the degree of correlation/increase progressively according to result of calculation, and return to the user.Retrieval ordering module 228 among Fig. 2 comprises search condition analyzer (Query Parser) 230, sorter (Sorter) 234, combiner (Merger) 232, described search condition analyzer 230 is used for the plurality of keywords of the search condition that receives from user interface 236 is analyzed, then search in lexicon and inverted index storer, common existing search engine can both provide the key word analysis that carries out based on the fuzzy search technology; Described sorter 234 can be used for the document that searches out is sorted; Described combiner 232 is used for the document that is stored in different memory is made up.
User interface (Search UI) 236 links to each other with user's input-output unit (not shown), be used to receive the retrieval request of user's input, and the interface of returning result for retrieval 100 that receives customization, and the result for retrieval that retrieval ordering module 228 is sorted, format at interface 100 according to customization, returns to the user at last.
Fig. 3 A is for having WDS engine 3 01 structured flowchart of log collection device (Log Collector) 250.So-called WDS engine (desktop search engine) is to be specifically designed to the instrument that the data in the local memory device are searched for according to user's request.The main difference of itself and prior art is to have increased a log collection device 350 between retrieval ordering module 228 and user interface 236.The major function of gatherer 350 is to collect the operation of user to result for retrieval 134, and puts out the correlationship between document from these operations in order.Particularly, log collection device 350 comprises daily record detecting device (log Watcher) 352, log memory (LogRepository) 354, log analyzer (Log Analyzer) 356, and log collection device 350 links to each other with document relationships storer (Document Relationship Repository) 358.
When the user after search engine is submitted search condition to, search engine can return the file retrieval the results list (Documents List) through ordering, and the user may judge whether that needs open reading the document according to the title of result for retrieval or its summary info etc. then.In general, if opening, the user reads a certain document above certain long-time (such as 5 minutes), just can judge whether the document is the more relevant document of importing with the user of keyword, it is relevant documentation, hereinafter will launch to introduce in detail, make exemplary illustration herein the determination methods of relevant documentation.Be appreciated that, described " certain long-time ", as checking the time threshold values, may be different based on different Doctypes, the time of checking threshold values such as word document and pdf format file just should be relative longer, because the user reads these documents and judges that its correlativity all will spend the long time usually, yet if html document or photo-document, its judgement time does not just need very long.Daily record detecting device 352 can be used for the retrieval behavior of monitoring user, and the user is to the clicking operation of result for retrieval 134, and corresponding information stores in the table 1 of log memory 354.
Figure C200610094219D00151
Table 1 result for retrieval user operation records table
In the example of table 1, " file retrieval result " list storage be at same search condition, 10 pieces of result for retrieval 134 documents that retrieved." relevant documentation " list storage be, based on the operation of user, set up the tabulation that the document of incidence relation is each other formed to result for retrieval, as being opened, and surpass the lists of documents of rationally long-time (as 5 minutes) by the user.Storage is the out of Memory that the user checks described relevant documentation in " out of Memory ", as the time that opening document is read, and the number of times information that opening document is read, the time when perhaps the user clicks certain piece of document or order information or the like.Note, the cited delegation's information that is index with a search condition Q1 in the table 1, both can be illustrated in that primary retrieval is on, user's clicking operation, the clicking operation that also can be the user retrieve according to same search condition Q1 and result for retrieval is carried out at different time, retrieve as the same keyword of user's twice input successively, in twice retrieval, the user has selected different articles to open reading respectively.
In addition, be understandable that, also can not preserve " file retrieval result " list information in the table 1, because after producing " list of relevant documents ", the information in the result for retrieval 134 has just receded into the background for subsequent operation of the present invention.And, can not preserve " out of Memory " in the table 1 yet, because the purpose of these information mainly is to use in order to calculate degree of correlation in the subsequent step, if need not degree of correlation is at length calculated, and think that generally all are clicked above rationally all quoting identical degree of correlation between the result for retrieval document for a long time by the user, so for simplicity, the information of these row also can be left in the basket.
When considering multiple situation, " out of Memory " in the table 1 also can fully be expanded, and it can comprise the user and open the number of times information that a certain piece of writing document is read, time the when user clicks certain piece of document or order information etc.Generally, we think if one piece of result for retrieval 134 document is opened by user's repeated multiple times read, and this piece document is likely and the closely-related document of search condition so; Can think also that perhaps an a certain piece of writing or several pieces of documents that the user at first opens may be the documents that has substantial connection with search condition; Certainly, in the other retrieval environment, can think that also it is the documents that have substantial connection with search condition that the user opens an a certain piece of writing or several pieces of documents of checking at last.Generally speaking, the content that " out of Memory " here comprised can help to calculate degree of correlation in subsequent step, any information relevant to the operation of result for retrieval with the user.
Log analyzer 356 will obtain the information of table 1 in the log memory 354 and analyze, thereby calculate the correlativity between the document, and be stored in the table 2 in the document relationships storer 358.For the relevant documentation under each search condition in the table 1, table 2 will be reference with described search condition and degree of correlation, be the relation between its foundation and other relevant documentation.
Table 2 coupled relation between documents table
Line data in the table 2 and column data are the relevant documentations that extracts from table 1, the intersection point correspondence of row and column be the correlationship data of a pair of relevant documentation.The correlationship data comprise with table 2 in the search condition data of a pair of relevant documentation institute quadrature and degree of correlation data etc.Described degree of correlation both can be based on the document degree of correlation of search condition, also can be average degree of correlation between document.Described document degree of correlation based on search condition is represented for a search condition, the degree of correlation of two pieces of documents; Described average degree of correlation represents for a plurality of search conditions, the average degree of correlation between two pieces of documents.For example, Q1 and Q2 represent two search conditions respectively in the table, 100% to be illustrated in Q1 be in the result for retrieval operation of keyword, the document 1 that is calculated and the degree of correlation of document 2,80% expression be with Q2 in the result for retrieval operation that is keyword, the degree of correlation of the document that calculated 1 and document 2,90% expression be the average degree of correlation of two pieces of documents that degree of correlation was calculated of comprehensively drawing with different keywords.In this example, 90%=(100%+80%)/2.
Calculating based on the degree of correlation between the document of same search condition can be according to a kind of like this basic assumption, if promptly all there is confidential relation in certain two pieces of document with the keyword that the user is imported, also there is substantial connection so between these documents, if in opposite two pieces of documents, have only one piece of keyword of being imported with the user to have confidential relation, perhaps the keyword imported of the two and user is all uncorrelated, so just can not assert in view of the above between two pieces of documents to have substantial connection.Therefore, the content of the calculating of said degree of correlation here " out of Memory " that depend in the table 1 to a great extent to be provided.
For the calculating of average degree of correlation between document, should derive out by above-mentioned degree of correlation based on different search conditions.Certainly, described derivation can be undertaken by average weighted form usually, and is as follows:
Average degree of correlation=(based on degree of correlation * weight 1+ of Q1 based on degree of correlation * weight 2+...+ of Q2 degree of correlation * weight N)/N based on QN
The weight of described each search condition correspondence, can realize by any simple or complicated calculating, more for a long time, can think that this keyword is important relatively, thereby increase its weighted value such as the number of times that in two pieces of articles, occurs when a certain keyword (being search condition).
For the sake of simplicity, the present invention can carry out some distortion, also can not write down degree of correlation between document (no matter being based on the degree of correlation of each search condition, still average degree of correlation) such as the present invention, and only write down with a pair of literary composition search condition that is mutually related, shown in following table 2A.In subsequent step, when the document correlationship is presented to the user, just no longer carry out any ordering like this according to degree of correlation, but can be in no particular order order the document correlationship is shown.
Relevant documentation Document 1 Document 2 ... Document n
Document
1 0 Q1,Q2 ...
Document 2 Q1,Q2 0 ...
... ...
Document n ... ... ... 0
Table 2A does not write down the coupled relation between documents table of degree of correlation
Be appreciated that the search condition in the table 2, and, also be omissible project (promptly only writing down the average degree of correlation between document), shown in following table 2B based on the degree of correlation of the relevant documentation of search condition.In subsequent step, when the document correlationship is presented to the user, just only show to have other relevant documentation itself of incidence relation, and can not present the search condition that the two is associated like this with a certain document.
Relevant documentation Document 1 Document 2 ... Document n
Document
1 0 90% ...
Document 2 90% 0 ...
... ...
Document n ... ... ... 0
The coupled relation between documents table of not record retrieval of table 2B condition
In addition, the average degree of correlation in the table 2 also is negligible project, in subsequent step, when the document correlationship is presented to the user, just no longer sorts according to the average degree of correlation between document like this.
When considering complex situations, the project in the table 2 also can further expand, such as the degree of correlation that can also write down in the table 2 between a certain document and a certain search condition.The computing method of described degree of correlation are that those of ordinary skill in the art can understand, and the present invention no longer is described in detail.Simultaneously, search engine system can also carry out other processing by his-and-hers watches 2 as required, such as its content of storing is deleted, with compression memory capacity, perhaps its content of storing is sorted, and thinks that subsequent operation is convenient.
In a word, the basic purpose of record sheet 2 is that the statement of " search condition is to document " in the table 1 (queryto document) is converted to the statement of " document is to document " (document to document), thereby sets up the correlationship between document.
Fig. 3 A has described WDS engine structure block diagram, and the present invention is not limited to be applied in the WDS automotive engine system shown in Fig. 4 A.But can also be applied on the network search engines as Fig. 4 B.Described network search engines 422 is meant that user 370 is connected by network 420 with server 422, and search engine is configured in the search system on the server.
Fig. 3 B is for setting up network search engines 302 structured flowcharts of different log collection devices 350 at different user 270.Server is set up a log collection device 350 separately for each user 370 in this structure, and at each user's 370 one of configuration and corresponding document relationships storer 358 (not shown) of its retrieves historical.Server can provide service to user 370 thus, allow user's 370 inquiries and other relevant document of a certain document on the server, exactly because server is set up an independent log collection device 350 for each user 370, present embodiment can guarantee that described Query Result is by fully personalized.
Fig. 3 C is for setting up network search engines 303 structured flowcharts of same log gatherer 350 at different user 370.Server is that a plurality of users 370 set up a common log collection device 350 in this structure, and at shared document relationships storer 358 (not shown)s of different user 370 configurations.In this embodiment, can think that different user 370 is not add differentiation, is considered as same user 370 in other words for the log collection device.The benefit of doing like this is that the operation of 370 pairs of result for retrieval of any one user all may cause the renewal of coupled relation between documents, thereby makes described incidence relation can reflect true correlation state between document fully, all sidedly.Thus, when any one user visits described server after 370 days, can use for reference the incidence relation between the document that retrieves historical (search history) based on other user 370 and behavior (activity) set up.
Fig. 5 is relevant documentation inquiry system 500 structured flowcharts.After the coupled relation between documents table (table 2) in the aforementioned document relationships storer 358 is ready to, can use relevant documentation inquiry system 500 to provide the relevant documentation inquiry service as the user, be used for when the user imports a certain document as the inquiry inlet, use the relevant documentation inquiry system to calculate and other relevant document of this inlet document, and the relevant documentation result is returned to the user.User interface 502 receives the document to be checked that the user selects, and sends it to inlet document monitor 504, and the inlet document represents that the user is an inlet with a certain document, when there is other relevant documentation of incidence relation with it in inquiry, and selected document; The inlet document that inlet document monitor 504 monitoring users are imported; As inlet, relevant documentation getter 506 will obtain other document relevant with the document that enters the mouth from document relationships storer 358; The relevant documentation that 510 pairs of relevant documentation getters 506 of Query Result processor obtain is handled, and promptly relevant documentation is sorted and screens; Storing default Query Result sortord in the sortord storer 512 of Query Result, perhaps by sortord controller 516, the sortord of the customization that obtains from user interface 502; At last, the relevant documentation through screening and ordering returns user interface 502 by Query Result transmitter 514.
Present invention is described according to method flow below.
Fig. 6 A is for setting up the process flow diagram of coupled relation between documents to file retrieval result's operation according to the user.At first, in the step 604, detect the search condition of user input after, search engine will return to the user with the file retrieval result in step 606, and the results list of generation/renewal file retrieval simultaneously.
Then, the document that system chooses in result for retrieval in step 608 monitor user ', and judge that in step 610 this chooses document whether to satisfy the condition that becomes relevant documentation.Generally, the user chooses one piece of document by the click mode, and those of ordinary skill in the art understands, and the user can also choose by alternate manner, comprises document " is saved as ", uses shortcut etc.Whether the document that the user chooses can become relevant documentation, and search engine will have certain criterion, and generally, user's opening document surpasses certain long-time, just is considered to the condition that this piece document has satisfied becomes relevant documentation; As long as for simplicity, also can not carry out the judgement of relevant documentation, promptly think user's opening document, no matter how long, all be relevant documentation; In other situation, it is also conceivable that the user opens the number of times information that certain piece of document is read, conducts such as time the when user clicks certain piece of document or order information are separately or the criterion that mutually combines.
In step 612, when judging that document that the user chooses satisfies when becoming the condition of relevant documentation the list of relevant documents of search engine system generation/renewal result for retrieval user operation records table (table 1) and out of Memory.So far, result for retrieval user operation records table is filled in and is finished.
Yet, generally, although in theory can, we are in no hurry to the real-time content with in the table 1 and analyze, put in order with generation/updating form 2 in the practical operation.Because for the purpose of saving system resource, when can waiting system idle, the step of analyzing, putting in order carries out, perhaps periodically carry out.Therefore, in step 614, judge, determine whether to reach the update condition of the table 2 that the user sets.Fig. 6 B has described the process flow diagram of this process.In the step 632, the update cycle of systems inspection table 2, this update cycle can be set to one day, a week or one month etc., depends on system configuration and user's request.In step 634, whether the systems inspection current time has arrived the described update cycle, if arrive, judge further in step 636 then whether current system is idle, if idle, and have new retrieval to produce (step 638) in the table 1, then can begin generation/updating form 2.
Get back to Fig. 6 A, system will be according to described list of relevant documents, and/or the out of Memory in the table 1, the incidence relation between the record document.
In step 616, the degree of correlation between the system-computed document, and generation/updating form 2.Fig. 6 C is the process flow diagram of generation/renewal coupled relation between documents table.In step 652, system determines each new retrieving head of table 1; And determine that each document in the list of relevant documents is as analytic target (step 654) in these clauses and subclauses; With it is line index, sets up the delegation's (step 656) in the table 2; And, be column index with the relevant documentation of analyzed document described in the table 1, set up the one or more row (step 658) in the table 2; Then, store the pairing search condition of described line index and column index (step 660); By the out of Memory in the table 1, the degree of correlation (step 662) of two documents of the described search condition correspondence of generation/renewal; At last, the average degree of correlation (step 664) between two documents of generation/renewal.
Get back to Fig. 6 A, in step 618, as preferred mode, system can also handle by his-and-hers watches 2, deletes such as his-and-hers watches 2, has only when described degree of correlation surpasses reservation threshold, just writes down the correlationship between document.Table 2 after the stores processor (step 620).So far, the relationship description foundation from " document is to document " finishes.
Querying flow to relevant documentation is described below.
Fig. 7 A inquires about the process flow diagram of the document relevant with the inlet document for according to the incidence relation between the document of having set up.At first, system prepares to receive the inlet document that the user will inquire about, and in step 704, system monitors user's operation interface, determines whether the user has selected inlet document (step 706).Those of ordinary skill in the art understands, the selection of determining the user can accomplished in various ways, in Fig. 9 A, select by the mode of clicking document " Innovation_matters.pdf " by right key, perhaps among Fig. 9 B, select by the mode of clicking document " A General and Flexible Access Control System... ", perhaps among Fig. 9 C, mode by import file name is selected, perhaps among Fig. 9 D, by double-clicking the icon or the link of document, thus opening document, with and relevant documentation check.
Behind the inlet document that receives user's selection, the relevant documentation that has incidence relation with described inlet document will be inquired about by system.Particularly, in step 708, the line index of system queries table 2, and read pairing column data of this row and correlationship data.
Before Query Result was returned to the user, preferred, system can also sort to search condition and relevant documentation.In step 710, system will determine pending sortord.In general, exist at least three kinds of orderings may, one, at first the relevant search condition of inlet document is sorted, then according to the pairing relevant documentation of each search condition sort (its ranking results is referring to Figure 10 A); Two, at first the relevant documentation of inlet document is sorted, then according to the pairing search condition of each relevant documentation sort (its ranking results is referring to Figure 10 B); Three, search condition is not sorted, but only carry out integral body ordering (its ranking results is referring to Figure 10 C) at relevant documentation.These three kinds of sortords can be applied in any query interface of Fig. 9 A-9D, and these three kinds of sortords can be used alone or as a mixture, and also comprise sortord according to relevant documentation simultaneously as both having comprised earlier sortord according to search condition among Fig. 9 A.
May be corresponding with above-mentioned three kinds of orderings, Fig. 7 B has provided the corresponding process flow diagram with Figure 10 A.In the step 722, whether system validation sorts according to search condition earlier; Then, the sortord (step 724) of the search condition of system queries consumer premise, if exist,, otherwise be pending sortord (step 726) with the sortord of the search condition of system default then as pending sortord (step 728); Then, system queries with each search condition is the sortord (step 730) of the relevant documentation of index, and is same, if exist, then as pending sortord (step 734), otherwise be pending sortord (step 732) with the sortord of the relevant documentation of system default; At last, return determined two kinds of sortords (step 736 and 738).
Fig. 7 C is corresponding to Figure 10 B.In the step 742, whether system validation sorts according to relevant documentation earlier; Then, the sortord (step 744) of the relevant documentation of system queries consumer premise, if exist,, otherwise be pending sortord (step 746) with the sortord of the relevant documentation of system default then as pending sortord (step 748); Then, system queries is the sortord (step 750) of the search condition of index with each relevant documentation, same, if exist, then as pending sortord (step 754), otherwise the sortord with the search condition of system default is pending sortord (step 752), and is last, returns determined two kinds of sortords (step 756 and 758).
Fig. 7 D is corresponding to Figure 10 C.In the step 772, whether system validation only sorts according to relevant documentation; Then, the sortord (step 774) of the relevant documentation of system queries consumer premise, if exist,, otherwise be pending sortord (step 776) with the sortord of the relevant documentation of system default then as pending sortord (step 778); At last, return determined sortord (step 780 and 782).
Need to prove that so-called sortord not only represents to be sorted putting in order of content, also may comprise the arrangement condition that the content that whether will sort shows.In Fig. 7 B and the example shown in Figure 10 A that formerly sort according to search condition, inquiry system must will determine to return to the type of user's search condition earlier, as shown in Figure 8, system will determine " keyword " and " modification time recently " as search condition to be shown; Then system need determine the sequencing of " keyword " and " modification time recently ", simultaneity factor also needs a plurality of " keywords " in definite table 2 to arrange according to what mode, generally, search condition all is according to its (above describing of sorting of degree of correlation with the inlet document, in table 2, may write down the degree of correlation between search condition and document), certainly, the present invention does not get rid of other sortord; Preferably, system can also set some threshold values, thereby only that those degrees of correlation are high search condition sorts.Then system is according to each search condition, determine the sortord of associated relevant documentation, same preferred, although in theory can, in fact system needn't indicator gauge 2 at all relevant documentations of a search condition, but can set some threshold values, thereby only that those degrees of correlation are high relevant documentation carries out sequencing display, generally, the ordering of relevant documentation can be according to (as 100% in the table 2 and 80%) that carry out based on the degree of correlation of search condition.
In the example of Fig. 7 C, 10B, 7D, 10C, at first to sort to relevant documentation, generally, the ordering of relevant documentation can be to sort (above having described the average degree of correlation that table 2 is preserved) according to the average degree of correlation between document, and the present invention does not get rid of other possible putting in order yet certainly.
Get back to Fig. 7 A, in the step 712, system sorts according to the sortord of relevant documentation of determining and/or the search condition relevant documentation to the inlet document, and its result is returned to user's (step 714).Return results can carry out according to variety of way, as the pop-up window among Fig. 9 A, and the perhaps tabulation among Fig. 9 B, Fig. 9 D, the perhaps tree structure among Fig. 9 C can certainly be by other the mode return results of above not mentioning.
Behind the return results, the user may need to click the content of checking relevant documentation, therefore, further usage log gatherer 350 is checked user's operation, and opens the degree of correlation between the document of checking according to inlet document in the further updating form 2 of mode mentioned above and by the user.Thereby the document correlationship data of guaranteeing in the table 2 to be write down are reacted the true correlation state between document as far as possible in time, accurately.
In addition, above-mentioned each operating process can be realized by the executable program that is stored in the computer program.This program product defines the function of each embodiment, and carries various signals, includes but is not limited to: the 1) information of permanent storage on not erasable medium; 2) be stored in information on the erasable medium; Or 3) communication medium by comprising radio communication (as, by computer network or telephone network) be sent to the information on the computing machine, particularly comprise from the information of the Internet and other network download.
Various embodiment of the present invention can provide many advantages, comprise in summary of the invention, enumerated and can itself derive out from technical scheme.But no matter whether an embodiment obtains whole advantages, and also no matter whether such advantage is considered to obtain substantive raising, should not be construed as limiting the invention.Simultaneously, the various embodiments of above mentioning only are for purposes of illustration, and those of ordinary skill in the art can make various modifications and changes to above-mentioned embodiment, and does not depart from essence of the present invention.Scope of the present invention is limited by appended claims fully.

Claims (29)

1. one kind is used for according to the user method of coupled relation between documents being set up in file retrieval result's operation, and wherein said file retrieval result retrieves and the document results that obtains according to search condition, and this method comprises:
Monitor user ' is to file retrieval result's operation, and according to described operation, obtains the document that the user chooses;
The document of choosing according to described user, the storage list of relevant documents, wherein said list of relevant documents comprises the operation to result for retrieval based on described user, sets up the tabulation that the document of incidence relation is each other formed;
According to described list of relevant documents, obtain the incidence relation between document,
Wherein, the described at least search condition of usefulness is described the incidence relation between described document.
2. the method for claim 1, the document that wherein said user chooses is that the user opens the document of browsing in described result for retrieval.
3. method as claimed in claim 2 also comprises:
Judge whether the document that described user chooses satisfies the condition that becomes described relevant documentation,, then store the described document of choosing as relevant documentation when satisfying when becoming the condition of described relevant documentation; When not satisfying when becoming the condition of described relevant documentation, then do not store the described document of choosing as relevant documentation.
4. method as claimed in claim 3, wherein satisfying the condition become described relevant documentation comprises that the user opens and describedly chooses the duration of document to surpass predetermined lasting time, or the user opens the described number of times of document of choosing above the predetermined number of times of opening, or the user chooses document to open in the given time, and perhaps the user chooses document to open predetermined opening in the order.
5. method as claimed in claim 2 also comprises:
Detect and store the out of Memory of described relevant documentation; And
According to described list of relevant documents and described out of Memory, the incidence relation between the record document.
6. method as claimed in claim 5, wherein said out of Memory comprises the duration after user's opening document, or the user opens the number of times of same document, or the user chooses opening the time of document, or the user chooses the order of opening of document.
7. method as claimed in claim 5, the step of the incidence relation between wherein said record document also comprises:
Calculate based on the degree of correlation between the document of at least one search condition according to described at least out of Memory, and describe incidence relation between described document by the degree of correlation between described at least search condition and document.
8. as claim 5,6 or 7 described methods, the step of the incidence relation between wherein said record document also comprises:
Calculate average degree of correlation between document according to described at least out of Memory, and describe incidence relation between described document by the average degree of correlation between described at least search condition and document.
9. as the described method of arbitrary claim among the claim 5-7, the step of wherein said record coupled relation between documents is carried out during the free time in system.
10. according to the method for the inquiry of the incidence relation between the determined document of arbitrary claim relevant documentation among the claim 1-9, this method comprises:
Receive the inlet document that the user selects;
There are the relevant documentation of incidence relation in inquiry and described inlet document; And
The Query Result that will comprise described relevant documentation returns to the user.
11. method as claimed in claim 10, the described step that Query Result is returned to the user further comprises:
The search condition that will be associated with described inlet document, and the relevant documentation that is associated with described inlet document returns to the user, wherein said relevant documentation is an index with described search condition.
12. method as claimed in claim 10, the described step that Query Result is returned to the user further comprises:
The relevant documentation that will be associated with described inlet document, and the search condition that is associated with described inlet document returns to the user, wherein said search condition is an index with described relevant documentation.
13., also comprise as claim 11 or 12 described methods:
Determine to return to the sortord of user's search condition; And
Determine to return to the sortord of user's relevant documentation.
14. method as claimed in claim 10 also comprises:
Determine to return to the sortord of user's relevant documentation.
15. method as claimed in claim 10 also comprises:
According to the relevant documentation that user in described Query Result chooses, further write down or upgrade the degree of correlation of described relevant documentation and described inlet document.
16. one kind is used for according to the user device of coupled relation between documents being set up in file retrieval result's operation, wherein said file retrieval result retrieves and the document results that obtains according to search condition, and this device comprises:
Monitor user ' is to file retrieval result's operation, and according to described operation, obtains the parts of the document that the user chooses;
The document of choosing according to the user, the parts of storage list of relevant documents, wherein said list of relevant documents comprises the operation to result for retrieval based on described user, sets up the tabulation that the document of incidence relation is each other formed;
According to described list of relevant documents, obtain the parts of the incidence relation between document,
Wherein, the described at least search condition of usefulness is described the incidence relation between described document.
17. device as claimed in claim 16, the document that wherein said user chooses are that the user opens the document of browsing in described result for retrieval.
18. device as claimed in claim 17 also comprises:
Judge whether document that described user chooses satisfies the parts of the condition that becomes described relevant documentation,, then store the described document of choosing as relevant documentation when satisfying when becoming the condition of described relevant documentation; When not satisfying when becoming the condition of described relevant documentation, then do not store the described document of choosing as relevant documentation.
19. device as claimed in claim 18, wherein satisfying the condition become described relevant documentation comprises that the user opens and describedly chooses the duration of document to surpass predetermined lasting time, or the user opens the described number of times of document of choosing above the predetermined number of times of opening, or the user chooses document to open in the given time, and perhaps the user chooses document to open predetermined opening in the order.
20. device as claimed in claim 17 also comprises:
Detect and store the parts of the out of Memory of described relevant documentation; And
According to described list of relevant documents and described out of Memory, the parts of the incidence relation between the record document.
21. device as claimed in claim 20, wherein said out of Memory comprises the duration after user's opening document, or the user opens the number of times of same document, or the user chooses opening the time of document, or the user chooses the order of opening of document.
22. as claim 20 or 21 described devices, the parts of the incidence relation between wherein said record document also comprise:
Calculate based on the degree of correlation between the document of at least one search condition according to described at least out of Memory, and describe the parts of the incidence relation between described document by the degree of correlation between described at least search condition and document.
23. as claim 20 or 21 described devices, the parts of the incidence relation between wherein said record document also comprise:
Calculate average degree of correlation between document according to described at least out of Memory, and describe the parts of the incidence relation between described document by the average degree of correlation between described at least search condition and document.
24. according to the device of the inquiry of the incidence relation between the determined document of arbitrary claim relevant documentation among the claim 16-23, this device comprises:
Receive the parts of the inlet document of user's selection;
There are the parts of the relevant documentation of incidence relation in inquiry and described inlet document; And
The Query Result that will comprise described relevant documentation returns to user's parts.
25. device as claimed in claim 24, the described parts that Query Result is returned to the user further comprise:
The search condition that will be associated with described inlet document, and the relevant documentation that is associated with described inlet document returns to user's parts, wherein said relevant documentation is an index with described search condition.
26. device as claimed in claim 24, the described parts that Query Result is returned to the user further comprise:
The relevant documentation that will be associated with described inlet document, and the search condition that is associated with described inlet document returns to user's parts, wherein said search condition is an index with described relevant documentation.
27., also comprise as claim 25 or 26 described devices:
Determine to return to the parts of sortord of user's search condition; And
Determine to return to the parts of sortord of user's relevant documentation.
28. device as claimed in claim 24 also comprises:
Determine to return to the parts of sortord of user's relevant documentation.
29. device as claimed in claim 24 also comprises:
According to the relevant documentation that user in described Query Result chooses, further write down or upgrade the parts of the degree of correlation of described relevant documentation and described inlet document.
CNB2006100942198A 2006-06-27 2006-06-27 Method and device for establishing coupled relation between documents Expired - Fee Related CN100524307C (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CNB2006100942198A CN100524307C (en) 2006-06-27 2006-06-27 Method and device for establishing coupled relation between documents
US11/740,431 US7809716B2 (en) 2006-06-27 2007-04-26 Method and apparatus for establishing relationship between documents

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100942198A CN100524307C (en) 2006-06-27 2006-06-27 Method and device for establishing coupled relation between documents

Publications (2)

Publication Number Publication Date
CN101097574A CN101097574A (en) 2008-01-02
CN100524307C true CN100524307C (en) 2009-08-05

Family

ID=38874644

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100942198A Expired - Fee Related CN100524307C (en) 2006-06-27 2006-06-27 Method and device for establishing coupled relation between documents

Country Status (2)

Country Link
US (1) US7809716B2 (en)
CN (1) CN100524307C (en)

Families Citing this family (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7505964B2 (en) 2003-09-12 2009-03-17 Google Inc. Methods and systems for improving a search ranking using related queries
US8661029B1 (en) 2006-11-02 2014-02-25 Google Inc. Modifying search result ranking based on implicit user feedback
US9110975B1 (en) 2006-11-02 2015-08-18 Google Inc. Search result inputs using variant generalized queries
US8938463B1 (en) 2007-03-12 2015-01-20 Google Inc. Modifying search result ranking based on implicit user feedback and a model of presentation bias
US8694374B1 (en) 2007-03-14 2014-04-08 Google Inc. Detecting click spam
US20080228699A1 (en) 2007-03-16 2008-09-18 Expanse Networks, Inc. Creation of Attribute Combination Databases
US9092510B1 (en) 2007-04-30 2015-07-28 Google Inc. Modifying search result ranking based on a temporal element of user feedback
US20090043752A1 (en) 2007-08-08 2009-02-12 Expanse Networks, Inc. Predicting Side Effect Attributes
US8694511B1 (en) 2007-08-20 2014-04-08 Google Inc. Modifying search result ranking based on populations
US8909655B1 (en) 2007-10-11 2014-12-09 Google Inc. Time based ranking
CA2711087C (en) * 2007-12-31 2020-03-10 Thomson Reuters Global Resources Systems, methods, and software for evaluating user queries
US20100005169A1 (en) * 2008-07-03 2010-01-07 Von Hilgers Philipp Method and Device for Tracking Interactions of a User with an Electronic Document
US20100010895A1 (en) * 2008-07-08 2010-01-14 Yahoo! Inc. Prediction of a degree of relevance between query rewrites and a search query
JP5327784B2 (en) * 2008-07-30 2013-10-30 株式会社日立製作所 Computer system, information collection support device, and information collection support method
US7917438B2 (en) 2008-09-10 2011-03-29 Expanse Networks, Inc. System for secure mobile healthcare selection
US20100076950A1 (en) * 2008-09-10 2010-03-25 Expanse Networks, Inc. Masked Data Service Selection
US8200509B2 (en) 2008-09-10 2012-06-12 Expanse Networks, Inc. Masked data record access
JP4633162B2 (en) * 2008-12-01 2011-02-16 株式会社エヌ・ティ・ティ・ドコモ Index generation system, information retrieval system, and index generation method
US8396865B1 (en) 2008-12-10 2013-03-12 Google Inc. Sharing search engine relevance data between corpora
US8255403B2 (en) 2008-12-30 2012-08-28 Expanse Networks, Inc. Pangenetic web satisfaction prediction system
US8386519B2 (en) 2008-12-30 2013-02-26 Expanse Networks, Inc. Pangenetic web item recommendation system
US8108406B2 (en) 2008-12-30 2012-01-31 Expanse Networks, Inc. Pangenetic web user behavior prediction system
US20100169338A1 (en) * 2008-12-30 2010-07-01 Expanse Networks, Inc. Pangenetic Web Search System
US9009146B1 (en) 2009-04-08 2015-04-14 Google Inc. Ranking search results based on similar queries
US8447760B1 (en) 2009-07-20 2013-05-21 Google Inc. Generating a related set of documents for an initial set of documents
US8498974B1 (en) 2009-08-31 2013-07-30 Google Inc. Refining search results
US8972391B1 (en) 2009-10-02 2015-03-03 Google Inc. Recent interest based relevance scoring
US8874555B1 (en) 2009-11-20 2014-10-28 Google Inc. Modifying scoring data based on historical changes
US8615514B1 (en) 2010-02-03 2013-12-24 Google Inc. Evaluating website properties by partitioning user feedback
US8924379B1 (en) 2010-03-05 2014-12-30 Google Inc. Temporal-based score adjustments
US8959093B1 (en) 2010-03-15 2015-02-17 Google Inc. Ranking search results based on anchors
US9623119B1 (en) 2010-06-29 2017-04-18 Google Inc. Accentuating search results
US8832083B1 (en) 2010-07-23 2014-09-09 Google Inc. Combining user feedback
US9002867B1 (en) 2010-12-30 2015-04-07 Google Inc. Modifying ranking data based on document changes
US9015143B1 (en) 2011-08-10 2015-04-21 Google Inc. Refining search results
CN103164444A (en) * 2011-12-14 2013-06-19 联想(北京)有限公司 File processing method, file processing device and file processing electronic equipment
US9183499B1 (en) 2013-04-19 2015-11-10 Google Inc. Evaluating quality based on neighbor features
CN104750762A (en) * 2013-12-31 2015-07-01 华为技术有限公司 Information retrieval method and device
US20150248398A1 (en) * 2014-02-28 2015-09-03 Choosito! Inc. Adaptive reading level assessment for personalized search
CN105022787A (en) * 2015-06-12 2015-11-04 广东小天才科技有限公司 Composition pushing method and apparatus
US10380207B2 (en) * 2015-11-10 2019-08-13 International Business Machines Corporation Ordering search results based on a knowledge level of a user performing the search
CN106528861A (en) * 2016-11-30 2017-03-22 福建中金在线信息科技有限公司 Method and device for adding internal chain
JP7013756B2 (en) * 2017-09-19 2022-02-01 富士フイルムビジネスイノベーション株式会社 Information processing equipment and programs
CN112764617A (en) * 2021-01-22 2021-05-07 维沃移动通信有限公司 File selection method and device, electronic equipment and readable storage medium
CN114611145B (en) * 2022-03-14 2023-01-06 穗保(广州)科技有限公司 Data security sharing platform based on internet online document

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5749081A (en) * 1995-04-06 1998-05-05 Firefly Network, Inc. System and method for recommending items to a user
JP3547069B2 (en) * 1997-05-22 2004-07-28 日本電信電話株式会社 Information associating apparatus and method
US6012053A (en) * 1997-06-23 2000-01-04 Lycos, Inc. Computer system with user-controlled relevance ranking of search results
US7124129B2 (en) * 1998-03-03 2006-10-17 A9.Com, Inc. Identifying the items most relevant to a current query based on items selected in connection with similar queries
US6317722B1 (en) * 1998-09-18 2001-11-13 Amazon.Com, Inc. Use of electronic shopping carts to generate personal recommendations
JP3347088B2 (en) * 1999-02-12 2002-11-20 インターナショナル・ビジネス・マシーンズ・コーポレーション Related information search method and system
US7000194B1 (en) * 1999-09-22 2006-02-14 International Business Machines Corporation Method and system for profiling users based on their relationships with content topics
US6502091B1 (en) * 2000-02-23 2002-12-31 Hewlett-Packard Company Apparatus and method for discovering context groups and document categories by mining usage logs
US6640218B1 (en) * 2000-06-02 2003-10-28 Lycos, Inc. Estimating the usefulness of an item in a collection of information
US6691107B1 (en) 2000-07-21 2004-02-10 International Business Machines Corporation Method and system for improving a text search
US6633868B1 (en) 2000-07-28 2003-10-14 Shermann Loyall Min System and method for context-based document retrieval
US8001118B2 (en) * 2001-03-02 2011-08-16 Google Inc. Methods and apparatus for employing usage statistics in document retrieval
US7231381B2 (en) * 2001-03-13 2007-06-12 Microsoft Corporation Media content search engine incorporating text content and user log mining
US7299270B2 (en) 2001-07-10 2007-11-20 Lycos, Inc. Inferring relations between internet objects that are not connected directly
US7149732B2 (en) * 2001-10-12 2006-12-12 Microsoft Corporation Clustering web queries
US20050021397A1 (en) * 2003-07-22 2005-01-27 Cui Yingwei Claire Content-targeted advertising using collected user behavior data
JP4116329B2 (en) * 2002-05-27 2008-07-09 株式会社日立製作所 Document information display system, document information display method, and document search method
US8086619B2 (en) * 2003-09-05 2011-12-27 Google Inc. System and method for providing search query refinements
US7346839B2 (en) * 2003-09-30 2008-03-18 Google Inc. Information retrieval based on historical data
US7634472B2 (en) * 2003-12-01 2009-12-15 Yahoo! Inc. Click-through re-ranking of images and other data
US7451131B2 (en) * 2003-12-08 2008-11-11 Iac Search & Media, Inc. Methods and systems for providing a response to a query
US7698626B2 (en) 2004-06-30 2010-04-13 Google Inc. Enhanced document browsing with automatically generated links to relevant information
US8572233B2 (en) * 2004-07-15 2013-10-29 Hewlett-Packard Development Company, L.P. Method and system for site path evaluation using web session clustering
US7693818B2 (en) * 2005-11-15 2010-04-06 Microsoft Corporation UserRank: ranking linked nodes leveraging user logs
US7647314B2 (en) * 2006-04-28 2010-01-12 Yahoo! Inc. System and method for indexing web content using click-through features
US8321448B2 (en) * 2007-02-22 2012-11-27 Microsoft Corporation Click-through log mining

Also Published As

Publication number Publication date
US20070299826A1 (en) 2007-12-27
US7809716B2 (en) 2010-10-05
CN101097574A (en) 2008-01-02

Similar Documents

Publication Publication Date Title
CN100524307C (en) Method and device for establishing coupled relation between documents
US9305100B2 (en) Object oriented data and metadata based search
US10394908B1 (en) Systems and methods for modifying search results based on a user's history
JP4721740B2 (en) Program for managing articles or topics
US7783631B2 (en) Systems and methods for managing multiple user accounts
RU2335013C2 (en) Methods and systems for improving search ranging with application of information about article
US9251157B2 (en) Enterprise node rank engine
KR101063364B1 (en) System and method for prioritizing websites during the web crawling process
US7694212B2 (en) Systems and methods for providing a graphical display of search activity
US7747632B2 (en) Systems and methods for providing subscription-based personalization
US8554768B2 (en) Automatically showing additional relevant search results based on user feedback
CN102722498B (en) Search engine and implementation method thereof
US20090187550A1 (en) Specifying relevance ranking preferences utilizing search scopes
CN102722501B (en) Search engine and realization method thereof
US20060224583A1 (en) Systems and methods for analyzing a user's web history
US20060224608A1 (en) Systems and methods for combining sets of favorites
CN102737021B (en) Search engine and realization method thereof
US20080065632A1 (en) Server, method and system for providing information search service by using web page segmented into several inforamtion blocks
CN102722499B (en) Search engine and implementation method thereof
US8380745B1 (en) Natural language search for audience
US20110238653A1 (en) Parsing and indexing dynamic reports
KR100671077B1 (en) Server, Method and System for Providing Information Search Service by Using Sheaf of Pages
JP2002312389A (en) Information retrieving device and information retrieving method
KR100645711B1 (en) Server, Method and System for Providing Information Search Service by Using Web Page Segmented into Several Information Blocks
Suruliandi et al. VALIDATING THE PERFORMANCE OF PERSONALIZATION TECHNIQUES IN SEARCH ENGINE.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090805

Termination date: 20200627

CF01 Termination of patent right due to non-payment of annual fee