WO1997046962A1 - Finding an e-mail message to which another e-mail message is a response - Google Patents

Finding an e-mail message to which another e-mail message is a response Download PDF

Info

Publication number
WO1997046962A1
WO1997046962A1 PCT/US1997/009161 US9709161W WO9746962A1 WO 1997046962 A1 WO1997046962 A1 WO 1997046962A1 US 9709161 W US9709161 W US 9709161W WO 9746962 A1 WO9746962 A1 WO 9746962A1
Authority
WO
WIPO (PCT)
Prior art keywords
message
filtered
messages
match
vector
Prior art date
Application number
PCT/US1997/009161
Other languages
French (fr)
Other versions
WO1997046962A9 (en
Inventor
Kimberly A. Knowles
David Dolan Lewis
Original Assignee
At & T Corp.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by At & T Corp. filed Critical At & T Corp.
Publication of WO1997046962A1 publication Critical patent/WO1997046962A1/en
Publication of WO1997046962A9 publication Critical patent/WO1997046962A9/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/234Monitoring or handling of messages for tracking messages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/107Computer-aided management of electronic mailing [e-mailing]

Definitions

  • This invention relates to electronic messaging and, more particularly, to a way of recognizing and manipulating threads contained in electronic messages
  • a problem with most such approaches is that they process each message individually. Many messages are parts of larger conversations, or threads A thread is a
  • Messaging systems that are explicitly oriented to group discussion, e.g., the Usenet network and other bulletin board systems, provide the most support for threading
  • the reply command in most Usenet news posting programs inserts into a reply or child message two forms of information about the relationship between it and its parent message (the message it is a reply to).
  • the chain of unique message identifiers in the References: field of the parent is copied into the References: field of the child, with the unique identifier of the parent added.
  • the Subject: line of the parent is copied into the Subject: line of the child, typically prefixed by Re:.
  • Usenet news readers providing a threaded display use the structural links from the References: field, while others organize a threaded display around Subject lines which are identical or have identical prefixes
  • Conversations can also be carried out over electronic mail systems .
  • the ability to send to and reply to groups of people, as well as the use of centralized mail "reflectors" and mailing list management software, can informally support multiple large scale discussions.
  • replying to an e-mail message often inserts structural information into the reply
  • the reply command may copy the Message-Id: field or other identifying information from the parent, into the In-Reply-To: field of the child
  • the Subject: line is typically copied to the Subject: field, preceded by Re:.
  • VM mail reader available at ftp.uu.net in networking/mail/vm directory
  • the mail archiving program hypermail see
  • http://www.eit.com/software/hypermail .html marks up archives of e-mail with a variety of links, including threading information. It attempts first to find a message id in the In-Reply-To: field and match it to a known message. Failing that it looks for a matching date string in the In-Reply-To: field, and finally tries for a match on the Subject: line, after removing one Re: tag.
  • In-Reply-To fields are optional and their format and nature is only loosely constrained when they are present.
  • Subject lines for both Usenet messages and Internet mail are allowed to contain arbitrary text, clients are inconsistent in their use of Re: tags, and manual editing of Subject:
  • An object of the present invention is to utilize the textual context and characteristics of messages to provide a more reliable and effective way to construct message threads.
  • statistical information retrieval techniques are used in conjunction with textual material obtained by "filtering" of messages to achieve a significant level of accuracy at identifying when one message is a reply to another.
  • FIGURE 2 contains a diagram showing an embodiment of the present invention.
  • Threading of electronic messages should be treated as a language processing task.
  • the present invention utilizes textual context and characteristics of messages in order to provide a more reliable and effective way to construct message threads. Preliminary experiments show that a significant level of threading effectiveness can be
  • the goal in experimentation was to test the ability of various linguistic clues to indicate whether one message was a response to another.
  • Three types of textual material from messages were investigated: (1) the Subject: line; (2) quoted material in the message; and (3) the (unquoted) text of the message itself.
  • the results of the experiments conducted show that statistical information retrieval techniques can achieve a significant level of accuracy at identifying when one message is a reply to another.
  • Text from the Subject line is a good clue that a message belongs to a particular thread, though it may not directly indicate which message in the thread is being replied to. Quoting of material from the parent message, particularly quotes of several lines, is a much stronger form of context. Salton and Buckley in an article entitled “Global Text Matching for Information Retrieval," Science, 253:1012-1015 (August, 1991), showed that text matching on a collection of Usenet messages which included substantial quoted material was highly effective at retrieving related messages, under a definition of relatedness that subsumed the response relationship of interest.
  • Simple message filters were written to extract the three types of textual material (referred to above) from each message: (1) the text of the Subject: field; (2) unquoted text from the message body,- and (3) quoted text from the message' body. This resulted in three collections of 2435 document representatives, one for each type of textual material. Some messages had empty document
  • Target messages represented the potential parent messages matched against a given "query” (child) message chosen from the database.
  • the "best" match of the target messages (excluding the query message) for a given query message represents a potential parent message.
  • SMART scores each target message I as b. Processing
  • FIGURE 1 displays the distribution of ranks of the 941 parent documents with respect to each of the five forms of text matching.
  • the value for rank 0 is the number of times a child retrieved its parent as the first document in the ranking, rank 1 indicates how often the parent was second in the ranking, and so on.
  • the child document (which was itself present in the database, though not necessarily in the same form as was used in querying) was removed from the ranking, so that the ranks run from 0 to 2433 instead of 0 to 2434.
  • Salton and Buckley were attempting to find related messages, not just parent messages, and defined all messages with the same Subject: line as being related.
  • the task undertaken by Salton and Buckley is a simpler task than finding the single parent of a messaqe.
  • the curve for quoted queries vs. unquoted messages drops off extremely sharply. In most cases only the single parent messages will have a large block of unquoted text similar to the quoted text of the child.
  • the curve for subject vs. subject (the fifth curve in FIGURE 1) drops sharply at the beginning, after the exhausting of those cases where there are nearly exact matches between the Subject: line of the query and a few documents with the same Subject: line. Later the curve is more gradual reflecting cases where the subject line is common to many messages, or the match is on only a subset of the words.
  • FIGURE 2 shows the flow of message processing in accordance with the present invention.
  • a set of N target messages denoted 1, 2, ..., N
  • any of which may be a parent message to be determined.
  • Each target (potential parent) message at 200 is filtered through a parent message filter A at 210.
  • parent message filter A may extract subject text, unquoted text, or quoted text from each message. The result of the message filtering
  • message filter A at 210 extracts unquoted text from each potential parent, and the set of unquoted text messages for potential parents is at 220.
  • the filtered potential parent messages (1 A , 2 A , ..., N A ) at 220 are then passed along to a
  • Statistical Information Retrieval Function 230 can be the SMART system described above or an equivalent
  • the child, or reply, message CM at 240 is also
  • the child message filter may extract subject text, unquoted text, or quoted text from the child message, producing a filtered child message CM Q at 260.
  • the child message filter at 250 extracts quoted text from the child message CM at 240, producing child quoted text at 260.
  • the filtered child message CM Q is then passed to the Statistical Information Retrieval Function at 230, along with filtered parent messages (1 A , 2 A , ..., N A ).
  • Statistical Information Retrieval Function processes these message components to provide a similarity value table at 270, which represents values (denoted AQ 1 , AQ 2 , ..., AQ N ) each of which is a measure of how likely it is that the corresponding message (1, 2, ..., N) is the parent for the child message CM.
  • the similarity value table at 270 is processed by a maximum value function at 280 from which the maximum value can be determined.
  • the position in the table of the maximum value is a pointer or identifier at 290 that can be used to retrieve the corresponding target message which has been selected as the most likely parent message.
  • This message can now be presented to the user along with the child message in a variety of formats, or simply retained for further processing to produce a thread.
  • a list of potential message pairings -- with or without selecting which one is the actual parent -- may be
  • an alternative step may include establishing a threshold against which the ranking or similarity scores for the child and potential parent messages are measured, and if none of the rankings or similarity scores exceed the threshold, then it would be determined that there is no "match", i.e., no true parent message for that child message.
  • Generating a thread may be accomplished by iteratively applying the method of the present invention as described above. Starting with a perceived child message, a likely parent message is determined using the method. That parent message is then substituted as a new "child" message and its parent (i.e., the grandparent of the original child message) is determined using the same method. Similarly, the grandparent message can then be substituted as yet another "child” message to determine its parent and so forth, so that ultimately a thread of messages having parent-child relationship between successive messages may be obtained.
  • Threads can be determined by linking up successive child-parent pairs. Linking of successive child-parent pairs may be done by, for example, finding a child message (denote as "B") having a parent message (denote as "A") wherein child message "B” is itself a parent message for another child message (denote as "C”); that is, message "A” is the parent of "B” and the
  • An alternative to the embodiment of the present invention described above may be used to obtain a likely child message given a parent message.
  • the basic process using message filters is the same for the alternative embodiment.
  • the differences in the process are the filters used.
  • the best results in determining a parent message given a child message were obtained by using a quoted text filter for the child and an unquoted filter for each of the potential parent messages. Starting with a given parent message, then, the process would involve the use of an unquoted filter on the parent message and a quoted filter for each of the remaining messages (the potential child messages). Once the messages are filtered, the processing essentially takes place as described above.
  • the method of the present invention could be applied to identify a parent message from the messages that have previously arrived.
  • the new message could be checked against the other messages ( in accordance with the method described above for locating a child message from a potential parent) in order to determine a child message for the newly received message.
  • analyzing citation patterns can be used to take this tendency into account.
  • a mail reader might display a revised message while
  • a message without a clear connection to its parent may be similar to another child of the same parent, which does have a clear like.
  • embodiment of the present invention may be obtained as a generalization of the embodiment reflected in FIGURE 2 described above.
  • FIGURE 3 With reference to the diagram in FIGURE 3, the flow of message processing for the more general embodiment of the present invention will now be described.
  • At 300 is a set of N target messages (denoted 1, 2, ..., N), any of which may be a parent message to be determined.
  • Each target (potential parent) message at 300 is filtered through a parent message filter bank (which may be one or more message filters).
  • the parent message filter bank is shown at 310 in FIGURE 3 as a set of one or more message filters denoted by A, B, ..., K, giving a parent message filter bank of length K.
  • Parent message filters A through K may extract subject text, unquoted text, or quoted text from each message, or they may implement one or more of the "improvements" in message analysis described above (such as, e.g., extracting nested quotations, time information, or cue phrases).
  • the result of the filtering operation is a set of N filtered target (potential parent) message vectors (denoted 1 A , 1 B , ..., 1 K , 2 A , 2 B , ..., 2 K , ..., N A , N B , ..., N K ) at 320, where each filtered parent message is a vector consisting of the K filtered representations of the message, i.e., each element of the vector is the result of one of the K
  • filtered target message 1 is denoted as vector 1 A , 1 B , ... , 1 K , where 1 A represents the result of processing target message 1 through message filter A, etc.).
  • Statistical Information Retrieval Function at 330, which may be the SMART system described above or an equivalent statistically--based retrieval function.
  • the child, or reply, message CM at 340 is also
  • a message filter bank which may be one or more message filters.
  • the child message filter bank is shown at 350 as a set of message filters denoted as Q, R, ..., Z, giving a child message filter bank of length Z-Q+1.
  • the child message filter bank may contain one or more of the same type of potential message filters described above for the parent message filter bank.
  • the child message filter bank produces a filtered child message vector (denoted CM Q , CM R; ..., CM Z ) containing Z-Q+l
  • the filtered child message vector (CM Q , CM R , ..., CM Z ) is then passed to the Statistical Information Retrieval Function at 330, along with the set of filtered parent message vectors (1 A , 1 B , ..., 1 K , 2 A , 2 B , ..., 2 K , ..., N A , N B , ..., N K ).
  • the Statistical Information Retrieval is then passed to the Statistical Information Retrieval Function at 330, along with the set of filtered parent message vectors (1 A , 1 B , ..., 1 K , 2 A , 2 B , ..., 2 K , ..., N A , N B , ..., N K ).
  • Function processes these message components to provide a similarity value table at 370, with values (denoted Ad, AQ 2 , ..., AQ N , KZ 1; KZ 2 , ..., KZ N ) representative of the similarity between potential parent and child message components. It may be preferable to combine the columns of values in the similarity value table of 370 using a
  • the combiner function at 372 to provide a single tuple of values at 374, each element of which is a measure of how likely it is that the corresponding message (1, 2, ..., N) is the parent for the child message CM.
  • the combiner function may be a decision procedure based upon machine learning methods.
  • the tuple of values at 374 is processed by a selector function at 380 from which an identifier for the most likely parent message can be determined at 390. For example, if the selector function is the maximum value function described above with
  • the position of the maximum value in the tuple of values is a pointer or identifier at 390 that can be used to retrieve the corresponding target message which has been selected as the most likely parent message.
  • the selected message can now be presented to the user along with the child message in a variety of formats, or simply retained for further processing to produce a thread.
  • each of the parent and child message filter banks may consist of a single message filter or multiple message filters.
  • the present invention may be implemented in any one of a number of known ways. For example, the present invention may be implemented by integrating or combining the techniques of the present invention with an e-mail reader or browser software program. Such a program may be client-based
  • the present invention could be implemented as part of a client-based or server-based message archival software program.
  • the advantages of the present invention do not depend upon the particular mode of operation (i.e., server or client) of a computer or processor through which the techniques herein described are implemented. It will be clear to those skilled in the art that the location of the messages that may be processed in accordance with the invention described herein need not be stored in the same location as the program utilized for carrying out such processing. Indeed, messages may be downloaded to a client station or to a message server from a remote location, such as, e.g., a message database accessible over the Internet or accessible over a corporate intranet.
  • the present invention involves an approach to threading that makes use of a range of individually uncertain, but cumulatively compelling clues as to what is going on in a conversation.

Abstract

Current tools for processing e-mail and other messages do not adequately recognize and manipulate threads, i.e., conversations among two or more people carried out by exchange of messages. The present invention utilizes the textual context and characteristics of messages in order to provide a more reliable and effective way to construct message threads. In accordance with the present invention, statistical information retrieval techniques are used in conjuction with textual material obtained by 'filtering' of messages to achieve a significant level of accuracy at identifying when one message is a reply to another.

Description

FINDING AN E-MAIL MESSAGE TO WHICH ANOTHER
E-MAIL MESSAGE IS A RESPONSE
Cross-references to related applications
This application claims the benefit of U.S.
Provisional Application No. 60/019264, filed June 7, 1996, entitled "Finding an E-mail Message to Which Another E-mail Message Is a Response."
Appendix
An attached appendix (pages 25 - 80) has been provided which lists the source code of the programs developed to carry out the experiments described below in connection with the present invention.
Copyright Notice
A portion of the disclosure of this patent document contains material which is subject to copyright protection . The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark
Office patent file or records, but otherwise reserves all copyright rights whatsoever.
Technical Field
This invention relates to electronic messaging and, more particularly, to a way of recognizing and manipulating threads contained in electronic messages
Background of the Invention
The volume of electronic messages, such as electronic mail ("e-mail"), is huge and growing. Many users receive more messages than they can handle, which has sparked interest in better message handling software. Almost all e-mail readers now support separating messages into
folders, and often allow rules to be defined to do this automatically. Tools for prioritizing and searching messages are also becoming available.
A problem with most such approaches is that they process each message individually. Many messages are parts of larger conversations, or threads A thread is a
conversation among two or more participants carried out by exchange of messages. Treating messages outside of this context may lead to undesirable results. For instance, a system that sorts messages into folders based on their content is unlikely to be 100% accurate. The effectiveness of content-based text categorization systems varies
considerably among categories, and accuracies over 95% are rarely reported. This means that threads having as few as 20 component messages will almost always be broken up and distributed into multiple folders by such a system, making it difficult for a reader to follow the conversational structure.
On the other hand, a mail reading interface that understood threads could save users considerable effort. For instance, some programs for reading Usenet news allow users to delete an entire thread at once, greatly reducing the number of messages the user must inspect.
Messaging systems that are explicitly oriented to group discussion, e.g., the Usenet network and other bulletin board systems, provide the most support for threading For instance, the reply command in most Usenet news posting programs inserts into a reply or child message two forms of information about the relationship between it and its parent message (the message it is a reply to).
First, the chain of unique message identifiers in the References: field of the parent is copied into the References: field of the child, with the unique identifier of the parent added. Second, the Subject: line of the parent is copied into the Subject: line of the child, typically prefixed by Re:. Usenet news readers providing a threaded display use the structural links from the References: field, while others organize a threaded display around Subject lines which are identical or have identical prefixes
Conversations, including group discussions, can also be carried out over electronic mail systems . The ability to send to and reply to groups of people, as well as the use of centralized mail "reflectors" and mailing list management software, can informally support multiple large scale discussions. As with bulletin board systems, replying to an e-mail message often inserts structural information into the reply For Internet-based mail systems, the reply command may copy the Message-Id: field or other identifying information from the parent, into the In-Reply-To: field of the child As in Usenet messages, the Subject: line is typically copied to the Subject: field, preceded by Re:.
Some mail clients provide threaded displays, though this is less common than for bulletin board systems. For instance, the VM mail reader (available at ftp.uu.net in networking/mail/vm directory) allows grouping of messages by one of several criteria, including having the same subject line text, the same author, or the same recipient The mail archiving program hypermail (see
http://www.eit.com/software/hypermail .html) marks up archives of e-mail with a variety of links, including threading information. It attempts first to find a message id in the In-Reply-To: field and match it to a known message. Failing that it looks for a matching date string in the In-Reply-To: field, and finally tries for a match on the Subject: line, after removing one Re: tag.
However, the error rate of each of the above
approaches is considerable. While the References: field is in theory required for replies to Usenet messages,
threading is hampered by clients that delete portions of the References: chain due to limitations on field length. In Internet electronic mail, the use of Message-Id: and
In-Reply-To: fields are optional and their format and nature is only loosely constrained when they are present. Subject: lines for both Usenet messages and Internet mail are allowed to contain arbitrary text, clients are inconsistent in their use of Re: tags, and manual editing of Subject:
lines further confuses the issue. Furthermore, current approaches to threading are to some extent misconceived, as they rely upon rapidly changing conventions in software communication.
While user clients typically insert in messages structural information useful for recovering threads, inconsistencies between clients, loose standards, creative user behavior, and the subjective nature of conversation make current threading systems only partially successful, and the situation is unlikely to change.
One approach to dealing with the above situation is to try to force clients to follow tighter standards for specifying threads. However, such an approach does not appear practical in light of the increasing diversity of clients and the growing interconnection of only partially compatible messaging systems. Tighter standards also do not help in recovering thread structure from archived messages, since deletion of fields such as In-Reply-To: by archiving and digestifying programs is common.
It is also not clear that threads should be identified with trees of reply links. The reply command is often used to avoid retyping a mail address, rather than to continue a conversation. Further, users will disagree about what is on-topic in a thread, and off-topic responses can easily spawn subdiscussions. Conversely, on-topic contributors to a discussion may simply send a message rather than using the reply command.
This suggests that the links desired for displayin a threading interface, and which result in structures to be processed as a unit, are actually not objectively defined "pattern-matching" or "structural" links. The link desired to be captured is that of a response in an ongoing
discourse The fact that users are able to participate in online discussions, despite the in-adequacies of current threading software, suggests that most messages contain the contextual information to understand their place in an ongoing conversation. Thus it is at least possible that an automated system will be able to make use of this
information as well to make this conversational structure explicit as a thread.
The role of cohesion or linking between the parts of a dialogue has been recognized. Language provides a variety of mechanisms for achieving this cohesion One such mechanism is lexical cohesion andin particular lexical repetition, that is, the repeating of words in linked parts of a discourse.
The phenomenon of lexical repetition suggests that the similarity of the vocabulary between two messages should be a powerful clue to whether a response relationship exists between them. Measuring the similarity of vocabulary between texts is, of course, a widely used strategy for finding texts with similar topic to a query. Indeed, similarity-based methods have been used to construct hypertexts linking documents or passages of documents on the basis of topic similarity.
Attempts have also been made to go beyond unlabeled linking to use similarity matching in detecting discourse relations. Hearst's TextTiling algorithm (see M.A.
Hearst, "Multi-paragraph Segmentation of Expository Text," 32nd Annual Meeting of the Association for Computational Linguistics at Pp. 9-16, Las Cruces, NM June 27-30, 1994) uses vector space similarity to decompose a text into topically coherent segments. Also used is the graph structure of a network of raw similarity links to infer meta-links corresponding to discourse relations such as comparison and summarization (see J. Allan, "Automatic Hypertext Link Typing," Proceedings of Hypertext -96,
1996). These lines of evidence suggest text similarity could be a clue to the existence of a response relation between messages as well.
What is desired is a way to utilize robust conventions in human communication in place of, orin addition to, software conventions in order to produce an effective message threading system.
Summary of the Invention
An object of the present invention is to utilize the textual context and characteristics of messages to provide a more reliable and effective way to construct message threads. In accordance with the present invention, statistical information retrieval techniques are used in conjunction with textual material obtained by "filtering" of messages to achieve a significant level of accuracy at identifying when one message is a reply to another.
Brief Description of the Drawings
FIGURE 1 shows the results of experimentation for a matching strategy used in an embodiment of the present invention.
FIGURE 2 contains a diagram showing an embodiment of the present invention.
FIGURE 3 contains a diagram showing a more generalized embodiment of the present invention.
Detailed Description
Threading of electronic messages should be treated as a language processing task. The present invention utilizes textual context and characteristics of messages in order to provide a more reliable and effective way to construct message threads. Preliminary experiments show that a significant level of threading effectiveness can be
achieved by applying standard text matching methods from information retrieval techniques to the textual portions of messages. In accordance with the present invention, statistical information retrieval techniques are used in conjunction with textual material obtained by "filtering" of messages to achieve a significant level of accuracy at identifying when one message is a reply to another. A preferred embodiment of the present invention will now be described with reference to the experiments described below. The experiments are meant to be illustrative of the process of the present invention and are not intended to be limiting.
Experiments
The goal in experimentation was to test the ability of various linguistic clues to indicate whether one message was a response to another. Three types of textual material from messages were investigated: (1) the Subject: line; (2) quoted material in the message; and (3) the (unquoted) text of the message itself. The results of the experiments conducted show that statistical information retrieval techniques can achieve a significant level of accuracy at identifying when one message is a reply to another.
Text from the Subject: line is a good clue that a message belongs to a particular thread, though it may not directly indicate which message in the thread is being replied to. Quoting of material from the parent message, particularly quotes of several lines, is a much stronger form of context. Salton and Buckley in an article entitled "Global Text Matching for Information Retrieval," Science, 253:1012-1015 (August, 1991), showed that text matching on a collection of Usenet messages which included substantial quoted material was highly effective at retrieving related messages, under a definition of relatedness that subsumed the response relationship of interest.
Further, the actual text of the reply can be expected, based on the coherence phenomena described earlier, to repeat words from the parent message. Since new material will be present as well, it is expected this to be a somewhat weaker clue than the Subject: line and quoted text. a. Dat.a Set and Preparation
A corpus of 2435 messages posted to the www- talk mailing list during the period February 1994 through July 1994 were obtained from the archives at URL http://www.w3.org/hypertext/WWW/Archive/www-talk.
A total of 941 of these messages had an In-Reply-To: field containing a unique identifier from the Message-Id: field of another message in the corpus. While it is suggested herein that In-Reply-To: links will not always correspond to the discourse response links of interest, they provide a reasonable initial test of the ability of text matching to find connections that are response-like. Therefore, these 941 child-parent pairs were used as ground truth against which methods for finding parent messages were tested.
Simple message filters were written to extract the three types of textual material (referred to above) from each message: (1) the text of the Subject: field; (2) unquoted text from the message body,- and (3) quoted text from the message' body. This resulted in three collections of 2435 document representatives, one for each type of textual material. Some messages had empty document
representatives in some of the databases (for instance, a message might have no quoted material) and so could not be retrieved from that database. These messages were used as "target" messages for the matching strategies described herein. Target messages represented the potential parent messages matched against a given "query" (child) message chosen from the database. The "best" match of the target messages (excluding the query message) for a given query message represents a potential parent message.
Each of the three collections was indexed using
Version 11.0 of the SMART experimental text retrieval system, obtained June 13, 1995 from directory pub/smart at ftp. cs. Cornell.edu. The SMART text retrieval system uses statistical information retrieval techniques to rank target messages based using the cosine similarity formula and a variant of tf x idf weighting. Using the SMART system, target messages were represented as vectors of numeric weights:
<wi1, wi2, ... wik, ... , wlt>
where
Figure imgf000012_0001
and fik is the number of times word k appears in message I. Query messages were similarly represented as vectors
<q1,q2, ...qk, ..., qt> where
Figure imgf000012_0002
Here fk is the number of times the word occurs in the query message, N is the number of messages in the database, and nk is the number of messages containing word k. SMART scores each target message I as
Figure imgf000012_0003
b. Processing
Five text matching strategies were tested in the experiments for their ability to retrieve the parent of a message, given text from the child message. For each strategy, all 941 document representatives of identified child messages were run as queries against one of the three databases of 2435 document representatives using the SMART system. This produced a ranking of all 2435 target (that is, potential parent) messages for each query message.
Messages which did not have any words in common with the query were not retrieved. They were assigned random ranks lower than that of any retrieved message. Documents were ranked by the score assigned by the SMART system
processing. The code developed for carrying out the processing, message filtering and matching (with the exception of the SMART program which, as noted, was
obtained from a publicly-available source) is included in the appendix (pages 25 - 80), which is filed herewith and expressly incorporated by reference herein.
Each strategy was a choice of what text from a child should be used as a query (i.e., what type of message filter to use for a child message), and what text from target messages (i.e., what type of message filter) should be used to represent them in the database. The five combinations explored were:
Figure imgf000013_0001
C . Experiment al results
FIGURE 1 displays the distribution of ranks of the 941 parent documents with respect to each of the five forms of text matching. The value for rank 0 is the number of times a child retrieved its parent as the first document in the ranking, rank 1 indicates how often the parent was second in the ranking, and so on. In computing the rank of the parent, the child document (which was itself present in the database, though not necessarily in the same form as was used in querying) was removed from the ranking, so that the ranks run from 0 to 2433 instead of 0 to 2434.
Table 1 below shows the number of times the parent was retrieved at rank 0, ranks 0 to 4, and ranks 0 to 9 for each of the search strategies used in the experimentation, over 941 trials. Comparison of this is made to the values that would be expected if the parent appeared at a random rank between 0 and 2433.
Figure imgf000014_0001
Discuss ion
As expected, using the quoted portion of a message as a query (i.e., child message filter extracts quoted text portion) and matching against the unquoted portions of target messages (i.e., target message filter extracts unquoted text) was the most effective strategy, of the five strategies tried, for finding a parent message. As shown in Table 1, the parent was the highest ranked message in 666 out of 941 trials or 71% of the time (for the quoted query - unquoted target strategy). Put another way, a system that simply assumed the highest ranked message under this matching strategy was the parent would, on average, have 0.71 recall (i.e., retrieval of 71% of the items relevant to the query message) and 0.71 precision (i.e., 71% of the retrieved items are relevant to the query message) at finding parent messages. Of course, these results are for messages that are known to have a parent message. An operational system would need not only to distinguish among potential parents, but also to detect whether or not the message has a parent at all. One way of accomplishing this is to establish a threshold -- which may be preset or specified by a user -- against which the ranking or similarity scores for the child and potential parent messages would be measured. If the highest ranking or similarity score falls below the threshold, then it would be determined that there is no "match", i.e., no true parent message for that child message.
These results can be roughly compared with the 0.90 recall and 0.72 precision in Salton and Buckley's
experiments with Usenet messages containing quoted
material. However, Salton and Buckley were attempting to find related messages, not just parent messages, and defined all messages with the same Subject: line as being related. The task undertaken by Salton and Buckley is a simpler task than finding the single parent of a messaqe.
Referring again to FIGURE 1, it is apparent that the other strategies tried were not as effective as matching quoted text against unquoted targets, though all were far better than random at finding parent messages. Even matching unquoted text queries against quoted text targets, which preferentially retrieves the children of a message, returns a nontrivial number of parents based on general content similarity. Similarly, quoted queries against quoted targets mostly should find siblings of a message, but gets some parents due to nested quotations that persist to the child.
How fast the number of parents gained drops off with increasing rank also depends on the matching strategy. As shown in FIGURE 1, the smoothest decay comes from matching unquoted material against unquoted material (the fourth curve in FIGURE 1). This picks up parents based on a general similarity of content rather than repetition of actual text from the parent. The relatively smooth
gradation of content similarity which shows up in typical text retrieval systems also shows up here. In contrast, the curve for quoted queries vs. unquoted messages drops off extremely sharply. In most cases only the single parent messages will have a large block of unquoted text similar to the quoted text of the child. The curve for subject vs. subject (the fifth curve in FIGURE 1) drops sharply at the beginning, after the exhausting of those cases where there are nearly exact matches between the Subject: line of the query and a few documents with the same Subject: line. Later the curve is more gradual reflecting cases where the subject line is common to many messages, or the match is on only a subset of the words.
The diagram in FIGURE 2 shows the flow of message processing in accordance with the present invention. At 200 is a set of N target messages (denoted 1, 2, ..., N), any of which may be a parent message to be determined.
Each target (potential parent) message at 200 is filtered through a parent message filter A at 210. As seen from the experiments described above, parent message filter A may extract subject text, unquoted text, or quoted text from each message. The result of the message filtering
operation is a set of filtered target (potential parent) messages (denoted 1A, 2A, ..., NA) at 220. Preferably, based upon the above test results, message filter A at 210 extracts unquoted text from each potential parent, and the set of unquoted text messages for potential parents is at 220.
Continuing, the filtered potential parent messages (1A, 2A, ..., NA) at 220 are then passed along to a
Statistical Information Retrieval Function at 230.
Statistical Information Retrieval Function 230 can be the SMART system described above or an equivalent
statistically--based retrieval function.
The child, or reply, message CM at 240 is also
processed using a message filter Q at 250. As discussed above, the child message filter may extract subject text, unquoted text, or quoted text from the child message, producing a filtered child message CMQ at 260. Preferably, based upon the experiments described above, the child message filter at 250 extracts quoted text from the child message CM at 240, producing child quoted text at 260.
The filtered child message CMQ is then passed to the Statistical Information Retrieval Function at 230, along with filtered parent messages (1A, 2A, ..., NA). The
Statistical Information Retrieval Function processes these message components to provide a similarity value table at 270, which represents values (denoted AQ1, AQ2, ..., AQN) each of which is a measure of how likely it is that the corresponding message (1, 2, ..., N) is the parent for the child message CM. To determine the most likely parent message, the similarity value table at 270 is processed by a maximum value function at 280 from which the maximum value can be determined. The position in the table of the maximum value is a pointer or identifier at 290 that can be used to retrieve the corresponding target message which has been selected as the most likely parent message. This message can now be presented to the user along with the child message in a variety of formats, or simply retained for further processing to produce a thread. Alternatively, a list of potential message pairings -- with or without selecting which one is the actual parent -- may be
presented to the user.
As mentioned above, an alternative step may include establishing a threshold against which the ranking or similarity scores for the child and potential parent messages are measured, and if none of the rankings or similarity scores exceed the threshold, then it would be determined that there is no "match", i.e., no true parent message for that child message.
Generating a thread may be accomplished by iteratively applying the method of the present invention as described above. Starting with a perceived child message, a likely parent message is determined using the method. That parent message is then substituted as a new "child" message and its parent (i.e., the grandparent of the original child message) is determined using the same method. Similarly, the grandparent message can then be substituted as yet another "child" message to determine its parent and so forth, so that ultimately a thread of messages having parent-child relationship between successive messages may be obtained.
Another way to generate a thread of messages is to process all messages as child messages against all other messages as potential parent messages (which, in fact, is the technique utilized during experimentation). For each child message, its parent is determined as described above using a statistical information retrieval function and computing similarity values. Threads can be determined by linking up successive child-parent pairs. Linking of successive child-parent pairs may be done by, for example, finding a child message (denote as "B") having a parent message (denote as "A") wherein child message "B" is itself a parent message for another child message (denote as "C"); that is, message "A" is the parent of "B" and the
grandparent of "C" Thus, the link of messages would be "A" - "B" - "C", and so on until all messages in the thread are accounted for.
An alternative to the embodiment of the present invention described above may be used to obtain a likely child message given a parent message. The basic process using message filters is the same for the alternative embodiment. The differences in the process are the filters used. For example, in the experiments described above, the best results in determining a parent message given a child message were obtained by using a quoted text filter for the child and an unquoted filter for each of the potential parent messages. Starting with a given parent message, then, the process would involve the use of an unquoted filter on the parent message and a quoted filter for each of the remaining messages (the potential child messages). Once the messages are filtered, the processing essentially takes place as described above.
It is readily apparent that one way of utilizing the present invention is with batch processing of messages such as, e.g., would be done in connection with message archiving. Another way of utilizing the method of the present invention, however, is in the processing of
incoming messages as they arrive, rather than waiting for a batch to accumulate. For example, when a new message arrives, the method of the present invention could be applied to identify a parent message from the messages that have previously arrived. In addition, in the event that the messages are received out of order, the new message could be checked against the other messages ( in accordance with the method described above for locating a child message from a potential parent) in order to determine a child message for the newly received message.
A variety of improvements in the basic processing scheme described above are possible. By improving
processing of document text, as well as making use of additional evidence, it is believed that the above results can be greatly improved. The improvements, each of which might be viewed as a message "filter," are as follows.
(1) Better Text Representation. The above-described experiments ignored the order of words when matching query messages against potential parents. This is sensible for detecting similarity of topic, as is the goal in matching unquoted text against unquoted text. A quotation in a child message, however, is likely to repeat a long sequence of words from the parent. Indexing, matching, and term weighting based on multi-word phrases or entire lines should greatly reduce the number and strength of spurious matches. Since header material (From: lines, etc.) can appear in quotes as well, matching should be allowed on this material as well.
(2) Nested Quotation. Multiple levels of quotation are common in electronic messaging, and are indicated by concatenated prefixes. For instance, if textual material is prefixed by ">> >", it would be expected that the parent message has the material prefixed by "> >", or perhaps by ">", but probably not by nothing and certainly not by " |" or "*". Concatenated Re: tags appear in Subject: lines, but should be statistically characterized, since their use by mailers is erratic.
(3) Time. Most replies to a message occur within a window of a few days after the message is posted. A simple statistical model, perhaps similar to those used in
analyzing citation patterns, can be used to take this tendency into account.
(4) Recognizing Other Message Relationships.
Duplicated, bounced, reposted, continued, and revised messages have strong textual similarity to other messages. The experimental data showed cases where they were falsely construed as replies. If treated simply as nonreplies they are likely to distort statistical models distinguishing replies from nonreplies. A better approach is to model these other message relationships as well, both to
distinguish them from response relationships and to provide additional useful links between messages. For instance, a mail reader might display a revised message while
backgrounding the original.
(5) Authorship Information. Replies often refer to the author of the parent message, either in an
automatically produced fashion (such as): lewis@research.att.com (David L. Lewis) writes:
>I'd really like a threading email reader. or via a manually written salutation (e.g., Dear Susan). These may be matched against header information of messages and manually or automatically produced signatures.
(6) Cue Phrases. In responses which do not directly quote the parent message, the author will often use
linguistic cues to indicate the parent message, e.g. I really like the suggestion that... or your argument is....
Considerable research which has been done on
distinguishing what relationship a particular cue phrase is indicating can be applied.
(7) Message Categorization. Certain types of messages such as calls for papers and job ads are unlikely to be replies to other messages and/or are unlikely to be replied to publicly. Known text categorization methods can detect these and provide evidence against the presence of response links.
(8) Detection of Siblings. A message without a clear connection to its parent may be similar to another child of the same parent, which does have a clear like. For
instance, two people may post similar responses objecting to an error in the parent message, but only one uses the reply command.
All of the above improvements are, in effect, clues that provide evidence toward the presence or absence of response links, but in all cases this evidence is
uncertain. A planned strategy is to implement the clues so as to reduce their uncertainty as much is as reasonable, but then to rely on machine learning methods known to those skilled in the art to combine these multiple uncertain clues into a decision procedure. This approach to complex information retrieval problems allows the system
implementer to focus on the relatively clean task of building feature detectors, while letting a learning algorithm use training data to balance the uncertain relationship of those features to the property of interest. (Two articles provide good examples of this strategy: B. Croft, J. Callan & J. Broglio, "Trec-2 routing and Ad-hoc Retrieval Evaluation Using the Inquery System," in The Second Text Retrieval Conference (D.K. Harman, ed.,
Gaithersburg, MD, March 1994, U.S. Dept . of Commerce, National Institute of Standards and Technology (NIST) Special Publication 500-215) pp. 75-83; and E. Spertus, "Smokey: Automatic Flame Recognition," Manuscript, Computer Science Department, Massachusetts Institute of Technology, 1996, submitted to ACM SIGIR -96.) In addition, this approach allows the system to be tailored to user
preferences as expressed, for instance, through their overriding of system decisions. This is desirable, since the presence of a response link is to some degree
subjective.
Each of the above-referenced improvements may be utilized as message filters alone or in combinations with one another and with the "subject text," "quoted text" and "unquoted text" message filters that were the subject of the experiments described herein. Accordingly, an
embodiment of the present invention may be obtained as a generalization of the embodiment reflected in FIGURE 2 described above. With reference to the diagram in FIGURE 3, the flow of message processing for the more general embodiment of the present invention will now be described.
As shown in FIGURE 3, at 300 is a set of N target messages (denoted 1, 2, ..., N), any of which may be a parent message to be determined. Each target (potential parent) message at 300 is filtered through a parent message filter bank (which may be one or more message filters).
The parent message filter bank is shown at 310 in FIGURE 3 as a set of one or more message filters denoted by A, B, ..., K, giving a parent message filter bank of length K. Parent message filters A through K may extract subject text, unquoted text, or quoted text from each message, or they may implement one or more of the "improvements" in message analysis described above (such as, e.g., extracting nested quotations, time information, or cue phrases). The result of the filtering operation is a set of N filtered target (potential parent) message vectors (denoted 1A, 1B, ..., 1K, 2A, 2B, ..., 2K, ..., NA, NB, ..., NK) at 320, where each filtered parent message is a vector consisting of the K filtered representations of the message, i.e., each element of the vector is the result of one of the K
filtering operations (e.g., filtered target message 1 is denoted as vector 1A, 1B, ... , 1K, where 1A represents the result of processing target message 1 through message filter A, etc.). These filtered potential parent messages at 320 are then passed along to Statistical Information Retrieval Function at 330, which may be the SMART system described above or an equivalent statistically--based retrieval function.
The child, or reply, message CM at 340 is also
processed using a message filter bank (which may be one or more message filters). In FIGURE 3, the child message filter bank is shown at 350 as a set of message filters denoted as Q, R, ..., Z, giving a child message filter bank of length Z-Q+1. The child message filter bank may contain one or more of the same type of potential message filters described above for the parent message filter bank. The child message filter bank produces a filtered child message vector (denoted CMQ, CMR; ..., CMZ) containing Z-Q+l
filtered representations of the message at 360.
The filtered child message vector (CMQ, CMR, ..., CMZ) is then passed to the Statistical Information Retrieval Function at 330, along with the set of filtered parent message vectors (1A, 1B, ..., 1K, 2A, 2B, ..., 2K, ..., NA, NB, ..., NK). The Statistical Information Retrieval
Function processes these message components to provide a similarity value table at 370, with values (denoted Ad, AQ2, ..., AQN, KZ1; KZ2, ..., KZN) representative of the similarity between potential parent and child message components. It may be preferable to combine the columns of values in the similarity value table of 370 using a
combiner function at 372 to provide a single tuple of values at 374, each element of which is a measure of how likely it is that the corresponding message (1, 2, ..., N) is the parent for the child message CM. As discussed above, the combiner function may be a decision procedure based upon machine learning methods. To determine the most likely parent message, the tuple of values at 374 is processed by a selector function at 380 from which an identifier for the most likely parent message can be determined at 390. For example, if the selector function is the maximum value function described above with
reference to FIGURE 2, the position of the maximum value in the tuple of values is a pointer or identifier at 390 that can be used to retrieve the corresponding target message which has been selected as the most likely parent message. The selected message can now be presented to the user along with the child message in a variety of formats, or simply retained for further processing to produce a thread.
Those skilled in the art will recognize that in the latter-described embodiment of present invention, each of the parent and child message filter banks may consist of a single message filter or multiple message filters. Those skilled in the art will further appreciate that the present invention may be implemented in any one of a number of known ways. For example, the present invention may be implemented by integrating or combining the techniques of the present invention with an e-mail reader or browser software program. Such a program may be client-based
(i.e., found locally within an individual's personal computer) or server based (i.e., found in a computer or gateway remote from the individual reader). As another example, the present invention could be implemented as part of a client-based or server-based message archival software program. The advantages of the present invention do not depend upon the particular mode of operation (i.e., server or client) of a computer or processor through which the techniques herein described are implemented. It will be clear to those skilled in the art that the location of the messages that may be processed in accordance with the invention described herein need not be stored in the same location as the program utilized for carrying out such processing. Indeed, messages may be downloaded to a client station or to a message server from a remote location, such as, e.g., a message database accessible over the Internet or accessible over a corporate intranet.
In summary, instead of attempting to solve the e-mail threading problem by forcing more consistency in the use of structural links by client software, the present invention involves an approach to threading that makes use of a range of individually uncertain, but cumulatively compelling clues as to what is going on in a conversation.
What has been described is merely illustrative of the application of the principles of the present invention. Other arrangements and methods can be implemented by those skilled in the art without departing from the spirit and scope of the present invention.
Figure imgf000027_0001
Figure imgf000028_0001
Figure imgf000029_0001
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000032_0001
Figure imgf000033_0001
Figure imgf000034_0001
Figure imgf000035_0001
Figure imgf000036_0001
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0001
Figure imgf000040_0001
Figure imgf000041_0001
Figure imgf000042_0001
Figure imgf000043_0001
Figure imgf000044_0001
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
Figure imgf000051_0001
Figure imgf000052_0001
Figure imgf000053_0001
Figure imgf000054_0001
Figure imgf000055_0001
Figure imgf000056_0001
Figure imgf000057_0001
Figure imgf000058_0001
Figure imgf000059_0001
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000062_0001
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
Figure imgf000069_0001
Figure imgf000070_0001
Figure imgf000071_0001
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001

Claims

What is Claimed is:
1. A method of determining from a plurality of messages a second message that is related to a first message, comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector; and
d. determining from each of the degrees of match which one of the plurality of messages is the second message.
2. The method according to claim 1, wherein the relationship of the second message to the first message is parent to child;
wherein the first message filter bank comprises a message filter that extracts a quoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered.
3. The method according to claim 1, wherein the relationship of the second message to the first message is child to parent; wherein the first message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts a quoted portion of the message being filtered.
4. The method according to claim 1, wherein the step of determining the degree of match between the filtered first message vector and the filtered second message vector comprises use of a statistical information retrieval function.
5. The method according to claim 1, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message
comprises determining which one of each of the degrees of match is the maximum value and selecting the message corresponding to the determined maximum value.
6. The method according to claim 4, wherein the step of determining the degree of match between the filtered first message vector and the filtered second message vectors further comprises combining a set of values
resulting from the statistical information retrieval function to form a single value representative of the degree of match.
7. The method according to claim 6, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message
comprises determining which element of the tuple of values representative of each of the degrees of match is the maximum value, and selecting the message corresponding to the determined maximum value.
8. The method according to claim 1, further
comprising the step of if the first message is contained in the plurality of messages, removing the first message from the plurality of messages before filtering the plurality of messages using the second message filter bank.
9. The method according to claim 1, further
comprising the step of verifying that the second message is related to the first message.
10. The method according to claim 9, wherein the step of verifying that the second message is related to the first message includes determining whether the degree of match between the filtered first message vector and the filtered second message vector corresponding to the
determined second message exceeds a threshold value.
11. The method according to claim 1, further
comprising the step of presenting a list including the first message, at least one of the plurality of messages, and the degree of match between the filtered first message vector and the filtered second message vector corresponding to the at least one of the plurality of messages.
12. A method of determining from a plurality of messages whether a second message is related to a first message, comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter; b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector; and
d. determining for each of the set of filtered second message vectors whether the degree of match between the filtered first message vector and the filtered second message vector exceeds a threshold value.
13. A method of processing a plurality of messages that may be related to a first message, comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector; and
d. presenting a list including the first
message, at least one of the plurality of messages, and the degree of match between the filtered first message vector and the filtered second message vector corresponding to the at least one of the plurality of messages.
14. A method of determining a thread of related messages from a plurality of messages, comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
b. if the first message is contained in the plurality of messages, removing the first message from the plurality of messages;
c. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
d. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector;
e. determining from each of the degrees of match whether one of the plurality of messages is a second message related to the first message; and
f. if it is determined that one of plurality of messages is a second message is related to the first message, substituting the second message in place of the first message and repeating each of the steps a through f herein.
15. The method according to claim 14, wherein the relationship of the second message to the first message is parent to child;
wherein the first message filter bank comprises a message filter that extracts a quoted portion of the message being filtered; and wherein the second message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered.
16. The method according to claim 14, wherein the relationship of the second message to the first message is child to parent;
wherein the first message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts a quoted portion of the message being filtered.
17. The method according to claim 14, wherein the step of determining the degree of match between the
filtered first message vector and the filtered second message vector comprises use of a statistical information retrieval function.
18. The method according to claim 14, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which one of each of the degrees of match is the maximum value and selecting the message corresponding to the determined maximum value.
19. The method according to claim 17, wherein the step of determining the degree of match between the
filtered first message vector and the filtered second message vector further comprises combining a set of values resulting from the statistical information retrieval function to form a single value representative of the degree of match.
20. The method according to claim 19, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which element of the vector
representative of each of the degrees of match is the maximum value, and selecting the message corresponding to the determined maximum value.
21. A method of determining a thread of related messages from a plurality of messages, comprising the steps of:
a. generating a set of filtered first message vectors by filtering each cf the plurality of messages using a first message filter bank, said first message filter bank comprising at least one message filter;
b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between each of the filtered first message vectors and the filtered second message vector;
d. determining from each of the degrees of match each one of the plurality of messages that is related to another of the plurality of messages; and
e. determining from each of the plurality of messages that is related tc another of the plurality of messages a linked list of messages having successive parent-child relationships.
22. A system for determining from a plurality of messages a second message that is related to a first message, comprising:
a. a processor; and
b. memory;
wherein said processor is programmed to execute the steps of:
1. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
2. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
3. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector; and
4. determining from each of the degrees of match which one of the plurality of messages is the second message.
23. The system according to claim 22, wherein the relationship of the second message to the first message is parent to child;
wherein the first message filter bank comprises a message filter that extracts a quoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered.
24. The system according to claim 22, wherein the relationship of the second message to the first message is child to parent; wherein the first message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts a quoted portion of the message being filtered.
25. The system according to claim 22, wherein the step of determining the degree of match between the
filtered first message vector and the filtered second message vector comprises use of a statistical information retrieval function.
26. The system according to claim 22, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which one of each of the degrees of match is the maximum value and selecting the message corresponding to the determined maximum value.
27. The system according to claim 25, wherein the step of determining the degree of match between the
filtered first message vector and the filtered second message vector further comprises combining a set of values resulting from the statistical information retrieval function to form a single value representative of the degree of match.
28. The system according to claim 27, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which element of the tuple of values representative of each of the degrees of match is the maximum value, and selecting the message corresponding to the determined maximum value.
29. The system according to claim 22, further
comprising the step of if the first message is contained in the plurality of messages, removing the first message from the plurality of messages before filtering the plurality of messages using the second message filter bank.
30. The system according to claim 22, further
comprising the step of verifying that the second message is related to the first message.
31. The system according to claim 30, wherein the step of verifying that the second message is related to the first message includes determining whether the degree of match between the filtered first message vector and the filtered second message vector corresponding to the
determined second message exceeds a threshold value.
32. The system according to claim 22, further
comprising the step of presenting a list including the first message, at least one of the plurality of messages, and the degree of match between the filtered first message vector and the filtered second message vector corresponding to the at least one of the plurality of messages.
33. A system for determining from a plurality of messages whether a second message is related to a first message, comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter; b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector; and
d. determining for each of the set of filtered second message vectors whether the degree of match between the filtered first message vector and the filtered second message vector exceeds a threshold value.
34. A system for processing a plurality of messages that may be related to a first message, comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector; and
d. presenting a list including the first
message, at least one of the plurality of messages, and the degree of match between the filtered first message vector and the filtered second message vector corresponding to the at least one of the plurality of messages.
35. A system for determining a thread of related messages from a plurality of messages, comprising:
a. a processor; and
b. memory;
wherein said processor is programmed to execute the steps of:
1. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
2. if the first message is contained in the plurality of messages, removing the first message from the plurality of messages;
3. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
4. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector;
5. determining from each of the degrees of match whether one of the plurality of messages is a second message related to the first message; and
6. if it is determined that one of
plurality of messages is a second message is related to the first message, substituting the second message in place of the first message and repeating each of the steps a through f herein.
36. The system according to claim 35, wherein the relationship of the second message to the first message is parent to child; wherein the first message filter bank comprises a message filter that extracts a quoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered.
37. The system according to claim 35, wherein the relationship of the second message to the first message is child to parent;
wherein the first message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts a quoted portion of the message being filtered.
38. The system according to claim 35, wherein the step of determining the degree of match between the
filtered first message vector and the filtered second message vector comprises use of a statistical information retrieval function.
39. The system according to claim 35, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which one of each of the degrees of match is the maximum value and selecting the message corresponding to the determined maximum value.
40. The system according to claim 38, wherein the step of determining the degree of match between the
filtered first message vector and the filtered second message vectors further comprises combining a set of values resulting from the statistical information retrieval function to form a single value representative of the degree of match.
41. The system according to claim 40, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which element of the vector
representative of each of the degrees of match is the maximum value, and selecting the message corresponding to the determined maximum value.
42. A system for determining a thread of related messages from a plurality of messages, comprising the steps of:
a. generating a set of filtered first message vectors by filtering each of the plurality of messages using a first message filter bank, said first message filter bank comprising at least one message filter;
b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between each of the filtered first message vectors and the filtered second message vector;
d. determining from each of the degrees of match each one of the plurality of messages that is related to another of the plurality of messages; and
e. determining from each of the plurality of messages that is related to another of the plurality of messages a linked list of messages having successive parent-child relationships.
43. An article of manufacture, comprising a computer- readable medium having stored thereon instructions for determining from a plurality of messages a second message that is related to a first message, said instructions which, when performed by a processor, cause the processor to execute the steps comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector; and
d. determining from each of the degrees of match which one of the plurality of messages is the second message.
44. The article of manufacture according to claim 43, wherein the relationship of the second message to the first message is parent to child;
wherein the first message filter bank comprises a message filter that extracts a quoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered.
45. The article of manufacture according to claim 43, wherein the relationship of the second message to the first message is child to parent;
wherein the first message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts a quoted portion of the message being filtered
46. The article of manufacture according to claim 43, wherein the step of determining the degree of match between the filtered first message vector and the filtered second message vector comprises use of a statistical information retrieval function.
47. The article of manufacture according to claim 43, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which one of each of the degrees of match is the maximum value and selecting the message corresponding to the determined maximum value.
48. The article of manufacture according to claim 46, wherein the step of determining the degree of match between the filtered first message vector and the filtered second message vector further comprises combining a set of values resulting from the statistical information retrieval function to form a single value representative of the degree of match.
49. The article of manufacture according to claim 48, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which element of the tuple of values representative of each of the degrees of match is the maximum value, and selecting the message corresponding to the determined maximum value.
50. The article of manufacture according to claim 43, further comprising the step of if the first message is contained in the plurality of messages, removing the first message from the plurality of messages before filtering the plurality of messages using the second message filter bank.
51. The article of manufacture according to claim 43, further comprising the step of verifying that the second message is related to the first message.
52. The article of manufacture according to claim 51, wherein the step of verifying that the second message is related to the first message includes determining whether the degree of match between the filtered first message vector and the filtered second message vector corresponding to the determined second message exceeds a threshold value.
53. The article of manufacture according to claim 43, further comprising the step of presenting a list including the first message, at least one of the plurality of
messages, and the degree of match between the filtered first message vector and the filtered second message vector corresponding to the at least one of the plurality of messages.
54. An article of manufacture comprising a computer- readable medium having stored thereon instructions for determining from a plurality of messages whether a second message is related to a first message, said instructions which, when performed by a processor, cause the processor to execute the steps comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector; and
d. determining for each of the set of filtered second message vectors whether the degree of match between the filtered first message vector and the filtered second message vector exceeds a threshold value.
55. An article of manufacture comprising a computer- readable medium having stored thereon instructions for processing a plurality of messages that may be related to a first message, said instructions which, when performed by a processor, cause the processor to execute the steps
comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter; c. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector; and
d. presenting a list including the first
message, at least one of the plurality of messages, and the degree of match between the filtered first message vector and the filtered second message vector corresponding to the at least one of the plurality of messages.
56. An article of manufacture, comprising a computer- readable medium having stored thereon instructions for determining a thread of related messages from a plurality of messages, said instructions which, when performed by a processor, cause the processor to execute the steps
comprising the steps of:
a. generating a filtered first message vector by filtering the first message using a first message filter bank, said first message filter bank comprising at least one message filter;
b. if the first message is contained in the plurality of messages, removing the first message from the plurality of messages;
c. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
d. determining for each of the set of filtered second message vectors the degree of match between the filtered first message vector and the filtered second message vector;
e. determining from each of the degrees of match whether one of the plurality of messages is a second message related to the first message; and f. if it is determined that one of plurality of messages is a second message is related to the first message, substituting the second message in place of the first message and repeating each of the steps a through f herein.
57. The article of manufacture according to claim 56, wherein the relationship of the second message to the first message is parent to child;
wherein the first message filter bank comprises a message filter that extracts a quoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered.
58. The article of manufacture according to claim 56, wherein the relationship of the second message to the first message is child to parent;
wherein the first message filter bank comprises a message filter that extracts an unquoted portion of the message being filtered; and
wherein the second message filter bank comprises a message filter that extracts a quoted portion of the message being filtered.
59. The article of manufacture according to claim 56, wherein the step of determining the degree of match between the filtered first message vector and the filtered second message vector comprises use of a statistical information retrieval function.
60. The article of manufacture according to claim 56, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which one of each of the degrees of match is the maximum value and selecting the message corresponding to the determined maximum value.
61. The article of manufacture according to claim 59, wherein the step of determining the degree of match between the filtered first message vector and the filtered second message vector further comprises combining a set of values resulting from the statistical information retrieval function to form a single value representative of the degree of match.
62. The article of manufacture according to claim 61, wherein the step of determining from each of the degrees of match which one of the plurality of messages is the second message comprises determining which element of the vector representative of each of the degrees of match is the maximum value, and selecting the message corresponding to the determined maximum value.
63. An article of manufacture comprising a computer- readable medium having stored thereon instructions for determining a thread of related messages from a plurality of messages, said instructions which, when performed by a processor, cause the processor to execute the steps
comprising the steps of:
a. generating a set of filtered first message vectors by filtering each of the plurality of messages using a first message filter bank, said first message filter bank comprising at least one message filter;
b. generating a set of filtered second message vectors by filtering each of the plurality of messages using a second message filter bank, said second message filter bank comprising at least one message filter;
c. determining for each of the set of filtered second message vectors the degree of match between each of the filtered first message vectors and the filtered second message vector;
d. determining from each of the degrees of match each one of the plurality of messages that is related to another of the plurality of messages; and
e. determining from each of the plurality of messages that is related to another of the plurality of messages a linked list of messages having successive parent-child relationships.
PCT/US1997/009161 1996-06-07 1997-05-30 Finding an e-mail message to which another e-mail message is a response WO1997046962A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US1926496P 1996-06-07 1996-06-07
US60/019,264 1996-06-07
US08/866,196 US5905863A (en) 1996-06-07 1997-05-30 Finding an e-mail message to which another e-mail message is a response
US08/866,196 1997-05-30

Publications (2)

Publication Number Publication Date
WO1997046962A1 true WO1997046962A1 (en) 1997-12-11
WO1997046962A9 WO1997046962A9 (en) 1998-03-12

Family

ID=26692050

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1997/009161 WO1997046962A1 (en) 1996-06-07 1997-05-30 Finding an e-mail message to which another e-mail message is a response

Country Status (2)

Country Link
US (1) US5905863A (en)
WO (1) WO1997046962A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6085321A (en) * 1998-08-14 2000-07-04 Omnipoint Corporation Unique digital signature
GB2350013A (en) * 1999-02-08 2000-11-15 Siemens Inf & Comm Networks Handling of threaded messages
US6356935B1 (en) 1998-08-14 2002-03-12 Xircom Wireless, Inc. Apparatus and method for an authenticated electronic userid
US6615348B1 (en) 1999-04-16 2003-09-02 Intel Corporation Method and apparatus for an adapted digital signature

Families Citing this family (179)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6085201A (en) * 1996-06-28 2000-07-04 Intel Corporation Context-sensitive template engine
US7031442B1 (en) 1997-02-10 2006-04-18 Genesys Telecommunications Laboratories, Inc. Methods and apparatus for personal routing in computer-simulated telephony
US6480600B1 (en) 1997-02-10 2002-11-12 Genesys Telecommunications Laboratories, Inc. Call and data correspondence in a call-in center employing virtual restructuring for computer telephony integrated functionality
US6104802A (en) 1997-02-10 2000-08-15 Genesys Telecommunications Laboratories, Inc. In-band signaling for routing
US6865715B2 (en) * 1997-09-08 2005-03-08 Fujitsu Limited Statistical method for extracting, and displaying keywords in forum/message board documents
US6985943B2 (en) 1998-09-11 2006-01-10 Genesys Telecommunications Laboratories, Inc. Method and apparatus for extended management of state and interaction of a remote knowledge worker from a contact center
US6711611B2 (en) 1998-09-11 2004-03-23 Genesis Telecommunications Laboratories, Inc. Method and apparatus for data-linking a mobile knowledge worker to home communication-center infrastructure
USRE46528E1 (en) 1997-11-14 2017-08-29 Genesys Telecommunications Laboratories, Inc. Implementation of call-center outbound dialing capability at a telephony network level
US6330610B1 (en) * 1997-12-04 2001-12-11 Eric E. Docter Multi-stage data filtering system employing multiple filtering criteria
US7907598B2 (en) 1998-02-17 2011-03-15 Genesys Telecommunication Laboratories, Inc. Method for implementing and executing communication center routing strategies represented in extensible markup language
US6332154B2 (en) 1998-09-11 2001-12-18 Genesys Telecommunications Laboratories, Inc. Method and apparatus for providing media-independent self-help modules within a multimedia communication-center customer interface
US6484196B1 (en) * 1998-03-20 2002-11-19 Advanced Web Solutions Internet messaging system and method for use in computer networks
US6330589B1 (en) * 1998-05-26 2001-12-11 Microsoft Corporation System and method for using a client database to manage conversation threads generated from email or news messages
US6438564B1 (en) * 1998-06-17 2002-08-20 Microsoft Corporation Method for associating a discussion with a document
US6785710B2 (en) * 1998-06-22 2004-08-31 Genesys Telecommunications Laboratories, Inc. E-mail client with programmable address attributes
US6665837B1 (en) * 1998-08-10 2003-12-16 Overture Services, Inc. Method for identifying related pages in a hyperlinked database
USRE46153E1 (en) 1998-09-11 2016-09-20 Genesys Telecommunications Laboratories, Inc. Method and apparatus enabling voice-based management of state and interaction of a remote knowledge worker in a contact center environment
AU1122100A (en) * 1998-10-30 2000-05-22 Justsystem Pittsburgh Research Center, Inc. Method for content-based filtering of messages by analyzing term characteristicswithin a message
US6442592B1 (en) * 1998-12-11 2002-08-27 Micro Computer Systems, Inc. Message center system
US7296060B2 (en) * 1998-12-24 2007-11-13 Intel Corporation System and method for automatically identifying and attaching related documents
US6654787B1 (en) * 1998-12-31 2003-11-25 Brightmail, Incorporated Method and apparatus for filtering e-mail
US6701346B1 (en) * 1999-07-12 2004-03-02 Micron Technology, Inc. Managing redundant electronic messages
US6631398B1 (en) 1999-07-12 2003-10-07 Micron Technology, Inc. Managing redundant electronic messages
US7194681B1 (en) 1999-07-30 2007-03-20 Microsoft Corporation Method for automatically assigning priorities to documents and messages
US6622160B1 (en) * 1999-07-30 2003-09-16 Microsoft Corporation Methods for routing items for communications based on a measure of criticality
US6714967B1 (en) * 1999-07-30 2004-03-30 Microsoft Corporation Integration of a computer-based message priority system with mobile electronic devices
US7353246B1 (en) * 1999-07-30 2008-04-01 Miva Direct, Inc. System and method for enabling information associations
US6523063B1 (en) 1999-08-30 2003-02-18 Zaplet, Inc. Method system and program product for accessing a file using values from a redirect message string for each change of the link identifier
US6691153B1 (en) 1999-08-30 2004-02-10 Zaplet, Inc. Method and system for process interaction among a group
US6453337B2 (en) * 1999-10-25 2002-09-17 Zaplet, Inc. Methods and systems to manage and track the states of electronic media
US7249175B1 (en) 1999-11-23 2007-07-24 Escom Corporation Method and system for blocking e-mail having a nonexistent sender address
US6321267B1 (en) 1999-11-23 2001-11-20 Escom Corporation Method and apparatus for filtering junk email
US7929978B2 (en) 1999-12-01 2011-04-19 Genesys Telecommunications Laboratories, Inc. Method and apparatus for providing enhanced communication capability for mobile devices on a virtual private network
US6507847B1 (en) 1999-12-17 2003-01-14 Openwave Systems Inc. History database structure for Usenet
US6564233B1 (en) * 1999-12-17 2003-05-13 Openwave Systems Inc. Server chaining system for usenet
US7567958B1 (en) * 2000-04-04 2009-07-28 Aol, Llc Filtering system for providing personalized information in the absence of negative data
US7430582B1 (en) * 2000-05-11 2008-09-30 International Business Machines Corporation Method article of manufacture and apparatus for assisting the response to an electronic mail message
US6772397B1 (en) * 2000-06-12 2004-08-03 International Business Machines Corporation Method, article of manufacture and apparatus for deleting electronic mail documents
US7028263B2 (en) 2000-07-19 2006-04-11 Research In Motion Limited User interface and method for viewing short messages on a wireless device
WO2002021413A2 (en) * 2000-09-05 2002-03-14 Zaplet, Inc. Methods and apparatus providing electronic messages that are linked and aggregated
US6778941B1 (en) * 2000-11-14 2004-08-17 Qualia Computing, Inc. Message and user attributes in a message filtering method and system
US7103634B1 (en) * 2000-11-16 2006-09-05 International Business Machines Corporation Method and system for e-mail chain group
US6781961B1 (en) * 2000-11-17 2004-08-24 Emware, Inc. Systems and methods for routing messages sent between computer systems
US20040181462A1 (en) * 2000-11-17 2004-09-16 Bauer Robert D. Electronic communication service
US7243125B2 (en) * 2000-12-08 2007-07-10 Xerox Corporation Method and apparatus for presenting e-mail threads as semi-connected text by removing redundant material
US20020073112A1 (en) * 2000-12-08 2002-06-13 Fujitsu Limited Related documents processing device, recording medium for processing related documents and method for processing related documents
US6963904B2 (en) * 2000-12-27 2005-11-08 Gateway Inc. Method for correlating an electronic mail message with related messages
WO2002054279A1 (en) * 2001-01-04 2002-07-11 Agency For Science, Technology And Research Improved method of text similarity measurement
US6820081B1 (en) * 2001-03-19 2004-11-16 Attenex Corporation System and method for evaluating a structured message store for message redundancy
US6745197B2 (en) * 2001-03-19 2004-06-01 Preston Gates Ellis Llp System and method for efficiently processing messages stored in multiple message stores
US6826729B1 (en) * 2001-06-29 2004-11-30 Microsoft Corporation Gallery user interface controls
US7269793B2 (en) * 2001-10-19 2007-09-11 Ebs Group Limited Conversational dealing system
GB2396727B (en) * 2001-10-19 2005-06-29 Ebs Dealing Resources Internat Conversational dealing system
US7192235B2 (en) * 2001-11-01 2007-03-20 Palm, Inc. Temporary messaging address system and method
US20030101065A1 (en) * 2001-11-27 2003-05-29 International Business Machines Corporation Method and apparatus for maintaining conversation threads in electronic mail
US20030144903A1 (en) * 2001-11-29 2003-07-31 Brechner Irvin W. Systems and methods for disseminating information
US20030105824A1 (en) * 2001-11-29 2003-06-05 Brechner Irvin W. Systems and methods for disseminating information
US7191166B2 (en) * 2002-02-27 2007-03-13 Wells Fargo Bank N.A. Method and system for comparing information contents
US7039677B2 (en) * 2002-05-07 2006-05-02 International Business Machines Corporation Threaded text-based chat collaboration
US7237009B1 (en) 2002-06-12 2007-06-26 Novell, Inc. Methods, systems and data structures for assigning categories to electronic mail
CA2392122A1 (en) * 2002-06-28 2003-12-28 Ibm Canada Limited-Ibm Canada Limitee An unrolling transformation on nested loops
AU2003250696A1 (en) 2002-07-29 2004-02-16 Research In Motion Limited System and method of mimetic messaging settings selection
US7010565B2 (en) * 2002-09-30 2006-03-07 Sampson Scott E Communication management using a token action log
US6804687B2 (en) * 2002-09-30 2004-10-12 Scott E. Sampson File system management with user-definable functional attributes stored in a token action log
US20040073688A1 (en) * 2002-09-30 2004-04-15 Sampson Scott E. Electronic payment validation using Transaction Authorization Tokens
US20060168089A1 (en) * 2002-09-30 2006-07-27 Sampson Scott E Controlling incoming communication by issuing tokens
US8051172B2 (en) * 2002-09-30 2011-11-01 Sampson Scott E Methods for managing the exchange of communication tokens
US7539730B2 (en) * 2002-10-18 2009-05-26 Research In Motion Limited System and method for selecting messaging settings on a messaging client
US20040098250A1 (en) * 2002-11-19 2004-05-20 Gur Kimchi Semantic search system and method
JP2004178294A (en) * 2002-11-27 2004-06-24 Nec Access Technica Ltd Mobile terminal displaying related e-mail, method of displaying e-mail, and program
WO2004055632A2 (en) * 2002-12-13 2004-07-01 Wholesecurity, Inc. Method, system, and computer program product for security within a global computer network
US7340674B2 (en) * 2002-12-16 2008-03-04 Xerox Corporation Method and apparatus for normalizing quoting styles in electronic mail messages
US7835504B1 (en) * 2003-03-16 2010-11-16 Palm, Inc. Telephone number parsing and linking
US7231229B1 (en) 2003-03-16 2007-06-12 Palm, Inc. Communication device interface
US20040186750A1 (en) * 2003-03-18 2004-09-23 Gordon Surbey Method and system for automating insurance processes
US8112481B2 (en) * 2003-03-28 2012-02-07 Microsoft Corporation Document message state management engine
US20040199590A1 (en) * 2003-04-03 2004-10-07 International Business Machines Corporation Apparatus, system and method of performing mail message thread searches
US7890603B2 (en) * 2003-04-03 2011-02-15 International Business Machines Corporation Apparatus, system and method of performing mail message searches across multiple mail servers
US7350187B1 (en) * 2003-04-30 2008-03-25 Google Inc. System and methods for automatically creating lists
US20050108340A1 (en) * 2003-05-15 2005-05-19 Matt Gleeson Method and apparatus for filtering email spam based on similarity measures
US6990224B2 (en) * 2003-05-15 2006-01-24 Federal Reserve Bank Of Atlanta Method and system for communicating and matching electronic files for financial transactions
US8145710B2 (en) * 2003-06-18 2012-03-27 Symantec Corporation System and method for filtering spam messages utilizing URL filtering module
US9715678B2 (en) 2003-06-26 2017-07-25 Microsoft Technology Licensing, Llc Side-by-side shared calendars
US7716593B2 (en) * 2003-07-01 2010-05-11 Microsoft Corporation Conversation grouping of electronic mail records
US7707255B2 (en) 2003-07-01 2010-04-27 Microsoft Corporation Automatic grouping of electronic mail
US8799808B2 (en) 2003-07-01 2014-08-05 Microsoft Corporation Adaptive multi-line view user interface
WO2005006614A1 (en) * 2003-07-14 2005-01-20 Sony Corporation Information providing method
US20050027779A1 (en) * 2003-07-29 2005-02-03 Schinner Charles Edward System and method for organizing email messages
US9379910B2 (en) * 2003-07-29 2016-06-28 Blackberry Limited System and method of mimetic messaging settings selection
US10437964B2 (en) 2003-10-24 2019-10-08 Microsoft Technology Licensing, Llc Programming interface for licensing
US7461151B2 (en) * 2003-11-13 2008-12-02 International Business Machines Corporation System and method enabling future messaging directives based on past participation via a history monitor
US8055713B2 (en) * 2003-11-17 2011-11-08 Hewlett-Packard Development Company, L.P. Email application with user voice interface
US7412437B2 (en) * 2003-12-29 2008-08-12 International Business Machines Corporation System and method for searching and retrieving related messages
US7908566B2 (en) * 2003-12-29 2011-03-15 International Business Machines Corporation System and method for scrolling among categories in a list of documents
US7409641B2 (en) * 2003-12-29 2008-08-05 International Business Machines Corporation Method for replying to related messages
US8151214B2 (en) * 2003-12-29 2012-04-03 International Business Machines Corporation System and method for color coding list items
US8171426B2 (en) 2003-12-29 2012-05-01 International Business Machines Corporation Method for secondary selection highlighting
US7421664B2 (en) * 2003-12-29 2008-09-02 International Business Machines Corporation System and method for providing a category separator in a list of documents
US7818680B2 (en) * 2003-12-29 2010-10-19 International Business Machines Corporation Method for deleting related messages
US8805933B2 (en) * 2003-12-29 2014-08-12 Google Inc. System and method for building interest profiles from related messages
US8301702B2 (en) * 2004-01-20 2012-10-30 Cloudmark, Inc. Method and an apparatus to screen electronic communications
US7555707B1 (en) 2004-03-12 2009-06-30 Microsoft Corporation Method and system for data binding in a block structured user interface scripting language
US7269621B2 (en) * 2004-03-31 2007-09-11 Google Inc. Method system and graphical user interface for dynamically updating transmission characteristics in a web mail reply
US7912904B2 (en) * 2004-03-31 2011-03-22 Google Inc. Email system with conversation-centric user interface
US9819624B2 (en) 2004-03-31 2017-11-14 Google Inc. Displaying conversations in a conversation-based email system
US7814155B2 (en) * 2004-03-31 2010-10-12 Google Inc. Email conversation management system
US7941490B1 (en) 2004-05-11 2011-05-10 Symantec Corporation Method and apparatus for detecting spam in email messages and email attachments
US7979501B1 (en) 2004-08-06 2011-07-12 Google Inc. Enhanced message display
US7761519B2 (en) * 2004-08-11 2010-07-20 International Business Machines Corporation Method, system, and computer program product for displaying message genealogy
US7703036B2 (en) 2004-08-16 2010-04-20 Microsoft Corporation User interface for displaying selectable software functionality controls that are relevant to a selected object
US7895531B2 (en) * 2004-08-16 2011-02-22 Microsoft Corporation Floating command object
US8117542B2 (en) 2004-08-16 2012-02-14 Microsoft Corporation User interface for displaying selectable software functionality controls that are contextually relevant to a selected object
US8255828B2 (en) 2004-08-16 2012-08-28 Microsoft Corporation Command user interface for displaying selectable software functionality controls
US8146016B2 (en) 2004-08-16 2012-03-27 Microsoft Corporation User interface for displaying a gallery of formatting options applicable to a selected object
US9015621B2 (en) 2004-08-16 2015-04-21 Microsoft Technology Licensing, Llc Command user interface for displaying multiple sections of software functionality controls
US7747966B2 (en) 2004-09-30 2010-06-29 Microsoft Corporation User interface for providing task management and calendar information
US8396897B2 (en) * 2004-11-22 2013-03-12 International Business Machines Corporation Method, system, and computer program product for threading documents using body text analysis
US9002725B1 (en) 2005-04-20 2015-04-07 Google Inc. System and method for targeting information based on message content
US8135778B1 (en) 2005-04-27 2012-03-13 Symantec Corporation Method and apparatus for certifying mass emailings
US7751533B2 (en) * 2005-05-02 2010-07-06 Nokia Corporation Dynamic message templates and messaging macros
US7886290B2 (en) * 2005-06-16 2011-02-08 Microsoft Corporation Cross version and cross product user interface
US7739337B1 (en) 2005-06-20 2010-06-15 Symantec Corporation Method and apparatus for grouping spam email messages
US8010609B2 (en) 2005-06-20 2011-08-30 Symantec Corporation Method and apparatus for maintaining reputation lists of IP addresses to detect email spam
US7870204B2 (en) * 2005-07-01 2011-01-11 0733660 B.C. Ltd. Electronic mail system with aggregation and integrated display of related messages
US8239882B2 (en) 2005-08-30 2012-08-07 Microsoft Corporation Markup based extensibility for user interfaces
US8689137B2 (en) 2005-09-07 2014-04-01 Microsoft Corporation Command user interface for displaying selectable functionality controls in a database application
US9542667B2 (en) 2005-09-09 2017-01-10 Microsoft Technology Licensing, Llc Navigating messages within a thread
US7739259B2 (en) 2005-09-12 2010-06-15 Microsoft Corporation Integrated search and find user interface
US8627222B2 (en) 2005-09-12 2014-01-07 Microsoft Corporation Expanded search and find user interface
US7949714B1 (en) * 2005-12-05 2011-05-24 Google Inc. System and method for targeting advertisements or other information using user geographical information
US8601004B1 (en) 2005-12-06 2013-12-03 Google Inc. System and method for targeting information items based on popularities of the information items
US9008075B2 (en) 2005-12-22 2015-04-14 Genesys Telecommunications Laboratories, Inc. System and methods for improving interaction routing performance
US9727989B2 (en) 2006-06-01 2017-08-08 Microsoft Technology Licensing, Llc Modifying and formatting a chart using pictorially provided chart elements
US8605090B2 (en) * 2006-06-01 2013-12-10 Microsoft Corporation Modifying and formatting a chart using pictorially provided chart elements
US20080254811A1 (en) * 2007-04-11 2008-10-16 Palm, Inc. System and method for monitoring locations of mobile devices
US9031583B2 (en) 2007-04-11 2015-05-12 Qualcomm Incorporated Notification on mobile device based on location of other mobile device
US9140552B2 (en) 2008-07-02 2015-09-22 Qualcomm Incorporated User defined names for displaying monitored location
US8484578B2 (en) 2007-06-29 2013-07-09 Microsoft Corporation Communication between a document editor in-space user interface and a document editor out-space user interface
US8201103B2 (en) 2007-06-29 2012-06-12 Microsoft Corporation Accessing an out-space user interface for a document editor program
US8762880B2 (en) 2007-06-29 2014-06-24 Microsoft Corporation Exposing non-authoring features through document status information in an out-space user interface
US7720921B2 (en) * 2007-08-27 2010-05-18 International Business Machines Corporation System and method for soliciting and retrieving a complete email thread
US7693940B2 (en) * 2007-10-23 2010-04-06 International Business Machines Corporation Method and system for conversation detection in email systems
US7818385B2 (en) * 2007-11-14 2010-10-19 International Business Machines Corporation Method and apparatus for forwarding emails to previous recipients
US8225219B2 (en) * 2008-02-12 2012-07-17 Microsoft Corporation Identifying unique content in electronic mail messages
JP5286876B2 (en) * 2008-03-28 2013-09-11 富士通株式会社 Pegging support program, pegging support device, pegging support method
US9588781B2 (en) 2008-03-31 2017-03-07 Microsoft Technology Licensing, Llc Associating command surfaces with multiple active components
US20090300517A1 (en) * 2008-05-31 2009-12-03 International Business Machines Corporation Providing user control of historical messages in electronic mail chain to be included in forwarded or replied electronic mail message
US8661082B2 (en) * 2008-06-20 2014-02-25 Microsoft Corporation Extracting previous messages from a later message
US9665850B2 (en) 2008-06-20 2017-05-30 Microsoft Technology Licensing, Llc Synchronized conversation-centric message list and message reading pane
US8402096B2 (en) 2008-06-24 2013-03-19 Microsoft Corporation Automatic conversation techniques
US20100082751A1 (en) 2008-09-29 2010-04-01 Microsoft Corporation User perception of electronic messaging
US9076125B2 (en) * 2009-02-27 2015-07-07 Microsoft Technology Licensing, Llc Visualization of participant relationships and sentiment for electronic messaging
US8799353B2 (en) 2009-03-30 2014-08-05 Josef Larsson Scope-based extensibility for control surfaces
US9046983B2 (en) 2009-05-12 2015-06-02 Microsoft Technology Licensing, Llc Hierarchically-organized control galleries
US8352561B1 (en) 2009-07-24 2013-01-08 Google Inc. Electronic communication reminder technology
US20110029617A1 (en) * 2009-07-30 2011-02-03 International Business Machines Corporation Managing Electronic Delegation Messages
WO2011075825A1 (en) 2009-12-21 2011-06-30 Kik Interactive, Inc. Systems and methods for accessing and controlling media stored remotely
US20120083243A1 (en) * 2010-04-30 2012-04-05 Ari Kahn Communication Network Signaling
US8302014B2 (en) 2010-06-11 2012-10-30 Microsoft Corporation Merging modifications to user interface components while preserving user customizations
US9589254B2 (en) 2010-12-08 2017-03-07 Microsoft Technology Licensing, Llc Using e-mail message characteristics for prioritization
US9152312B1 (en) * 2011-01-26 2015-10-06 Google Inc. Displaying related content in a content stream
US20120304072A1 (en) * 2011-05-23 2012-11-29 Microsoft Corporation Sentiment-based content aggregation and presentation
CA2746065C (en) 2011-07-18 2013-02-19 Research In Motion Limited Electronic device and method for selectively applying message actions
US9037601B2 (en) 2011-07-27 2015-05-19 Google Inc. Conversation system and method for performing both conversation-based queries and message-based queries
US9042266B2 (en) 2011-12-21 2015-05-26 Kik Interactive, Inc. Methods and apparatus for initializing a network connection for an output device
CN103516578B (en) * 2012-06-26 2016-05-18 国际商业机器公司 Method, equipment and the e-mail system of common electronic mail are provided
US9870554B1 (en) 2012-10-23 2018-01-16 Google Inc. Managing documents based on a user's calendar
US10140198B1 (en) 2012-10-30 2018-11-27 Google Llc Networked desktop environment
US8819587B1 (en) 2012-10-30 2014-08-26 Google Inc. Methods of managing items in a shared workspace
US9842113B1 (en) 2013-08-27 2017-12-12 Google Inc. Context-based file selection
US9973462B1 (en) 2013-10-21 2018-05-15 Google Llc Methods for generating message notifications
US10452484B2 (en) * 2014-05-15 2019-10-22 Carbonite, Inc. Systems and methods for time-based folder restore
US9979682B2 (en) 2015-09-01 2018-05-22 Microsoft Technology Licensing, Llc Command propagation optimization
US9977666B2 (en) 2015-09-01 2018-05-22 Microsoft Technology Licensing, Llc Add a new instance to a series
US9882854B2 (en) 2015-09-01 2018-01-30 Microsoft Technology Licensing, Llc Email parking lot
US9929989B2 (en) 2015-09-01 2018-03-27 Microsoft Technology Licensing, Llc Interoperability with legacy clients
US10163076B2 (en) 2015-09-01 2018-12-25 Microsoft Technology Licensing, Llc Consensus scheduling for business calendar
AU2017203723A1 (en) * 2016-06-07 2017-12-21 David Nixon Meeting management system and process
US11861463B2 (en) 2019-09-06 2024-01-02 International Business Machines Corporation Identifying related messages in a natural language interaction

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5404488A (en) * 1990-09-26 1995-04-04 Lotus Development Corporation Realtime data feed engine for updating an application with the most currently received data from multiple data feeds
US5758257A (en) * 1994-11-29 1998-05-26 Herz; Frederick System and method for scheduling broadcast of and access to video programs and other data using customer profiles
US5799304A (en) * 1995-01-03 1998-08-25 Intel Corporation Information evaluation
US5796633A (en) * 1996-07-12 1998-08-18 Electronic Data Systems Corporation Method and system for performance monitoring in computer networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"INTELLIGENT DOCUMENT ANALYZER FOR SMARTMAIL", IBM TECHNICAL DISCLOSURE BULLETIN, vol. 34, no. 4A, 1 September 1991 (1991-09-01), pages 215, XP000210889 *
FREI H P ET AL: "RETRIEVAL ALGORITHM EFFECTIVENESS IN A WIDE AREA NETWORK INFORMATION FILTER", PROCEEDINGS OF THE ANNUAL INTERNATIONAL ACM/SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, CHICAGO, OCT. 13 - 16, 1991, no. CONF. 14, 13 October 1991 (1991-10-13), BOOKSTEIN A;CHIARAMELLA Y; SALTON G; RAGHAVAN V, pages 114 - 122, XP000239163 *
GOLDBERG D: "USING COLLABORATIVE FILTERING TO WEAVE AN INFORMATION TAPESTRY", COMMUNICATIONS OF THE ASSOCIATION FOR COMPUTING MACHINERY, vol. 35, no. 12, 1 December 1992 (1992-12-01), pages 61 - 70, XP000334368 *
SALTON G ET AL: "AUTOMATIC STRUCTURING AND RETRIEVAL OF LARGE TEXT FILES", COMMUNICATIONS OF THE ASSOCIATION FOR COMPUTING MACHINERY, vol. 37, no. 2, 1 February 1994 (1994-02-01), pages 97 - 108, XP000425939 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6085321A (en) * 1998-08-14 2000-07-04 Omnipoint Corporation Unique digital signature
US6356935B1 (en) 1998-08-14 2002-03-12 Xircom Wireless, Inc. Apparatus and method for an authenticated electronic userid
US6795919B1 (en) 1998-08-14 2004-09-21 Intel Corporation Unique digital signature
GB2350013A (en) * 1999-02-08 2000-11-15 Siemens Inf & Comm Networks Handling of threaded messages
GB2350013B (en) * 1999-02-08 2003-11-05 Siemens Inf & Comm Networks System and method for improved handling of threaded messages
US7110510B1 (en) 1999-02-08 2006-09-19 Siemens Communications, Inc. System and method for handling of threaded messages
US6615348B1 (en) 1999-04-16 2003-09-02 Intel Corporation Method and apparatus for an adapted digital signature

Also Published As

Publication number Publication date
US5905863A (en) 1999-05-18

Similar Documents

Publication Publication Date Title
US5905863A (en) Finding an e-mail message to which another e-mail message is a response
Lewis et al. Threading electronic mail: A preliminary study
US7657603B1 (en) Methods and systems of electronic message derivation
US7743051B1 (en) Methods, systems, and user interface for e-mail search and retrieval
US8392409B1 (en) Methods, systems, and user interface for E-mail analysis and review
US7949660B2 (en) Method and apparatus for searching and resource discovery in a distributed enterprise system
US7672956B2 (en) Method and system for providing a search index for an electronic messaging system based on message threads
Adams et al. Topic detection and extraction in chat
WO1997046962A9 (en) Finding an e-mail message to which another e-mail message is a response
CN100390786C (en) Content information analyzing method and apparatus
US8244720B2 (en) Ranking blog documents
US6332141B2 (en) Apparatus and method of implementing fast internet real-time search technology (FIRST)
US8209339B1 (en) Document similarity detection
US8495049B2 (en) System and method for extracting content for submission to a search engine
JP3810463B2 (en) Information filtering device
JP5536851B2 (en) Method and system for symbolic linking and intelligent classification of information
US20020103867A1 (en) Method and system for matching and exchanging unsorted messages via a communications network
Cselle et al. BuzzTrack: topic detection and tracking in email
WO2008097856A2 (en) Search result delivery engine
WO2007140364A2 (en) Method for scoring changes to a webpage
WO2009082100A9 (en) Method and system for searching information of collective emotion based on comments about contents on internet
US20090319617A1 (en) Extracting previous messages from a later message
Payne Learning Email Filtering Rules with Magi-A Mail Agent Interface
Stewart et al. User profiling techniques: a critical review
Felser et al. Recommendation of query terms for colloquial texts in forensic text analysis

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CA JP MX

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

COP Corrected version of pamphlet

Free format text: PAGES 1/3-3/3, DRAWINGS, REPLACED BY NEW PAGES BEARING THE SAME NUMBER; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 98500683

Format of ref document f/p: F

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA