US20040006481A1 - Fast transcription of speech - Google Patents
Fast transcription of speech Download PDFInfo
- Publication number
- US20040006481A1 US20040006481A1 US10/610,532 US61053203A US2004006481A1 US 20040006481 A1 US20040006481 A1 US 20040006481A1 US 61053203 A US61053203 A US 61053203A US 2004006481 A1 US2004006481 A1 US 2004006481A1
- Authority
- US
- United States
- Prior art keywords
- speech
- transcription
- audio
- data
- stream
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/42—Graphical user interfaces
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/60—Medium conversion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2203/00—Aspects of automatic or semi-automatic exchanges
- H04M2203/30—Aspects of automatic or semi-automatic exchanges related to audio recordings in general
- H04M2203/305—Recording playback features, e.g. increased speed
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99943—Generating database or data structure, e.g. via user interface
Definitions
- the present invention relates generally to speech processing and, more particularly, to the transcription of speech.
- Speech has not traditionally been valued as an archival information source. As effective as the spoken word is for communicating, archiving spoken segments in a useful and easily retrievable manner has long been a difficult proposition. Although the act of recording audio is not difficult, automatically transcribing and indexing speech in an intelligent and useful manner can be difficult.
- Automatic transcription systems are generally based on language and acoustic models.
- the language model is trained on a speech signal and on a corresponding transcription of the speech.
- the model will “learn” how the speech signal corresponds to the transcription.
- the training transcriptions of the speech are derived through a manual transcription process in which a user listens to the training audio, segments the audio, and types in the text corresponding to the audio. While typing in the text, the user may additionally annotate the text so that certain words, such as proper names, are noted as such.
- Systems and methods consistent with the principles of this invention provide a transcription tool that allows a user to quickly, and with minimal training, transcribe segments of speech.
- One aspect of the invention is directed to a speech transcription tool that includes an audio classification component, control logic, and an input device.
- the audio classification component receives an audio stream containing speech data and segments the audio stream into speech and non-speech audio segments based on locations of the speech data within the audio stream.
- the control logic plays the speech segments and skips playing of the non-speech segments.
- the input device receives user transcription text relating to a transcription of the speech segments played by the control logic.
- a second aspect of the invention is directed to a method that includes receiving an audio stream containing speech data, determining where the speech data is located in the audio stream, and playing select portions of the audio stream to a user.
- the select portions of the audio stream are based on the location of the speech data.
- the method additionally includes receiving text corresponding to the played portions of the audio stream and outputting the text.
- a third aspect of the invention is directed to a method that includes analyzing a data stream based on acoustic characteristics of the data stream to generate acoustic classification information for the data stream, and playing portions of the data stream that meet predetermined criteria based on the acoustic classification information. Further, the method includes receiving transcription information relating to the played portions of the data stream.
- Yet another aspect of the invention is directed to a computing device for transcribing an audio file that includes speech.
- the computing device comprises speakers, a processor, and a computer memory.
- the computer memory contains program instructions that when executed by the processor cause processor to automatically segment the audio file into speech and non-speech segments based on acoustic characteristics of the audio file. Additionally, the processor plays a current one of the speech segments through the speakers, receives transcription information for the speech segments played through the speakers, and skips the non-speech segments when locating a next current one of the speech segments.
- FIG. 1 is a diagram illustrating an exemplary system in which concepts consistent with the invention may be implemented
- FIG. 2 is a block diagram of a transcription tool consistent with the present invention
- FIG. 3 is a diagram illustrating a graphical user interface of the transcription tool consistent with the present invention.
- FIG. 4 is a flow chart illustrating methods of operation of the transcription tool consistent with the present invention.
- a speech transcription tool assists a user in transcribing speech.
- the transcription tool automatically identifies segments in an audio stream appropriate for transcription. Additionally, the transcription tool presents the user with a simple graphical interface for typing in the transcription text.
- FIG. 1 is a diagram illustrating an exemplary system 100 in which concepts consistent with the invention may be implemented.
- System 100 includes a computing device 101 that has a computer-readable medium 109 , such as random access memory, coupled to a processor 108 .
- Computing device 101 may also include a number of additional external or internal devices.
- An external input device 120 and an external output device 121 are shown in FIG. 1.
- the input devices 120 may include, without limitation, a mouse, a CD-ROM, or a keyboard.
- the output devices may include, without limitation, a display or an audio output device, such as a speaker.
- a keyboard in particular, may be used by the user of system 100 when transcribing a speech segment that is played back from an output device, such as a speaker.
- computing device 101 may be any type of computing platform, and may be connected to a network 102 .
- Computing device 101 is exemplary only.
- Concepts consistent with the present invention can be implemented on any computing device, whether or not connected to a network.
- Processor 108 executes program instructions stored in memory 109 .
- Processor 108 can be any of a number of well-known computer processors, such as processors from Intel Corporation, of Santa Clara, Calif.
- Memory 109 contains an application program 115 .
- application program 115 may implement the transcription tool described below.
- Transcription tool 115 plays audio segments to a user, who types the words spoken in the audio into transcription tool 115 .
- Transcription tool 115 automates many of the traditional transcription responsibilities of the user.
- FIG. 2 is a block diagram illustrating software elements of transcription tool 115 .
- Input audio information is received by audio classification component 201 .
- Users of transcription tool 115 i.e., transcribers
- GUI graphical user interface
- Control logic 202 processes the output of audio classification component 201 and coordinates the operation of graphical user interface 204 and user input component 203 to perform transcription in a manner consistent with the present invention.
- Audio classification component 201 receives an input audio stream and performs acoustic classification functions on the audio stream. More particularly, audio classification component 201 may classify segments of the audio as either speech or non-speech audio, or as wideband or narrowband audio. Audio classification component 201 may output a series of classification codes that indicate when a particular segment of the audio changes from one classification state to another. For example, when audio classification component 201 begins to detect speech, it may output an indication that speech is beginning and a time code corresponding to when the speech begins. When the speech segment ends, audio classification component 201 may similarly output an indication that the speech is ending along with a corresponding time code.
- audio classification component 201 may analyze the frequency spectrum of the input audio information. For example, a wideband audio signal, such as a studio quality audio signal, will have wider frequency range than a narrowband signal, such as an audio signal received over a telephone line. Similarly, in classifying audio signals as speech/non-speech information, audio classification component 201 may examine the frequency characteristics of the signal. Because signals that include human speech tend to exhibit certain characteristics, audio classification component 201 can determine whether or not the audio signal includes speech based on these characteristics. An implementation of audio classification component 201 is described in additional detail in application Ser. No. ______ (Attorney Docket No. 02-4022), titled “Systems and Methods for Providing Acoustic Classification,” the contents of which are incorporated by reference herein.
- User input component 203 processes information received from the user.
- a user may input information through a number of different hardware input devices.
- a keyboard for example, is an input device that the user is likely to use in entering the text corresponding to speech.
- Other devices such as a foot pedal or a mouse, may be used to control the operation of transcription tool 115 .
- FIG. 3 is an exemplary diagram of an interface 300 that may be presented to the user via graphical user interface 204 .
- Interface 300 includes waveform section 301 and transcription section 302 .
- interface 300 may include selectable menu options 303 and window control buttons 304 . Through menu options 303 , a user may initiate functions of transcription tool 115 , such as opening an audio file for transcription, saving a transcription, and setting program options.
- Waveform section 301 graphically illustrates the time-domain waveform of the audio stream that is being processed.
- the exemplary waveform shown in FIG. 3, waveform 310 includes a number of quiet segments 311 that deliniate audible segments 312 .
- Audible segments 312 may include, for example, speech, music, other sounds, or combinations thereof.
- Audio classification component 201 identifies the start location, the end location, and the classification (i.e., speech or non-speech, wideband or narrowband) of each audible segment 312 .
- transcription tool 115 plays the audio signal to the user.
- Transcription tool 115 may visually mark the portion of waveform 310 that is currently being played. For example, as shown in FIG. 3, waveform segment 315 is the portion of the audio signal that is currently being played.
- An additional, more precise location marker, such as arrow 316 may point to the current playback position in audio waveform 310 .
- the user of transcription tool 115 may input (e.g., type) the textual transcription corresponding to the audio stream into transcription section 302 .
- the user may control the playback of the audio signal through actions, such as actuating a footpedal, graphical commands accessed using a mouse, or keyboard commands. For example, if the user misses a particular word, the user may backup five seconds in the audio stream by tapping a footpedal.
- FIG. 4 is a flow chart illustrating the operation of transcription tool 115 for an input audio stream that contains speech.
- the audio stream may include, for example, audio from a radio or television broadcast.
- Audio classification component 201 receives the input audio stream, (act 401 ), and segments the audio stream into segments such as speech/non-speech segments (act 402 ). Audio classification component 201 may send indications of the segments to control logic 202 , which also receives the audio stream. Control logic 202 displays a graphical waveform representing the audio stream (or a portion of the audio stream), such as waveform 310 , in graphical user interface 204 (act 403 ). The waveform may include graphical indications of, for example, segments that correspond to speech signals, the segment that is active, and an indication of the current playback position within the active segment. Concurrently with the graphical display of waveform 301 , control logic 202 plays the audio stream back to the user (act 404 ).
- Control logic 202 displays the text in transcription section 302 (act 405 ). Additionally, control logic 202 may receive and process any commands input by the user (act 406 ). Examples of such commands include commands to move backwards or forwards in time in the audio stream or a command to skip to the next audio segment.
- the user may additionally enter predefined formatting commands that define additional information for a typed-in word. For example, the user, before typing in a proper name, may instruct transcription tool 115 that the word that the user is about to type is a proper name.
- the user instruction can be as simple as a key-code, such as a function key.
- Control logic 202 internally annotates the typed-in word as a proper name.
- control logic 202 automatically skips to the next audio segment 312 that corresponds to a speech signal (acts 407 and 408 ).
- control logic 202 may skip to the next audio segment based on user commands.
- Control logic 212 may, thus skip over audio segments that are not useful for transcription purposes, such as music segments.
- the user may configure transcription component 115 to only playback audio streams that also meet additional criteria, such as audio streams that include wideband speech signals.
- transcription tool 115 may output the transcription entered by the user to a file.
- the file may include the text typed by the user as well as meta-information added by transcription tool 115 .
- the meta-information may include, for example, time codes that correlate the transcription with the original audio and codes that indicate which words the user indicated as being proper names.
- a partial transcription that skips certain words may be an acceptable transcription.
- the user may simply skip words that are not understood or the user may skip sections when the user falls behind.
- a transcription tool automates and simplifies the transcription process.
- the transcription tool saves the user from having to listen to and manually identify suitable segments for transcription from the audio stream.
- the transcription tool is relatively simple to use and does not require the user to memorize or actively use a large number of commands. Accordingly, users can successfully transcribe audio with relatively little specialized training. Essentially, any user with competent typing skills and literacy in the target language can, with little training, effectively use the transcription tool.
- the transcription tool can increase the efficiency, and thus lower the cost, of creating transcriptions.
- the software may more generally be implemented as any type of logic.
- This logic may include hardware, such as an application specific integrated circuit or a field programmable gate array, software, or a combination of hardware and software.
Abstract
Description
- This application claims priority under 35 U.S.C. §119 based on U.S. Provisional Application Nos. 60/394,064 and 60/394,082 filed Jul. 3, 2002 and Provisional Application No. 60/419,214 filed Oct. 17, 2002, the disclosures of which are incorporated herein by reference.
- [0002] The U.S. Government may have a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reasonable terms as provided for by the terms of Contract No. 1999-S018900-0 (Federal Broadcast Information Service (FBIS)).
- A. Field of the Invention
- The present invention relates generally to speech processing and, more particularly, to the transcription of speech.
- B. Description of Related Art
- Speech has not traditionally been valued as an archival information source. As effective as the spoken word is for communicating, archiving spoken segments in a useful and easily retrievable manner has long been a difficult proposition. Although the act of recording audio is not difficult, automatically transcribing and indexing speech in an intelligent and useful manner can be difficult.
- Automatic transcription systems are generally based on language and acoustic models. The language model is trained on a speech signal and on a corresponding transcription of the speech. The model will “learn” how the speech signal corresponds to the transcription. Typically, the training transcriptions of the speech are derived through a manual transcription process in which a user listens to the training audio, segments the audio, and types in the text corresponding to the audio. While typing in the text, the user may additionally annotate the text so that certain words, such as proper names, are noted as such.
- Manually transcribing speech can be a time consuming, and thus expensive, task. Conventionally, generating one hour of transcribed training data requires up to 40 hours of a skilled transcriber's time. Accordingly, in situations in which a lot of training data is required, or in which a number of different languages are to be modeled, the cost of obtaining the training data can be prohibitive.
- Thus, there is a need in the art to be able to cost-effectively transcribe speech.
- Systems and methods consistent with the principles of this invention provide a transcription tool that allows a user to quickly, and with minimal training, transcribe segments of speech.
- One aspect of the invention is directed to a speech transcription tool that includes an audio classification component, control logic, and an input device. The audio classification component receives an audio stream containing speech data and segments the audio stream into speech and non-speech audio segments based on locations of the speech data within the audio stream. The control logic plays the speech segments and skips playing of the non-speech segments. The input device receives user transcription text relating to a transcription of the speech segments played by the control logic.
- A second aspect of the invention is directed to a method that includes receiving an audio stream containing speech data, determining where the speech data is located in the audio stream, and playing select portions of the audio stream to a user. The select portions of the audio stream are based on the location of the speech data. The method additionally includes receiving text corresponding to the played portions of the audio stream and outputting the text.
- A third aspect of the invention is directed to a method that includes analyzing a data stream based on acoustic characteristics of the data stream to generate acoustic classification information for the data stream, and playing portions of the data stream that meet predetermined criteria based on the acoustic classification information. Further, the method includes receiving transcription information relating to the played portions of the data stream.
- Yet another aspect of the invention is directed to a computing device for transcribing an audio file that includes speech. The computing device comprises speakers, a processor, and a computer memory. The computer memory contains program instructions that when executed by the processor cause processor to automatically segment the audio file into speech and non-speech segments based on acoustic characteristics of the audio file. Additionally, the processor plays a current one of the speech segments through the speakers, receives transcription information for the speech segments played through the speakers, and skips the non-speech segments when locating a next current one of the speech segments.
- The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate the invention and, together with the description, explain the invention. In the drawings,
- FIG. 1 is a diagram illustrating an exemplary system in which concepts consistent with the invention may be implemented;
- FIG. 2 is a block diagram of a transcription tool consistent with the present invention;
- FIG. 3 is a diagram illustrating a graphical user interface of the transcription tool consistent with the present invention; and
- FIG. 4 is a flow chart illustrating methods of operation of the transcription tool consistent with the present invention.
- The following detailed description of the invention refers to the accompanying drawings. The same reference numbers may be used in different drawings to identify the same or similar elements. Also, the following detailed description does not limit the invention. Instead, the scope of the invention is defined by the appended claims and equivalents of the claim limitations.
- A speech transcription tool assists a user in transcribing speech. The transcription tool automatically identifies segments in an audio stream appropriate for transcription. Additionally, the transcription tool presents the user with a simple graphical interface for typing in the transcription text.
- Speech transcription, as described herein, may be performed on one or more processing devices or networks of processing devices. FIG. 1 is a diagram illustrating an
exemplary system 100 in which concepts consistent with the invention may be implemented.System 100 includes acomputing device 101 that has a computer-readable medium 109, such as random access memory, coupled to aprocessor 108.Computing device 101 may also include a number of additional external or internal devices. Anexternal input device 120 and anexternal output device 121 are shown in FIG. 1. Theinput devices 120 may include, without limitation, a mouse, a CD-ROM, or a keyboard. The output devices may include, without limitation, a display or an audio output device, such as a speaker. A keyboard, in particular, may be used by the user ofsystem 100 when transcribing a speech segment that is played back from an output device, such as a speaker. - In general,
computing device 101 may be any type of computing platform, and may be connected to anetwork 102.Computing device 101 is exemplary only. Concepts consistent with the present invention can be implemented on any computing device, whether or not connected to a network. -
Processor 108 executes program instructions stored inmemory 109.Processor 108 can be any of a number of well-known computer processors, such as processors from Intel Corporation, of Santa Clara, Calif. - Memory109 contains an
application program 115. In particular,application program 115 may implement the transcription tool described below.Transcription tool 115 plays audio segments to a user, who types the words spoken in the audio intotranscription tool 115.Transcription tool 115 automates many of the traditional transcription responsibilities of the user. - FIG. 2 is a block diagram illustrating software elements of
transcription tool 115. Input audio information is received byaudio classification component 201. Users of transcription tool 115 (i.e., transcribers) interact withtranscription tool 115 throughuser input component 203 and graphical user interface (GUI) 204.Control logic 202 processes the output ofaudio classification component 201 and coordinates the operation ofgraphical user interface 204 anduser input component 203 to perform transcription in a manner consistent with the present invention. -
Audio classification component 201 receives an input audio stream and performs acoustic classification functions on the audio stream. More particularly,audio classification component 201 may classify segments of the audio as either speech or non-speech audio, or as wideband or narrowband audio.Audio classification component 201 may output a series of classification codes that indicate when a particular segment of the audio changes from one classification state to another. For example, whenaudio classification component 201 begins to detect speech, it may output an indication that speech is beginning and a time code corresponding to when the speech begins. When the speech segment ends,audio classification component 201 may similarly output an indication that the speech is ending along with a corresponding time code. - In performing speech/non-speech and wideband/narrowband classifications,
audio classification component 201 may analyze the frequency spectrum of the input audio information. For example, a wideband audio signal, such as a studio quality audio signal, will have wider frequency range than a narrowband signal, such as an audio signal received over a telephone line. Similarly, in classifying audio signals as speech/non-speech information,audio classification component 201 may examine the frequency characteristics of the signal. Because signals that include human speech tend to exhibit certain characteristics,audio classification component 201 can determine whether or not the audio signal includes speech based on these characteristics. An implementation ofaudio classification component 201 is described in additional detail in application Ser. No. ______ (Attorney Docket No. 02-4022), titled “Systems and Methods for Providing Acoustic Classification,” the contents of which are incorporated by reference herein. -
User input component 203 processes information received from the user. A user may input information through a number of different hardware input devices. A keyboard, for example, is an input device that the user is likely to use in entering the text corresponding to speech. Other devices, such as a foot pedal or a mouse, may be used to control the operation oftranscription tool 115. -
Graphical user interface 204 displays the graphical interface through which the user interacts withtranscription tool 115. FIG. 3 is an exemplary diagram of aninterface 300 that may be presented to the user viagraphical user interface 204.Interface 300 includeswaveform section 301 andtranscription section 302. Additionally,interface 300 may includeselectable menu options 303 andwindow control buttons 304. Throughmenu options 303, a user may initiate functions oftranscription tool 115, such as opening an audio file for transcription, saving a transcription, and setting program options. -
Waveform section 301 graphically illustrates the time-domain waveform of the audio stream that is being processed. The exemplary waveform shown in FIG. 3,waveform 310, includes a number ofquiet segments 311 that deliniateaudible segments 312.Audible segments 312 may include, for example, speech, music, other sounds, or combinations thereof.Audio classification component 201 identifies the start location, the end location, and the classification (i.e., speech or non-speech, wideband or narrowband) of eachaudible segment 312. - Concurrently with the display of
audio waveform 310,transcription tool 115 plays the audio signal to the user.Transcription tool 115 may visually mark the portion ofwaveform 310 that is currently being played. For example, as shown in FIG. 3,waveform segment 315 is the portion of the audio signal that is currently being played. An additional, more precise location marker, such asarrow 316, may point to the current playback position inaudio waveform 310. - The user of
transcription tool 115 may input (e.g., type) the textual transcription corresponding to the audio stream intotranscription section 302. The user may control the playback of the audio signal through actions, such as actuating a footpedal, graphical commands accessed using a mouse, or keyboard commands. For example, if the user misses a particular word, the user may backup five seconds in the audio stream by tapping a footpedal. - In alternate implementations,
waveform section 301 may be omitted. In this situation, the interface may simply includetranscription section 302. - FIG. 4 is a flow chart illustrating the operation of
transcription tool 115 for an input audio stream that contains speech. The audio stream may include, for example, audio from a radio or television broadcast. -
Audio classification component 201 receives the input audio stream, (act 401), and segments the audio stream into segments such as speech/non-speech segments (act 402).Audio classification component 201 may send indications of the segments to controllogic 202, which also receives the audio stream.Control logic 202 displays a graphical waveform representing the audio stream (or a portion of the audio stream), such aswaveform 310, in graphical user interface 204 (act 403). The waveform may include graphical indications of, for example, segments that correspond to speech signals, the segment that is active, and an indication of the current playback position within the active segment. Concurrently with the graphical display ofwaveform 301,control logic 202 plays the audio stream back to the user (act 404). - As the audio stream is played to the user, the user may type in text corresponding to the audio signal.
Control logic 202 displays the text in transcription section 302 (act 405). Additionally,control logic 202 may receive and process any commands input by the user (act 406). Examples of such commands include commands to move backwards or forwards in time in the audio stream or a command to skip to the next audio segment. The user may additionally enter predefined formatting commands that define additional information for a typed-in word. For example, the user, before typing in a proper name, may instructtranscription tool 115 that the word that the user is about to type is a proper name. The user instruction can be as simple as a key-code, such as a function key.Control logic 202 internally annotates the typed-in word as a proper name. - At the end of an
audio segment 310,control logic 202 automatically skips to thenext audio segment 312 that corresponds to a speech signal (acts 407 and 408). Alternatively,control logic 202 may skip to the next audio segment based on user commands. Control logic 212, may, thus skip over audio segments that are not useful for transcription purposes, such as music segments. In one implementation, the user may configuretranscription component 115 to only playback audio streams that also meet additional criteria, such as audio streams that include wideband speech signals. - When the user is finished transcribing,
transcription tool 115 may output the transcription entered by the user to a file. The file may include the text typed by the user as well as meta-information added bytranscription tool 115. The meta-information may include, for example, time codes that correlate the transcription with the original audio and codes that indicate which words the user indicated as being proper names. - For some applications, it may be acceptable to generate transcriptions that do not include all of the words in the audio stream. In other words, a partial transcription that skips certain words may be an acceptable transcription. In these situations, in order to speed the transcription rate, the user may simply skip words that are not understood or the user may skip sections when the user falls behind.
- As described herein, a transcription tool automates and simplifies the transcription process. By automatically identifying segments in an audio stream that are appropriate for transcription, the transcription tool saves the user from having to listen to and manually identify suitable segments for transcription from the audio stream. Additionally, the transcription tool is relatively simple to use and does not require the user to memorize or actively use a large number of commands. Accordingly, users can successfully transcribe audio with relatively little specialized training. Essentially, any user with competent typing skills and literacy in the target language can, with little training, effectively use the transcription tool. Additionally, due to the ability of the transcription tool to identify speech segments appropriate for transcription, the transcription tool can increase the efficiency, and thus lower the cost, of creating transcriptions.
- The foregoing description of preferred embodiments of the invention provides illustration and description, but is not intended to be exhaustive or to limit the invention to the precise form disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention. For example, while a series of acts has been presented with respect to FIG. 4, the order of the acts may be different in other implementations consistent with the present invention.
- Certain portions of the invention have been described as software that performs one or more functions. The software may more generally be implemented as any type of logic. This logic may include hardware, such as an application specific integrated circuit or a field programmable gate array, software, or a combination of hardware and software.
- No element, act, or instruction used in the description of the present application should be construed as critical or essential to the invention unless explicitly described as such. Also, as used herein, the article “a” is intended to include one or more items. Where only one item is intended, the term “one” or similar language is used.
- The scope of the invention is defined by the claims and their equivalents.
Claims (28)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/610,532 US20040006481A1 (en) | 2002-07-03 | 2003-07-02 | Fast transcription of speech |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US39406402P | 2002-07-03 | 2002-07-03 | |
US39408202P | 2002-07-03 | 2002-07-03 | |
US41921402P | 2002-10-17 | 2002-10-17 | |
US10/610,532 US20040006481A1 (en) | 2002-07-03 | 2003-07-02 | Fast transcription of speech |
Publications (1)
Publication Number | Publication Date |
---|---|
US20040006481A1 true US20040006481A1 (en) | 2004-01-08 |
Family
ID=30003990
Family Applications (11)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/610,574 Abandoned US20040006748A1 (en) | 2002-07-03 | 2003-07-02 | Systems and methods for providing online event tracking |
US10/610,532 Abandoned US20040006481A1 (en) | 2002-07-03 | 2003-07-02 | Fast transcription of speech |
US10/610,679 Abandoned US20040024598A1 (en) | 2002-07-03 | 2003-07-02 | Thematic segmentation of speech |
US10/611,106 Active 2026-04-11 US7337115B2 (en) | 2002-07-03 | 2003-07-02 | Systems and methods for providing acoustic classification |
US10/610,684 Abandoned US20040024582A1 (en) | 2002-07-03 | 2003-07-02 | Systems and methods for aiding human translation |
US10/610,699 Abandoned US20040117188A1 (en) | 2002-07-03 | 2003-07-02 | Speech based personal information manager |
US10/610,799 Abandoned US20040199495A1 (en) | 2002-07-03 | 2003-07-02 | Name browsing systems and methods |
US10/610,696 Abandoned US20040024585A1 (en) | 2002-07-03 | 2003-07-02 | Linguistic segmentation of speech |
US10/610,697 Expired - Fee Related US7290207B2 (en) | 2002-07-03 | 2003-07-02 | Systems and methods for providing multimedia information management |
US10/610,533 Expired - Fee Related US7801838B2 (en) | 2002-07-03 | 2003-07-02 | Multimedia recognition system comprising a plurality of indexers configured to receive and analyze multimedia data based on training data and user augmentation relating to one or more of a plurality of generated documents |
US12/806,465 Expired - Fee Related US8001066B2 (en) | 2002-07-03 | 2010-08-13 | Systems and methods for improving recognition results via user-augmentation of a database |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/610,574 Abandoned US20040006748A1 (en) | 2002-07-03 | 2003-07-02 | Systems and methods for providing online event tracking |
Family Applications After (9)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/610,679 Abandoned US20040024598A1 (en) | 2002-07-03 | 2003-07-02 | Thematic segmentation of speech |
US10/611,106 Active 2026-04-11 US7337115B2 (en) | 2002-07-03 | 2003-07-02 | Systems and methods for providing acoustic classification |
US10/610,684 Abandoned US20040024582A1 (en) | 2002-07-03 | 2003-07-02 | Systems and methods for aiding human translation |
US10/610,699 Abandoned US20040117188A1 (en) | 2002-07-03 | 2003-07-02 | Speech based personal information manager |
US10/610,799 Abandoned US20040199495A1 (en) | 2002-07-03 | 2003-07-02 | Name browsing systems and methods |
US10/610,696 Abandoned US20040024585A1 (en) | 2002-07-03 | 2003-07-02 | Linguistic segmentation of speech |
US10/610,697 Expired - Fee Related US7290207B2 (en) | 2002-07-03 | 2003-07-02 | Systems and methods for providing multimedia information management |
US10/610,533 Expired - Fee Related US7801838B2 (en) | 2002-07-03 | 2003-07-02 | Multimedia recognition system comprising a plurality of indexers configured to receive and analyze multimedia data based on training data and user augmentation relating to one or more of a plurality of generated documents |
US12/806,465 Expired - Fee Related US8001066B2 (en) | 2002-07-03 | 2010-08-13 | Systems and methods for improving recognition results via user-augmentation of a database |
Country Status (1)
Country | Link |
---|---|
US (11) | US20040006748A1 (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050210511A1 (en) * | 2004-03-19 | 2005-09-22 | Pettinato Richard F | Real-time media captioning subscription framework for mobile devices |
US20050210516A1 (en) * | 2004-03-19 | 2005-09-22 | Pettinato Richard F | Real-time captioning framework for mobile devices |
US20080195370A1 (en) * | 2005-08-26 | 2008-08-14 | Koninklijke Philips Electronics, N.V. | System and Method For Synchronizing Sound and Manually Transcribed Text |
US20080293443A1 (en) * | 2004-03-19 | 2008-11-27 | Media Captioning Services | Live media subscription framework for mobile devices |
US20080294434A1 (en) * | 2004-03-19 | 2008-11-27 | Media Captioning Services | Live Media Captioning Subscription Framework for Mobile Devices |
US20100121637A1 (en) * | 2008-11-12 | 2010-05-13 | Massachusetts Institute Of Technology | Semi-Automatic Speech Transcription |
US20100159973A1 (en) * | 2008-12-23 | 2010-06-24 | Motoral, Inc. | Distributing a broadband resource locator over a narrowband audio stream |
GB2502944A (en) * | 2012-03-30 | 2013-12-18 | Jpal Ltd | Segmentation and transcription of speech |
US8676590B1 (en) * | 2012-09-26 | 2014-03-18 | Google Inc. | Web-based audio transcription tool |
US8775175B1 (en) * | 2012-06-01 | 2014-07-08 | Google Inc. | Performing dictation correction |
US20150006174A1 (en) * | 2012-02-03 | 2015-01-01 | Sony Corporation | Information processing device, information processing method and program |
US9087024B1 (en) * | 2012-01-26 | 2015-07-21 | Amazon Technologies, Inc. | Narration of network content |
US20150279354A1 (en) * | 2010-05-19 | 2015-10-01 | Google Inc. | Personalization and Latency Reduction for Voice-Activated Commands |
US9195750B2 (en) | 2012-01-26 | 2015-11-24 | Amazon Technologies, Inc. | Remote browsing and searching |
US9330188B1 (en) | 2011-12-22 | 2016-05-03 | Amazon Technologies, Inc. | Shared browsing sessions |
US9336321B1 (en) | 2012-01-26 | 2016-05-10 | Amazon Technologies, Inc. | Remote browsing and searching |
US9772816B1 (en) * | 2014-12-22 | 2017-09-26 | Google Inc. | Transcription and tagging system |
US9773499B2 (en) | 2014-06-18 | 2017-09-26 | Google Inc. | Entity name recognition based on entity type |
EP3432560A1 (en) * | 2017-07-20 | 2019-01-23 | Dialogtech Inc. | System, method, and computer program product for automatically analyzing and categorizing phone calls |
US10354647B2 (en) | 2015-04-28 | 2019-07-16 | Google Llc | Correcting voice recognition using selective re-speak |
Families Citing this family (210)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7349477B2 (en) * | 2002-07-10 | 2008-03-25 | Mitsubishi Electric Research Laboratories, Inc. | Audio-assisted video segmentation and summarization |
US20070225614A1 (en) * | 2004-05-26 | 2007-09-27 | Endothelix, Inc. | Method and apparatus for determining vascular health conditions |
US7574447B2 (en) * | 2003-04-08 | 2009-08-11 | United Parcel Service Of America, Inc. | Inbound package tracking systems and methods |
US20050010231A1 (en) * | 2003-06-20 | 2005-01-13 | Myers Thomas H. | Method and apparatus for strengthening the biomechanical properties of implants |
US7487094B1 (en) * | 2003-06-20 | 2009-02-03 | Utopy, Inc. | System and method of call classification with context modeling based on composite words |
US7231396B2 (en) * | 2003-07-24 | 2007-06-12 | International Business Machines Corporation | Data abstraction layer for a database |
US8229744B2 (en) * | 2003-08-26 | 2012-07-24 | Nuance Communications, Inc. | Class detection scheme and time mediated averaging of class dependent models |
US20060212830A1 (en) * | 2003-09-09 | 2006-09-21 | Fogg Brian J | Graphical messaging system |
US8046814B1 (en) * | 2003-10-22 | 2011-10-25 | The Weather Channel, Inc. | Systems and methods for formulating and delivering video having perishable information |
GB2409087A (en) * | 2003-12-12 | 2005-06-15 | Ibm | Computer generated prompting |
US7496500B2 (en) * | 2004-03-01 | 2009-02-24 | Microsoft Corporation | Systems and methods that determine intent of data and respond to the data based on the intent |
US20050209849A1 (en) * | 2004-03-22 | 2005-09-22 | Sony Corporation And Sony Electronics Inc. | System and method for automatically cataloguing data by utilizing speech recognition procedures |
US7363279B2 (en) | 2004-04-29 | 2008-04-22 | Microsoft Corporation | Method and system for calculating importance of a block within a display page |
US8838452B2 (en) * | 2004-06-09 | 2014-09-16 | Canon Kabushiki Kaisha | Effective audio segmentation and classification |
US8036893B2 (en) * | 2004-07-22 | 2011-10-11 | Nuance Communications, Inc. | Method and system for identifying and correcting accent-induced speech recognition difficulties |
US9344765B2 (en) | 2004-07-30 | 2016-05-17 | Broadband Itv, Inc. | Dynamic adjustment of electronic program guide displays based on viewer preferences for minimizing navigation in VOD program selection |
US9641902B2 (en) | 2007-06-26 | 2017-05-02 | Broadband Itv, Inc. | Dynamic adjustment of electronic program guide displays based on viewer preferences for minimizing navigation in VOD program selection |
US7631336B2 (en) | 2004-07-30 | 2009-12-08 | Broadband Itv, Inc. | Method for converting, navigating and displaying video content uploaded from the internet to a digital TV video-on-demand platform |
US11259059B2 (en) | 2004-07-30 | 2022-02-22 | Broadband Itv, Inc. | System for addressing on-demand TV program content on TV services platform of a digital TV services provider |
US9584868B2 (en) | 2004-07-30 | 2017-02-28 | Broadband Itv, Inc. | Dynamic adjustment of electronic program guide displays based on viewer preferences for minimizing navigation in VOD program selection |
US7590997B2 (en) | 2004-07-30 | 2009-09-15 | Broadband Itv, Inc. | System and method for managing, converting and displaying video content on a video-on-demand platform, including ads used for drill-down navigation and consumer-generated classified ads |
US7529765B2 (en) * | 2004-11-23 | 2009-05-05 | Palo Alto Research Center Incorporated | Methods, apparatus, and program products for performing incremental probabilistic latent semantic analysis |
US7769579B2 (en) | 2005-05-31 | 2010-08-03 | Google Inc. | Learning facts from semi-structured text |
EP1889255A1 (en) * | 2005-05-24 | 2008-02-20 | Loquendo S.p.A. | Automatic text-independent, language-independent speaker voice-print creation and speaker recognition |
TWI270052B (en) * | 2005-08-09 | 2007-01-01 | Delta Electronics Inc | System for selecting audio content by using speech recognition and method therefor |
GB2430073A (en) * | 2005-09-08 | 2007-03-14 | Univ East Anglia | Analysis and transcription of music |
US20070061703A1 (en) * | 2005-09-12 | 2007-03-15 | International Business Machines Corporation | Method and apparatus for annotating a document |
US20070078644A1 (en) * | 2005-09-30 | 2007-04-05 | Microsoft Corporation | Detecting segmentation errors in an annotated corpus |
CA2624339C (en) * | 2005-10-12 | 2014-12-02 | Thomson Licensing | Region of interest h.264 scalable video coding |
US20070094023A1 (en) * | 2005-10-21 | 2007-04-26 | Callminer, Inc. | Method and apparatus for processing heterogeneous units of work |
JP4432877B2 (en) * | 2005-11-08 | 2010-03-17 | ソニー株式会社 | Information processing system, information processing method, information processing apparatus, program, and recording medium |
US8019752B2 (en) * | 2005-11-10 | 2011-09-13 | Endeca Technologies, Inc. | System and method for information retrieval from object collections with complex interrelationships |
US20070150540A1 (en) * | 2005-12-27 | 2007-06-28 | Microsoft Corporation | Presence and peer launch pad |
TW200731113A (en) * | 2006-02-09 | 2007-08-16 | Benq Corp | Method for utilizing a media adapter for controlling a display device to display information of multimedia data corresponding to an authority datum |
US20070225606A1 (en) * | 2006-03-22 | 2007-09-27 | Endothelix, Inc. | Method and apparatus for comprehensive assessment of vascular health |
US20070225973A1 (en) * | 2006-03-23 | 2007-09-27 | Childress Rhonda L | Collective Audio Chunk Processing for Streaming Translated Multi-Speaker Conversations |
US7752031B2 (en) * | 2006-03-23 | 2010-07-06 | International Business Machines Corporation | Cadence management of translated multi-speaker conversations using pause marker relationship models |
US8301448B2 (en) | 2006-03-29 | 2012-10-30 | Nuance Communications, Inc. | System and method for applying dynamic contextual grammars and language models to improve automatic speech recognition accuracy |
US20080201318A1 (en) * | 2006-05-02 | 2008-08-21 | Lit Group, Inc. | Method and system for retrieving network documents |
US20080027330A1 (en) * | 2006-05-15 | 2008-01-31 | Endothelix, Inc. | Risk assessment method for acute cardiovascular events |
US7587407B2 (en) * | 2006-05-26 | 2009-09-08 | International Business Machines Corporation | System and method for creation, representation, and delivery of document corpus entity co-occurrence information |
US7593940B2 (en) * | 2006-05-26 | 2009-09-22 | International Business Machines Corporation | System and method for creation, representation, and delivery of document corpus entity co-occurrence information |
US10339208B2 (en) | 2006-06-12 | 2019-07-02 | Brief-Lynx, Inc. | Electronic documentation |
US8219543B2 (en) * | 2006-06-12 | 2012-07-10 | Etrial Communications, Inc. | Electronic documentation |
US7504969B2 (en) * | 2006-07-11 | 2009-03-17 | Data Domain, Inc. | Locality-based stream segmentation for data deduplication |
US7620551B2 (en) * | 2006-07-20 | 2009-11-17 | Mspot, Inc. | Method and apparatus for providing search capability and targeted advertising for audio, image, and video content over the internet |
US20080081963A1 (en) * | 2006-09-29 | 2008-04-03 | Endothelix, Inc. | Methods and Apparatus for Profiling Cardiovascular Vulnerability to Mental Stress |
US8122026B1 (en) * | 2006-10-20 | 2012-02-21 | Google Inc. | Finding and disambiguating references to entities on web pages |
DE102006057159A1 (en) * | 2006-12-01 | 2008-06-05 | Deutsche Telekom Ag | Method for classifying spoken language in speech dialogue systems |
JP4827721B2 (en) * | 2006-12-26 | 2011-11-30 | ニュアンス コミュニケーションズ,インコーポレイテッド | Utterance division method, apparatus and program |
TW200841189A (en) * | 2006-12-27 | 2008-10-16 | Ibm | Technique for accurately detecting system failure |
US20080172219A1 (en) * | 2007-01-17 | 2008-07-17 | Novell, Inc. | Foreign language translator in a document editor |
US8285697B1 (en) * | 2007-01-23 | 2012-10-09 | Google Inc. | Feedback enhanced attribute extraction |
US20080177536A1 (en) * | 2007-01-24 | 2008-07-24 | Microsoft Corporation | A/v content editing |
US20080215318A1 (en) * | 2007-03-01 | 2008-09-04 | Microsoft Corporation | Event recognition |
US20090030685A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Using speech recognition results based on an unstructured language model with a navigation system |
US20110054899A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Command and control utilizing content information in a mobile voice-to-speech application |
US8886540B2 (en) | 2007-03-07 | 2014-11-11 | Vlingo Corporation | Using speech recognition results based on an unstructured language model in a mobile communication facility application |
US20080221884A1 (en) * | 2007-03-07 | 2008-09-11 | Cerra Joseph P | Mobile environment speech processing facility |
US8635243B2 (en) | 2007-03-07 | 2014-01-21 | Research In Motion Limited | Sending a communications header with voice recording to send metadata for use in speech recognition, formatting, and search mobile search application |
US20110054895A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Utilizing user transmitted text to improve language model in mobile dictation application |
US8949266B2 (en) | 2007-03-07 | 2015-02-03 | Vlingo Corporation | Multiple web-based content category searching in mobile search application |
US10056077B2 (en) * | 2007-03-07 | 2018-08-21 | Nuance Communications, Inc. | Using speech recognition results based on an unstructured language model with a music system |
US20080221900A1 (en) * | 2007-03-07 | 2008-09-11 | Cerra Joseph P | Mobile local search environment speech processing facility |
US20090030697A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Using contextual information for delivering results generated from a speech recognition facility using an unstructured language model |
US20110054897A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Transmitting signal quality information in mobile dictation application |
US8838457B2 (en) | 2007-03-07 | 2014-09-16 | Vlingo Corporation | Using results of unstructured language model based speech recognition to control a system-level function of a mobile communications facility |
US20110054896A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Sending a communications header with voice recording to send metadata for use in speech recognition and formatting in mobile dictation application |
US20090030687A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Adapting an unstructured language model speech recognition system based on usage |
US20110054898A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Multiple web-based content search user interface in mobile search application |
US8886545B2 (en) | 2007-03-07 | 2014-11-11 | Vlingo Corporation | Dealing with switch latency in speech recognition |
US20090030691A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Using an unstructured language model associated with an application of a mobile communication facility |
US8949130B2 (en) | 2007-03-07 | 2015-02-03 | Vlingo Corporation | Internal and external speech recognition use with a mobile communication facility |
US20090030688A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Tagging speech recognition results based on an unstructured language model for use in a mobile communication facility application |
JP4466665B2 (en) * | 2007-03-13 | 2010-05-26 | 日本電気株式会社 | Minutes creation method, apparatus and program thereof |
US8347202B1 (en) | 2007-03-14 | 2013-01-01 | Google Inc. | Determining geographic locations for place names in a fact repository |
US20080229914A1 (en) * | 2007-03-19 | 2008-09-25 | Trevor Nathanial | Foot operated transport controller for digital audio workstations |
US8078464B2 (en) * | 2007-03-30 | 2011-12-13 | Mattersight Corporation | Method and system for analyzing separated voice data of a telephonic communication to determine the gender of the communicant |
US8856002B2 (en) * | 2007-04-12 | 2014-10-07 | International Business Machines Corporation | Distance metrics for universal pattern processing tasks |
US20080288239A1 (en) * | 2007-05-15 | 2008-11-20 | Microsoft Corporation | Localization and internationalization of document resources |
US8984133B2 (en) * | 2007-06-19 | 2015-03-17 | The Invention Science Fund I, Llc | Providing treatment-indicative feedback dependent on putative content treatment |
US8682982B2 (en) * | 2007-06-19 | 2014-03-25 | The Invention Science Fund I, Llc | Preliminary destination-dependent evaluation of message content |
US20080320088A1 (en) * | 2007-06-19 | 2008-12-25 | Searete Llc, A Limited Liability Corporation Of The State Of Delaware | Helping valuable message content pass apparent message filtering |
US9374242B2 (en) * | 2007-11-08 | 2016-06-21 | Invention Science Fund I, Llc | Using evaluations of tentative message content |
US11570521B2 (en) | 2007-06-26 | 2023-01-31 | Broadband Itv, Inc. | Dynamic adjustment of electronic program guide displays based on viewer preferences for minimizing navigation in VOD program selection |
US20090027485A1 (en) * | 2007-07-26 | 2009-01-29 | Avaya Technology Llc | Automatic Monitoring of a Call Participant's Attentiveness |
JP5088050B2 (en) * | 2007-08-29 | 2012-12-05 | ヤマハ株式会社 | Voice processing apparatus and program |
US8082225B2 (en) * | 2007-08-31 | 2011-12-20 | The Invention Science Fund I, Llc | Using destination-dependent criteria to guide data transmission decisions |
US8065404B2 (en) * | 2007-08-31 | 2011-11-22 | The Invention Science Fund I, Llc | Layering destination-dependent content handling guidance |
US8326833B2 (en) * | 2007-10-04 | 2012-12-04 | International Business Machines Corporation | Implementing metadata extraction of artifacts from associated collaborative discussions |
US20090122157A1 (en) * | 2007-11-14 | 2009-05-14 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method, and computer-readable storage medium |
US7930389B2 (en) | 2007-11-20 | 2011-04-19 | The Invention Science Fund I, Llc | Adaptive filtering of annotated messages or the like |
DK2081405T3 (en) * | 2008-01-21 | 2012-08-20 | Bernafon Ag | Hearing aid adapted to a particular voice type in an acoustic environment as well as method and application |
US8208245B2 (en) | 2008-03-31 | 2012-06-26 | Over The Sun, Llc | Tablet computer |
US20090259469A1 (en) * | 2008-04-14 | 2009-10-15 | Motorola, Inc. | Method and apparatus for speech recognition |
US8326788B2 (en) * | 2008-04-29 | 2012-12-04 | International Business Machines Corporation | Determining the degree of relevance of alerts in an entity resolution system |
US8250637B2 (en) * | 2008-04-29 | 2012-08-21 | International Business Machines Corporation | Determining the degree of relevance of duplicate alerts in an entity resolution system |
US20090271394A1 (en) * | 2008-04-29 | 2009-10-29 | Allen Thomas B | Determining the degree of relevance of entities and identities in an entity resolution system that maintains alert relevance |
US8015137B2 (en) * | 2008-04-29 | 2011-09-06 | International Business Machines Corporation | Determining the degree of relevance of alerts in an entity resolution system over alert disposition lifecycle |
US7475344B1 (en) * | 2008-05-04 | 2009-01-06 | International Business Machines Corporation | Genders-usage assistant for composition of electronic documents, emails, or letters |
US8818801B2 (en) * | 2008-07-28 | 2014-08-26 | Nec Corporation | Dialogue speech recognition system, dialogue speech recognition method, and recording medium for storing dialogue speech recognition program |
US8655950B2 (en) * | 2008-08-06 | 2014-02-18 | International Business Machines Corporation | Contextual awareness in real time collaborative activity alerts |
US8744532B2 (en) * | 2008-11-10 | 2014-06-03 | Disney Enterprises, Inc. | System and method for customizable playback of communication device alert media |
JP5488475B2 (en) * | 2008-12-15 | 2014-05-14 | 日本電気株式会社 | Topic transition analysis system, topic transition analysis method and program |
US8654963B2 (en) | 2008-12-19 | 2014-02-18 | Genesys Telecommunications Laboratories, Inc. | Method and system for integrating an interaction management system with a business rules management system |
US8301444B2 (en) | 2008-12-29 | 2012-10-30 | At&T Intellectual Property I, L.P. | Automated demographic analysis by analyzing voice activity |
US8498866B2 (en) * | 2009-01-15 | 2013-07-30 | K-Nfb Reading Technology, Inc. | Systems and methods for multiple language document narration |
EP2216775B1 (en) * | 2009-02-05 | 2012-11-21 | Nuance Communications, Inc. | Speaker recognition |
US8458105B2 (en) * | 2009-02-12 | 2013-06-04 | Decisive Analytics Corporation | Method and apparatus for analyzing and interrelating data |
US20100235314A1 (en) * | 2009-02-12 | 2010-09-16 | Decisive Analytics Corporation | Method and apparatus for analyzing and interrelating video data |
US9646603B2 (en) * | 2009-02-27 | 2017-05-09 | Longsand Limited | Various apparatus and methods for a speech recognition system |
CN101847412B (en) * | 2009-03-27 | 2012-02-15 | 华为技术有限公司 | Method and device for classifying audio signals |
CN101901235B (en) | 2009-05-27 | 2013-03-27 | 国际商业机器公司 | Method and system for document processing |
US8463606B2 (en) | 2009-07-13 | 2013-06-11 | Genesys Telecommunications Laboratories, Inc. | System for analyzing interactions and reporting analytic results to human-operated and system interfaces in real time |
US8190420B2 (en) | 2009-08-04 | 2012-05-29 | Autonomy Corporation Ltd. | Automatic spoken language identification based on phoneme sequence patterns |
US8160877B1 (en) * | 2009-08-06 | 2012-04-17 | Narus, Inc. | Hierarchical real-time speaker recognition for biometric VoIP verification and targeting |
US8799408B2 (en) * | 2009-08-10 | 2014-08-05 | Sling Media Pvt Ltd | Localization systems and methods |
US9727842B2 (en) | 2009-08-21 | 2017-08-08 | International Business Machines Corporation | Determining entity relevance by relationships to other relevant entities |
WO2011035210A2 (en) * | 2009-09-18 | 2011-03-24 | Lexxe Pty Ltd | Method and system for scoring texts |
WO2011044153A1 (en) | 2009-10-09 | 2011-04-14 | Dolby Laboratories Licensing Corporation | Automatic generation of metadata for audio dominance effects |
US8954434B2 (en) * | 2010-01-08 | 2015-02-10 | Microsoft Corporation | Enhancing a document with supplemental information from another document |
US8903847B2 (en) * | 2010-03-05 | 2014-12-02 | International Business Machines Corporation | Digital media voice tags in social networks |
US8831942B1 (en) * | 2010-03-19 | 2014-09-09 | Narus, Inc. | System and method for pitch based gender identification with suspicious speaker detection |
US8600750B2 (en) * | 2010-06-08 | 2013-12-03 | Cisco Technology, Inc. | Speaker-cluster dependent speaker recognition (speaker-type automated speech recognition) |
US9465935B2 (en) | 2010-06-11 | 2016-10-11 | D2L Corporation | Systems, methods, and apparatus for securing user documents |
US20130115606A1 (en) * | 2010-07-07 | 2013-05-09 | The University Of British Columbia | System and method for microfluidic cell culture |
TWI403304B (en) * | 2010-08-27 | 2013-08-01 | Ind Tech Res Inst | Method and mobile device for awareness of linguistic ability |
EP2437153A3 (en) * | 2010-10-01 | 2016-10-05 | Samsung Electronics Co., Ltd. | Apparatus and method for turning e-book pages in portable terminal |
KR101743632B1 (en) | 2010-10-01 | 2017-06-07 | 삼성전자주식회사 | Apparatus and method for turning e-book pages in portable terminal |
US9678572B2 (en) | 2010-10-01 | 2017-06-13 | Samsung Electronics Co., Ltd. | Apparatus and method for turning e-book pages in portable terminal |
EP2437151B1 (en) | 2010-10-01 | 2020-07-08 | Samsung Electronics Co., Ltd. | Apparatus and method for turning e-book pages in portable terminal |
US8498998B2 (en) * | 2010-10-11 | 2013-07-30 | International Business Machines Corporation | Grouping identity records to generate candidate lists to use in an entity and relationship resolution process |
US20120197643A1 (en) * | 2011-01-27 | 2012-08-02 | General Motors Llc | Mapping obstruent speech energy to lower frequencies |
US20120246238A1 (en) | 2011-03-21 | 2012-09-27 | International Business Machines Corporation | Asynchronous messaging tags |
US20120244842A1 (en) | 2011-03-21 | 2012-09-27 | International Business Machines Corporation | Data Session Synchronization With Phone Numbers |
US8688090B2 (en) | 2011-03-21 | 2014-04-01 | International Business Machines Corporation | Data session preferences |
US9053750B2 (en) | 2011-06-17 | 2015-06-09 | At&T Intellectual Property I, L.P. | Speaker association with a visual representation of spoken content |
US9160837B2 (en) * | 2011-06-29 | 2015-10-13 | Gracenote, Inc. | Interactive streaming content apparatus, systems and methods |
US20130144414A1 (en) * | 2011-12-06 | 2013-06-06 | Cisco Technology, Inc. | Method and apparatus for discovering and labeling speakers in a large and growing collection of videos with minimal user effort |
US9396277B2 (en) * | 2011-12-09 | 2016-07-19 | Microsoft Technology Licensing, Llc | Access to supplemental data based on identifier derived from corresponding primary application data |
US8886651B1 (en) * | 2011-12-22 | 2014-11-11 | Reputation.Com, Inc. | Thematic clustering |
US9324323B1 (en) | 2012-01-13 | 2016-04-26 | Google Inc. | Speech recognition using topic-specific language models |
US8543398B1 (en) | 2012-02-29 | 2013-09-24 | Google Inc. | Training an automatic speech recognition system using compressed word frequencies |
US8775177B1 (en) | 2012-03-08 | 2014-07-08 | Google Inc. | Speech recognition process |
US8965766B1 (en) * | 2012-03-15 | 2015-02-24 | Google Inc. | Systems and methods for identifying music in a noisy environment |
WO2013149027A1 (en) * | 2012-03-28 | 2013-10-03 | Crawford Terry | Method and system for providing segment-based viewing of recorded sessions |
US9129605B2 (en) * | 2012-03-30 | 2015-09-08 | Src, Inc. | Automated voice and speech labeling |
US8374865B1 (en) | 2012-04-26 | 2013-02-12 | Google Inc. | Sampling training data for an automatic speech recognition system based on a benchmark classification distribution |
US8805684B1 (en) | 2012-05-31 | 2014-08-12 | Google Inc. | Distributed speaker adaptation |
US8571859B1 (en) | 2012-05-31 | 2013-10-29 | Google Inc. | Multi-stage speaker adaptation |
WO2013179275A2 (en) * | 2012-06-01 | 2013-12-05 | Donald, Heather June | Method and system for generating an interactive display |
US9881616B2 (en) * | 2012-06-06 | 2018-01-30 | Qualcomm Incorporated | Method and systems having improved speech recognition |
GB2505072A (en) | 2012-07-06 | 2014-02-19 | Box Inc | Identifying users and collaborators as search results in a cloud-based system |
US8880398B1 (en) | 2012-07-13 | 2014-11-04 | Google Inc. | Localized speech recognition with offload |
US9123333B2 (en) | 2012-09-12 | 2015-09-01 | Google Inc. | Minimum bayesian risk methods for automatic speech recognition |
US10915492B2 (en) * | 2012-09-19 | 2021-02-09 | Box, Inc. | Cloud-based platform enabled with media content indexed for text-based searches and/or metadata extraction |
TW201417093A (en) * | 2012-10-19 | 2014-05-01 | Hon Hai Prec Ind Co Ltd | Electronic device with video/audio files processing function and video/audio files processing method |
EP2736042A1 (en) * | 2012-11-23 | 2014-05-28 | Samsung Electronics Co., Ltd | Apparatus and method for constructing multilingual acoustic model and computer readable recording medium for storing program for performing the method |
US9912816B2 (en) | 2012-11-29 | 2018-03-06 | Genesys Telecommunications Laboratories, Inc. | Workload distribution with resource awareness |
US9542936B2 (en) | 2012-12-29 | 2017-01-10 | Genesys Telecommunications Laboratories, Inc. | Fast out-of-vocabulary search in automatic speech recognition systems |
KR102112742B1 (en) * | 2013-01-22 | 2020-05-19 | 삼성전자주식회사 | Electronic apparatus and voice processing method thereof |
US9208777B2 (en) | 2013-01-25 | 2015-12-08 | Microsoft Technology Licensing, Llc | Feature space transformation for personalization using generalized i-vector clustering |
CN105264518B (en) * | 2013-02-28 | 2017-12-01 | 株式会社东芝 | Data processing equipment and story model building method |
US9190055B1 (en) * | 2013-03-14 | 2015-11-17 | Amazon Technologies, Inc. | Named entity recognition with personalized models |
US9195656B2 (en) | 2013-12-30 | 2015-11-24 | Google Inc. | Multilingual prosody generation |
WO2015105994A1 (en) | 2014-01-08 | 2015-07-16 | Callminer, Inc. | Real-time conversational analytics facility |
US9430186B2 (en) * | 2014-03-17 | 2016-08-30 | Google Inc | Visual indication of a recognized voice-initiated action |
US9497868B2 (en) | 2014-04-17 | 2016-11-15 | Continental Automotive Systems, Inc. | Electronics enclosure |
JPWO2016027800A1 (en) * | 2014-08-22 | 2017-06-29 | オリンパス株式会社 | Cell culture bag, cell culture device and cell culture container |
US10025773B2 (en) * | 2015-07-24 | 2018-07-17 | International Business Machines Corporation | System and method for natural language processing using synthetic text |
US10381022B1 (en) | 2015-12-23 | 2019-08-13 | Google Llc | Audio classifier |
US10282411B2 (en) * | 2016-03-31 | 2019-05-07 | International Business Machines Corporation | System, method, and recording medium for natural language learning |
CN107305541B (en) * | 2016-04-20 | 2021-05-04 | 科大讯飞股份有限公司 | Method and device for segmenting speech recognition text |
US20180018973A1 (en) | 2016-07-15 | 2018-01-18 | Google Inc. | Speaker verification |
US9978392B2 (en) * | 2016-09-09 | 2018-05-22 | Tata Consultancy Services Limited | Noisy signal identification from non-stationary audio signals |
US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
US10642889B2 (en) * | 2017-02-20 | 2020-05-05 | Gong I.O Ltd. | Unsupervised automated topic detection, segmentation and labeling of conversations |
CN109102810B (en) * | 2017-06-21 | 2021-10-15 | 北京搜狗科技发展有限公司 | Voiceprint recognition method and device |
GB2578386B (en) | 2017-06-27 | 2021-12-01 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
GB2563953A (en) | 2017-06-28 | 2019-01-02 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
GB201713697D0 (en) | 2017-06-28 | 2017-10-11 | Cirrus Logic Int Semiconductor Ltd | Magnetic detection of replay attack |
GB201801526D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Methods, apparatus and systems for authentication |
GB201801532D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Methods, apparatus and systems for audio playback |
GB201801527D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Method, apparatus and systems for biometric processes |
GB201801528D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Method, apparatus and systems for biometric processes |
GB201801530D0 (en) | 2017-07-07 | 2018-03-14 | Cirrus Logic Int Semiconductor Ltd | Methods, apparatus and systems for authentication |
GB201801664D0 (en) | 2017-10-13 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Detection of liveness |
GB2567503A (en) | 2017-10-13 | 2019-04-17 | Cirrus Logic Int Semiconductor Ltd | Analysing speech signals |
GB201804843D0 (en) | 2017-11-14 | 2018-05-09 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
GB201801661D0 (en) | 2017-10-13 | 2018-03-21 | Cirrus Logic International Uk Ltd | Detection of liveness |
GB201801874D0 (en) | 2017-10-13 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Improving robustness of speech processing system against ultrasound and dolphin attacks |
GB201801663D0 (en) | 2017-10-13 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Detection of liveness |
GB201803570D0 (en) | 2017-10-13 | 2018-04-18 | Cirrus Logic Int Semiconductor Ltd | Detection of replay attack |
GB201801659D0 (en) | 2017-11-14 | 2018-03-21 | Cirrus Logic Int Semiconductor Ltd | Detection of loudspeaker playback |
US11568231B2 (en) * | 2017-12-08 | 2023-01-31 | Raytheon Bbn Technologies Corp. | Waypoint detection for a contact center analysis system |
US10671251B2 (en) | 2017-12-22 | 2020-06-02 | Arbordale Publishing, LLC | Interactive eReader interface generation based on synchronization of textual and audial descriptors |
US11443646B2 (en) | 2017-12-22 | 2022-09-13 | Fathom Technologies, LLC | E-Reader interface system with audio and highlighting synchronization for digital books |
US11270071B2 (en) * | 2017-12-28 | 2022-03-08 | Comcast Cable Communications, Llc | Language-based content recommendations using closed captions |
US11264037B2 (en) | 2018-01-23 | 2022-03-01 | Cirrus Logic, Inc. | Speaker identification |
US11475899B2 (en) | 2018-01-23 | 2022-10-18 | Cirrus Logic, Inc. | Speaker identification |
US11735189B2 (en) | 2018-01-23 | 2023-08-22 | Cirrus Logic, Inc. | Speaker identification |
US11276407B2 (en) | 2018-04-17 | 2022-03-15 | Gong.Io Ltd. | Metadata-based diarization of teleconferences |
US10692490B2 (en) | 2018-07-31 | 2020-06-23 | Cirrus Logic, Inc. | Detection of replay attack |
US10915614B2 (en) | 2018-08-31 | 2021-02-09 | Cirrus Logic, Inc. | Biometric authentication |
US11037574B2 (en) * | 2018-09-05 | 2021-06-15 | Cirrus Logic, Inc. | Speaker recognition and speaker change detection |
US11183195B2 (en) * | 2018-09-27 | 2021-11-23 | Snackable Inc. | Audio content processing systems and methods |
US11410658B1 (en) * | 2019-10-29 | 2022-08-09 | Dialpad, Inc. | Maintainable and scalable pipeline for automatic speech recognition language modeling |
US11373657B2 (en) * | 2020-05-01 | 2022-06-28 | Raytheon Applied Signal Technology, Inc. | System and method for speaker identification in audio data |
US11315545B2 (en) | 2020-07-09 | 2022-04-26 | Raytheon Applied Signal Technology, Inc. | System and method for language identification in audio data |
CN112289323B (en) * | 2020-12-29 | 2021-05-28 | 深圳追一科技有限公司 | Voice data processing method and device, computer equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5835667A (en) * | 1994-10-14 | 1998-11-10 | Carnegie Mellon University | Method and apparatus for creating a searchable digital video library and a system and method of using such a library |
US5960447A (en) * | 1995-11-13 | 1999-09-28 | Holt; Douglas | Word tagging and editing system for speech recognition |
US6067517A (en) * | 1996-02-02 | 2000-05-23 | International Business Machines Corporation | Transcription of speech data with segments from acoustically dissimilar environments |
US6161087A (en) * | 1998-10-05 | 2000-12-12 | Lernout & Hauspie Speech Products N.V. | Speech-recognition-assisted selective suppression of silent and filled speech pauses during playback of an audio recording |
US6332147B1 (en) * | 1995-11-03 | 2001-12-18 | Xerox Corporation | Computer controlled display system using a graphical replay device to control playback of temporal data representing collaborative activities |
US6360237B1 (en) * | 1998-10-05 | 2002-03-19 | Lernout & Hauspie Speech Products N.V. | Method and system for performing text edits during audio recording playback |
US6708148B2 (en) * | 2001-10-12 | 2004-03-16 | Koninklijke Philips Electronics N.V. | Correction device to mark parts of a recognized text |
US6792409B2 (en) * | 1999-12-20 | 2004-09-14 | Koninklijke Philips Electronics N.V. | Synchronous reproduction in a speech recognition system |
Family Cites Families (173)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AUPQ131399A0 (en) | 1999-06-30 | 1999-07-22 | Silverbrook Research Pty Ltd | A method and apparatus (NPAGE02) |
US4193119A (en) * | 1977-03-25 | 1980-03-11 | Xerox Corporation | Apparatus for assisting in the transposition of foreign language text |
US4317611A (en) * | 1980-05-19 | 1982-03-02 | International Business Machines Corporation | Optical ray deflection apparatus |
US4615595A (en) * | 1984-10-10 | 1986-10-07 | Texas Instruments Incorporated | Frame addressed spatial light modulator |
US4908866A (en) | 1985-02-04 | 1990-03-13 | Eric Goldwasser | Speech transcribing system |
JPH0693221B2 (en) | 1985-06-12 | 1994-11-16 | 株式会社日立製作所 | Voice input device |
JPH0743719B2 (en) * | 1986-05-20 | 1995-05-15 | シャープ株式会社 | Machine translation device |
US4879648A (en) | 1986-09-19 | 1989-11-07 | Nancy P. Cochran | Search system which continuously displays search terms during scrolling and selections of individually displayed data sets |
JPH0833799B2 (en) * | 1988-10-31 | 1996-03-29 | 富士通株式会社 | Data input / output control method |
US5146439A (en) * | 1989-01-04 | 1992-09-08 | Pitney Bowes Inc. | Records management system having dictation/transcription capability |
US6978277B2 (en) * | 1989-10-26 | 2005-12-20 | Encyclopaedia Britannica, Inc. | Multimedia search system |
US5418716A (en) * | 1990-07-26 | 1995-05-23 | Nec Corporation | System for recognizing sentence patterns and a system for recognizing sentence patterns and grammatical cases |
US5404295A (en) | 1990-08-16 | 1995-04-04 | Katz; Boris | Method and apparatus for utilizing annotations to facilitate computer retrieval of database material |
US5408686A (en) * | 1991-02-19 | 1995-04-18 | Mankovitz; Roy J. | Apparatus and methods for music and lyrics broadcasting |
US5317732A (en) | 1991-04-26 | 1994-05-31 | Commodore Electronics Limited | System for relocating a multimedia presentation on a different platform by extracting a resource map in order to remap and relocate resources |
US5477451A (en) * | 1991-07-25 | 1995-12-19 | International Business Machines Corp. | Method and system for natural language translation |
US5875108A (en) * | 1991-12-23 | 1999-02-23 | Hoffberg; Steven M. | Ergonomic man-machine interface incorporating adaptive pattern recognition based control system |
US5544257A (en) | 1992-01-08 | 1996-08-06 | International Business Machines Corporation | Continuous parameter hidden Markov model approach to automatic handwriting recognition |
US5311360A (en) * | 1992-04-28 | 1994-05-10 | The Board Of Trustees Of The Leland Stanford, Junior University | Method and apparatus for modulating a light beam |
JP2524472B2 (en) * | 1992-09-21 | 1996-08-14 | インターナショナル・ビジネス・マシーンズ・コーポレイション | How to train a telephone line based speech recognition system |
CA2108536C (en) | 1992-11-24 | 2000-04-04 | Oscar Ernesto Agazzi | Text recognition using two-dimensional stochastic models |
US5369704A (en) * | 1993-03-24 | 1994-11-29 | Engate Incorporated | Down-line transcription system for manipulating real-time testimony |
US5525047A (en) * | 1993-06-30 | 1996-06-11 | Cooper Cameron Corporation | Sealing system for an unloader |
US5689641A (en) * | 1993-10-01 | 1997-11-18 | Vicor, Inc. | Multimedia collaboration system arrangement for routing compressed AV signal through a participant site without decompressing the AV signal |
JP2986345B2 (en) * | 1993-10-18 | 1999-12-06 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Voice recording indexing apparatus and method |
US5452024A (en) * | 1993-11-01 | 1995-09-19 | Texas Instruments Incorporated | DMD display system |
JP3185505B2 (en) * | 1993-12-24 | 2001-07-11 | 株式会社日立製作所 | Meeting record creation support device |
GB2285895A (en) | 1994-01-19 | 1995-07-26 | Ibm | Audio conferencing system which generates a set of minutes |
US5810599A (en) * | 1994-01-26 | 1998-09-22 | E-Systems, Inc. | Interactive audio-visual foreign language skills maintenance system and method |
FR2718539B1 (en) * | 1994-04-08 | 1996-04-26 | Thomson Csf | Device for amplifying the amplitude modulation rate of an optical beam. |
JPH07319917A (en) * | 1994-05-24 | 1995-12-08 | Fuji Xerox Co Ltd | Document data base managing device and document data base system |
US5715445A (en) * | 1994-09-02 | 1998-02-03 | Wolfe; Mark A. | Document retrieval system employing a preloading procedure |
US5613032A (en) * | 1994-09-02 | 1997-03-18 | Bell Communications Research, Inc. | System and method for recording, playing back and searching multimedia events wherein video, audio and text can be searched and retrieved |
US5768607A (en) | 1994-09-30 | 1998-06-16 | Intel Corporation | Method and apparatus for freehand annotation and drawings incorporating sound and for compressing and synchronizing sound |
AU3625095A (en) * | 1994-09-30 | 1996-04-26 | Motorola, Inc. | Method and system for extracting features from handwritten text |
US5777614A (en) * | 1994-10-14 | 1998-07-07 | Hitachi, Ltd. | Editing support system including an interactive interface |
US5614940A (en) | 1994-10-21 | 1997-03-25 | Intel Corporation | Method and apparatus for providing broadcast information with indexing |
US6029195A (en) | 1994-11-29 | 2000-02-22 | Herz; Frederick S. M. | System for customized electronic identification of desirable objects |
US5729656A (en) | 1994-11-30 | 1998-03-17 | International Business Machines Corporation | Reduction of search space in speech recognition using phone boundaries and phone ranking |
US5638487A (en) * | 1994-12-30 | 1997-06-10 | Purespeech, Inc. | Automatic speech recognition |
US5715367A (en) | 1995-01-23 | 1998-02-03 | Dragon Systems, Inc. | Apparatuses and methods for developing and using models for speech recognition |
US5684924A (en) | 1995-05-19 | 1997-11-04 | Kurzweil Applied Intelligence, Inc. | User adaptable speech recognition system |
EP0834139A4 (en) * | 1995-06-07 | 1998-08-05 | Int Language Engineering Corp | Machine assisted translation tools |
US6046840A (en) * | 1995-06-19 | 2000-04-04 | Reflectivity, Inc. | Double substrate reflective spatial light modulator with self-limiting micro-mechanical elements |
US5559875A (en) | 1995-07-31 | 1996-09-24 | Latitude Communications | Method and apparatus for recording and retrieval of audio conferences |
US6151598A (en) * | 1995-08-14 | 2000-11-21 | Shaw; Venson M. | Digital dictionary with a communication system for the creating, updating, editing, storing, maintaining, referencing, and managing the digital dictionary |
US5963940A (en) | 1995-08-16 | 1999-10-05 | Syracuse University | Natural language information retrieval system and method |
US6026388A (en) | 1995-08-16 | 2000-02-15 | Textwise, Llc | User interface and other enhancements for natural language information retrieval system and method |
US6006221A (en) | 1995-08-16 | 1999-12-21 | Syracuse University | Multilingual document retrieval system and method using semantic vector matching |
US5757536A (en) * | 1995-08-30 | 1998-05-26 | Sandia Corporation | Electrically-programmable diffraction grating |
US5742419A (en) * | 1995-11-07 | 1998-04-21 | The Board Of Trustees Of The Leland Stanford Junior Universtiy | Miniature scanning confocal microscope |
US5999306A (en) * | 1995-12-01 | 1999-12-07 | Seiko Epson Corporation | Method of manufacturing spatial light modulator and electronic device employing it |
JPH09269931A (en) * | 1996-01-30 | 1997-10-14 | Canon Inc | Cooperative work environment constructing system, its method and medium |
EP0823112B1 (en) * | 1996-02-27 | 2002-05-02 | Koninklijke Philips Electronics N.V. | Method and apparatus for automatic speech segmentation into phoneme-like units |
US5862259A (en) | 1996-03-27 | 1999-01-19 | Caere Corporation | Pattern recognition employing arbitrary segmentation and compound probabilistic evaluation |
US6024571A (en) | 1996-04-25 | 2000-02-15 | Renegar; Janet Elaine | Foreign language communication system/device and learning aid |
US5778187A (en) | 1996-05-09 | 1998-07-07 | Netcast Communications Corp. | Multicasting method and apparatus |
US5996022A (en) * | 1996-06-03 | 1999-11-30 | Webtv Networks, Inc. | Transcoding data in a proxy computer prior to transmitting the audio data to a client |
US5806032A (en) * | 1996-06-14 | 1998-09-08 | Lucent Technologies Inc. | Compilation of weighted finite-state transducers from decision trees |
US5835908A (en) * | 1996-11-19 | 1998-11-10 | Microsoft Corporation | Processing multiple database transactions in the same process to reduce process overhead and redundant retrieval from database servers |
US6169789B1 (en) * | 1996-12-16 | 2001-01-02 | Sanjay K. Rao | Intelligent keyboard system |
US5897614A (en) * | 1996-12-20 | 1999-04-27 | International Business Machines Corporation | Method and apparatus for sibilant classification in a speech recognition system |
US6732183B1 (en) | 1996-12-31 | 2004-05-04 | Broadware Technologies, Inc. | Video and audio streaming for multiple users |
US6185531B1 (en) * | 1997-01-09 | 2001-02-06 | Gte Internetworking Incorporated | Topic indexing method |
US6807570B1 (en) * | 1997-01-21 | 2004-10-19 | International Business Machines Corporation | Pre-loading of web pages corresponding to designated links in HTML |
US6088669A (en) * | 1997-01-28 | 2000-07-11 | International Business Machines, Corporation | Speech recognition with attempted speaker recognition for speaker model prefetching or alternative speech modeling |
JP2991287B2 (en) | 1997-01-28 | 1999-12-20 | 日本電気株式会社 | Suppression standard pattern selection type speaker recognition device |
US6029124A (en) * | 1997-02-21 | 2000-02-22 | Dragon Systems, Inc. | Sequential, nonparametric speech recognition and speaker identification |
US6024751A (en) * | 1997-04-11 | 2000-02-15 | Coherent Inc. | Method and apparatus for transurethral resection of the prostate |
CA2286935C (en) * | 1997-05-28 | 2001-02-27 | Shinar Linguistic Technologies Inc. | Translation system |
US6567980B1 (en) * | 1997-08-14 | 2003-05-20 | Virage, Inc. | Video cataloger system with hyperlinked output |
US6360234B2 (en) * | 1997-08-14 | 2002-03-19 | Virage, Inc. | Video cataloger system with synchronized encoders |
US6463444B1 (en) * | 1997-08-14 | 2002-10-08 | Virage, Inc. | Video cataloger system with extensibility |
US6052657A (en) * | 1997-09-09 | 2000-04-18 | Dragon Systems, Inc. | Text segmentation and identification of topic using language models |
US6317716B1 (en) | 1997-09-19 | 2001-11-13 | Massachusetts Institute Of Technology | Automatic cueing of speech |
CA2271745A1 (en) | 1997-10-01 | 1999-04-08 | Pierre David Wellner | Method and apparatus for storing and retrieving labeled interval data for multimedia recordings |
US6961954B1 (en) * | 1997-10-27 | 2005-11-01 | The Mitre Corporation | Automated segmentation, information extraction, summarization, and presentation of broadcast news |
US6064963A (en) | 1997-12-17 | 2000-05-16 | Opus Telecom, L.L.C. | Automatic key word or phrase speech recognition for the corrections industry |
JP4183311B2 (en) | 1997-12-22 | 2008-11-19 | 株式会社リコー | Document annotation method, annotation device, and recording medium |
US5970473A (en) | 1997-12-31 | 1999-10-19 | At&T Corp. | Video communication device providing in-home catalog services |
SE511584C2 (en) | 1998-01-15 | 1999-10-25 | Ericsson Telefon Ab L M | information Routing |
US6327343B1 (en) | 1998-01-16 | 2001-12-04 | International Business Machines Corporation | System and methods for automatic call and data transfer processing |
JP3181548B2 (en) | 1998-02-03 | 2001-07-03 | 富士通株式会社 | Information retrieval apparatus and information retrieval method |
US6073096A (en) * | 1998-02-04 | 2000-06-06 | International Business Machines Corporation | Speaker adaptation system and method based on class-specific pre-clustering training speakers |
US7257528B1 (en) * | 1998-02-13 | 2007-08-14 | Zi Corporation Of Canada, Inc. | Method and apparatus for Chinese character text input |
US6361326B1 (en) * | 1998-02-20 | 2002-03-26 | George Mason University | System for instruction thinking skills |
US6381640B1 (en) | 1998-09-11 | 2002-04-30 | Genesys Telecommunications Laboratories, Inc. | Method and apparatus for automated personalization and presentation of workload assignments to agents within a multimedia communication center |
US6112172A (en) | 1998-03-31 | 2000-08-29 | Dragon Systems, Inc. | Interactive searching |
CN1159662C (en) * | 1998-05-13 | 2004-07-28 | 国际商业机器公司 | Automatic punctuating for continuous speech recognition |
US6076053A (en) | 1998-05-21 | 2000-06-13 | Lucent Technologies Inc. | Methods and apparatus for discriminative training and adaptation of pronunciation networks |
US6243680B1 (en) * | 1998-06-15 | 2001-06-05 | Nortel Networks Limited | Method and apparatus for obtaining a transcription of phrases through text and spoken utterances |
US6067514A (en) * | 1998-06-23 | 2000-05-23 | International Business Machines Corporation | Method for automatically punctuating a speech utterance in a continuous speech recognition system |
US6341330B1 (en) * | 1998-07-27 | 2002-01-22 | Oak Technology, Inc. | Method and system for caching a selected viewing angle in a DVD environment |
US6233389B1 (en) * | 1998-07-30 | 2001-05-15 | Tivo, Inc. | Multimedia time warping system |
US6246983B1 (en) * | 1998-08-05 | 2001-06-12 | Matsushita Electric Corporation Of America | Text-to-speech e-mail reader with multi-modal reply processor |
US6373985B1 (en) | 1998-08-12 | 2002-04-16 | Lucent Technologies, Inc. | E-mail signature block analysis |
US6038058A (en) * | 1998-10-15 | 2000-03-14 | Memsolutions, Inc. | Grid-actuated charge controlled mirror and method of addressing the same |
US6347295B1 (en) * | 1998-10-26 | 2002-02-12 | Compaq Computer Corporation | Computer method and apparatus for grapheme-to-phoneme rule-set-generation |
US6332139B1 (en) * | 1998-11-09 | 2001-12-18 | Mega Chips Corporation | Information communication system |
US6292772B1 (en) * | 1998-12-01 | 2001-09-18 | Justsystem Corporation | Method for identifying the language of individual words |
JP3252282B2 (en) * | 1998-12-17 | 2002-02-04 | 松下電器産業株式会社 | Method and apparatus for searching scene |
US6654735B1 (en) | 1999-01-08 | 2003-11-25 | International Business Machines Corporation | Outbound information analysis for generating user interest profiles and improving user productivity |
US6253179B1 (en) | 1999-01-29 | 2001-06-26 | International Business Machines Corporation | Method and apparatus for multi-environment speaker verification |
DE19912405A1 (en) | 1999-03-19 | 2000-09-21 | Philips Corp Intellectual Pty | Determination of a regression class tree structure for speech recognizers |
DE60038674T2 (en) | 1999-03-30 | 2009-06-10 | TiVo, Inc., Alviso | DATA STORAGE MANAGEMENT AND PROGRAM FLOW SYSTEM |
US6345252B1 (en) | 1999-04-09 | 2002-02-05 | International Business Machines Corporation | Methods and apparatus for retrieving audio information using content and speaker information |
US6434520B1 (en) | 1999-04-16 | 2002-08-13 | International Business Machines Corporation | System and method for indexing and querying audio archives |
US6338033B1 (en) * | 1999-04-20 | 2002-01-08 | Alis Technologies, Inc. | System and method for network-based teletranslation from one natural language to another |
US6711585B1 (en) * | 1999-06-15 | 2004-03-23 | Kanisa Inc. | System and method for implementing a knowledge management system |
US6219640B1 (en) * | 1999-08-06 | 2001-04-17 | International Business Machines Corporation | Methods and apparatus for audio-visual speaker recognition and utterance verification |
IE990799A1 (en) | 1999-08-20 | 2001-03-07 | Digitake Software Systems Ltd | "An audio processing system" |
JP3232289B2 (en) * | 1999-08-30 | 2001-11-26 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Symbol insertion device and method |
US6480826B2 (en) | 1999-08-31 | 2002-11-12 | Accenture Llp | System and method for a telephonic emotion detection that provides operator feedback |
US6711541B1 (en) * | 1999-09-07 | 2004-03-23 | Matsushita Electric Industrial Co., Ltd. | Technique for developing discriminative sound units for speech recognition and allophone modeling |
US6624826B1 (en) | 1999-09-28 | 2003-09-23 | Ricoh Co., Ltd. | Method and apparatus for generating visual representations for audio documents |
US6396619B1 (en) * | 2000-01-28 | 2002-05-28 | Reflectivity, Inc. | Deflectable spatial light modulator having stopping mechanisms |
US7412643B1 (en) * | 1999-11-23 | 2008-08-12 | International Business Machines Corporation | Method and apparatus for linking representation and realization data |
US6571208B1 (en) * | 1999-11-29 | 2003-05-27 | Matsushita Electric Industrial Co., Ltd. | Context-dependent acoustic models for medium and large vocabulary speech recognition with eigenvoice training |
US20020071169A1 (en) * | 2000-02-01 | 2002-06-13 | Bowers John Edward | Micro-electro-mechanical-system (MEMS) mirror device |
ATE336776T1 (en) * | 2000-02-25 | 2006-09-15 | Koninkl Philips Electronics Nv | DEVICE FOR SPEECH RECOGNITION WITH REFERENCE TRANSFORMATION MEANS |
US7197694B2 (en) * | 2000-03-21 | 2007-03-27 | Oki Electric Industry Co., Ltd. | Image display system, image registration terminal device and image reading terminal device used in the image display system |
US7120575B2 (en) * | 2000-04-08 | 2006-10-10 | International Business Machines Corporation | Method and system for the automatic segmentation of an audio stream into semantic or syntactic units |
EP1148505A3 (en) * | 2000-04-21 | 2002-03-27 | Matsushita Electric Industrial Co., Ltd. | Data playback apparatus |
WO2001082111A2 (en) * | 2000-04-24 | 2001-11-01 | Microsoft Corporation | Computer-aided reading system and method with cross-language reading wizard |
US7107204B1 (en) * | 2000-04-24 | 2006-09-12 | Microsoft Corporation | Computer-aided writing system and method with cross-language writing wizard |
US6388661B1 (en) * | 2000-05-03 | 2002-05-14 | Reflectivity, Inc. | Monochrome and color digital display systems and methods |
US6505153B1 (en) | 2000-05-22 | 2003-01-07 | Compaq Information Technologies Group, L.P. | Efficient method for producing off-line closed captions |
US6748356B1 (en) * | 2000-06-07 | 2004-06-08 | International Business Machines Corporation | Methods and apparatus for identifying unknown speakers using a hierarchical tree structure |
US7047192B2 (en) * | 2000-06-28 | 2006-05-16 | Poirier Darrell A | Simultaneous multi-user real-time speech recognition system |
US6337760B1 (en) * | 2000-07-17 | 2002-01-08 | Reflectivity, Inc. | Encapsulated multi-directional light beam steering device |
US6931376B2 (en) * | 2000-07-20 | 2005-08-16 | Microsoft Corporation | Speech-related event notification system |
WO2002010887A2 (en) | 2000-07-28 | 2002-02-07 | Jan Pathuel | Method and system of securing data and systems |
US20020059204A1 (en) | 2000-07-28 | 2002-05-16 | Harris Larry R. | Distributed search system and method |
US7155061B2 (en) * | 2000-08-22 | 2006-12-26 | Microsoft Corporation | Method and system for searching for words and phrases in active and stored ink word documents |
WO2002019147A1 (en) | 2000-08-28 | 2002-03-07 | Emotion, Inc. | Method and apparatus for digital media management, retrieval, and collaboration |
US6604110B1 (en) * | 2000-08-31 | 2003-08-05 | Ascential Software, Inc. | Automated software code generation from a metadata-based repository |
US6647383B1 (en) | 2000-09-01 | 2003-11-11 | Lucent Technologies Inc. | System and method for providing interactive dialogue and iterative search functions to find information |
US7075671B1 (en) * | 2000-09-14 | 2006-07-11 | International Business Machines Corp. | System and method for providing a printing capability for a transcription service or multimedia presentation |
WO2002029614A1 (en) | 2000-09-30 | 2002-04-11 | Intel Corporation | Method and system to scale down a decision tree-based hidden markov model (hmm) for speech recognition |
WO2002029612A1 (en) | 2000-09-30 | 2002-04-11 | Intel Corporation | Method and system for generating and searching an optimal maximum likelihood decision tree for hidden markov model (hmm) based speech recognition |
US6431714B1 (en) * | 2000-10-10 | 2002-08-13 | Nippon Telegraph And Telephone Corporation | Micro-mirror apparatus and production method therefor |
US6934756B2 (en) | 2000-11-01 | 2005-08-23 | International Business Machines Corporation | Conversational networking via transport, coding and control conversational protocols |
US20050060162A1 (en) | 2000-11-10 | 2005-03-17 | Farhad Mohit | Systems and methods for automatic identification and hyperlinking of words or other data items and for information retrieval using hyperlinked words or data items |
US6574026B2 (en) * | 2000-12-07 | 2003-06-03 | Agere Systems Inc. | Magnetically-packaged optical MEMs device |
SG98440A1 (en) | 2001-01-16 | 2003-09-19 | Reuters Ltd | Method and apparatus for a financial database structure |
US6944272B1 (en) * | 2001-01-16 | 2005-09-13 | Interactive Intelligence, Inc. | Method and system for administering multiple messages over a public switched telephone network |
US6714911B2 (en) * | 2001-01-25 | 2004-03-30 | Harcourt Assessment, Inc. | Speech transcription and analysis system and method |
US6429033B1 (en) * | 2001-02-20 | 2002-08-06 | Nayna Networks, Inc. | Process for manufacturing mirror devices using semiconductor technology |
US20020133477A1 (en) * | 2001-03-05 | 2002-09-19 | Glenn Abel | Method for profile-based notice and broadcast of multimedia content |
US6732095B1 (en) * | 2001-04-13 | 2004-05-04 | Siebel Systems, Inc. | Method and apparatus for mapping between XML and relational representations |
WO2002086737A1 (en) * | 2001-04-20 | 2002-10-31 | Wordsniffer, Inc. | Method and apparatus for integrated, user-directed web site text translation |
US6820055B2 (en) * | 2001-04-26 | 2004-11-16 | Speche Communications | Systems and methods for automated audio transcription, translation, and transfer with text display software for manipulating the text |
US7035804B2 (en) * | 2001-04-26 | 2006-04-25 | Stenograph, L.L.C. | Systems and methods for automated audio transcription, translation, and transfer |
US6895376B2 (en) * | 2001-05-04 | 2005-05-17 | Matsushita Electric Industrial Co., Ltd. | Eigenvoice re-estimation technique of acoustic models for speech recognition, speaker identification and speaker verification |
JP4369132B2 (en) * | 2001-05-10 | 2009-11-18 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Background learning of speaker voice |
US6973428B2 (en) | 2001-05-24 | 2005-12-06 | International Business Machines Corporation | System and method for searching, analyzing and displaying text transcripts of speech after imperfect speech recognition |
US20030018663A1 (en) * | 2001-05-30 | 2003-01-23 | Cornette Ranjita K. | Method and system for creating a multimedia electronic book |
US7027973B2 (en) * | 2001-07-13 | 2006-04-11 | Hewlett-Packard Development Company, L.P. | System and method for converting a standard generalized markup language in multiple languages |
US6778979B2 (en) | 2001-08-13 | 2004-08-17 | Xerox Corporation | System for automatically generating queries |
US6993473B2 (en) * | 2001-08-31 | 2006-01-31 | Equality Translation Services | Productivity tool for language translators |
US20030078973A1 (en) * | 2001-09-25 | 2003-04-24 | Przekop Michael V. | Web-enabled system and method for on-demand distribution of transcript-synchronized video/audio records of legal proceedings to collaborative workgroups |
US6748350B2 (en) | 2001-09-27 | 2004-06-08 | Intel Corporation | Method to compensate for stress between heat spreader and thermal interface material |
US20030093580A1 (en) | 2001-11-09 | 2003-05-15 | Koninklijke Philips Electronics N.V. | Method and system for information alerts |
WO2003065245A1 (en) * | 2002-01-29 | 2003-08-07 | International Business Machines Corporation | Translating method, translated sentence outputting method, recording medium, program, and computer device |
US7165024B2 (en) * | 2002-02-22 | 2007-01-16 | Nec Laboratories America, Inc. | Inferring hierarchical descriptions of a set of documents |
US7522910B2 (en) | 2002-05-31 | 2009-04-21 | Oracle International Corporation | Method and apparatus for controlling data provided to a mobile device |
US7668816B2 (en) * | 2002-06-11 | 2010-02-23 | Microsoft Corporation | Dynamically updated quick searches and strategies |
US6618702B1 (en) * | 2002-06-14 | 2003-09-09 | Mary Antoinette Kohler | Method of and device for phone-based speaker recognition |
US7131117B2 (en) | 2002-09-04 | 2006-10-31 | Sbc Properties, L.P. | Method and system for automating the analysis of word frequencies |
US6999918B2 (en) | 2002-09-20 | 2006-02-14 | Motorola, Inc. | Method and apparatus to facilitate correlating symbols to sounds |
EP1422692A3 (en) | 2002-11-22 | 2004-07-14 | ScanSoft, Inc. | Automatic insertion of non-verbalized punctuation in speech recognition |
US7627817B2 (en) * | 2003-02-21 | 2009-12-01 | Motionpoint Corporation | Analyzing web site for translation |
US8464150B2 (en) * | 2008-06-07 | 2013-06-11 | Apple Inc. | Automatic language identification for dynamic text processing |
-
2003
- 2003-07-02 US US10/610,574 patent/US20040006748A1/en not_active Abandoned
- 2003-07-02 US US10/610,532 patent/US20040006481A1/en not_active Abandoned
- 2003-07-02 US US10/610,679 patent/US20040024598A1/en not_active Abandoned
- 2003-07-02 US US10/611,106 patent/US7337115B2/en active Active
- 2003-07-02 US US10/610,684 patent/US20040024582A1/en not_active Abandoned
- 2003-07-02 US US10/610,699 patent/US20040117188A1/en not_active Abandoned
- 2003-07-02 US US10/610,799 patent/US20040199495A1/en not_active Abandoned
- 2003-07-02 US US10/610,696 patent/US20040024585A1/en not_active Abandoned
- 2003-07-02 US US10/610,697 patent/US7290207B2/en not_active Expired - Fee Related
- 2003-07-02 US US10/610,533 patent/US7801838B2/en not_active Expired - Fee Related
-
2010
- 2010-08-13 US US12/806,465 patent/US8001066B2/en not_active Expired - Fee Related
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5835667A (en) * | 1994-10-14 | 1998-11-10 | Carnegie Mellon University | Method and apparatus for creating a searchable digital video library and a system and method of using such a library |
US6332147B1 (en) * | 1995-11-03 | 2001-12-18 | Xerox Corporation | Computer controlled display system using a graphical replay device to control playback of temporal data representing collaborative activities |
US5960447A (en) * | 1995-11-13 | 1999-09-28 | Holt; Douglas | Word tagging and editing system for speech recognition |
US6067517A (en) * | 1996-02-02 | 2000-05-23 | International Business Machines Corporation | Transcription of speech data with segments from acoustically dissimilar environments |
US6161087A (en) * | 1998-10-05 | 2000-12-12 | Lernout & Hauspie Speech Products N.V. | Speech-recognition-assisted selective suppression of silent and filled speech pauses during playback of an audio recording |
US6360237B1 (en) * | 1998-10-05 | 2002-03-19 | Lernout & Hauspie Speech Products N.V. | Method and system for performing text edits during audio recording playback |
US6792409B2 (en) * | 1999-12-20 | 2004-09-14 | Koninklijke Philips Electronics N.V. | Synchronous reproduction in a speech recognition system |
US6708148B2 (en) * | 2001-10-12 | 2004-03-16 | Koninklijke Philips Electronics N.V. | Correction device to mark parts of a recognized text |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7844684B2 (en) | 2004-03-19 | 2010-11-30 | Media Captioning Services, Inc. | Live media captioning subscription framework for mobile devices |
US8266313B2 (en) | 2004-03-19 | 2012-09-11 | Media Captioning Services, Inc. | Live media subscription framework for mobile devices |
US20110035218A1 (en) * | 2004-03-19 | 2011-02-10 | Media Captioning Services | Live Media Captioning Subscription Framework for Mobile Devices |
US7421477B2 (en) | 2004-03-19 | 2008-09-02 | Media Captioning Services | Real-time media captioning subscription framework for mobile devices |
US20080293443A1 (en) * | 2004-03-19 | 2008-11-27 | Media Captioning Services | Live media subscription framework for mobile devices |
US20080294434A1 (en) * | 2004-03-19 | 2008-11-27 | Media Captioning Services | Live Media Captioning Subscription Framework for Mobile Devices |
US20050210511A1 (en) * | 2004-03-19 | 2005-09-22 | Pettinato Richard F | Real-time media captioning subscription framework for mobile devices |
US8285819B2 (en) | 2004-03-19 | 2012-10-09 | Media Captioning Services | Live media captioning subscription framework for mobile devices |
US20050210516A1 (en) * | 2004-03-19 | 2005-09-22 | Pettinato Richard F | Real-time captioning framework for mobile devices |
US8014765B2 (en) * | 2004-03-19 | 2011-09-06 | Media Captioning Services | Real-time captioning framework for mobile devices |
US20080195370A1 (en) * | 2005-08-26 | 2008-08-14 | Koninklijke Philips Electronics, N.V. | System and Method For Synchronizing Sound and Manually Transcribed Text |
US8560327B2 (en) | 2005-08-26 | 2013-10-15 | Nuance Communications, Inc. | System and method for synchronizing sound and manually transcribed text |
US8924216B2 (en) | 2005-08-26 | 2014-12-30 | Nuance Communications, Inc. | System and method for synchronizing sound and manually transcribed text |
US8249870B2 (en) * | 2008-11-12 | 2012-08-21 | Massachusetts Institute Of Technology | Semi-automatic speech transcription |
US20100121637A1 (en) * | 2008-11-12 | 2010-05-13 | Massachusetts Institute Of Technology | Semi-Automatic Speech Transcription |
US8135333B2 (en) * | 2008-12-23 | 2012-03-13 | Motorola Solutions, Inc. | Distributing a broadband resource locator over a narrowband audio stream |
US20100159973A1 (en) * | 2008-12-23 | 2010-06-24 | Motoral, Inc. | Distributing a broadband resource locator over a narrowband audio stream |
US20150279354A1 (en) * | 2010-05-19 | 2015-10-01 | Google Inc. | Personalization and Latency Reduction for Voice-Activated Commands |
US9330188B1 (en) | 2011-12-22 | 2016-05-03 | Amazon Technologies, Inc. | Shared browsing sessions |
US20150324377A1 (en) * | 2012-01-26 | 2015-11-12 | Amazon Technologies, Inc. | Narration of network content |
US9087024B1 (en) * | 2012-01-26 | 2015-07-21 | Amazon Technologies, Inc. | Narration of network content |
US9195750B2 (en) | 2012-01-26 | 2015-11-24 | Amazon Technologies, Inc. | Remote browsing and searching |
US9336321B1 (en) | 2012-01-26 | 2016-05-10 | Amazon Technologies, Inc. | Remote browsing and searching |
US9898542B2 (en) * | 2012-01-26 | 2018-02-20 | Amazon Technologies, Inc. | Narration of network content |
US20150006174A1 (en) * | 2012-02-03 | 2015-01-01 | Sony Corporation | Information processing device, information processing method and program |
US10339955B2 (en) * | 2012-02-03 | 2019-07-02 | Sony Corporation | Information processing device and method for displaying subtitle information |
US9786283B2 (en) * | 2012-03-30 | 2017-10-10 | Jpal Limited | Transcription of speech |
US20150066505A1 (en) * | 2012-03-30 | 2015-03-05 | Jpal Limited | Transcription of Speech |
GB2502944A (en) * | 2012-03-30 | 2013-12-18 | Jpal Ltd | Segmentation and transcription of speech |
US8775175B1 (en) * | 2012-06-01 | 2014-07-08 | Google Inc. | Performing dictation correction |
US8676590B1 (en) * | 2012-09-26 | 2014-03-18 | Google Inc. | Web-based audio transcription tool |
US9773499B2 (en) | 2014-06-18 | 2017-09-26 | Google Inc. | Entity name recognition based on entity type |
US9772816B1 (en) * | 2014-12-22 | 2017-09-26 | Google Inc. | Transcription and tagging system |
US10354647B2 (en) | 2015-04-28 | 2019-07-16 | Google Llc | Correcting voice recognition using selective re-speak |
EP3432560A1 (en) * | 2017-07-20 | 2019-01-23 | Dialogtech Inc. | System, method, and computer program product for automatically analyzing and categorizing phone calls |
Also Published As
Publication number | Publication date |
---|---|
US20040006737A1 (en) | 2004-01-08 |
US7337115B2 (en) | 2008-02-26 |
US20040117188A1 (en) | 2004-06-17 |
US20110004576A1 (en) | 2011-01-06 |
US20040030550A1 (en) | 2004-02-12 |
US20040006576A1 (en) | 2004-01-08 |
US20040199495A1 (en) | 2004-10-07 |
US20040006748A1 (en) | 2004-01-08 |
US8001066B2 (en) | 2011-08-16 |
US20040024582A1 (en) | 2004-02-05 |
US7801838B2 (en) | 2010-09-21 |
US20040024585A1 (en) | 2004-02-05 |
US7290207B2 (en) | 2007-10-30 |
US20040024598A1 (en) | 2004-02-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20040006481A1 (en) | Fast transcription of speech | |
US20040138894A1 (en) | Speech transcription tool for efficient speech transcription | |
US5583965A (en) | Methods and apparatus for training and operating voice recognition systems | |
US8150687B2 (en) | Recognizing speech, and processing data | |
US6161087A (en) | Speech-recognition-assisted selective suppression of silent and filled speech pauses during playback of an audio recording | |
US6704709B1 (en) | System and method for improving the accuracy of a speech recognition program | |
US6535848B1 (en) | Method and apparatus for transcribing multiple files into a single document | |
US6332122B1 (en) | Transcription system for multiple speakers, using and establishing identification | |
US6839667B2 (en) | Method of speech recognition by presenting N-best word candidates | |
US7962331B2 (en) | System and method for tuning and testing in a speech recognition system | |
US8812314B2 (en) | Method of and system for improving accuracy in a speech recognition system | |
KR101213835B1 (en) | Verb error recovery in speech recognition | |
US20030046071A1 (en) | Voice recognition apparatus and method | |
US20090234648A1 (en) | Speech Recogniton System, Speech Recognition Method, and Program | |
US6421643B1 (en) | Method and apparatus for directing an audio file to a speech recognition program that does not accept such files | |
GB2362745A (en) | Transcription of text from computer voice mail | |
JP2007519987A (en) | Integrated analysis system and method for internal and external audiovisual data | |
JP2006301223A (en) | System and program for speech recognition | |
US20030072463A1 (en) | Sound-activated song selection broadcasting apparatus | |
US6745165B2 (en) | Method and apparatus for recognizing from here to here voice command structures in a finite grammar speech recognition system | |
US7010485B1 (en) | Method and system of audio file searching | |
US20040098266A1 (en) | Personal speech font | |
JP3279684B2 (en) | Voice interface builder system | |
EP1079313A2 (en) | An audio processing system | |
US20080167879A1 (en) | Speech delimiting processing system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: BBNT SOLUTIONS LLC, TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KIECZA, DANIEL;KUBALA, FRANCIS G.;REEL/FRAME:014267/0968;SIGNING DATES FROM 20030625 TO 20030626 |
|
AS | Assignment |
Owner name: FLEET NATIONAL BANK, AS AGENT, MASSACHUSETTS Free format text: PATENT & TRADEMARK SECURITY AGREEMENT;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:014624/0196 Effective date: 20040326 Owner name: FLEET NATIONAL BANK, AS AGENT,MASSACHUSETTS Free format text: PATENT & TRADEMARK SECURITY AGREEMENT;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:014624/0196 Effective date: 20040326 |
|
AS | Assignment |
Owner name: BBN TECHNOLOGIES CORP.,MASSACHUSETTS Free format text: MERGER;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:017274/0318 Effective date: 20060103 Owner name: BBN TECHNOLOGIES CORP., MASSACHUSETTS Free format text: MERGER;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:017274/0318 Effective date: 20060103 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: BBN TECHNOLOGIES CORP. (AS SUCCESSOR BY MERGER TO Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:BANK OF AMERICA, N.A. (SUCCESSOR BY MERGER TO FLEET NATIONAL BANK);REEL/FRAME:023427/0436 Effective date: 20091026 |