US20130030789A1 - Universal Language Translator - Google Patents

Universal Language Translator Download PDF

Info

Publication number
US20130030789A1
US20130030789A1 US13/559,346 US201213559346A US2013030789A1 US 20130030789 A1 US20130030789 A1 US 20130030789A1 US 201213559346 A US201213559346 A US 201213559346A US 2013030789 A1 US2013030789 A1 US 2013030789A1
Authority
US
United States
Prior art keywords
speaker
dialect
voice
translator
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/559,346
Inventor
Reginald Dalce
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US13/559,346 priority Critical patent/US20130030789A1/en
Publication of US20130030789A1 publication Critical patent/US20130030789A1/en
Priority to US14/926,698 priority patent/US9864745B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser

Definitions

  • the problem of different people being unable to understand one another is as old as the legend of the Tower of Arabic.
  • a first person who speaks a first language wishes to communicate with a second person who speaks a second language
  • some sort of translator needs to be used.
  • the translator is a human translator who understands both languages very well, and can translate cultural nuances from one language to another.
  • many people do not have access to a human translator, and must rely upon machine translations.
  • U.S. Pat. No. 4,882,681 to Brotz teaches a handheld oral translating device which will listen to a phrase in one language, and will output an oral translation of the phrase through a machine speaker in the device.
  • Brotz's device provides simultaneous translation of a conversation between a user of the device that speaks a first language, and another user of the device that speaks a second language, seamlessly performing both reception of a phrase and transmission of a translated phrase.
  • Brotz's device uses machine-synthesized phonemes to produce the translated phrase, which sounds artificial and stilted to a normal user.
  • Li's device fails to account for a speaker that is moving about a room, and requires a locked-on speaker to remain in one place for the duration of the translation session.
  • Other speech translation devices such as EP1464048 to Palmquist have the same issue.
  • the inventive subject matter provides apparatus, systems and methods in which one can use a translator device to automatically translate a phrase spoken by a user into a specific dialect that mimics properties of the user's voice.
  • computing devices comprise a processor configured to execute software instructions stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, solid state drive, RAM, flash, ROM, etc.).
  • the software instructions preferably configure the computing device to provide the roles, responsibilities, or other functionality as discussed below with respect to the disclosed apparatus.
  • the various servers, systems, databases, or interfaces exchange data using standardized protocols or algorithms, possibly based on HTTP, HTTPS, AES, public-private key exchanges, web service APIs, known financial transaction protocols, or other electronic information exchanging methods.
  • Data exchanges preferably are conducted over a packet-switched network, the Internet, LAN, WAN, VPN, or other type of packet switched network.
  • the device could have a voice recognition module that detects phonetic elements of a speaker's voice in order to help mimic the user's voice.
  • a voice recognition module records and analyzes an effective amount of words or phrases spoken by the speaker in order to synthesize words with the speaker's mannerisms in accordance with the speaker's auditory attributes.
  • the voice recognition module could detect a speaker's pitch, speed of talking, intonation, and/or average audio frequency, and could synthesize speech that emulates one or more of such auditory attributes.
  • a speaker could “train” the voice recognition module by reciting key sentences, words, phrases, or phonemes into a microphone on the device.
  • the voice recognition module could continuously record auditory attributes of the speaker's voice while in use, continuously modifying the synthesized voice over time such that the synthesized voice grows closer to the speaker's voice the longer the speaker uses the device.
  • the device automatically synthesizes the translated phrase using the auditory attributes of the speaker's voice.
  • a user interface could be presented whereupon a user could select one of a plurality of speakers to emulate.
  • the voice recognition module also preferably detects the speaking language and/or the speaking dialect of the speaker.
  • a dialect is a method of pronouncing a language that is specific to a culture or a region.
  • dialects of English include Southern, Bostonian, British, Australian, and South African while dialects of Chinese include Mandarin, Cantonese, and Shanghainese.
  • a dialect may also contain slang words that are specific to a region that are not used in other areas.
  • a user interface to the voice recognition module may be provided to allow a user to manually select a speaker dialect. This is particularly useful when a speaker's dialect may be similar to another dialect of the same language.
  • the device also preferably has a voice filter module that filters out ambient sounds generated by non-speaker objects.
  • a “non-speaker object” is any object that generates sounds that is not the speaker.
  • a second speaker an engine, wind, music, or a fan.
  • Sounds generated by non-speaker objects could be filtered from an audio recording through any number of methods.
  • the device could detect a region that a speaker is located in and could then filter out ambient sounds generated outside the speaker's region.
  • the region could be a static region that is defined by a user of the device, or the region could move with the speaker as the device tracks the speaker's location.
  • the device could also “lock onto” the speaker by tracking a speaker's voice signature, and could filter out ambient sounds that are do not share attributes of the voice signature. In this manner, a recording could be made by the device that only records words and phrases made by the speaker.
  • the device could also have a dialect detector which will automatically select a dialect to translate the speaker's words/phrases into based upon a location of the device, based upon a detected dialect of a listener, or based upon a user's selection of a target dialect.
  • the location of the device could be obtained in a multitude of ways, for example by triangulating the user's position based upon wireless signals such as GPS signals, by communicating with a triangulation device, by a user selection of the location, or by receiving a location signal from a wireless service such as Wi-Fi or Bluetooth®.
  • the device could also select a target dialect by recording a portion of speech spoken by a listener, and by detecting the dialect of the listener.
  • the speaker's words are translated by a translation module that translates a word or phrase spoken by the speaker into a corresponding word or phrase of a different language—typically the dialect selected by the dialect detector.
  • a preferred device contains at least 100, 300, 500, or even 1000 languages, and contains at least 500, 1000, 5000, or even 10000 different dialects to translate phrases to and from.
  • the device first locks onto a speaker's voice by and filters out ambient sounds generated by non-speaker objects to obtain a “pure” stream of words spoken by the locked-on speaker.
  • the dialect detector module could then automatically select a dialect based upon a location of the device, translating the speaker's words into a local dialect, and synthesizing a translated phrase that mimics the auditory attributes of the speaker's voice, such that it sounds like the speaker is actually talking in the local dialect. This is particularly useful for travelers who are roaming in remote areas where multiple local dialects are spoken, who need to communicate with local citizens who may not speak the traveler's language.
  • the device is preferably a handheld translator with one or more microphones, speakers, processors, memories, and user interfaces that allow two users to converse with one another while speaking different languages.
  • the device preferably translates each user's word instantaneously as the user speaks, but in a common embodiment, the device translates each user's word or phrase approximately 2-4 seconds after each user speaks.
  • the device may delay until a speaker finishes his/her entire phrase in order to reword the translation with proper grammar and syntax, and may require user interaction to signal that the speaker has finished a phrase.
  • This signal could be tangible, such as a button that a speaker presses or depresses when the speaker has finished uttering a phrase, or could be auditory, such as a verbal command, “translate” or “over.”
  • the universal translator could be a software that is installable onto a computer system.
  • Such software could be installed onto a computer system to translate auditory signals from another program, for example SkyPE® or iChatTM. In this manner, users who speak different languages could communicate through a computer audio or video chat program that otherwise would not provide such translation services.
  • any computer system that the software is installed upon is internet-accessible, or at least is accessible to some wired or wireless network.
  • the software could also be installed as an application on a smartphone, such as an AndroidTM system or an iPhone®, or could be installed in a radio system to allow a car that travels from one country to another to translate a foreign-speaking radio station into the driver's language.
  • the software could be provided as a standard feature in any rental car for travelers from other countries.
  • the dialect detector is likely disabled and the target dialect or target language is selected by the driver.
  • FIG. 1 is a schematic of an exemplary universal translator translating dialogue between two persons
  • FIG. 2A-2D are exemplary user interfaces in accordance with one embodiment of the invention.
  • FIG. 3 is an exemplary use of a universal language translator applied to a telephone
  • FIG. 4 is an exemplary use of a universal language translator applied to an audio/visual media.
  • computing devices comprise a processor configured to execute software instructions stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, solid state drive, RAM, flash, ROM, etc.).
  • the software instructions preferably configure the computing device to provide the roles, responsibilities, or other functionality as discussed below with respect to the disclosed apparatus.
  • the various servers, systems, databases, or interfaces exchange data using standardized protocols or algorithms, possibly based on HTTP, HTTPS, AES, public-private key exchanges, web service APIs, known financial transaction protocols, or other electronic information exchanging methods.
  • Data exchanges preferably are conducted over a packet-switched network, the Internet, LAN, WAN, VPN, or other type of packet switched network.
  • the disclosed techniques provide many advantageous technical effects including automatically translating language spoken by two people by translating each person's spoken words into a local dialect of the other person by emulating each person's speech. Since the device also allows a person to “lock onto” a specific speaker, this allows such conversations to take place in noisy areas with multiple speakers, and might even allow a user to listen in on another person's conversations during possible covert operations.
  • inventive subject matter is considered to include all possible combinations of the disclosed elements.
  • inventive subject matter is also considered to include other remaining combinations of A, B, C, or D, even if not explicitly disclosed.
  • an exemplary device 100 has directional microphones 112 and 114 , auditory outputs 122 and 124 , voice recognition module 130 , voice filter module 140 , dialect detector 150 , and translation module 160 .
  • Device 100 his activated to translate speech between speaker 102 and speaker 106 .
  • Speakers 104 and 108 are sitting to either side of speaker 106 .
  • directional microphone 112 is aimed towards speaker 102 and directional microphone 114 is aimed at speaker 106 so as to prevent ambient noise from non-speakers objects, such as speakers 104 and 108 .
  • microphone 114 is aimed at speaker 106
  • device 100 could record some words/phrases spoken by speaker 106 and lock onto that speaker.
  • voice filter module 140 could then move directional microphone 114 to follow speaker 106 , and/or could also electronically filter out ambient noise from non-speaker objects using noise-cancellation technology or other noise-filtering technology.
  • voice recognition module 130 detects auditory attributes of speaker 102 such that auditory output 124 could synthesize one or more auditory attributes of speaker 102 when the translation is output. As mentioned above, these auditory attributes could take the form of speaker 102 's tone, pitch, timber, phonemes, speed of talking, intonation, and average auditory frequency. Dialect detector 150 also preferably automatically selects a target dialect that speaker 102 's words are translated into based upon the location of device 100 . It is possible that the auto-selected dialect is not the specific language or dialect that speaker 106 speaks.
  • dialect detector 150 then analyzes speaker 106 's speech to determine whether speaker 106 's dialect is the same as, or different from, the auto-selected dialect based upon the location of device 100 . If speaker 106 's dialect is different, then dialect detector 150 will then auto-select the dialect spoken by speaker 106 for speaker 102 's words to be translated into. All of the translation from speaker 102 to auditory output 124 and from speaker 106 to auditory output 122 is handled by translation module 160 .
  • Exemplary user interfaces for device 100 are shown in FIGS. 2A-2D .
  • languages and dialects are preferably auto-selected by the device
  • a language and dialect could be manually selected by a user of the device. This is particularly useful when there is only one-way communication, such as when a user is translating input from a radio or from a loudspeaker. Such selections could occur for just one speaker, or both speakers who are using the system.
  • a user of the device selects which speaker to listen to, when multiple voices are detected.
  • Speaker 1 there are four speakers: Speaker 1 , Speaker 2 , Speaker 3 , and Speaker 4 .
  • Speaker 3 has already been recognized by the system as a speaker that has interacted with the system before, and so the name for Speaker 3 has been auto-populated to “Darcy,” as that speaker's auditory attributes and mannerisms have been saved by the system.
  • the user could first listen to each speaker's voice by pressing the “sound” symbol next to each speaker's name, and then when the user has selected the correct speaker, the device could then lock onto the chosen speaker and filter out all non-speaker sounds when translating.
  • FIGS. 2C and 2D an exemplary embodiment is shown where the device requires user input to inform the device when each speaker has started speaking, or has stopped speaking. This is particularly useful when translating highly disparate languages whose words need to be re-ordered into a completely different syntax in order to be intelligible in another language.
  • speaker 1 touches the “Start speaking” button, the name of Speaker 1 is then highlighted, and the “Start speaking” button then changes to a “Finished speaking” button.
  • the system translates the phrase spoken by speaker 1 into the language and/or dialect of Speaker 2 , and Speaker 2 can then respond after pressing the “Start speaking” button.
  • a universal language translator 320 is interposed between two audio input/output devices 310 and 330 .
  • Audio input/output device 310 is shown here euphemistically as a telephone wire connected to a third party on the other side, and audio input/output device 330 is shown here euphemistically as a telephone.
  • an audio feed could be received by a network connection to video conferencing software (e.g., SkypeTM) or an internet social network site, such as Facebook® or Gchat®, and a telephone, or a stationary (e.g., desktop computer, television, gaming console, etc.) or mobile computing device (e.g., laptop, smart phone, etc.) could be receiving the audio feed.
  • video conferencing software e.g., SkypeTM
  • an internet social network site such as Facebook® or Gchat®
  • a telephone or a stationary (e.g., desktop computer, television, gaming console, etc.) or mobile computing device (e.g., laptop, smart phone, etc.) could be receiving the audio feed.
  • the universal language translator 320 is preferably software running on the computing device, which translates the audio feed from one party to another party and modulates that translation to sound like the voice of the speaker.
  • the universal language translator could be embedded as an application in a portable cellphone, allowing for a user to create modulated translated sentences on the fly with any user who is
  • the universal language translator 320 can be utilized with the transportation industry.
  • the universal language translator 320 could be interposed between an aviation control tower and an airplane, such that a pilot can interact with the control tower even without speaking the same language.
  • the universal language translator 320 can be at least partially incorporated into a television set, such that foreign programming can be automatically translated into the viewer's language or other desired language.
  • a party who is speaking into the universal translator can have his/her sentence analyzed, deconstructed, translated, and then reconstructed into a translated sentence that is modulated to sound like the voice of the party speaking on the other side.
  • an exemplary universal language translator 420 is interposed between an audio/visual source 410 and an audio/visual output 430 .
  • Audio/visual source 410 is shown here euphemistically as a satellite receiver, but could be any known source of audio/visual media, for example the Internet, a media player, a media diskette, or an electronic storage medium.
  • Audio/visual output 430 is shown here euphemistically as a television, but could be any known output of audio/visual media, for example a theater, a computer monitor and speakers, or a portable media-viewing device.
  • the universal language translator preferably buffers the audio/visual signal within a memory of the universal language translator by at least 10, 20, 30, 60, 180, or 360 seconds, to ensure that the universal language translator can reconstruct a modulated, translated sentence and output the translated sentence without allowing the audio feed and the video feed to run out of sync with one another.

Abstract

A universal language translator automatically translates a spoken word or phrase between two speakers. The translator can lock onto a speaker and filter out ambient noise so as to be used in noisy environments, and to ensure accuracy of the translation when multiple speakers are present. The translator can also synthesize a speaker's voice into the dialect of the other speaker such that each speaker sounds like they're speaking the language of the other. A dialect detector could automatically select target dialects either by auto-sensing the dialect by listening to aspects of each speaker's phrases, or based upon the location of the device.

Description

  • This application claims the benefit of priority to U.S. provisional application having Ser. No. 61/513381 filed on Jul. 29, 2011, and U.S. provisional application having Ser. No. 61/610811 filed on Mar. 14, 2012. These and all other extrinsic materials discussed herein are incorporated by reference in their entirety. Where a definition or use of a term in an incorporated reference is inconsistent or contrary to the definition of that term provided herein, the definition of that term provided herein applies and the definition of that term in the reference does not apply.
  • Field of the Invention
  • The field of the invention is translation devices
  • BACKGROUND
  • The problem of different people being unable to understand one another is as old as the legend of the Tower of Babylon. When a first person who speaks a first language wishes to communicate with a second person who speaks a second language, some sort of translator needs to be used. Preferably the translator is a human translator who understands both languages very well, and can translate cultural nuances from one language to another. However, many people do not have access to a human translator, and must rely upon machine translations.
  • There are many machine translators available for users with access to the internet. For example, the websites http://translate.google.com and http://babelfish.yahoo.com both provide relatively accurate machine translations of languages when a word or phrase is typed into a web user interface. However, not all users have access to a web user interface when they need a word or phrase translated from one language to another. In addition, some users might hear a word or phrase in a foreign language, but may not know how to accurately spell or type the word or phrase into a keyboard user interface.
  • U.S. Pat. No. 4,882,681 to Brotz teaches a handheld oral translating device which will listen to a phrase in one language, and will output an oral translation of the phrase through a machine speaker in the device. In this way, Brotz's device provides simultaneous translation of a conversation between a user of the device that speaks a first language, and another user of the device that speaks a second language, seamlessly performing both reception of a phrase and transmission of a translated phrase. Brotz's device, however, uses machine-synthesized phonemes to produce the translated phrase, which sounds artificial and stilted to a normal user.
  • US2006/0271370 to Li and WO2010025460 to Kent both teach oral translation devices that produce a translated phrase that mimics a user's voice. Kent's device requires a user to create a user-stored dictionary consisting of stored phones, diphones, triphones, half-syllables, words, and other basic sound units in order to construct any word for a target language. Li's device estimates and saves a speaker's speech characteristics, such as a speaker's pitch and timber, and then uses the saved pitch and timbre to synthesize speech. Li's device even locks onto a speaker's location so as to only translate words uttered from that speaker location. Li's device, however, fails to account for a speaker that is moving about a room, and requires a locked-on speaker to remain in one place for the duration of the translation session. Other speech translation devices, such as EP1464048 to Palmquist have the same issue.
  • There is also a need in the art to orally translate a phrase into a local dialect. Many older languages, such as Chinese, have localized to such an extent that people speaking different dialects of the same language frequently cannot even understand one another. Thus, when designing an oral translator, there is a need to not only to translate phrases from one language to another, but also to translate phrases to a language of a specific dialect. US20040044517, US20080195375, and WO2010062542 to Gupta each teach devices that will output a translation into a specific dialect. AU2006201926 to Rigas teaches an oral translation device that uses a GPS to determine the dialect of a region before orally translating that dialect. None of those devices, however, output a voice while mimicking a speaker's voice.
  • There has been a long-felt need in the art for an oral translating device that mimics a user's voice into a specific regional dialect to allow for a device that will be most akin to the speaker actually speaking in that other language, yet no such device has ever been contemplated nor created.
  • Thus, there is still a need for oral translation devices which mimic a user's voice into a specific dialect.
  • SUMMARY OF THE INVENTION
  • The inventive subject matter provides apparatus, systems and methods in which one can use a translator device to automatically translate a phrase spoken by a user into a specific dialect that mimics properties of the user's voice.
  • It should be noted that while the following description is drawn to a single handheld computer device that translates speech, various alternative configurations are also deemed suitable and may employ various computing devices including servers, interfaces, systems, databases, agents, peers, engines, controllers, or other types of computing devices operating individually or collectively. One should appreciate the computing devices comprise a processor configured to execute software instructions stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, solid state drive, RAM, flash, ROM, etc.). The software instructions preferably configure the computing device to provide the roles, responsibilities, or other functionality as discussed below with respect to the disclosed apparatus. In especially preferred embodiments, the various servers, systems, databases, or interfaces exchange data using standardized protocols or algorithms, possibly based on HTTP, HTTPS, AES, public-private key exchanges, web service APIs, known financial transaction protocols, or other electronic information exchanging methods. Data exchanges preferably are conducted over a packet-switched network, the Internet, LAN, WAN, VPN, or other type of packet switched network.
  • The device could have a voice recognition module that detects phonetic elements of a speaker's voice in order to help mimic the user's voice. A voice recognition module records and analyzes an effective amount of words or phrases spoken by the speaker in order to synthesize words with the speaker's mannerisms in accordance with the speaker's auditory attributes. For example, the voice recognition module could detect a speaker's pitch, speed of talking, intonation, and/or average audio frequency, and could synthesize speech that emulates one or more of such auditory attributes. In an exemplary embodiment, a speaker could “train” the voice recognition module by reciting key sentences, words, phrases, or phonemes into a microphone on the device. In another embodiment, the voice recognition module could continuously record auditory attributes of the speaker's voice while in use, continuously modifying the synthesized voice over time such that the synthesized voice grows closer to the speaker's voice the longer the speaker uses the device. In a preferred embodiment, the device automatically synthesizes the translated phrase using the auditory attributes of the speaker's voice. However, since a given device might be trained to emulate the voices of many different speakers, a user interface could be presented whereupon a user could select one of a plurality of speakers to emulate.
  • The voice recognition module also preferably detects the speaking language and/or the speaking dialect of the speaker. As used herein, a dialect is a method of pronouncing a language that is specific to a culture or a region. For example, dialects of English include Southern, Bostonian, British, Australian, and South African while dialects of Chinese include Mandarin, Cantonese, and Shanghainese. A dialect may also contain slang words that are specific to a region that are not used in other areas. A user interface to the voice recognition module may be provided to allow a user to manually select a speaker dialect. This is particularly useful when a speaker's dialect may be similar to another dialect of the same language.
  • The device also preferably has a voice filter module that filters out ambient sounds generated by non-speaker objects. As used herein, a “non-speaker object” is any object that generates sounds that is not the speaker. For example, a second speaker, an engine, wind, music, or a fan. Sounds generated by non-speaker objects could be filtered from an audio recording through any number of methods. For example, the device could detect a region that a speaker is located in and could then filter out ambient sounds generated outside the speaker's region. The region could be a static region that is defined by a user of the device, or the region could move with the speaker as the device tracks the speaker's location. The device could also “lock onto” the speaker by tracking a speaker's voice signature, and could filter out ambient sounds that are do not share attributes of the voice signature. In this manner, a recording could be made by the device that only records words and phrases made by the speaker.
  • The device could also have a dialect detector which will automatically select a dialect to translate the speaker's words/phrases into based upon a location of the device, based upon a detected dialect of a listener, or based upon a user's selection of a target dialect. The location of the device could be obtained in a multitude of ways, for example by triangulating the user's position based upon wireless signals such as GPS signals, by communicating with a triangulation device, by a user selection of the location, or by receiving a location signal from a wireless service such as Wi-Fi or Bluetooth®. The device could also select a target dialect by recording a portion of speech spoken by a listener, and by detecting the dialect of the listener.
  • The speaker's words are translated by a translation module that translates a word or phrase spoken by the speaker into a corresponding word or phrase of a different language—typically the dialect selected by the dialect detector. A preferred device contains at least 100, 300, 500, or even 1000 languages, and contains at least 500, 1000, 5000, or even 10000 different dialects to translate phrases to and from. In a preferred embodiment, the device first locks onto a speaker's voice by and filters out ambient sounds generated by non-speaker objects to obtain a “pure” stream of words spoken by the locked-on speaker. The dialect detector module could then automatically select a dialect based upon a location of the device, translating the speaker's words into a local dialect, and synthesizing a translated phrase that mimics the auditory attributes of the speaker's voice, such that it sounds like the speaker is actually talking in the local dialect. This is particularly useful for travelers who are roaming in remote areas where multiple local dialects are spoken, who need to communicate with local citizens who may not speak the traveler's language.
  • The device is preferably a handheld translator with one or more microphones, speakers, processors, memories, and user interfaces that allow two users to converse with one another while speaking different languages. The device preferably translates each user's word instantaneously as the user speaks, but in a common embodiment, the device translates each user's word or phrase approximately 2-4 seconds after each user speaks. In an exemplary embodiment, the device may delay until a speaker finishes his/her entire phrase in order to reword the translation with proper grammar and syntax, and may require user interaction to signal that the speaker has finished a phrase. This signal could be tangible, such as a button that a speaker presses or depresses when the speaker has finished uttering a phrase, or could be auditory, such as a verbal command, “translate” or “over.”
  • In an alternative embodiment, the universal translator could be a software that is installable onto a computer system. Such software could be installed onto a computer system to translate auditory signals from another program, for example SkyPE® or iChat™. In this manner, users who speak different languages could communicate through a computer audio or video chat program that otherwise would not provide such translation services. Preferably any computer system that the software is installed upon is internet-accessible, or at least is accessible to some wired or wireless network. The software could also be installed as an application on a smartphone, such as an Android™ system or an iPhone®, or could be installed in a radio system to allow a car that travels from one country to another to translate a foreign-speaking radio station into the driver's language. Or the software could be provided as a standard feature in any rental car for travelers from other countries. In such an embodiment, the dialect detector is likely disabled and the target dialect or target language is selected by the driver.
  • Unless the context dictates the contrary, all ranges set forth herein should be interpreted as being inclusive of their endpoints, and open-ended ranges should be interpreted to include commercially practical values. Similarly, all lists of values should be considered as inclusive of intermediate values unless the context indicates the contrary. As used herein, and unless the context dictates otherwise, the term “coupled to” is intended to include both direct coupling (in which two elements that are coupled to each other contact each other) and indirect coupling (in which at least one additional element is located between the two elements). Therefore, the terms “coupled to” and “coupled with” are used synonymously.
  • Various objects, features, aspects and advantages of the inventive subject matter will become more apparent from the following detailed description of preferred embodiments, along with the accompanying drawing figures in which like numerals represent like components.
  • BRIEF DESCRIPTION OF THE DRAWING
  • FIG. 1 is a schematic of an exemplary universal translator translating dialogue between two persons
  • FIG. 2A-2D are exemplary user interfaces in accordance with one embodiment of the invention.
  • FIG. 3 is an exemplary use of a universal language translator applied to a telephone
  • FIG. 4 is an exemplary use of a universal language translator applied to an audio/visual media.
  • DETAILED DESCRIPTION
  • It should be noted that while the following description is drawn to a computer/server based universal translation system, various alternative configurations are also deemed suitable and may employ various computing devices including servers, interfaces, systems, databases, agents, peers, engines, controllers, or other types of computing devices operating individually or collectively. One should appreciate the computing devices comprise a processor configured to execute software instructions stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, solid state drive, RAM, flash, ROM, etc.). The software instructions preferably configure the computing device to provide the roles, responsibilities, or other functionality as discussed below with respect to the disclosed apparatus. In especially preferred embodiments, the various servers, systems, databases, or interfaces exchange data using standardized protocols or algorithms, possibly based on HTTP, HTTPS, AES, public-private key exchanges, web service APIs, known financial transaction protocols, or other electronic information exchanging methods. Data exchanges preferably are conducted over a packet-switched network, the Internet, LAN, WAN, VPN, or other type of packet switched network.
  • One should appreciate that the disclosed techniques provide many advantageous technical effects including automatically translating language spoken by two people by translating each person's spoken words into a local dialect of the other person by emulating each person's speech. Since the device also allows a person to “lock onto” a specific speaker, this allows such conversations to take place in noisy areas with multiple speakers, and might even allow a user to listen in on another person's conversations during possible covert operations.
  • The following discussion provides many example embodiments of the inventive subject matter. Although each embodiment represents a single combination of inventive elements, the inventive subject matter is considered to include all possible combinations of the disclosed elements. Thus if one embodiment comprises elements A, B, and C, and a second embodiment comprises elements B and D, then the inventive subject matter is also considered to include other remaining combinations of A, B, C, or D, even if not explicitly disclosed.
  • In FIG. 1, an exemplary device 100 has directional microphones 112 and 114, auditory outputs 122 and 124, voice recognition module 130, voice filter module 140, dialect detector 150, and translation module 160. Device 100 his activated to translate speech between speaker 102 and speaker 106. Speakers 104 and 108 are sitting to either side of speaker 106. As shown, directional microphone 112 is aimed towards speaker 102 and directional microphone 114 is aimed at speaker 106 so as to prevent ambient noise from non-speakers objects, such as speakers 104 and 108. Once microphone 114 is aimed at speaker 106, device 100 could record some words/phrases spoken by speaker 106 and lock onto that speaker. When speaker 106 moves about a room, voice filter module 140 could then move directional microphone 114 to follow speaker 106, and/or could also electronically filter out ambient noise from non-speaker objects using noise-cancellation technology or other noise-filtering technology.
  • As speaker 102 speaks, voice recognition module 130 detects auditory attributes of speaker 102 such that auditory output 124 could synthesize one or more auditory attributes of speaker 102 when the translation is output. As mentioned above, these auditory attributes could take the form of speaker 102's tone, pitch, timber, phonemes, speed of talking, intonation, and average auditory frequency. Dialect detector 150 also preferably automatically selects a target dialect that speaker 102's words are translated into based upon the location of device 100. It is possible that the auto-selected dialect is not the specific language or dialect that speaker 106 speaks. In such an occasion, when speaker 106's speech is read by directional microphone 114, dialect detector 150 then analyzes speaker 106's speech to determine whether speaker 106's dialect is the same as, or different from, the auto-selected dialect based upon the location of device 100. If speaker 106's dialect is different, then dialect detector 150 will then auto-select the dialect spoken by speaker 106 for speaker 102's words to be translated into. All of the translation from speaker 102 to auditory output 124 and from speaker 106 to auditory output 122 is handled by translation module 160.
  • Exemplary user interfaces for device 100 are shown in FIGS. 2A-2D. As shown in FIG. 2A, while languages and dialects are preferably auto-selected by the device, a language and dialect could be manually selected by a user of the device. This is particularly useful when there is only one-way communication, such as when a user is translating input from a radio or from a loudspeaker. Such selections could occur for just one speaker, or both speakers who are using the system.
  • In FIG. 2B, a user of the device selects which speaker to listen to, when multiple voices are detected. As shown, there are four speakers: Speaker 1, Speaker 2, Speaker 3, and Speaker 4. Speaker 3 has already been recognized by the system as a speaker that has interacted with the system before, and so the name for Speaker 3 has been auto-populated to “Darcy,” as that speaker's auditory attributes and mannerisms have been saved by the system. The user could first listen to each speaker's voice by pressing the “sound” symbol next to each speaker's name, and then when the user has selected the correct speaker, the device could then lock onto the chosen speaker and filter out all non-speaker sounds when translating.
  • In FIGS. 2C and 2D, an exemplary embodiment is shown where the device requires user input to inform the device when each speaker has started speaking, or has stopped speaking. This is particularly useful when translating highly disparate languages whose words need to be re-ordered into a completely different syntax in order to be intelligible in another language. As shown, when speaker 1 touches the “Start speaking” button, the name of Speaker 1 is then highlighted, and the “Start speaking” button then changes to a “Finished speaking” button. When the “Finished speaking” button is then pressed, the system translates the phrase spoken by speaker 1 into the language and/or dialect of Speaker 2, and Speaker 2 can then respond after pressing the “Start speaking” button.
  • In FIG. 3, a universal language translator 320 is interposed between two audio input/ output devices 310 and 330. Audio input/output device 310 is shown here euphemistically as a telephone wire connected to a third party on the other side, and audio input/output device 330 is shown here euphemistically as a telephone. However, other input/output devices could be used, for example an audio feed could be received by a network connection to video conferencing software (e.g., Skype™) or an internet social network site, such as Facebook® or Gchat®, and a telephone, or a stationary (e.g., desktop computer, television, gaming console, etc.) or mobile computing device (e.g., laptop, smart phone, etc.) could be receiving the audio feed. In such an embodiment, the universal language translator 320 is preferably software running on the computing device, which translates the audio feed from one party to another party and modulates that translation to sound like the voice of the speaker. In a preferred embodiment, the universal language translator could be embedded as an application in a portable cellphone, allowing for a user to create modulated translated sentences on the fly with any user who is accessible via cell.
  • In other contemplated embodiments, the universal language translator 320 can be utilized with the transportation industry. For example, the universal language translator 320 could be interposed between an aviation control tower and an airplane, such that a pilot can interact with the control tower even without speaking the same language. In still other embodiments, the universal language translator 320 can be at least partially incorporated into a television set, such that foreign programming can be automatically translated into the viewer's language or other desired language.
  • By interposing the universal language translator 320 between two audio input/output sources, a party who is speaking into the universal translator can have his/her sentence analyzed, deconstructed, translated, and then reconstructed into a translated sentence that is modulated to sound like the voice of the party speaking on the other side. There might be a small delay while the translating occurs, but such a device would be invaluable in having translated conversations with a person while retaining that person's inflections and tone.
  • In FIG. 4, an exemplary universal language translator 420 is interposed between an audio/visual source 410 and an audio/visual output 430. Audio/visual source 410 is shown here euphemistically as a satellite receiver, but could be any known source of audio/visual media, for example the Internet, a media player, a media diskette, or an electronic storage medium. Audio/visual output 430 is shown here euphemistically as a television, but could be any known output of audio/visual media, for example a theater, a computer monitor and speakers, or a portable media-viewing device. The universal language translator preferably buffers the audio/visual signal within a memory of the universal language translator by at least 10, 20, 30, 60, 180, or 360 seconds, to ensure that the universal language translator can reconstruct a modulated, translated sentence and output the translated sentence without allowing the audio feed and the video feed to run out of sync with one another.
  • It should be apparent to those skilled in the art that many more modifications besides those already described are possible without departing from the inventive concepts herein. The inventive subject matter, therefore, is not to be restricted except in the scope of the appended claims. Moreover, in interpreting both the specification and the claims, all terms should be interpreted in the broadest possible manner consistent with the context. In particular, the terms “comprises” and “comprising” should be interpreted as referring to elements, components, or steps in a non-exclusive manner, indicating that the referenced elements, components, or steps may be present, or utilized, or combined with other elements, components, or steps that are not expressly referenced. Where the specification claims refers to at least one of something selected from the group consisting of A, B, C . . . and N, the text should be interpreted as requiring only one element from the group, not A plus N, or B plus N, etc.

Claims (10)

1. A translator device, comprising:
a voice recognition module that detects auditory attributes of a speaker's voice;
a voice filter module that filters out ambient sounds generated by a non-speaker object;
a dialect detector module that selects a dialect based upon a location of the device;
a translation module that translates a first phrase spoken by the speaker to a second phrase of the selected dialect; and
an auditory output that synthesizes the phrase using the auditory attributes of the speaker's voice.
2. The translator device of claim 1, wherein at least one of the auditory attributes of the speaker's voice is selected from the group consisting of pitch, speed of talking, intonation, and average frequency.
3. The translator device of claim 1, wherein the voice filter module detects a region that the speaker is located and filters out ambient sounds generated outside the region.
4. The translator device of claim 3, wherein the voice filter module moves the region as the speaker moves.
5. The translator device of claim 1, wherein the voice filter module creates a voice signature for the speaker and filters out ambient sounds that do not share attributes of the voice signature.
6. The translator device of claim 1, wherein the dialect detector module receives the location of the device from a GPS device.
7. The translator device of claim 1, wherein the dialect detector module receives the location of the device from a triangulation module.
8. The translator device of claim 1, wherein the dialect detector module selects the dialect based upon a dialect of a second speaker.
9. The translator device of claim 1, wherein the dialect detector module has a library of at least 10000 dialects.
10. The translator device of claim 1, wherein the translation module has a library of at least 500 languages.
US13/559,346 2011-07-29 2012-07-26 Universal Language Translator Abandoned US20130030789A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US13/559,346 US20130030789A1 (en) 2011-07-29 2012-07-26 Universal Language Translator
US14/926,698 US9864745B2 (en) 2011-07-29 2015-10-29 Universal language translator

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201161513381P 2011-07-29 2011-07-29
US201261610811P 2012-03-14 2012-03-14
US13/559,346 US20130030789A1 (en) 2011-07-29 2012-07-26 Universal Language Translator

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/926,698 Continuation US9864745B2 (en) 2011-07-29 2015-10-29 Universal language translator

Publications (1)

Publication Number Publication Date
US20130030789A1 true US20130030789A1 (en) 2013-01-31

Family

ID=47597955

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/559,346 Abandoned US20130030789A1 (en) 2011-07-29 2012-07-26 Universal Language Translator
US14/926,698 Active - Reinstated US9864745B2 (en) 2011-07-29 2015-10-29 Universal language translator

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/926,698 Active - Reinstated US9864745B2 (en) 2011-07-29 2015-10-29 Universal language translator

Country Status (1)

Country Link
US (2) US20130030789A1 (en)

Cited By (146)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130304455A1 (en) * 2012-05-14 2013-11-14 International Business Machines Corporation Management of language usage to facilitate effective communication
US20140163948A1 (en) * 2012-12-10 2014-06-12 At&T Intellectual Property I, L.P. Message language conversion
US8761513B1 (en) 2013-03-15 2014-06-24 Translate Abroad, Inc. Systems and methods for displaying foreign character sets and their translations in real time on resource-constrained mobile devices
US20140180670A1 (en) * 2012-12-21 2014-06-26 Maria Osipova General Dictionary for All Languages
US8965129B2 (en) 2013-03-15 2015-02-24 Translate Abroad, Inc. Systems and methods for determining and displaying multi-line foreign language translations in real time on mobile devices
US20150073770A1 (en) * 2013-09-10 2015-03-12 At&T Intellectual Property I, L.P. System and method for intelligent language switching in automated text-to-speech systems
US9160967B2 (en) * 2012-11-13 2015-10-13 Cisco Technology, Inc. Simultaneous language interpretation during ongoing video conferencing
CN105190607A (en) * 2013-03-15 2015-12-23 苹果公司 User training by intelligent digital assistant
WO2015198165A1 (en) 2014-06-24 2015-12-30 Sony Corporation Lifelog camera and method of controlling same using voice triggers
US20160019882A1 (en) * 2014-07-15 2016-01-21 Avaya Inc. Systems and methods for speech analytics and phrase spotting using phoneme sequences
USD749115S1 (en) 2015-02-20 2016-02-09 Translate Abroad, Inc. Mobile device with graphical user interface
US9372672B1 (en) * 2013-09-04 2016-06-21 Tg, Llc Translation in visual context
US9430465B2 (en) * 2013-05-13 2016-08-30 Facebook, Inc. Hybrid, offline/online speech translation system
US9471567B2 (en) * 2013-01-31 2016-10-18 Ncr Corporation Automatic language recognition
US20170060850A1 (en) * 2015-08-24 2017-03-02 Microsoft Technology Licensing, Llc Personal translator
WO2017049766A1 (en) * 2015-09-25 2017-03-30 百度在线网络技术(北京)有限公司 Method and device for outputting voice information
US9697824B1 (en) * 2015-12-30 2017-07-04 Thunder Power New Energy Vehicle Development Company Limited Voice control system with dialect recognition
US9798653B1 (en) * 2010-05-05 2017-10-24 Nuance Communications, Inc. Methods, apparatus and data structure for cross-language speech adaptation
US9805030B2 (en) * 2016-01-21 2017-10-31 Language Line Services, Inc. Configuration for dynamically displaying language interpretation/translation modalities
US9980042B1 (en) 2016-11-18 2018-05-22 Stages Llc Beamformer direction of arrival and orientation analysis system
GR20160100543A (en) * 2016-10-20 2018-06-27 Ευτυχια Ιωαννη Ψωμα Portable translator with memory-equipped sound recorder - translation from native into foreign languages and vice versa
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US10121488B1 (en) * 2015-02-23 2018-11-06 Sprint Communications Company L.P. Optimizing call quality using vocal frequency fingerprints to filter voice calls
US20180350343A1 (en) * 2017-05-31 2018-12-06 Lenovo (Singapore) Pte. Ltd. Provide output associated with a dialect
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US10311144B2 (en) 2017-05-16 2019-06-04 Apple Inc. Emoji word sense disambiguation
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US10332518B2 (en) 2017-05-09 2019-06-25 Apple Inc. User interface for correcting recognition errors
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10354652B2 (en) 2015-12-02 2019-07-16 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10366689B2 (en) * 2014-10-29 2019-07-30 Kyocera Corporation Communication robot
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US10390213B2 (en) 2014-09-30 2019-08-20 Apple Inc. Social reminders
US10395654B2 (en) 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US10403283B1 (en) 2018-06-01 2019-09-03 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10403278B2 (en) 2017-05-16 2019-09-03 Apple Inc. Methods and systems for phonetic matching in digital assistant services
US10417344B2 (en) 2014-05-30 2019-09-17 Apple Inc. Exemplar-based natural language processing
US10417266B2 (en) 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
US10417405B2 (en) 2011-03-21 2019-09-17 Apple Inc. Device access using voice authentication
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10438595B2 (en) 2014-09-30 2019-10-08 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10445429B2 (en) 2017-09-21 2019-10-15 Apple Inc. Natural language understanding using vocabularies with compressed serialized tries
US10453443B2 (en) 2014-09-30 2019-10-22 Apple Inc. Providing an indication of the suitability of speech recognition
US10474753B2 (en) 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US10496705B1 (en) 2018-06-03 2019-12-03 Apple Inc. Accelerated task performance
US10529332B2 (en) 2015-03-08 2020-01-07 Apple Inc. Virtual assistant activation
US10553215B2 (en) 2016-09-23 2020-02-04 Apple Inc. Intelligent automated assistant
CN110767233A (en) * 2019-10-30 2020-02-07 合肥名阳信息技术有限公司 Voice conversion system and method
US10580409B2 (en) 2016-06-11 2020-03-03 Apple Inc. Application integration with a digital assistant
US10592604B2 (en) 2018-03-12 2020-03-17 Apple Inc. Inverse text normalization for automatic speech recognition
US10636424B2 (en) 2017-11-30 2020-04-28 Apple Inc. Multi-turn canned dialog
US10643611B2 (en) 2008-10-02 2020-05-05 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US10657328B2 (en) 2017-06-02 2020-05-19 Apple Inc. Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10664667B2 (en) * 2017-08-25 2020-05-26 Panasonic Intellectual Property Corporation Of America Information processing method, information processing device, and recording medium having program recorded thereon
US10684703B2 (en) 2018-06-01 2020-06-16 Apple Inc. Attention aware virtual assistant dismissal
US10692504B2 (en) 2010-02-25 2020-06-23 Apple Inc. User profiling for voice input processing
US10699717B2 (en) 2014-05-30 2020-06-30 Apple Inc. Intelligent assistant for home automation
US10714117B2 (en) 2013-02-07 2020-07-14 Apple Inc. Voice trigger for a digital assistant
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10733375B2 (en) 2018-01-31 2020-08-04 Apple Inc. Knowledge-based framework for improving natural language understanding
US10733982B2 (en) 2018-01-08 2020-08-04 Apple Inc. Multi-directional dialog
US10741185B2 (en) 2010-01-18 2020-08-11 Apple Inc. Intelligent automated assistant
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US10755051B2 (en) 2017-09-29 2020-08-25 Apple Inc. Rule-based natural language processing
US10769385B2 (en) 2013-06-09 2020-09-08 Apple Inc. System and method for inferring user intent from speech inputs
US10789945B2 (en) 2017-05-12 2020-09-29 Apple Inc. Low-latency intelligent automated assistant
US10789959B2 (en) 2018-03-02 2020-09-29 Apple Inc. Training speaker recognition models for digital assistants
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
US10909331B2 (en) 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
US10945080B2 (en) 2016-11-18 2021-03-09 Stages Llc Audio analysis and processing system
US10942703B2 (en) 2015-12-23 2021-03-09 Apple Inc. Proactive assistance based on dialog communication between devices
US10942702B2 (en) 2016-06-11 2021-03-09 Apple Inc. Intelligent device arbitration and control
US10956666B2 (en) 2015-11-09 2021-03-23 Apple Inc. Unconventional virtual assistant interactions
US10984780B2 (en) 2018-05-21 2021-04-20 Apple Inc. Global semantic word embeddings using bi-directional recurrent neural networks
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11023513B2 (en) 2007-12-20 2021-06-01 Apple Inc. Method and apparatus for searching using an active ontology
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11048473B2 (en) 2013-06-09 2021-06-29 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US11069336B2 (en) 2012-03-02 2021-07-20 Apple Inc. Systems and methods for name pronunciation
US11070949B2 (en) 2015-05-27 2021-07-20 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11120219B2 (en) 2019-10-28 2021-09-14 International Business Machines Corporation User-customized computer-automated translation
US11127397B2 (en) 2015-05-27 2021-09-21 Apple Inc. Device voice control
US11126400B2 (en) 2015-09-08 2021-09-21 Apple Inc. Zero latency digital assistant
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US11231904B2 (en) 2015-03-06 2022-01-25 Apple Inc. Reducing response latency of intelligent automated assistants
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11281993B2 (en) 2016-12-05 2022-03-22 Apple Inc. Model and ensemble compression for metric learning
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11301477B2 (en) 2017-05-12 2022-04-12 Apple Inc. Feedback analysis of a digital assistant
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11314370B2 (en) 2013-12-06 2022-04-26 Apple Inc. Method for extracting salient dialog usage from live data
US11330388B2 (en) 2016-11-18 2022-05-10 Stages Llc Audio source spatialization relative to orientation sensor and output
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11350253B2 (en) 2011-06-03 2022-05-31 Apple Inc. Active transport based notifications
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
US11386266B2 (en) 2018-06-01 2022-07-12 Apple Inc. Text correction
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US20220293098A1 (en) * 2021-03-15 2022-09-15 Lenovo (Singapore) Pte. Ltd. Dialect correction and training
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11467802B2 (en) 2017-05-11 2022-10-11 Apple Inc. Maintaining privacy of personal information
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11495218B2 (en) 2018-06-01 2022-11-08 Apple Inc. Virtual assistant operation in multi-device environments
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US11516537B2 (en) 2014-06-30 2022-11-29 Apple Inc. Intelligent automated assistant for TV user interactions
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US11580990B2 (en) 2017-05-12 2023-02-14 Apple Inc. User-specific acoustic models
US11597519B2 (en) 2017-10-17 2023-03-07 The Boeing Company Artificially intelligent flight crew systems and methods
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11689846B2 (en) 2014-12-05 2023-06-27 Stages Llc Active noise control and customized audio system
US11696060B2 (en) 2020-07-21 2023-07-04 Apple Inc. User identification using headphones
US11755276B2 (en) 2020-05-12 2023-09-12 Apple Inc. Reducing description length based on confidence
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction
US11790914B2 (en) 2019-06-01 2023-10-17 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US11838734B2 (en) 2020-07-20 2023-12-05 Apple Inc. Multi-device audio adjustment coordination
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11914848B2 (en) 2020-05-11 2024-02-27 Apple Inc. Providing relevant data items based on context
US11928604B2 (en) 2005-09-08 2024-03-12 Apple Inc. Method and apparatus for building an intelligent automated assistant

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105931643A (en) * 2016-06-30 2016-09-07 北京海尔广科数字技术有限公司 Speech recognition method and apparatus
US11195507B2 (en) * 2018-10-04 2021-12-07 Rovi Guides, Inc. Translating between spoken languages with emotion in audio and video media streams

Citations (59)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020091521A1 (en) * 2000-11-16 2002-07-11 International Business Machines Corporation Unsupervised incremental adaptation using maximum likelihood spectral transformation
US20020184032A1 (en) * 2001-03-09 2002-12-05 Yuji Hisaminato Voice synthesizing apparatus
US20030036903A1 (en) * 2001-08-16 2003-02-20 Sony Corporation Retraining and updating speech models for speech recognition
US20030050783A1 (en) * 2001-09-13 2003-03-13 Shinichi Yoshizawa Terminal device, server device and speech recognition method
US20050065795A1 (en) * 2002-04-02 2005-03-24 Canon Kabushiki Kaisha Text structure for voice synthesis, voice synthesis method, voice synthesis apparatus, and computer program thereof
US20050240406A1 (en) * 2004-04-21 2005-10-27 David Carroll Speech recognition computing device display with highlighted text
US20060020463A1 (en) * 2004-07-22 2006-01-26 International Business Machines Corporation Method and system for identifying and correcting accent-induced speech recognition difficulties
US20060136216A1 (en) * 2004-12-10 2006-06-22 Delta Electronics, Inc. Text-to-speech system and method thereof
US20060149558A1 (en) * 2001-07-17 2006-07-06 Jonathan Kahn Synchronized pattern recognition source data processed by manual or automatic means for creation of shared speaker-dependent speech user profile
US20070005363A1 (en) * 2005-06-29 2007-01-04 Microsoft Corporation Location aware multi-modal multi-lingual device
US20070033005A1 (en) * 2005-08-05 2007-02-08 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US20070038436A1 (en) * 2005-08-10 2007-02-15 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition in conversational speech
US20070233489A1 (en) * 2004-05-11 2007-10-04 Yoshifumi Hirose Speech Synthesis Device and Method
US20070244688A1 (en) * 2006-04-14 2007-10-18 At&T Corp. On-Demand Language Translation For Television Programs
US20080052069A1 (en) * 2000-10-24 2008-02-28 Global Translation, Inc. Integrated speech recognition, closed captioning, and translation system and method
US20080195386A1 (en) * 2005-05-31 2008-08-14 Koninklijke Philips Electronics, N.V. Method and a Device For Performing an Automatic Dubbing on a Multimedia Signal
US20080208597A1 (en) * 2007-02-27 2008-08-28 Tetsuro Chino Apparatus, method, and computer program product for processing input speech
US20080225184A1 (en) * 2007-03-13 2008-09-18 Sony Corporation And Sony Electronics Inc. System and method for effectively performing a remote control configuration procedure
US20090037179A1 (en) * 2007-07-30 2009-02-05 International Business Machines Corporation Method and Apparatus for Automatically Converting Voice
US7496498B2 (en) * 2003-03-24 2009-02-24 Microsoft Corporation Front-end architecture for a multi-lingual text-to-speech system
US20090125309A1 (en) * 2001-12-10 2009-05-14 Steve Tischer Methods, Systems, and Products for Synthesizing Speech
US7596499B2 (en) * 2004-02-02 2009-09-29 Panasonic Corporation Multilingual text-to-speech system with limited resources
US20090243929A1 (en) * 2008-03-31 2009-10-01 Uttam Sengupta Method and apparatus for faster global positioning system (gps) location using a pre-computed spatial location for tracking gps satellites
US20090306985A1 (en) * 2008-06-06 2009-12-10 At&T Labs System and method for synthetically generated speech describing media content
US20100049497A1 (en) * 2009-09-19 2010-02-25 Manuel-Devadoss Smith Johnson Phonetic natural language translation system
US20100057435A1 (en) * 2008-08-29 2010-03-04 Kent Justin R System and method for speech-to-speech translation
US20100082329A1 (en) * 2008-09-29 2010-04-01 Apple Inc. Systems and methods of detecting language and natural language strings for text to speech synthesis
US20100100907A1 (en) * 2008-10-16 2010-04-22 At&T Intellectual Property I, L.P. Presentation of an adaptive avatar
US20100122288A1 (en) * 2008-11-07 2010-05-13 Minter David D Methods and systems for selecting content for an internet television stream using mobile device location
US20100198577A1 (en) * 2009-02-03 2010-08-05 Microsoft Corporation State mapping for cross-language speaker adaptation
US7778632B2 (en) * 2005-10-28 2010-08-17 Microsoft Corporation Multi-modal device capable of automated actions
US20100250231A1 (en) * 2009-03-07 2010-09-30 Voice Muffler Corporation Mouthpiece with sound reducer to enhance language translation
US7809549B1 (en) * 2006-06-15 2010-10-05 At&T Intellectual Property Ii, L.P. On-demand language translation for television programs
US20100293230A1 (en) * 2009-05-12 2010-11-18 International Business Machines Corporation Multilingual Support for an Improved Messaging System
US20110044438A1 (en) * 2009-08-20 2011-02-24 T-Mobile Usa, Inc. Shareable Applications On Telecommunications Devices
US20110161076A1 (en) * 2009-12-31 2011-06-30 Davis Bruce L Intuitive Computing Methods and Systems
US20110166938A1 (en) * 2010-01-05 2011-07-07 Bionic Click Llc Methods For Advertising
US20110178803A1 (en) * 1999-08-31 2011-07-21 Accenture Global Services Limited Detecting emotion in voice signals in a call center
US20110238407A1 (en) * 2009-08-31 2011-09-29 O3 Technologies, Llc Systems and methods for speech-to-speech translation
US20110246172A1 (en) * 2010-03-30 2011-10-06 Polycom, Inc. Method and System for Adding Translation in a Videoconference
US20110270601A1 (en) * 2010-04-28 2011-11-03 Vahe Nick Karapetian, Jr. Universal translator
US20110313775A1 (en) * 2010-05-20 2011-12-22 Google Inc. Television Remote Control Data Transfer
US20120004899A1 (en) * 2010-07-04 2012-01-05 Taymoor Arshi Dynamic ad selection for ad delivery systems
US20120035906A1 (en) * 2010-08-05 2012-02-09 David Lynton Jephcott Translation Station
US20120069131A1 (en) * 2010-05-28 2012-03-22 Abelow Daniel H Reality alternate
US8165879B2 (en) * 2007-01-11 2012-04-24 Casio Computer Co., Ltd. Voice output device and voice output program
US20120120218A1 (en) * 2010-11-15 2012-05-17 Flaks Jason S Semi-private communication in open environments
US20120203557A1 (en) * 2001-03-29 2012-08-09 Gilad Odinak Comprehensive multiple feature telematics system
US8244534B2 (en) * 2007-08-20 2012-08-14 Microsoft Corporation HMM-based bilingual (Mandarin-English) TTS techniques
US8275621B2 (en) * 2008-03-31 2012-09-25 Nuance Communications, Inc. Determining text to speech pronunciation based on an utterance from a user
US20120253781A1 (en) * 2011-04-04 2012-10-04 Microsoft Corporation Frame mapping approach for cross-lingual voice transformation
US20120278081A1 (en) * 2009-06-10 2012-11-01 Kabushiki Kaisha Toshiba Text to speech method and system
US20130016760A1 (en) * 2011-01-14 2013-01-17 Qualcomm Incorporated Methods and apparatuses for low-rate television white space (tvws) enablement
US20130144595A1 (en) * 2011-12-01 2013-06-06 Richard T. Lord Language translation based on speaker-related information
US20130144625A1 (en) * 2009-01-15 2013-06-06 K-Nfb Reading Technology, Inc. Systems and methods document narration
US20130166278A1 (en) * 2009-03-09 2013-06-27 Apple Inc. Systems and Methods for Determining the Language to Use for Speech Generated by a Text to Speech Engine
US20130185052A1 (en) * 2007-03-29 2013-07-18 Microsoft Corporation Language translation of visual and audio input
US8645140B2 (en) * 2009-02-25 2014-02-04 Blackberry Limited Electronic device and method of associating a voice font with a contact for text-to-speech conversion at the electronic device
US8887199B2 (en) * 2005-12-19 2014-11-11 Koninklijke Philips N.V. System, apparatus, and method for templates offering default settings for typical virtual channels

Family Cites Families (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4882681A (en) 1987-09-02 1989-11-21 Brotz Gregory R Remote language translating device
US5636325A (en) * 1992-11-13 1997-06-03 International Business Machines Corporation Speech synthesis and analysis of dialects
CA2168848C (en) 1994-07-01 2003-10-14 Makoto Hyuga Communication method and system for same
US6233545B1 (en) 1997-05-01 2001-05-15 William E. Datig Universal machine translator of arbitrary languages utilizing epistemic moments
DE69712485T2 (en) 1997-10-23 2002-12-12 Sony Int Europe Gmbh Voice interface for a home network
JP3576840B2 (en) * 1997-11-28 2004-10-13 松下電器産業株式会社 Basic frequency pattern generation method, basic frequency pattern generation device, and program recording medium
US20030065504A1 (en) 2001-10-02 2003-04-03 Jessica Kraemer Instant verbal translator
US20030125959A1 (en) 2001-12-31 2003-07-03 Palmquist Robert D. Translation device with planar microphone array
EP1495625B1 (en) 2002-04-02 2011-09-28 Verizon Business Global LLC Providing of presence information to a telephony services system
US20040044517A1 (en) 2002-08-30 2004-03-04 Robert Palmquist Translation system
US7702792B2 (en) 2004-01-08 2010-04-20 Cisco Technology, Inc. Method and system for managing communication sessions between a text-based and a voice-based client
US20060004730A1 (en) * 2004-07-02 2006-01-05 Ning-Ping Chan Variant standardization engine
US9167195B2 (en) * 2005-10-31 2015-10-20 Invention Science Fund I, Llc Preservation/degradation of video/audio aspects of a data stream
JP4125362B2 (en) * 2005-05-18 2008-07-30 松下電器産業株式会社 Speech synthesizer
US20060271370A1 (en) 2005-05-24 2006-11-30 Li Qi P Mobile two-way spoken language translator and noise reduction using multi-directional microphone arrays
AU2006201926A1 (en) 2005-05-31 2006-12-14 Rigas, Nikolaos A system and method for translating a multi-lingual dialogue
US20070050188A1 (en) * 2005-08-26 2007-03-01 Avaya Technology Corp. Tone contour transformation of speech
US8165882B2 (en) * 2005-09-06 2012-04-24 Nec Corporation Method, apparatus and program for speech synthesis
WO2007091475A1 (en) * 2006-02-08 2007-08-16 Nec Corporation Speech synthesizing device, speech synthesizing method, and program
US20090319273A1 (en) * 2006-06-30 2009-12-24 Nec Corporation Audio content generation system, information exchanging system, program, audio content generating method, and information exchanging method
US20080195375A1 (en) 2007-02-09 2008-08-14 Gideon Farre Clifton Echo translator
WO2008102594A1 (en) * 2007-02-19 2008-08-28 Panasonic Corporation Tenseness converting device, speech converting device, speech synthesizing device, speech converting method, speech synthesizing method, and program
US8078698B2 (en) * 2007-06-26 2011-12-13 At&T Intellectual Property I, L.P. Methods, systems, and products for producing persona-based hosts
US8041555B2 (en) 2007-08-15 2011-10-18 International Business Machines Corporation Language translation based on a location of a wireless device
US20090074216A1 (en) * 2007-09-13 2009-03-19 Bionica Corporation Assistive listening system with programmable hearing aid and wireless handheld programmable digital signal processing device
US20090074203A1 (en) * 2007-09-13 2009-03-19 Bionica Corporation Method of enhancing sound for hearing impaired individuals
WO2009042861A1 (en) 2007-09-26 2009-04-02 The Trustees Of Columbia University In The City Of New York Methods, systems, and media for partially diacritizing text
US8024179B2 (en) * 2007-10-30 2011-09-20 At&T Intellectual Property Ii, L.P. System and method for improving interaction with a user through a dynamically alterable spoken dialog system
KR101181785B1 (en) * 2008-04-08 2012-09-11 가부시키가이샤 엔.티.티.도코모 Media process server apparatus and media process method therefor
US8347247B2 (en) * 2008-10-17 2013-01-01 International Business Machines Corporation Visualization interface of continuous waveform multi-speaker identification
WO2010062542A1 (en) 2008-10-27 2010-06-03 Research Triangle Institute Method for translation of a communication between languages, and associated system and computer program product
FR2941314B1 (en) 2009-01-20 2011-03-04 Airbus France METHOD FOR CONTROLLING AN AIRCRAFT USING A VOTING SYSTEM
US8600731B2 (en) 2009-02-04 2013-12-03 Microsoft Corporation Universal translator
US20110257978A1 (en) * 2009-10-23 2011-10-20 Brainlike, Inc. Time Series Filtering, Data Reduction and Voice Recognition in Communication Device
US20100075281A1 (en) 2009-11-13 2010-03-25 Manuel-Devadoss Johnson Smith In-Flight Entertainment Phonetic Language Translation System using Brain Interface
USH2269H1 (en) 2009-11-20 2012-06-05 Manuel-Devadoss Johnson Smith Johnson Automated speech translation system using human brain language areas comprehension capabilities
US20110150270A1 (en) * 2009-12-22 2011-06-23 Carpenter Michael D Postal processing including voice training
US8442827B2 (en) * 2010-06-18 2013-05-14 At&T Intellectual Property I, L.P. System and method for customized voice response
US8407736B2 (en) * 2010-08-04 2013-03-26 At&T Intellectual Property I, L.P. Apparatus and method for providing emergency communications
US9274744B2 (en) * 2010-09-10 2016-03-01 Amazon Technologies, Inc. Relative position-inclusive device interfaces
US20120089400A1 (en) * 2010-10-06 2012-04-12 Caroline Gilles Henton Systems and methods for using homophone lexicons in english text-to-speech
US8738355B2 (en) * 2011-01-06 2014-05-27 Qualcomm Incorporated Methods and apparatuses for providing predictive translation information services to mobile stations
US8781836B2 (en) * 2011-02-22 2014-07-15 Apple Inc. Hearing assistance system for providing consistent human speech
US9620128B2 (en) * 2012-05-31 2017-04-11 Elwha Llc Speech recognition adaptation systems based on adaptation data

Patent Citations (59)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110178803A1 (en) * 1999-08-31 2011-07-21 Accenture Global Services Limited Detecting emotion in voice signals in a call center
US20080052069A1 (en) * 2000-10-24 2008-02-28 Global Translation, Inc. Integrated speech recognition, closed captioning, and translation system and method
US20020091521A1 (en) * 2000-11-16 2002-07-11 International Business Machines Corporation Unsupervised incremental adaptation using maximum likelihood spectral transformation
US20020184032A1 (en) * 2001-03-09 2002-12-05 Yuji Hisaminato Voice synthesizing apparatus
US20120203557A1 (en) * 2001-03-29 2012-08-09 Gilad Odinak Comprehensive multiple feature telematics system
US20060149558A1 (en) * 2001-07-17 2006-07-06 Jonathan Kahn Synchronized pattern recognition source data processed by manual or automatic means for creation of shared speaker-dependent speech user profile
US20030036903A1 (en) * 2001-08-16 2003-02-20 Sony Corporation Retraining and updating speech models for speech recognition
US20030050783A1 (en) * 2001-09-13 2003-03-13 Shinichi Yoshizawa Terminal device, server device and speech recognition method
US20090125309A1 (en) * 2001-12-10 2009-05-14 Steve Tischer Methods, Systems, and Products for Synthesizing Speech
US20050065795A1 (en) * 2002-04-02 2005-03-24 Canon Kabushiki Kaisha Text structure for voice synthesis, voice synthesis method, voice synthesis apparatus, and computer program thereof
US7496498B2 (en) * 2003-03-24 2009-02-24 Microsoft Corporation Front-end architecture for a multi-lingual text-to-speech system
US7596499B2 (en) * 2004-02-02 2009-09-29 Panasonic Corporation Multilingual text-to-speech system with limited resources
US20050240406A1 (en) * 2004-04-21 2005-10-27 David Carroll Speech recognition computing device display with highlighted text
US20070233489A1 (en) * 2004-05-11 2007-10-04 Yoshifumi Hirose Speech Synthesis Device and Method
US20060020463A1 (en) * 2004-07-22 2006-01-26 International Business Machines Corporation Method and system for identifying and correcting accent-induced speech recognition difficulties
US20060136216A1 (en) * 2004-12-10 2006-06-22 Delta Electronics, Inc. Text-to-speech system and method thereof
US20080195386A1 (en) * 2005-05-31 2008-08-14 Koninklijke Philips Electronics, N.V. Method and a Device For Performing an Automatic Dubbing on a Multimedia Signal
US20070005363A1 (en) * 2005-06-29 2007-01-04 Microsoft Corporation Location aware multi-modal multi-lingual device
US20070033005A1 (en) * 2005-08-05 2007-02-08 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US20070038436A1 (en) * 2005-08-10 2007-02-15 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition in conversational speech
US7778632B2 (en) * 2005-10-28 2010-08-17 Microsoft Corporation Multi-modal device capable of automated actions
US8887199B2 (en) * 2005-12-19 2014-11-11 Koninklijke Philips N.V. System, apparatus, and method for templates offering default settings for typical virtual channels
US20070244688A1 (en) * 2006-04-14 2007-10-18 At&T Corp. On-Demand Language Translation For Television Programs
US7809549B1 (en) * 2006-06-15 2010-10-05 At&T Intellectual Property Ii, L.P. On-demand language translation for television programs
US8165879B2 (en) * 2007-01-11 2012-04-24 Casio Computer Co., Ltd. Voice output device and voice output program
US20080208597A1 (en) * 2007-02-27 2008-08-28 Tetsuro Chino Apparatus, method, and computer program product for processing input speech
US20080225184A1 (en) * 2007-03-13 2008-09-18 Sony Corporation And Sony Electronics Inc. System and method for effectively performing a remote control configuration procedure
US20130185052A1 (en) * 2007-03-29 2013-07-18 Microsoft Corporation Language translation of visual and audio input
US20090037179A1 (en) * 2007-07-30 2009-02-05 International Business Machines Corporation Method and Apparatus for Automatically Converting Voice
US8244534B2 (en) * 2007-08-20 2012-08-14 Microsoft Corporation HMM-based bilingual (Mandarin-English) TTS techniques
US8275621B2 (en) * 2008-03-31 2012-09-25 Nuance Communications, Inc. Determining text to speech pronunciation based on an utterance from a user
US20090243929A1 (en) * 2008-03-31 2009-10-01 Uttam Sengupta Method and apparatus for faster global positioning system (gps) location using a pre-computed spatial location for tracking gps satellites
US20090306985A1 (en) * 2008-06-06 2009-12-10 At&T Labs System and method for synthetically generated speech describing media content
US20100057435A1 (en) * 2008-08-29 2010-03-04 Kent Justin R System and method for speech-to-speech translation
US20100082329A1 (en) * 2008-09-29 2010-04-01 Apple Inc. Systems and methods of detecting language and natural language strings for text to speech synthesis
US20100100907A1 (en) * 2008-10-16 2010-04-22 At&T Intellectual Property I, L.P. Presentation of an adaptive avatar
US20100122288A1 (en) * 2008-11-07 2010-05-13 Minter David D Methods and systems for selecting content for an internet television stream using mobile device location
US20130144625A1 (en) * 2009-01-15 2013-06-06 K-Nfb Reading Technology, Inc. Systems and methods document narration
US20100198577A1 (en) * 2009-02-03 2010-08-05 Microsoft Corporation State mapping for cross-language speaker adaptation
US8645140B2 (en) * 2009-02-25 2014-02-04 Blackberry Limited Electronic device and method of associating a voice font with a contact for text-to-speech conversion at the electronic device
US20100250231A1 (en) * 2009-03-07 2010-09-30 Voice Muffler Corporation Mouthpiece with sound reducer to enhance language translation
US20130166278A1 (en) * 2009-03-09 2013-06-27 Apple Inc. Systems and Methods for Determining the Language to Use for Speech Generated by a Text to Speech Engine
US20100293230A1 (en) * 2009-05-12 2010-11-18 International Business Machines Corporation Multilingual Support for an Improved Messaging System
US20120278081A1 (en) * 2009-06-10 2012-11-01 Kabushiki Kaisha Toshiba Text to speech method and system
US20110044438A1 (en) * 2009-08-20 2011-02-24 T-Mobile Usa, Inc. Shareable Applications On Telecommunications Devices
US20110238407A1 (en) * 2009-08-31 2011-09-29 O3 Technologies, Llc Systems and methods for speech-to-speech translation
US20100049497A1 (en) * 2009-09-19 2010-02-25 Manuel-Devadoss Smith Johnson Phonetic natural language translation system
US20110161076A1 (en) * 2009-12-31 2011-06-30 Davis Bruce L Intuitive Computing Methods and Systems
US20110166938A1 (en) * 2010-01-05 2011-07-07 Bionic Click Llc Methods For Advertising
US20110246172A1 (en) * 2010-03-30 2011-10-06 Polycom, Inc. Method and System for Adding Translation in a Videoconference
US20110270601A1 (en) * 2010-04-28 2011-11-03 Vahe Nick Karapetian, Jr. Universal translator
US20110313775A1 (en) * 2010-05-20 2011-12-22 Google Inc. Television Remote Control Data Transfer
US20120069131A1 (en) * 2010-05-28 2012-03-22 Abelow Daniel H Reality alternate
US20120004899A1 (en) * 2010-07-04 2012-01-05 Taymoor Arshi Dynamic ad selection for ad delivery systems
US20120035906A1 (en) * 2010-08-05 2012-02-09 David Lynton Jephcott Translation Station
US20120120218A1 (en) * 2010-11-15 2012-05-17 Flaks Jason S Semi-private communication in open environments
US20130016760A1 (en) * 2011-01-14 2013-01-17 Qualcomm Incorporated Methods and apparatuses for low-rate television white space (tvws) enablement
US20120253781A1 (en) * 2011-04-04 2012-10-04 Microsoft Corporation Frame mapping approach for cross-lingual voice transformation
US20130144595A1 (en) * 2011-12-01 2013-06-06 Richard T. Lord Language translation based on speaker-related information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CJ Leggetter, PC Woodland, "Speaker Adaptation of Continuous Density HMMs Using Multivariate Regression", ICSLP, 1994. *

Cited By (232)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11928604B2 (en) 2005-09-08 2024-03-12 Apple Inc. Method and apparatus for building an intelligent automated assistant
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11023513B2 (en) 2007-12-20 2021-06-01 Apple Inc. Method and apparatus for searching using an active ontology
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US10643611B2 (en) 2008-10-02 2020-05-05 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11900936B2 (en) 2008-10-02 2024-02-13 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11348582B2 (en) 2008-10-02 2022-05-31 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US10741185B2 (en) 2010-01-18 2020-08-11 Apple Inc. Intelligent automated assistant
US10692504B2 (en) 2010-02-25 2020-06-23 Apple Inc. User profiling for voice input processing
US9798653B1 (en) * 2010-05-05 2017-10-24 Nuance Communications, Inc. Methods, apparatus and data structure for cross-language speech adaptation
US10417405B2 (en) 2011-03-21 2019-09-17 Apple Inc. Device access using voice authentication
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11350253B2 (en) 2011-06-03 2022-05-31 Apple Inc. Active transport based notifications
US11069336B2 (en) 2012-03-02 2021-07-20 Apple Inc. Systems and methods for name pronunciation
US9460082B2 (en) 2012-05-14 2016-10-04 International Business Machines Corporation Management of language usage to facilitate effective communication
US9442916B2 (en) * 2012-05-14 2016-09-13 International Business Machines Corporation Management of language usage to facilitate effective communication
US20130304455A1 (en) * 2012-05-14 2013-11-14 International Business Machines Corporation Management of language usage to facilitate effective communication
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11321116B2 (en) 2012-05-15 2022-05-03 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US9160967B2 (en) * 2012-11-13 2015-10-13 Cisco Technology, Inc. Simultaneous language interpretation during ongoing video conferencing
US20140163948A1 (en) * 2012-12-10 2014-06-12 At&T Intellectual Property I, L.P. Message language conversion
US9411801B2 (en) * 2012-12-21 2016-08-09 Abbyy Development Llc General dictionary for all languages
US20140180670A1 (en) * 2012-12-21 2014-06-26 Maria Osipova General Dictionary for All Languages
US9471567B2 (en) * 2013-01-31 2016-10-18 Ncr Corporation Automatic language recognition
US11557310B2 (en) 2013-02-07 2023-01-17 Apple Inc. Voice trigger for a digital assistant
US10714117B2 (en) 2013-02-07 2020-07-14 Apple Inc. Voice trigger for a digital assistant
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US11862186B2 (en) 2013-02-07 2024-01-02 Apple Inc. Voice trigger for a digital assistant
US11636869B2 (en) 2013-02-07 2023-04-25 Apple Inc. Voice trigger for a digital assistant
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11151899B2 (en) 2013-03-15 2021-10-19 Apple Inc. User training by intelligent digital assistant
US9275046B2 (en) 2013-03-15 2016-03-01 Translate Abroad, Inc. Systems and methods for displaying foreign character sets and their translations in real time on resource-constrained mobile devices
US8761513B1 (en) 2013-03-15 2014-06-24 Translate Abroad, Inc. Systems and methods for displaying foreign character sets and their translations in real time on resource-constrained mobile devices
AU2017221864C1 (en) * 2013-03-15 2020-01-16 Apple Inc. User training by intelligent digital assistant
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US8965129B2 (en) 2013-03-15 2015-02-24 Translate Abroad, Inc. Systems and methods for determining and displaying multi-line foreign language translations in real time on mobile devices
AU2017221864B2 (en) * 2013-03-15 2019-06-20 Apple Inc. User training by intelligent digital assistant
CN105190607A (en) * 2013-03-15 2015-12-23 苹果公司 User training by intelligent digital assistant
EP2973002B1 (en) * 2013-03-15 2019-06-26 Apple Inc. User training by intelligent digital assistant
US9430465B2 (en) * 2013-05-13 2016-08-30 Facebook, Inc. Hybrid, offline/online speech translation system
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10769385B2 (en) 2013-06-09 2020-09-08 Apple Inc. System and method for inferring user intent from speech inputs
US11048473B2 (en) 2013-06-09 2021-06-29 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US11727219B2 (en) 2013-06-09 2023-08-15 Apple Inc. System and method for inferring user intent from speech inputs
US9372672B1 (en) * 2013-09-04 2016-06-21 Tg, Llc Translation in visual context
US10388269B2 (en) * 2013-09-10 2019-08-20 At&T Intellectual Property I, L.P. System and method for intelligent language switching in automated text-to-speech systems
US20150073770A1 (en) * 2013-09-10 2015-03-12 At&T Intellectual Property I, L.P. System and method for intelligent language switching in automated text-to-speech systems
US9640173B2 (en) * 2013-09-10 2017-05-02 At&T Intellectual Property I, L.P. System and method for intelligent language switching in automated text-to-speech systems
US11195510B2 (en) * 2013-09-10 2021-12-07 At&T Intellectual Property I, L.P. System and method for intelligent language switching in automated text-to-speech systems
US20170236509A1 (en) * 2013-09-10 2017-08-17 At&T Intellectual Property I, L.P. System and method for intelligent language switching in automated text-to-speech systems
US11314370B2 (en) 2013-12-06 2022-04-26 Apple Inc. Method for extracting salient dialog usage from live data
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US10714095B2 (en) 2014-05-30 2020-07-14 Apple Inc. Intelligent assistant for home automation
US11810562B2 (en) 2014-05-30 2023-11-07 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10878809B2 (en) 2014-05-30 2020-12-29 Apple Inc. Multi-command single utterance input method
US10417344B2 (en) 2014-05-30 2019-09-17 Apple Inc. Exemplar-based natural language processing
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US10657966B2 (en) 2014-05-30 2020-05-19 Apple Inc. Better resolution when referencing to concepts
US10699717B2 (en) 2014-05-30 2020-06-30 Apple Inc. Intelligent assistant for home automation
US11670289B2 (en) 2014-05-30 2023-06-06 Apple Inc. Multi-command single utterance input method
US11699448B2 (en) 2014-05-30 2023-07-11 Apple Inc. Intelligent assistant for home automation
US9854139B2 (en) 2014-06-24 2017-12-26 Sony Mobile Communications Inc. Lifelog camera and method of controlling same using voice triggers
WO2015198165A1 (en) 2014-06-24 2015-12-30 Sony Corporation Lifelog camera and method of controlling same using voice triggers
US11838579B2 (en) 2014-06-30 2023-12-05 Apple Inc. Intelligent automated assistant for TV user interactions
US11516537B2 (en) 2014-06-30 2022-11-29 Apple Inc. Intelligent automated assistant for TV user interactions
US20160019882A1 (en) * 2014-07-15 2016-01-21 Avaya Inc. Systems and methods for speech analytics and phrase spotting using phoneme sequences
US11289077B2 (en) * 2014-07-15 2022-03-29 Avaya Inc. Systems and methods for speech analytics and phrase spotting using phoneme sequences
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10453443B2 (en) 2014-09-30 2019-10-22 Apple Inc. Providing an indication of the suitability of speech recognition
US10438595B2 (en) 2014-09-30 2019-10-08 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10390213B2 (en) 2014-09-30 2019-08-20 Apple Inc. Social reminders
US10366689B2 (en) * 2014-10-29 2019-07-30 Kyocera Corporation Communication robot
US11689846B2 (en) 2014-12-05 2023-06-27 Stages Llc Active noise control and customized audio system
USD749115S1 (en) 2015-02-20 2016-02-09 Translate Abroad, Inc. Mobile device with graphical user interface
US10121488B1 (en) * 2015-02-23 2018-11-06 Sprint Communications Company L.P. Optimizing call quality using vocal frequency fingerprints to filter voice calls
US10825462B1 (en) 2015-02-23 2020-11-03 Sprint Communications Company L.P. Optimizing call quality using vocal frequency fingerprints to filter voice calls
US11231904B2 (en) 2015-03-06 2022-01-25 Apple Inc. Reducing response latency of intelligent automated assistants
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US11842734B2 (en) 2015-03-08 2023-12-12 Apple Inc. Virtual assistant activation
US10529332B2 (en) 2015-03-08 2020-01-07 Apple Inc. Virtual assistant activation
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US10930282B2 (en) 2015-03-08 2021-02-23 Apple Inc. Competing devices responding to voice triggers
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US11070949B2 (en) 2015-05-27 2021-07-20 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display
US11127397B2 (en) 2015-05-27 2021-09-21 Apple Inc. Device voice control
US10681212B2 (en) 2015-06-05 2020-06-09 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11947873B2 (en) 2015-06-29 2024-04-02 Apple Inc. Virtual assistant for media playback
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
US20170060850A1 (en) * 2015-08-24 2017-03-02 Microsoft Technology Licensing, Llc Personal translator
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11954405B2 (en) 2015-09-08 2024-04-09 Apple Inc. Zero latency digital assistant
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US11550542B2 (en) 2015-09-08 2023-01-10 Apple Inc. Zero latency digital assistant
US11126400B2 (en) 2015-09-08 2021-09-21 Apple Inc. Zero latency digital assistant
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
WO2017049766A1 (en) * 2015-09-25 2017-03-30 百度在线网络技术(北京)有限公司 Method and device for outputting voice information
EP3242201A4 (en) * 2015-09-25 2018-04-04 Baidu Online Network Technology (Beijing) Co., Ltd Method and device for outputting voice information
US10403264B2 (en) 2015-09-25 2019-09-03 Baidu Online Network Technology (Beijing) Co., Ltd. Method and device for outputting voice information based on a geographical location having a maximum number of historical records
JP2018508816A (en) * 2015-09-25 2018-03-29 百度在線網絡技術(北京)有限公司 Method and apparatus for outputting audio information
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US11809886B2 (en) 2015-11-06 2023-11-07 Apple Inc. Intelligent automated assistant in a messaging environment
US11886805B2 (en) 2015-11-09 2024-01-30 Apple Inc. Unconventional virtual assistant interactions
US10956666B2 (en) 2015-11-09 2021-03-23 Apple Inc. Unconventional virtual assistant interactions
US10354652B2 (en) 2015-12-02 2019-07-16 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10942703B2 (en) 2015-12-23 2021-03-09 Apple Inc. Proactive assistance based on dialog communication between devices
US11853647B2 (en) 2015-12-23 2023-12-26 Apple Inc. Proactive assistance based on dialog communication between devices
US9916828B2 (en) 2015-12-30 2018-03-13 Thunder Power New Energy Vehicle Development Company Limited Voice control system with dialect recognition
US9697824B1 (en) * 2015-12-30 2017-07-04 Thunder Power New Energy Vehicle Development Company Limited Voice control system with dialect recognition
US10672386B2 (en) 2015-12-30 2020-06-02 Thunder Power New Energy Vehicle Development Company Limited Voice control system with dialect recognition
US9805030B2 (en) * 2016-01-21 2017-10-31 Language Line Services, Inc. Configuration for dynamically displaying language interpretation/translation modalities
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11657820B2 (en) 2016-06-10 2023-05-23 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11749275B2 (en) 2016-06-11 2023-09-05 Apple Inc. Application integration with a digital assistant
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US10580409B2 (en) 2016-06-11 2020-03-03 Apple Inc. Application integration with a digital assistant
US11809783B2 (en) 2016-06-11 2023-11-07 Apple Inc. Intelligent device arbitration and control
US10942702B2 (en) 2016-06-11 2021-03-09 Apple Inc. Intelligent device arbitration and control
US10474753B2 (en) 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
US10553215B2 (en) 2016-09-23 2020-02-04 Apple Inc. Intelligent automated assistant
GR20160100543A (en) * 2016-10-20 2018-06-27 Ευτυχια Ιωαννη Ψωμα Portable translator with memory-equipped sound recorder - translation from native into foreign languages and vice versa
US11601764B2 (en) 2016-11-18 2023-03-07 Stages Llc Audio analysis and processing system
US10945080B2 (en) 2016-11-18 2021-03-09 Stages Llc Audio analysis and processing system
US9980042B1 (en) 2016-11-18 2018-05-22 Stages Llc Beamformer direction of arrival and orientation analysis system
US11330388B2 (en) 2016-11-18 2022-05-10 Stages Llc Audio source spatialization relative to orientation sensor and output
US11281993B2 (en) 2016-12-05 2022-03-22 Apple Inc. Model and ensemble compression for metric learning
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
US11656884B2 (en) 2017-01-09 2023-05-23 Apple Inc. Application integration with a digital assistant
US10332518B2 (en) 2017-05-09 2019-06-25 Apple Inc. User interface for correcting recognition errors
US10417266B2 (en) 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
US10741181B2 (en) 2017-05-09 2020-08-11 Apple Inc. User interface for correcting recognition errors
US11599331B2 (en) 2017-05-11 2023-03-07 Apple Inc. Maintaining privacy of personal information
US10847142B2 (en) 2017-05-11 2020-11-24 Apple Inc. Maintaining privacy of personal information
US10395654B2 (en) 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
US11467802B2 (en) 2017-05-11 2022-10-11 Apple Inc. Maintaining privacy of personal information
US11837237B2 (en) 2017-05-12 2023-12-05 Apple Inc. User-specific acoustic models
US11580990B2 (en) 2017-05-12 2023-02-14 Apple Inc. User-specific acoustic models
US11380310B2 (en) 2017-05-12 2022-07-05 Apple Inc. Low-latency intelligent automated assistant
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US10789945B2 (en) 2017-05-12 2020-09-29 Apple Inc. Low-latency intelligent automated assistant
US11301477B2 (en) 2017-05-12 2022-04-12 Apple Inc. Feedback analysis of a digital assistant
US11538469B2 (en) 2017-05-12 2022-12-27 Apple Inc. Low-latency intelligent automated assistant
US11862151B2 (en) 2017-05-12 2024-01-02 Apple Inc. Low-latency intelligent automated assistant
US10403278B2 (en) 2017-05-16 2019-09-03 Apple Inc. Methods and systems for phonetic matching in digital assistant services
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US10311144B2 (en) 2017-05-16 2019-06-04 Apple Inc. Emoji word sense disambiguation
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US11675829B2 (en) 2017-05-16 2023-06-13 Apple Inc. Intelligent automated assistant for media exploration
US10909171B2 (en) 2017-05-16 2021-02-02 Apple Inc. Intelligent automated assistant for media exploration
US10943601B2 (en) * 2017-05-31 2021-03-09 Lenovo (Singapore) Pte. Ltd. Provide output associated with a dialect
US20180350343A1 (en) * 2017-05-31 2018-12-06 Lenovo (Singapore) Pte. Ltd. Provide output associated with a dialect
US10657328B2 (en) 2017-06-02 2020-05-19 Apple Inc. Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling
US10664667B2 (en) * 2017-08-25 2020-05-26 Panasonic Intellectual Property Corporation Of America Information processing method, information processing device, and recording medium having program recorded thereon
US10445429B2 (en) 2017-09-21 2019-10-15 Apple Inc. Natural language understanding using vocabularies with compressed serialized tries
US10755051B2 (en) 2017-09-29 2020-08-25 Apple Inc. Rule-based natural language processing
US11597519B2 (en) 2017-10-17 2023-03-07 The Boeing Company Artificially intelligent flight crew systems and methods
US10636424B2 (en) 2017-11-30 2020-04-28 Apple Inc. Multi-turn canned dialog
US10733982B2 (en) 2018-01-08 2020-08-04 Apple Inc. Multi-directional dialog
US10733375B2 (en) 2018-01-31 2020-08-04 Apple Inc. Knowledge-based framework for improving natural language understanding
US10789959B2 (en) 2018-03-02 2020-09-29 Apple Inc. Training speaker recognition models for digital assistants
US10592604B2 (en) 2018-03-12 2020-03-17 Apple Inc. Inverse text normalization for automatic speech recognition
US11710482B2 (en) 2018-03-26 2023-07-25 Apple Inc. Natural assistant interaction
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US10909331B2 (en) 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
US11907436B2 (en) 2018-05-07 2024-02-20 Apple Inc. Raise to speak
US11487364B2 (en) 2018-05-07 2022-11-01 Apple Inc. Raise to speak
US11900923B2 (en) 2018-05-07 2024-02-13 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11854539B2 (en) 2018-05-07 2023-12-26 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11169616B2 (en) 2018-05-07 2021-11-09 Apple Inc. Raise to speak
US10984780B2 (en) 2018-05-21 2021-04-20 Apple Inc. Global semantic word embeddings using bi-directional recurrent neural networks
US11431642B2 (en) 2018-06-01 2022-08-30 Apple Inc. Variable latency device coordination
US11386266B2 (en) 2018-06-01 2022-07-12 Apple Inc. Text correction
US11630525B2 (en) 2018-06-01 2023-04-18 Apple Inc. Attention aware virtual assistant dismissal
US10684703B2 (en) 2018-06-01 2020-06-16 Apple Inc. Attention aware virtual assistant dismissal
US11495218B2 (en) 2018-06-01 2022-11-08 Apple Inc. Virtual assistant operation in multi-device environments
US11360577B2 (en) 2018-06-01 2022-06-14 Apple Inc. Attention aware virtual assistant dismissal
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
US10720160B2 (en) 2018-06-01 2020-07-21 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11009970B2 (en) 2018-06-01 2021-05-18 Apple Inc. Attention aware virtual assistant dismissal
US10403283B1 (en) 2018-06-01 2019-09-03 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10984798B2 (en) 2018-06-01 2021-04-20 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10944859B2 (en) 2018-06-03 2021-03-09 Apple Inc. Accelerated task performance
US10504518B1 (en) 2018-06-03 2019-12-10 Apple Inc. Accelerated task performance
US10496705B1 (en) 2018-06-03 2019-12-03 Apple Inc. Accelerated task performance
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11893992B2 (en) 2018-09-28 2024-02-06 Apple Inc. Multi-modal inputs for voice commands
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11783815B2 (en) 2019-03-18 2023-10-10 Apple Inc. Multimodality in digital assistant systems
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11675491B2 (en) 2019-05-06 2023-06-13 Apple Inc. User configurable task triggers
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11705130B2 (en) 2019-05-06 2023-07-18 Apple Inc. Spoken notifications
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11888791B2 (en) 2019-05-21 2024-01-30 Apple Inc. Providing message response suggestions
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11360739B2 (en) 2019-05-31 2022-06-14 Apple Inc. User activity shortcut suggestions
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11790914B2 (en) 2019-06-01 2023-10-17 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11120219B2 (en) 2019-10-28 2021-09-14 International Business Machines Corporation User-customized computer-automated translation
CN110767233A (en) * 2019-10-30 2020-02-07 合肥名阳信息技术有限公司 Voice conversion system and method
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction
US11914848B2 (en) 2020-05-11 2024-02-27 Apple Inc. Providing relevant data items based on context
US11924254B2 (en) 2020-05-11 2024-03-05 Apple Inc. Digital assistant hardware abstraction
US11755276B2 (en) 2020-05-12 2023-09-12 Apple Inc. Reducing description length based on confidence
US11838734B2 (en) 2020-07-20 2023-12-05 Apple Inc. Multi-device audio adjustment coordination
US11696060B2 (en) 2020-07-21 2023-07-04 Apple Inc. User identification using headphones
US11750962B2 (en) 2020-07-21 2023-09-05 Apple Inc. User identification using headphones
US20220293098A1 (en) * 2021-03-15 2022-09-15 Lenovo (Singapore) Pte. Ltd. Dialect correction and training

Also Published As

Publication number Publication date
US20160048508A1 (en) 2016-02-18
US9864745B2 (en) 2018-01-09

Similar Documents

Publication Publication Date Title
US9864745B2 (en) Universal language translator
JP7114660B2 (en) Hot word trigger suppression for recording media
US10817673B2 (en) Translating languages
US9251142B2 (en) Mobile speech-to-speech interpretation system
US9293134B1 (en) Source-specific speech interactions
KR102108500B1 (en) Supporting Method And System For communication Service, and Electronic Device supporting the same
US20170286407A1 (en) Device and method for voice translation
KR20190100334A (en) Contextual Hotwords
KR20200023456A (en) Speech sorter
CN114566161A (en) Cooperative voice control device
KR102097710B1 (en) Apparatus and method for separating of dialogue
US20140365200A1 (en) System and method for automatic speech translation
JP2013510341A (en) System and method for hybrid processing in a natural language speech service environment
JPWO2019111346A1 (en) Two-way speech translation system, two-way speech translation method and program
US9940926B2 (en) Rapid speech recognition adaptation using acoustic input
US10049658B2 (en) Method for training an automatic speech recognition system
WO2020210050A1 (en) Automated control of noise reduction or noise masking
KR20200013774A (en) Pair a Voice-Enabled Device with a Display Device
JP7400364B2 (en) Speech recognition system and information processing method
US11776563B2 (en) Textual echo cancellation
JP2010128766A (en) Information processor, information processing method, program and recording medium
EP2541544A1 (en) Voice sample tagging
Takeda et al. Construction and evaluation of a large in-car speech corpus
Tsiakoulis et al. Statistical methods for building robust spoken dialogue systems in an automobile
JP2019153160A (en) Digital signage device and program

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION