US20160011853A1 - Methods and systems for managing speech recognition in a multi-speech system environment - Google Patents
Methods and systems for managing speech recognition in a multi-speech system environment Download PDFInfo
- Publication number
- US20160011853A1 US20160011853A1 US14/325,916 US201414325916A US2016011853A1 US 20160011853 A1 US20160011853 A1 US 20160011853A1 US 201414325916 A US201414325916 A US 201414325916A US 2016011853 A1 US2016011853 A1 US 2016011853A1
- Authority
- US
- United States
- Prior art keywords
- speech
- user
- data
- action
- gesture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/013—Eye tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/033—Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
- G06F3/038—Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/24—Speech recognition using non-acoustical features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/038—Indexing scheme relating to G06F3/038
- G06F2203/0381—Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/227—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of the speaker; Human-factor methodology
Definitions
- the present disclosure generally relates to methods and systems for managing speech recognition, and more particularly relates to methods and systems for managing speech recognition in an environment having multiple speech systems.
- Speech recognition may be used for interaction with multiple systems in an aircraft.
- the speech recognition capability of each system may be distinct. Interacting with multiple distinct speech systems when operating an aircraft can be difficult for a flight crew member to designate and keep track of which system they are interacting with.
- independent speech systems could have separate but overlapping vocabularies, which could result in unintended inputs or control actions if the flight crew believes they are commanding one system through speech, but in fact, are commanding another. If each system were developed such that there is little overlap in vocabularies, a risk of reduced recognition rates or non-responsive systems may result which may be frustrating or confusing to the crew members.
- a method includes: recording first user data that indicates an action of a user; determining, by a processor, a selection of a first speech enabled system based on the recorded user data; and generating, by the processor, a signal to at least one of activate and deactivate speech processing based on the first speech enabled system.
- a system in another embodiment, includes: an input device that records first user data that indicates an action of a user; and a processor.
- the processor determines a selection of a speech enabled system based on the recorded user data, and generates a signal to at least one of activate and deactivate speech processing based on the speech enabled system.
- FIG. 1 is a functional block diagram illustrating a speech management system for an aircraft in accordance with exemplary embodiments
- FIG. 2 is dataflow diagram illustrating modules of the speech management system in accordance with exemplary embodiments.
- FIG. 3 is a flowchart illustrating a speech management method that may be performed by the speech management system in accordance with exemplary embodiments.
- speech management systems for managing speech input from a user in an environment having multiple speech enabled systems.
- the speech management system generally allows for the selection of a particular speech enabled system through one or more input modalities (e.g., speech, gesture, gaze, etc.). For example, a user may simply say, point to, or look at the speech enabled system that he or she wants to control through speech, and the speech management recognizes the intent of the user and activates speech recognition for that system. Since the suggested input modalities do not require the user to touch the intended system or physically activate the system, the user is free to perform other tasks.
- the input modalities allow a user to interact with speech enabled systems that may be outside his or her reach-envelope. Based upon the selection, the speech management system generates signals to the selected speech enabled system and/or the non-selected speech enabled systems. The signals activate the speech recognition by the selected speech enabled system and/or deactivate speech recognition by the non-selected speech enabled systems.
- exemplary embodiments of the present disclosure are directed to a speech management system shown generally at 10 that is associated with an aircraft 12 .
- the speech management system 10 described herein can be implemented in any aircraft 12 (vehicle or other environment) having onboard a computing device 14 that is associated with two or more speech enabled systems 16 a - 16 n.
- the speech enabled systems 16 a - 16 n each include a speech system that is configured to receive and process speech input from a crew member or other user. In various other embodiments, the speech enabled systems 16 a - 16 n receive inputs from a central speech processor (not shown) that performs speech processing for each of the speech enabled systems.
- the computing device 14 may be implemented as a part of one of the speech enabled systems 16 a and may communicate with the other speech enabled systems 16 b - 16 n, may be a stand-alone system that communicates with each of the speech enabled systems 16 a - 16 n (as shown), or may be partially part of one or more of the speech enabled systems 16 a - 16 n, and partially part of a stand-alone system.
- the computing device 14 may be associated with a display device 18 and one or more input devices 20 a - 20 n and may generally include a memory 22 , one or more processors 24 , and one or more input/output controllers 26 that are communicatively coupled to the display device 18 and the one or more input devices 20 a - 20 n.
- the input devices 20 a - 20 n include for example, an activation switch 20 a, an audio recording device 20 b, and/or one or more video recording devices 20 n.
- the memory 22 stores instructions that can be performed by the processor 24 .
- the instructions stored in memory 22 may include one or more separate programs, each of which comprises an ordered listing of executable instructions for implementing logical functions.
- the instructions stored in the memory include an operating system (OS) 28 and a speech management system 30 .
- OS operating system
- speech management system 30 a speech management system
- the operating system 28 controls the performance of other computer programs and provides scheduling, input-output control, file and data management, memory management, and communication control and related services.
- the processor 24 is configured to execute the instructions stored within the memory 22 , to communicate data to and from the memory 22 , and to generally control operations of the computing device 14 pursuant to the instructions.
- the processor 24 can be any custom made or commercially available processor, a central processing unit (CPU), an auxiliary processor among several processors associated with the computing device 14 , a semiconductor based microprocessor (in the form of a microchip or chip set), a macroprocessor, or generally any device for executing instructions.
- the processor 24 executes the instructions of the speech management system 30 of the present disclosure.
- the speech management system 30 generally allows for the selection of a particular speech enabled system 16 a - 16 n by a user through one or more input modalities (e.g., speech, gesture, gaze, etc.).
- the speech management system 30 recognizes the selection and activates the corresponding speech enabled system 16 a - 16 n based on the selection.
- the speech management system 30 continuously monitors data of one or more of the input modalities for the user initiated selection, and/or the speech management system 30 monitors data of one or more of the input modalities only after being activated via the activation switch 20 a or other input device.
- the speech management system 30 receives an activation signal from the activation switch 16 a and, in response, activates the audio recording device 20 b for recording a command spoken by a user.
- the command may include a first name or other name designating a selected speech enabled system 16 a - 16 n .
- the speech management system 30 processes the recorded audio data to determine the selected speech enabled system 16 a.
- the speech management system 30 activates speech recognition for the selected speech enabled system 16 a by sending an activation signal to the speech enabled system 16 a (e.g., when each speech enabled system 16 a - 16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabled system 16 a (e.g., when a centralized speech processor performs the speech processing for all the speech enabled systems 16 a - 16 n ). Additionally or alternatively, the speech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabled systems 16 b - 16 n.
- the speech management system 30 receives an activation signal from the activation switch 16 a and, in response, activates the video recording device 20 n (or other device) for recording a gesture performed by a user.
- the gesture may include any gesture made by a finger, hand, or arm, such as pointing for a minimum amount of time, or using a finger movement (e.g., a twirl) to indicate a direction of a selected speech enabled system 16 a.
- the speech management system 30 processes the recorded video data to determine the selected speech enabled system 16 a.
- the speech management system 30 activates speech recognition for the speech enabled system 16 a by sending an activation signal to the speech enabled system 16 a (e.g., when each speech enabled system 16 a - 16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabled system 16 a (e.g., when a centralized speech processor performs the speech processing for all speech enabled systems 16 a - 16 n ). Additionally or alternatively, the speech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabled systems 16 b - 16 n.
- the speech management system 30 receives an activation signal from the activation switch and, in response, activates the video recording device 20 n (or other device) for recording a gaze of the user.
- the gaze of the user's eyes may indicate a direction of a selected speech enabled system 16 a.
- the speech management system 30 processes the recorded video data to determine the selected speech enabled system 16 a.
- the speech management system 30 activates speech recognition for the speech enabled system 16 a by sending an activation signal to the speech enabled system 16 a (e.g., when each speech enabled system 16 a - 16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabled system 16 a (e.g., when a centralized speech processor performs the speech processing for all speech enabled systems 16 a - 16 n ). Additionally or alternatively, the speech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabled systems 16 b - 16 n.
- a dataflow diagram illustrates various embodiments of the speech management system 30 .
- Various embodiments of speech management systems 30 may include any number of sub-modules embedded within the speech management system 30 .
- the sub-modules shown in FIG. 2 may be combined and/or further partitioned to manage speech input to the speech management system 30 .
- the inputs to the speech management system 30 may be received from other modules (not shown), determined/modeled by other sub-modules (not shown) within the speech management system 30 , and/or may be user input that is based on a user interacting with a user interface via an input device 16 a - 16 n.
- the speech management system 30 includes a system activation module 31 , at least one of a speech processing module 32 , a gaze processing module 34 , and a gesture processing module 36 (or any other processing modules depending on the number of input modalities), and a speech system activation/deactivation module 38 .
- the system activation module 31 receives as input user input data 40 .
- the user input data 40 may be received based on a user interacting with an input device, such as, for example, the activation switch 20 a or other device.
- the system activation module 31 processes the user input data 40 to determine if the user input data indicates a user's request to activate the selection of a speech enabled system 16 a - 16 n. If the user input data 40 does not indicate to activate speech system selection, optionally, the system activation module 31 may generate display data 42 that includes a message that may be displayed in an interface that indicates that the input is not recognized. If the user input data 40 indicates to activate the speech system selection, the system activation module 31 sets an activation flag 44 to TRUE (or other value indicating to activate the speech system selection).
- the speech processing module 32 receives as input the activation flag 44 .
- the activation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection)
- the speech processing module 44 sends a signal 46 to the recording device 20 b to activate audio recording.
- the speech processing module 32 receives the recorded speech data 48 .
- the speech processing module 32 processes the recorded speech data 48 to determine a spoken command. The processing can be performed based on a set of recognized commands that identify speech enabled systems 16 a - 16 n of the aircraft 12 and speech processing techniques known in the art.
- the speech processing module 32 If the speech processing module 32 is unable to recognize a spoken command from the recorded speech data 48 , optionally, the speech processing module 32 generates display data 50 that includes a message that, when displayed, indicates that the command was not recognized. If a spoken command was recognized, the speech processing module 32 determines a particular speech enabled system 16 a of the speech enabled systems 16 a - 16 n on the aircraft 12 and sets a selected speech system 52 to the particular speech enabled system.
- the gaze processing module 34 receives as input the activation flag 44 .
- the activation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection)
- the gaze processing module 34 sends a signal 54 to the recording device 20 n to activate video recording.
- the gaze processing module 34 receives recorded gaze data 56 .
- the gaze processing module 34 processes the recorded gaze data 56 to determine a gaze direction. The processing can be performed based on gaze recognition techniques known in the art. If the gaze processing module 34 is unable to recognize a gaze diction from the recorded gaze data 56 , optionally, the gaze processing module 34 generates display data 58 that includes a message that, when displayed, indicates that the gaze direction was not identified. If a gaze direction was recognized, the gaze processing module 34 determines a particular speech enabled system 16 a of the speech enabled systems 16 a - 16 n on the aircraft 12 and sets a selected speech enabled system 60 to the particular speech enabled system.
- the gesture processing module 36 receives as input the activation flag 44 .
- the activation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection)
- the gesture processing module 36 sends a signal 62 to the recording device 20 n to activate video recording.
- the gesture processing module 36 receives recorded gesture data 64 .
- the gesture processing module 36 processes the recorded gesture data 64 to determine a gesture direction. The processing can be performed based on gesture recognition techniques known in the art. If the gesture processing module 36 is unable to recognize a gesture diction from the recorded gesture data 64 , optionally, the gesture processing module 36 generates display data 66 that includes a message that, when displayed, indicates that the gesture direction was not identified. If a gesture direction was recognized, the gesture processing module 36 determines a particular speech enabled system 16 a of the speech enabled systems 16 a - 16 n on the aircraft 12 and sets a selected speech system 68 to the particular speech enabled system.
- the speech system activation/deactivation module 38 receives as input the selected speech system 68 from the gesture processing module 36 , the selected speech system 60 from the gaze processing module 34 , and/or the selected speech system 52 from the speech processing module 32 .
- the speech system activation/deactivation module 38 generates an activation/deactivation signal 70 based on the received selected speech system 52 , 60 , 68 .
- the activation/deactivation signals 70 is received by the selected speech enabled system to activate and/or deactivate speech processing by the speech enabled systems, or alternatively, the activation/deactivation signals 70 are used to activate and/or deactivate speech processing by a centralized speech processor using a particular vocabulary and/or speech processing techniques.
- the speech system activation/deactivation module 38 determines the appropriate speech enabled system 16 a - 16 n to generate the activation/deactivation signals based on an arbitration method. For example, if two or more of the selected speech systems 52 , 60 , 68 are the same, then the activation/deactivation signals 70 are generated based on the same selected speech systems. If, two or more of the selected speech systems 52 , 60 , 68 are different, then the selected speech system associated with a processing technique having a highest priority is selected.
- the speech system activation/deactivation module 38 may generate display data 72 that includes a message that indicates the different selected speech enabled systems 52 , 60 , 68 and a request to pick one of the different selected speech systems 52 , 60 , 68 .
- user input data 74 may be received indicating a selected one of the different selected speech systems 52 , 60 , 68 , and the speech system activation/deactivation module 38 generates the activation/deactivation signals 70 based on the selected one.
- FIG. 3 a flowchart illustrates a method that may be performed by the speech management system 30 in accordance with the present disclosure.
- the order of operation within the method is not limited to the sequential execution as illustrated in FIG. 3 , but may be performed in one or more varying orders as applicable and in accordance with the present disclosure.
- the method can be scheduled to run based on predetermined events, and/or can run continually during operation of the computing device 14 of the aircraft 12 .
- the method may begin at 100 . It is determined whether user input data 40 is received at 110 .
- the user input data 40 may be received based on a user selecting an input device, such as, for example, the activation switch 20 a or other device. If user input data 40 is not received at 110 , the method continues with monitoring for user input data 40 at 110 . If however, the user input data 40 is received at 110 , the user input data 40 is processed at 120 and evaluated at 130 . If the user input data 40 does not indicate to activate speech recognition at 130 , optionally, a message may be displayed that indicates that the input is not recognized at 140 and the method continues with monitoring for user input data 40 at 110 .
- the input device 16 b, 16 n is activated at 150 to start recording of the speech, the gesture, and/or the gaze of the user.
- the recorded input is speech input at 160
- the recorded speech data 48 is processed at 170 based on speech recognition methods to determine the speech command
- the selected speech system 52 is determined from the speech command at 180 .
- the activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a - 16 n at 190 based on the selected speech system 52 . Thereafter, the method may end at 200 .
- the recorded data is not speech data at 160 rather, the recorded data is gaze data 56 at 210
- the recorded gaze data 56 is processed at 220 based on gaze recognition methods to determine the direction of gaze of the user.
- the selected speech system 60 is determined from the direction of gaze of the user at 230 .
- the activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a - 16 n at 190 based on the selected speech system 60 . Thereafter, the method may end at 200 .
- the recorded gesture data 64 is processed at 250 based on gesture recognition methods to determine the direction of the gesture of the user.
- the selected speech system 68 is determined from the direction of gesture of the user at 260 .
- the activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a - 16 n at 190 based on the selected speech system 68 . Thereafter, the method may end at 200 .
- the method shown in FIG. 3 illustrates processing one of speech data, gaze data, and gesture data to determine the selected speech system.
- two or more of the speech data, the gaze data, and the gesture data can be processed to determine the selected speech system. For example, if two or more inputs indicate the same speech system, then that speech system is the selected speech system. In another example, if one input indicates a first speech system and another input indicates a second speech system, then a message may be displayed indicating the discrepancy.
- Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
- an embodiment of a system or a component may employ various integrated circuit components, e.g., memory elements, digital signal processing elements, logic elements, look-up tables, or the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices.
- integrated circuit components e.g., memory elements, digital signal processing elements, logic elements, look-up tables, or the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices.
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- a general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
- a processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
- a software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
- An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
- the storage medium may be integral to the processor.
- the processor and the storage medium may reside in an ASIC.
- the ASIC may reside in a user terminal
- the processor and the storage medium may reside as discrete components in a user terminal
Abstract
Description
- The present disclosure generally relates to methods and systems for managing speech recognition, and more particularly relates to methods and systems for managing speech recognition in an environment having multiple speech systems.
- Speech recognition may be used for interaction with multiple systems in an aircraft. In some cases, the speech recognition capability of each system may be distinct. Interacting with multiple distinct speech systems when operating an aircraft can be difficult for a flight crew member to designate and keep track of which system they are interacting with.
- In addition, the independent speech systems could have separate but overlapping vocabularies, which could result in unintended inputs or control actions if the flight crew believes they are commanding one system through speech, but in fact, are commanding another. If each system were developed such that there is little overlap in vocabularies, a risk of reduced recognition rates or non-responsive systems may result which may be frustrating or confusing to the crew members. These and other problems may exist for other environments having multiple independent speech systems.
- Hence, there is a need for systems and methods for managing speech inputs in a multi-speech system environment. Other desirable features and characteristics will become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and the foregoing technical field and background.
- Methods and system are provided for managing speech processing in an environment having at least two speech enabled systems. In one embodiment, a method includes: recording first user data that indicates an action of a user; determining, by a processor, a selection of a first speech enabled system based on the recorded user data; and generating, by the processor, a signal to at least one of activate and deactivate speech processing based on the first speech enabled system.
- In another embodiment, a system includes: an input device that records first user data that indicates an action of a user; and a processor. The processor determines a selection of a speech enabled system based on the recorded user data, and generates a signal to at least one of activate and deactivate speech processing based on the speech enabled system.
- Furthermore, other desirable features and characteristics of the method and system will become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and the preceding background.
- The present invention will hereinafter be described in conjunction with the following figures, wherein like numerals denote like elements, and wherein:
-
FIG. 1 is a functional block diagram illustrating a speech management system for an aircraft in accordance with exemplary embodiments; -
FIG. 2 is dataflow diagram illustrating modules of the speech management system in accordance with exemplary embodiments; and -
FIG. 3 is a flowchart illustrating a speech management method that may be performed by the speech management system in accordance with exemplary embodiments. - The following detailed description is merely exemplary in nature and is not intended to limit the disclosure or the application and uses of the disclosure. As used herein, the word “exemplary” means “serving as an example, instance, or illustration.” Thus, any embodiment described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments. All of the embodiments described herein are exemplary embodiments provided to enable persons skilled in the art to make or use the invention and not to limit the scope of the invention which is defined by the claims. Furthermore, there is no intention to be bound by any expressed or implied theory presented in the preceding technical field, background, brief summary, or the following detailed description.
- In accordance with various embodiments, speech management systems are disclosed for managing speech input from a user in an environment having multiple speech enabled systems. The speech management system generally allows for the selection of a particular speech enabled system through one or more input modalities (e.g., speech, gesture, gaze, etc.). For example, a user may simply say, point to, or look at the speech enabled system that he or she wants to control through speech, and the speech management recognizes the intent of the user and activates speech recognition for that system. Since the suggested input modalities do not require the user to touch the intended system or physically activate the system, the user is free to perform other tasks. In addition, the input modalities allow a user to interact with speech enabled systems that may be outside his or her reach-envelope. Based upon the selection, the speech management system generates signals to the selected speech enabled system and/or the non-selected speech enabled systems. The signals activate the speech recognition by the selected speech enabled system and/or deactivate speech recognition by the non-selected speech enabled systems.
- Referring now to
FIG. 1 , exemplary embodiments of the present disclosure are directed to a speech management system shown generally at 10 that is associated with anaircraft 12. As can be appreciated, thespeech management system 10 described herein can be implemented in any aircraft 12 (vehicle or other environment) having onboard acomputing device 14 that is associated with two or more speech enabled systems 16 a-16 n. - In various embodiments, the speech enabled systems 16 a-16 n each include a speech system that is configured to receive and process speech input from a crew member or other user. In various other embodiments, the speech enabled systems 16 a-16 n receive inputs from a central speech processor (not shown) that performs speech processing for each of the speech enabled systems. As can be appreciated, the
computing device 14 may be implemented as a part of one of the speech enabledsystems 16 a and may communicate with the other speech enabledsystems 16 b-16 n, may be a stand-alone system that communicates with each of the speech enabled systems 16 a-16 n (as shown), or may be partially part of one or more of the speech enabled systems 16 a-16 n, and partially part of a stand-alone system. - The
computing device 14 may be associated with adisplay device 18 and one or more input devices 20 a-20 n and may generally include amemory 22, one ormore processors 24, and one or more input/output controllers 26 that are communicatively coupled to thedisplay device 18 and the one or more input devices 20 a-20 n. The input devices 20 a-20 n include for example, anactivation switch 20 a, anaudio recording device 20 b, and/or one or morevideo recording devices 20 n. - In various embodiments, the
memory 22 stores instructions that can be performed by theprocessor 24. The instructions stored inmemory 22 may include one or more separate programs, each of which comprises an ordered listing of executable instructions for implementing logical functions. In the example ofFIG. 1 , the instructions stored in the memory include an operating system (OS) 28 and aspeech management system 30. - The
operating system 28 controls the performance of other computer programs and provides scheduling, input-output control, file and data management, memory management, and communication control and related services. When thecomputing device 14 is in operation, theprocessor 24 is configured to execute the instructions stored within thememory 22, to communicate data to and from thememory 22, and to generally control operations of thecomputing device 14 pursuant to the instructions. Theprocessor 24 can be any custom made or commercially available processor, a central processing unit (CPU), an auxiliary processor among several processors associated with thecomputing device 14, a semiconductor based microprocessor (in the form of a microchip or chip set), a macroprocessor, or generally any device for executing instructions. - The
processor 24 executes the instructions of thespeech management system 30 of the present disclosure. Thespeech management system 30 generally allows for the selection of a particular speech enabled system 16 a-16 n by a user through one or more input modalities (e.g., speech, gesture, gaze, etc.). Thespeech management system 30 recognizes the selection and activates the corresponding speech enabled system 16 a-16 n based on the selection. - In various embodiments, the
speech management system 30 continuously monitors data of one or more of the input modalities for the user initiated selection, and/or thespeech management system 30 monitors data of one or more of the input modalities only after being activated via theactivation switch 20 a or other input device. For example, in the case of using speech for the identification of the selection, thespeech management system 30 receives an activation signal from theactivation switch 16 a and, in response, activates theaudio recording device 20 b for recording a command spoken by a user. The command may include a first name or other name designating a selected speech enabled system 16 a-16 n. Thespeech management system 30 processes the recorded audio data to determine the selected speech enabledsystem 16 a. Once the speech enabledsystem 16 a has been selected, thespeech management system 30 activates speech recognition for the selected speech enabledsystem 16 a by sending an activation signal to the speech enabledsystem 16 a (e.g., when each speech enabled system 16 a-16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabledsystem 16 a (e.g., when a centralized speech processor performs the speech processing for all the speech enabled systems 16 a-16 n). Additionally or alternatively, thespeech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabledsystems 16 b-16 n. - In another example, in the case of using gesture for the identification of the selection, the
speech management system 30 receives an activation signal from theactivation switch 16 a and, in response, activates thevideo recording device 20 n (or other device) for recording a gesture performed by a user. The gesture may include any gesture made by a finger, hand, or arm, such as pointing for a minimum amount of time, or using a finger movement (e.g., a twirl) to indicate a direction of a selected speech enabledsystem 16 a. Thespeech management system 30 processes the recorded video data to determine the selected speech enabledsystem 16 a. Once the speech enabledsystem 16 a has been selected, thespeech management system 30 activates speech recognition for the speech enabledsystem 16 a by sending an activation signal to the speech enabledsystem 16 a (e.g., when each speech enabled system 16 a-16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabledsystem 16 a (e.g., when a centralized speech processor performs the speech processing for all speech enabled systems 16 a-16 n). Additionally or alternatively, thespeech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabledsystems 16 b-16 n. - In still another example, in the case of using gaze for the identification of the selection, the
speech management system 30 receives an activation signal from the activation switch and, in response, activates thevideo recording device 20 n (or other device) for recording a gaze of the user. The gaze of the user's eyes may indicate a direction of a selected speech enabledsystem 16 a. Thespeech management system 30 processes the recorded video data to determine the selected speech enabledsystem 16 a. Once a speech enabledsystem 16 a has been selected, thespeech management system 30 activates speech recognition for the speech enabledsystem 16 a by sending an activation signal to the speech enabledsystem 16 a (e.g., when each speech enabled system 16 a-16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabledsystem 16 a (e.g., when a centralized speech processor performs the speech processing for all speech enabled systems 16 a-16 n). Additionally or alternatively, thespeech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabledsystems 16 b-16 n. - Referring now to
FIG. 2 and with continued reference toFIG. 1 , a dataflow diagram illustrates various embodiments of thespeech management system 30. Various embodiments ofspeech management systems 30 according to the present disclosure may include any number of sub-modules embedded within thespeech management system 30. As can be appreciated, the sub-modules shown inFIG. 2 may be combined and/or further partitioned to manage speech input to thespeech management system 30. The inputs to thespeech management system 30 may be received from other modules (not shown), determined/modeled by other sub-modules (not shown) within thespeech management system 30, and/or may be user input that is based on a user interacting with a user interface via an input device 16 a-16 n. In various embodiments, thespeech management system 30 includes asystem activation module 31, at least one of aspeech processing module 32, agaze processing module 34, and a gesture processing module 36 (or any other processing modules depending on the number of input modalities), and a speech system activation/deactivation module 38. - The
system activation module 31 receives as inputuser input data 40. Theuser input data 40 may be received based on a user interacting with an input device, such as, for example, theactivation switch 20 a or other device. Thesystem activation module 31 processes theuser input data 40 to determine if the user input data indicates a user's request to activate the selection of a speech enabled system 16 a-16 n. If theuser input data 40 does not indicate to activate speech system selection, optionally, thesystem activation module 31 may generatedisplay data 42 that includes a message that may be displayed in an interface that indicates that the input is not recognized. If theuser input data 40 indicates to activate the speech system selection, thesystem activation module 31 sets anactivation flag 44 to TRUE (or other value indicating to activate the speech system selection). - The
speech processing module 32, for example, receives as input theactivation flag 44. When theactivation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection), thespeech processing module 44 sends asignal 46 to therecording device 20 b to activate audio recording. In return, thespeech processing module 32 receives the recordedspeech data 48. Thespeech processing module 32 processes the recordedspeech data 48 to determine a spoken command. The processing can be performed based on a set of recognized commands that identify speech enabled systems 16 a-16 n of theaircraft 12 and speech processing techniques known in the art. If thespeech processing module 32 is unable to recognize a spoken command from the recordedspeech data 48, optionally, thespeech processing module 32 generatesdisplay data 50 that includes a message that, when displayed, indicates that the command was not recognized. If a spoken command was recognized, thespeech processing module 32 determines a particular speech enabledsystem 16 a of the speech enabled systems 16 a-16 n on theaircraft 12 and sets a selectedspeech system 52 to the particular speech enabled system. - The
gaze processing module 34 receives as input theactivation flag 44. When theactivation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection), thegaze processing module 34 sends asignal 54 to therecording device 20 n to activate video recording. In return, thegaze processing module 34 receives recordedgaze data 56. Thegaze processing module 34 processes the recordedgaze data 56 to determine a gaze direction. The processing can be performed based on gaze recognition techniques known in the art. If thegaze processing module 34 is unable to recognize a gaze diction from the recordedgaze data 56, optionally, thegaze processing module 34 generatesdisplay data 58 that includes a message that, when displayed, indicates that the gaze direction was not identified. If a gaze direction was recognized, thegaze processing module 34 determines a particular speech enabledsystem 16 a of the speech enabled systems 16 a-16 n on theaircraft 12 and sets a selected speech enabledsystem 60 to the particular speech enabled system. - The
gesture processing module 36 receives as input theactivation flag 44. When theactivation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection), thegesture processing module 36 sends asignal 62 to therecording device 20 n to activate video recording. In return, thegesture processing module 36 receives recordedgesture data 64. Thegesture processing module 36 processes the recordedgesture data 64 to determine a gesture direction. The processing can be performed based on gesture recognition techniques known in the art. If thegesture processing module 36 is unable to recognize a gesture diction from the recordedgesture data 64, optionally, thegesture processing module 36 generatesdisplay data 66 that includes a message that, when displayed, indicates that the gesture direction was not identified. If a gesture direction was recognized, thegesture processing module 36 determines a particular speech enabledsystem 16 a of the speech enabled systems 16 a-16 n on theaircraft 12 and sets a selectedspeech system 68 to the particular speech enabled system. - The speech system activation/
deactivation module 38 receives as input the selectedspeech system 68 from thegesture processing module 36, the selectedspeech system 60 from thegaze processing module 34, and/or the selectedspeech system 52 from thespeech processing module 32. The speech system activation/deactivation module 38 generates an activation/deactivation signal 70 based on the received selectedspeech system - When selected
speech systems gesture processing module 36, thegaze processing module 34, and thespeech processing module 32, the speech system activation/deactivation module 38 determines the appropriate speech enabled system 16 a-16 n to generate the activation/deactivation signals based on an arbitration method. For example, if two or more of the selectedspeech systems speech systems deactivation module 38 may generatedisplay data 72 that includes a message that indicates the different selected speech enabledsystems speech systems user input data 74 may be received indicating a selected one of the different selectedspeech systems deactivation module 38 generates the activation/deactivation signals 70 based on the selected one. - Referring now to
FIG. 3 and with continued reference toFIGS. 1 and 2 , a flowchart illustrates a method that may be performed by thespeech management system 30 in accordance with the present disclosure. As can be appreciated in light of the disclosure, the order of operation within the method is not limited to the sequential execution as illustrated inFIG. 3 , but may be performed in one or more varying orders as applicable and in accordance with the present disclosure. - In various embodiments, the method can be scheduled to run based on predetermined events, and/or can run continually during operation of the
computing device 14 of theaircraft 12. - The method may begin at 100. It is determined whether
user input data 40 is received at 110. Theuser input data 40 may be received based on a user selecting an input device, such as, for example, theactivation switch 20 a or other device. Ifuser input data 40 is not received at 110, the method continues with monitoring foruser input data 40 at 110. If however, theuser input data 40 is received at 110, theuser input data 40 is processed at 120 and evaluated at 130. If theuser input data 40 does not indicate to activate speech recognition at 130, optionally, a message may be displayed that indicates that the input is not recognized at 140 and the method continues with monitoring foruser input data 40 at 110. - If, however, the
user input data 40 does indicate to activate speech recognition at 130, theinput device speech data 48 is processed at 170 based on speech recognition methods to determine the speech command The selectedspeech system 52 is determined from the speech command at 180. The activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a-16 n at 190 based on the selectedspeech system 52. Thereafter, the method may end at 200. - If, however, the recorded data is not speech data at 160 rather, the recorded data is
gaze data 56 at 210, the recordedgaze data 56 is processed at 220 based on gaze recognition methods to determine the direction of gaze of the user. The selectedspeech system 60 is determined from the direction of gaze of the user at 230. The activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a-16 n at 190 based on the selectedspeech system 60. Thereafter, the method may end at 200. - If, however, the recorded data is not speech data at 160, and the recorded data is not gesture data at 240, rather the recorded data is recorded
gesture data 64 at 210, the recordedgesture data 64 is processed at 250 based on gesture recognition methods to determine the direction of the gesture of the user. The selectedspeech system 68 is determined from the direction of gesture of the user at 260. The activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a-16 n at 190 based on the selectedspeech system 68. Thereafter, the method may end at 200. - The method shown in
FIG. 3 illustrates processing one of speech data, gaze data, and gesture data to determine the selected speech system. As can be appreciated, two or more of the speech data, the gaze data, and the gesture data can be processed to determine the selected speech system. For example, if two or more inputs indicate the same speech system, then that speech system is the selected speech system. In another example, if one input indicates a first speech system and another input indicates a second speech system, then a message may be displayed indicating the discrepancy. - Those of skill in the art will appreciate that the various illustrative logical blocks, modules, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. Some of the embodiments and implementations are described above in terms of functional and/or logical block components (or modules) and various processing steps. However, it should be appreciated that such block components (or modules) may be realized by any number of hardware, software, and/or firmware components configured to perform the specified functions. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. For example, an embodiment of a system or a component may employ various integrated circuit components, e.g., memory elements, digital signal processing elements, logic elements, look-up tables, or the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices. In addition, those skilled in the art will appreciate that embodiments described herein are merely exemplary implementations
- The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
- The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal In the alternative, the processor and the storage medium may reside as discrete components in a user terminal
- In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Numerical ordinals such as “first,” “second,” “third,” etc. simply denote different singles of a plurality and do not imply any order or sequence unless specifically defined by the claim language. The sequence of the text in any of the claims does not imply that process steps must be performed in a temporal or logical order according to such sequence unless it is specifically defined by the language of the claim. The process steps may be interchanged in any order without departing from the scope of the invention as long as such an interchange does not contradict the claim language and is not logically nonsensical.
- While at least one exemplary embodiment has been presented in the foregoing detailed description of the invention, it should be appreciated that a vast number of variations exist. It should also be appreciated that the exemplary embodiment or exemplary embodiments are only examples, and are not intended to limit the scope, applicability, or configuration of the invention in any way. Rather, the foregoing detailed description will provide those skilled in the art with a convenient road map for implementing an exemplary embodiment of the invention. It being understood that various changes may be made in the function and arrangement of elements described in an exemplary embodiment without departing from the scope of the invention as set forth in the appended claims.
Claims (20)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/325,916 US9569174B2 (en) | 2014-07-08 | 2014-07-08 | Methods and systems for managing speech recognition in a multi-speech system environment |
EP15171540.6A EP2966644A3 (en) | 2014-07-08 | 2015-06-10 | Methods and systems for managing speech recognition in a multi-speech system environment |
CN201510392770.XA CN105261361A (en) | 2014-07-08 | 2015-07-07 | Methods and systems for managing speech recognition in a multi-speech system environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/325,916 US9569174B2 (en) | 2014-07-08 | 2014-07-08 | Methods and systems for managing speech recognition in a multi-speech system environment |
Publications (2)
Publication Number | Publication Date |
---|---|
US20160011853A1 true US20160011853A1 (en) | 2016-01-14 |
US9569174B2 US9569174B2 (en) | 2017-02-14 |
Family
ID=53373350
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/325,916 Expired - Fee Related US9569174B2 (en) | 2014-07-08 | 2014-07-08 | Methods and systems for managing speech recognition in a multi-speech system environment |
Country Status (3)
Country | Link |
---|---|
US (1) | US9569174B2 (en) |
EP (1) | EP2966644A3 (en) |
CN (1) | CN105261361A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9653075B1 (en) * | 2015-11-06 | 2017-05-16 | Google Inc. | Voice commands across devices |
US9824689B1 (en) * | 2015-12-07 | 2017-11-21 | Rockwell Collins Inc. | Speech recognition for avionic systems |
US10890969B2 (en) * | 2018-05-04 | 2021-01-12 | Google Llc | Invoking automated assistant function(s) based on detected gesture and gaze |
WO2021086600A1 (en) * | 2019-11-01 | 2021-05-06 | Microsoft Technology Licensing, Llc | Selective response rendering for virtual assistants |
US11289081B2 (en) * | 2018-11-08 | 2022-03-29 | Sharp Kabushiki Kaisha | Refrigerator |
US11614794B2 (en) | 2018-05-04 | 2023-03-28 | Google Llc | Adapting automated assistant based on detected mouth movement and/or gaze |
US11688417B2 (en) * | 2018-05-04 | 2023-06-27 | Google Llc | Hot-word free adaptation of automated assistant function(s) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0702355A2 (en) * | 1994-09-14 | 1996-03-20 | Canon Kabushiki Kaisha | Speech recognition method and apparatus |
US6157403A (en) * | 1996-08-05 | 2000-12-05 | Kabushiki Kaisha Toshiba | Apparatus for detecting position of object capable of simultaneously detecting plural objects and detection method therefor |
US6868383B1 (en) * | 2001-07-12 | 2005-03-15 | At&T Corp. | Systems and methods for extracting meaning from multimodal inputs using finite-state devices |
US20050175218A1 (en) * | 2003-11-14 | 2005-08-11 | Roel Vertegaal | Method and apparatus for calibration-free eye tracking using multiple glints or surface reflections |
US20060110008A1 (en) * | 2003-11-14 | 2006-05-25 | Roel Vertegaal | Method and apparatus for calibration-free eye tracking |
US20070016426A1 (en) * | 2005-06-28 | 2007-01-18 | Microsoft Corporation | Audio-visual control system |
US20070081090A1 (en) * | 2005-09-27 | 2007-04-12 | Mona Singh | Method and system for associating user comments to a scene captured by a digital imaging device |
US20070288242A1 (en) * | 2006-06-12 | 2007-12-13 | Lockheed Martin Corporation | Speech recognition and control system, program product, and related methods |
US20110125503A1 (en) * | 2009-11-24 | 2011-05-26 | Honeywell International Inc. | Methods and systems for utilizing voice commands onboard an aircraft |
US20130210406A1 (en) * | 2012-02-12 | 2013-08-15 | Joel Vidal | Phone that prevents texting while driving |
US20130281079A1 (en) * | 2012-02-12 | 2013-10-24 | Joel Vidal | Phone that prevents concurrent texting and driving |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2696258B1 (en) | 1992-09-25 | 1994-10-28 | Sextant Avionique | Device for managing a human-machine interaction system. |
US6154723A (en) | 1996-12-06 | 2000-11-28 | The Board Of Trustees Of The University Of Illinois | Virtual reality 3D interface system for data creation, viewing and editing |
US6219645B1 (en) * | 1999-12-02 | 2001-04-17 | Lucent Technologies, Inc. | Enhanced automatic speech recognition using multiple directional microphones |
EP1215658A3 (en) | 2000-12-05 | 2002-08-14 | Hewlett-Packard Company | Visual activation of voice controlled apparatus |
WO2007017796A2 (en) * | 2005-08-11 | 2007-02-15 | Philips Intellectual Property & Standards Gmbh | Method for introducing interaction pattern and application functionalities |
US8964298B2 (en) | 2010-02-28 | 2015-02-24 | Microsoft Corporation | Video display modification based on sensor input for a see-through near-to-eye display |
CN103092557A (en) * | 2011-11-01 | 2013-05-08 | 上海博泰悦臻网络技术服务有限公司 | Vehicular speech input device and method |
CN102614057B (en) * | 2012-04-11 | 2013-08-28 | 合肥工业大学 | Multifunctional electric nursing sickbed with intelligent residential environment |
US9423870B2 (en) | 2012-05-08 | 2016-08-23 | Google Inc. | Input determination method |
CN102945672B (en) * | 2012-09-29 | 2013-10-16 | 深圳市国华识别科技开发有限公司 | Voice control system for multimedia equipment, and voice control method |
-
2014
- 2014-07-08 US US14/325,916 patent/US9569174B2/en not_active Expired - Fee Related
-
2015
- 2015-06-10 EP EP15171540.6A patent/EP2966644A3/en not_active Ceased
- 2015-07-07 CN CN201510392770.XA patent/CN105261361A/en active Pending
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0702355A2 (en) * | 1994-09-14 | 1996-03-20 | Canon Kabushiki Kaisha | Speech recognition method and apparatus |
US6157403A (en) * | 1996-08-05 | 2000-12-05 | Kabushiki Kaisha Toshiba | Apparatus for detecting position of object capable of simultaneously detecting plural objects and detection method therefor |
US6868383B1 (en) * | 2001-07-12 | 2005-03-15 | At&T Corp. | Systems and methods for extracting meaning from multimodal inputs using finite-state devices |
US7295975B1 (en) * | 2001-07-12 | 2007-11-13 | At&T Corp. | Systems and methods for extracting meaning from multimodal inputs using finite-state devices |
US20050175218A1 (en) * | 2003-11-14 | 2005-08-11 | Roel Vertegaal | Method and apparatus for calibration-free eye tracking using multiple glints or surface reflections |
US20060110008A1 (en) * | 2003-11-14 | 2006-05-25 | Roel Vertegaal | Method and apparatus for calibration-free eye tracking |
US20070016426A1 (en) * | 2005-06-28 | 2007-01-18 | Microsoft Corporation | Audio-visual control system |
US20070081090A1 (en) * | 2005-09-27 | 2007-04-12 | Mona Singh | Method and system for associating user comments to a scene captured by a digital imaging device |
US20070288242A1 (en) * | 2006-06-12 | 2007-12-13 | Lockheed Martin Corporation | Speech recognition and control system, program product, and related methods |
US7774202B2 (en) * | 2006-06-12 | 2010-08-10 | Lockheed Martin Corporation | Speech activated control system and related methods |
US20110125503A1 (en) * | 2009-11-24 | 2011-05-26 | Honeywell International Inc. | Methods and systems for utilizing voice commands onboard an aircraft |
US8515763B2 (en) * | 2009-11-24 | 2013-08-20 | Honeywell International Inc. | Methods and systems for utilizing voice commands onboard an aircraft |
US20130210406A1 (en) * | 2012-02-12 | 2013-08-15 | Joel Vidal | Phone that prevents texting while driving |
US8538402B2 (en) * | 2012-02-12 | 2013-09-17 | Joel Vidal | Phone that prevents texting while driving |
US20130281079A1 (en) * | 2012-02-12 | 2013-10-24 | Joel Vidal | Phone that prevents concurrent texting and driving |
US8914014B2 (en) * | 2012-02-12 | 2014-12-16 | Joel Vidal | Phone that prevents concurrent texting and driving |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9653075B1 (en) * | 2015-11-06 | 2017-05-16 | Google Inc. | Voice commands across devices |
US20170249940A1 (en) * | 2015-11-06 | 2017-08-31 | Google Inc. | Voice commands across devices |
US10714083B2 (en) * | 2015-11-06 | 2020-07-14 | Google Llc | Voice commands across devices |
US11749266B2 (en) | 2015-11-06 | 2023-09-05 | Google Llc | Voice commands across devices |
US9824689B1 (en) * | 2015-12-07 | 2017-11-21 | Rockwell Collins Inc. | Speech recognition for avionic systems |
US10890969B2 (en) * | 2018-05-04 | 2021-01-12 | Google Llc | Invoking automated assistant function(s) based on detected gesture and gaze |
US11493992B2 (en) | 2018-05-04 | 2022-11-08 | Google Llc | Invoking automated assistant function(s) based on detected gesture and gaze |
US11614794B2 (en) | 2018-05-04 | 2023-03-28 | Google Llc | Adapting automated assistant based on detected mouth movement and/or gaze |
US11688417B2 (en) * | 2018-05-04 | 2023-06-27 | Google Llc | Hot-word free adaptation of automated assistant function(s) |
US11289081B2 (en) * | 2018-11-08 | 2022-03-29 | Sharp Kabushiki Kaisha | Refrigerator |
WO2021086600A1 (en) * | 2019-11-01 | 2021-05-06 | Microsoft Technology Licensing, Llc | Selective response rendering for virtual assistants |
US11289086B2 (en) | 2019-11-01 | 2022-03-29 | Microsoft Technology Licensing, Llc | Selective response rendering for virtual assistants |
Also Published As
Publication number | Publication date |
---|---|
EP2966644A2 (en) | 2016-01-13 |
US9569174B2 (en) | 2017-02-14 |
CN105261361A (en) | 2016-01-20 |
EP2966644A3 (en) | 2016-02-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9569174B2 (en) | Methods and systems for managing speech recognition in a multi-speech system environment | |
CN105691406B (en) | System and method for the built-in negotiation automation movement of the vehicles | |
MX2015002413A (en) | Disambiguation of dynamic commands. | |
US10065750B2 (en) | Aircraft maintenance systems and methods using wearable device | |
JP2016530660A (en) | Context-sensitive gesture classification | |
EP3451129B1 (en) | System and method of providing clipboard cut and paste operations in an avionics touchscreen system | |
US20150054743A1 (en) | Display method through a head mounted device | |
CN111737670B (en) | Method, system and vehicle-mounted multimedia device for multi-mode data collaborative man-machine interaction | |
US20170287476A1 (en) | Vehicle aware speech recognition systems and methods | |
EP3040917A1 (en) | Speech recognition systems and methods for maintenance repair and overhaul | |
US20180095477A1 (en) | Method for accessing a vehicle-specific electronic device | |
US9588611B2 (en) | System and method for guarding emergency and critical touch targets | |
EP3021265A1 (en) | Context based content display in a wearable device | |
US9594610B2 (en) | Method and system for processing multimodal input signals | |
CN109857326A (en) | A kind of vehicular touch screen and its control method | |
EP2903240A1 (en) | Configurable communication systems and methods for communication | |
CN113835570A (en) | Method, device, apparatus, storage medium, and program for controlling display screen in vehicle | |
US20160086389A1 (en) | Methods and systems for processing speech to assist maintenance operations | |
US20210193133A1 (en) | Information processing device, information processing method, and program | |
US10061388B2 (en) | Method and apparatus for processing user input | |
US10646998B2 (en) | System and method for optimizing resource usage of a robot | |
CN113791841A (en) | Execution instruction determining method, device, equipment and storage medium | |
US20150317973A1 (en) | Systems and methods for coordinating speech recognition | |
US9721356B2 (en) | Methods and systems for programatically identifying shapes in graphical artifacts | |
US9858918B2 (en) | Root cause analysis and recovery systems and methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: HONEYWELL INTERNATIONAL INC., NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ROGERS, WILLIAM;LETSU-DAKE, EMMANUEL;WHITLOW, STEPHEN;SIGNING DATES FROM 20140624 TO 20140707;REEL/FRAME:033262/0932 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20210214 |