US20160011853A1 - Methods and systems for managing speech recognition in a multi-speech system environment - Google Patents

Methods and systems for managing speech recognition in a multi-speech system environment Download PDF

Info

Publication number
US20160011853A1
US20160011853A1 US14/325,916 US201414325916A US2016011853A1 US 20160011853 A1 US20160011853 A1 US 20160011853A1 US 201414325916 A US201414325916 A US 201414325916A US 2016011853 A1 US2016011853 A1 US 2016011853A1
Authority
US
United States
Prior art keywords
speech
user
data
action
gesture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US14/325,916
Other versions
US9569174B2 (en
Inventor
William Rogers
Emmanuel Letsu-Dake
Stephen Whitlow
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honeywell International Inc
Original Assignee
Honeywell International Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honeywell International Inc filed Critical Honeywell International Inc
Priority to US14/325,916 priority Critical patent/US9569174B2/en
Assigned to HONEYWELL INTERNATIONAL INC. reassignment HONEYWELL INTERNATIONAL INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Whitlow, Stephen, Letsu-Dake, Emmanuel, ROGERS, WILLIAM
Priority to EP15171540.6A priority patent/EP2966644A3/en
Priority to CN201510392770.XA priority patent/CN105261361A/en
Publication of US20160011853A1 publication Critical patent/US20160011853A1/en
Application granted granted Critical
Publication of US9569174B2 publication Critical patent/US9569174B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/038Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/038Indexing scheme relating to G06F3/038
    • G06F2203/0381Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/227Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of the speaker; Human-factor methodology

Definitions

  • the present disclosure generally relates to methods and systems for managing speech recognition, and more particularly relates to methods and systems for managing speech recognition in an environment having multiple speech systems.
  • Speech recognition may be used for interaction with multiple systems in an aircraft.
  • the speech recognition capability of each system may be distinct. Interacting with multiple distinct speech systems when operating an aircraft can be difficult for a flight crew member to designate and keep track of which system they are interacting with.
  • independent speech systems could have separate but overlapping vocabularies, which could result in unintended inputs or control actions if the flight crew believes they are commanding one system through speech, but in fact, are commanding another. If each system were developed such that there is little overlap in vocabularies, a risk of reduced recognition rates or non-responsive systems may result which may be frustrating or confusing to the crew members.
  • a method includes: recording first user data that indicates an action of a user; determining, by a processor, a selection of a first speech enabled system based on the recorded user data; and generating, by the processor, a signal to at least one of activate and deactivate speech processing based on the first speech enabled system.
  • a system in another embodiment, includes: an input device that records first user data that indicates an action of a user; and a processor.
  • the processor determines a selection of a speech enabled system based on the recorded user data, and generates a signal to at least one of activate and deactivate speech processing based on the speech enabled system.
  • FIG. 1 is a functional block diagram illustrating a speech management system for an aircraft in accordance with exemplary embodiments
  • FIG. 2 is dataflow diagram illustrating modules of the speech management system in accordance with exemplary embodiments.
  • FIG. 3 is a flowchart illustrating a speech management method that may be performed by the speech management system in accordance with exemplary embodiments.
  • speech management systems for managing speech input from a user in an environment having multiple speech enabled systems.
  • the speech management system generally allows for the selection of a particular speech enabled system through one or more input modalities (e.g., speech, gesture, gaze, etc.). For example, a user may simply say, point to, or look at the speech enabled system that he or she wants to control through speech, and the speech management recognizes the intent of the user and activates speech recognition for that system. Since the suggested input modalities do not require the user to touch the intended system or physically activate the system, the user is free to perform other tasks.
  • the input modalities allow a user to interact with speech enabled systems that may be outside his or her reach-envelope. Based upon the selection, the speech management system generates signals to the selected speech enabled system and/or the non-selected speech enabled systems. The signals activate the speech recognition by the selected speech enabled system and/or deactivate speech recognition by the non-selected speech enabled systems.
  • exemplary embodiments of the present disclosure are directed to a speech management system shown generally at 10 that is associated with an aircraft 12 .
  • the speech management system 10 described herein can be implemented in any aircraft 12 (vehicle or other environment) having onboard a computing device 14 that is associated with two or more speech enabled systems 16 a - 16 n.
  • the speech enabled systems 16 a - 16 n each include a speech system that is configured to receive and process speech input from a crew member or other user. In various other embodiments, the speech enabled systems 16 a - 16 n receive inputs from a central speech processor (not shown) that performs speech processing for each of the speech enabled systems.
  • the computing device 14 may be implemented as a part of one of the speech enabled systems 16 a and may communicate with the other speech enabled systems 16 b - 16 n, may be a stand-alone system that communicates with each of the speech enabled systems 16 a - 16 n (as shown), or may be partially part of one or more of the speech enabled systems 16 a - 16 n, and partially part of a stand-alone system.
  • the computing device 14 may be associated with a display device 18 and one or more input devices 20 a - 20 n and may generally include a memory 22 , one or more processors 24 , and one or more input/output controllers 26 that are communicatively coupled to the display device 18 and the one or more input devices 20 a - 20 n.
  • the input devices 20 a - 20 n include for example, an activation switch 20 a, an audio recording device 20 b, and/or one or more video recording devices 20 n.
  • the memory 22 stores instructions that can be performed by the processor 24 .
  • the instructions stored in memory 22 may include one or more separate programs, each of which comprises an ordered listing of executable instructions for implementing logical functions.
  • the instructions stored in the memory include an operating system (OS) 28 and a speech management system 30 .
  • OS operating system
  • speech management system 30 a speech management system
  • the operating system 28 controls the performance of other computer programs and provides scheduling, input-output control, file and data management, memory management, and communication control and related services.
  • the processor 24 is configured to execute the instructions stored within the memory 22 , to communicate data to and from the memory 22 , and to generally control operations of the computing device 14 pursuant to the instructions.
  • the processor 24 can be any custom made or commercially available processor, a central processing unit (CPU), an auxiliary processor among several processors associated with the computing device 14 , a semiconductor based microprocessor (in the form of a microchip or chip set), a macroprocessor, or generally any device for executing instructions.
  • the processor 24 executes the instructions of the speech management system 30 of the present disclosure.
  • the speech management system 30 generally allows for the selection of a particular speech enabled system 16 a - 16 n by a user through one or more input modalities (e.g., speech, gesture, gaze, etc.).
  • the speech management system 30 recognizes the selection and activates the corresponding speech enabled system 16 a - 16 n based on the selection.
  • the speech management system 30 continuously monitors data of one or more of the input modalities for the user initiated selection, and/or the speech management system 30 monitors data of one or more of the input modalities only after being activated via the activation switch 20 a or other input device.
  • the speech management system 30 receives an activation signal from the activation switch 16 a and, in response, activates the audio recording device 20 b for recording a command spoken by a user.
  • the command may include a first name or other name designating a selected speech enabled system 16 a - 16 n .
  • the speech management system 30 processes the recorded audio data to determine the selected speech enabled system 16 a.
  • the speech management system 30 activates speech recognition for the selected speech enabled system 16 a by sending an activation signal to the speech enabled system 16 a (e.g., when each speech enabled system 16 a - 16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabled system 16 a (e.g., when a centralized speech processor performs the speech processing for all the speech enabled systems 16 a - 16 n ). Additionally or alternatively, the speech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabled systems 16 b - 16 n.
  • the speech management system 30 receives an activation signal from the activation switch 16 a and, in response, activates the video recording device 20 n (or other device) for recording a gesture performed by a user.
  • the gesture may include any gesture made by a finger, hand, or arm, such as pointing for a minimum amount of time, or using a finger movement (e.g., a twirl) to indicate a direction of a selected speech enabled system 16 a.
  • the speech management system 30 processes the recorded video data to determine the selected speech enabled system 16 a.
  • the speech management system 30 activates speech recognition for the speech enabled system 16 a by sending an activation signal to the speech enabled system 16 a (e.g., when each speech enabled system 16 a - 16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabled system 16 a (e.g., when a centralized speech processor performs the speech processing for all speech enabled systems 16 a - 16 n ). Additionally or alternatively, the speech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabled systems 16 b - 16 n.
  • the speech management system 30 receives an activation signal from the activation switch and, in response, activates the video recording device 20 n (or other device) for recording a gaze of the user.
  • the gaze of the user's eyes may indicate a direction of a selected speech enabled system 16 a.
  • the speech management system 30 processes the recorded video data to determine the selected speech enabled system 16 a.
  • the speech management system 30 activates speech recognition for the speech enabled system 16 a by sending an activation signal to the speech enabled system 16 a (e.g., when each speech enabled system 16 a - 16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabled system 16 a (e.g., when a centralized speech processor performs the speech processing for all speech enabled systems 16 a - 16 n ). Additionally or alternatively, the speech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabled systems 16 b - 16 n.
  • a dataflow diagram illustrates various embodiments of the speech management system 30 .
  • Various embodiments of speech management systems 30 may include any number of sub-modules embedded within the speech management system 30 .
  • the sub-modules shown in FIG. 2 may be combined and/or further partitioned to manage speech input to the speech management system 30 .
  • the inputs to the speech management system 30 may be received from other modules (not shown), determined/modeled by other sub-modules (not shown) within the speech management system 30 , and/or may be user input that is based on a user interacting with a user interface via an input device 16 a - 16 n.
  • the speech management system 30 includes a system activation module 31 , at least one of a speech processing module 32 , a gaze processing module 34 , and a gesture processing module 36 (or any other processing modules depending on the number of input modalities), and a speech system activation/deactivation module 38 .
  • the system activation module 31 receives as input user input data 40 .
  • the user input data 40 may be received based on a user interacting with an input device, such as, for example, the activation switch 20 a or other device.
  • the system activation module 31 processes the user input data 40 to determine if the user input data indicates a user's request to activate the selection of a speech enabled system 16 a - 16 n. If the user input data 40 does not indicate to activate speech system selection, optionally, the system activation module 31 may generate display data 42 that includes a message that may be displayed in an interface that indicates that the input is not recognized. If the user input data 40 indicates to activate the speech system selection, the system activation module 31 sets an activation flag 44 to TRUE (or other value indicating to activate the speech system selection).
  • the speech processing module 32 receives as input the activation flag 44 .
  • the activation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection)
  • the speech processing module 44 sends a signal 46 to the recording device 20 b to activate audio recording.
  • the speech processing module 32 receives the recorded speech data 48 .
  • the speech processing module 32 processes the recorded speech data 48 to determine a spoken command. The processing can be performed based on a set of recognized commands that identify speech enabled systems 16 a - 16 n of the aircraft 12 and speech processing techniques known in the art.
  • the speech processing module 32 If the speech processing module 32 is unable to recognize a spoken command from the recorded speech data 48 , optionally, the speech processing module 32 generates display data 50 that includes a message that, when displayed, indicates that the command was not recognized. If a spoken command was recognized, the speech processing module 32 determines a particular speech enabled system 16 a of the speech enabled systems 16 a - 16 n on the aircraft 12 and sets a selected speech system 52 to the particular speech enabled system.
  • the gaze processing module 34 receives as input the activation flag 44 .
  • the activation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection)
  • the gaze processing module 34 sends a signal 54 to the recording device 20 n to activate video recording.
  • the gaze processing module 34 receives recorded gaze data 56 .
  • the gaze processing module 34 processes the recorded gaze data 56 to determine a gaze direction. The processing can be performed based on gaze recognition techniques known in the art. If the gaze processing module 34 is unable to recognize a gaze diction from the recorded gaze data 56 , optionally, the gaze processing module 34 generates display data 58 that includes a message that, when displayed, indicates that the gaze direction was not identified. If a gaze direction was recognized, the gaze processing module 34 determines a particular speech enabled system 16 a of the speech enabled systems 16 a - 16 n on the aircraft 12 and sets a selected speech enabled system 60 to the particular speech enabled system.
  • the gesture processing module 36 receives as input the activation flag 44 .
  • the activation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection)
  • the gesture processing module 36 sends a signal 62 to the recording device 20 n to activate video recording.
  • the gesture processing module 36 receives recorded gesture data 64 .
  • the gesture processing module 36 processes the recorded gesture data 64 to determine a gesture direction. The processing can be performed based on gesture recognition techniques known in the art. If the gesture processing module 36 is unable to recognize a gesture diction from the recorded gesture data 64 , optionally, the gesture processing module 36 generates display data 66 that includes a message that, when displayed, indicates that the gesture direction was not identified. If a gesture direction was recognized, the gesture processing module 36 determines a particular speech enabled system 16 a of the speech enabled systems 16 a - 16 n on the aircraft 12 and sets a selected speech system 68 to the particular speech enabled system.
  • the speech system activation/deactivation module 38 receives as input the selected speech system 68 from the gesture processing module 36 , the selected speech system 60 from the gaze processing module 34 , and/or the selected speech system 52 from the speech processing module 32 .
  • the speech system activation/deactivation module 38 generates an activation/deactivation signal 70 based on the received selected speech system 52 , 60 , 68 .
  • the activation/deactivation signals 70 is received by the selected speech enabled system to activate and/or deactivate speech processing by the speech enabled systems, or alternatively, the activation/deactivation signals 70 are used to activate and/or deactivate speech processing by a centralized speech processor using a particular vocabulary and/or speech processing techniques.
  • the speech system activation/deactivation module 38 determines the appropriate speech enabled system 16 a - 16 n to generate the activation/deactivation signals based on an arbitration method. For example, if two or more of the selected speech systems 52 , 60 , 68 are the same, then the activation/deactivation signals 70 are generated based on the same selected speech systems. If, two or more of the selected speech systems 52 , 60 , 68 are different, then the selected speech system associated with a processing technique having a highest priority is selected.
  • the speech system activation/deactivation module 38 may generate display data 72 that includes a message that indicates the different selected speech enabled systems 52 , 60 , 68 and a request to pick one of the different selected speech systems 52 , 60 , 68 .
  • user input data 74 may be received indicating a selected one of the different selected speech systems 52 , 60 , 68 , and the speech system activation/deactivation module 38 generates the activation/deactivation signals 70 based on the selected one.
  • FIG. 3 a flowchart illustrates a method that may be performed by the speech management system 30 in accordance with the present disclosure.
  • the order of operation within the method is not limited to the sequential execution as illustrated in FIG. 3 , but may be performed in one or more varying orders as applicable and in accordance with the present disclosure.
  • the method can be scheduled to run based on predetermined events, and/or can run continually during operation of the computing device 14 of the aircraft 12 .
  • the method may begin at 100 . It is determined whether user input data 40 is received at 110 .
  • the user input data 40 may be received based on a user selecting an input device, such as, for example, the activation switch 20 a or other device. If user input data 40 is not received at 110 , the method continues with monitoring for user input data 40 at 110 . If however, the user input data 40 is received at 110 , the user input data 40 is processed at 120 and evaluated at 130 . If the user input data 40 does not indicate to activate speech recognition at 130 , optionally, a message may be displayed that indicates that the input is not recognized at 140 and the method continues with monitoring for user input data 40 at 110 .
  • the input device 16 b, 16 n is activated at 150 to start recording of the speech, the gesture, and/or the gaze of the user.
  • the recorded input is speech input at 160
  • the recorded speech data 48 is processed at 170 based on speech recognition methods to determine the speech command
  • the selected speech system 52 is determined from the speech command at 180 .
  • the activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a - 16 n at 190 based on the selected speech system 52 . Thereafter, the method may end at 200 .
  • the recorded data is not speech data at 160 rather, the recorded data is gaze data 56 at 210
  • the recorded gaze data 56 is processed at 220 based on gaze recognition methods to determine the direction of gaze of the user.
  • the selected speech system 60 is determined from the direction of gaze of the user at 230 .
  • the activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a - 16 n at 190 based on the selected speech system 60 . Thereafter, the method may end at 200 .
  • the recorded gesture data 64 is processed at 250 based on gesture recognition methods to determine the direction of the gesture of the user.
  • the selected speech system 68 is determined from the direction of gesture of the user at 260 .
  • the activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a - 16 n at 190 based on the selected speech system 68 . Thereafter, the method may end at 200 .
  • the method shown in FIG. 3 illustrates processing one of speech data, gaze data, and gesture data to determine the selected speech system.
  • two or more of the speech data, the gaze data, and the gesture data can be processed to determine the selected speech system. For example, if two or more inputs indicate the same speech system, then that speech system is the selected speech system. In another example, if one input indicates a first speech system and another input indicates a second speech system, then a message may be displayed indicating the discrepancy.
  • Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
  • an embodiment of a system or a component may employ various integrated circuit components, e.g., memory elements, digital signal processing elements, logic elements, look-up tables, or the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices.
  • integrated circuit components e.g., memory elements, digital signal processing elements, logic elements, look-up tables, or the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • a software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
  • the storage medium may be integral to the processor.
  • the processor and the storage medium may reside in an ASIC.
  • the ASIC may reside in a user terminal
  • the processor and the storage medium may reside as discrete components in a user terminal

Abstract

Methods and system are provided for managing speech processing in an environment having at least two speech enabled systems. In one embodiment, a method includes: recording first user data that indicates an action of a user; determining, by a processor, a selection of a first speech enabled system based on the recorded user data; and generating, by the processor, a signal to at least one of activate and deactivate speech processing based on the first speech enabled system.

Description

    TECHNICAL FIELD
  • The present disclosure generally relates to methods and systems for managing speech recognition, and more particularly relates to methods and systems for managing speech recognition in an environment having multiple speech systems.
  • BACKGROUND
  • Speech recognition may be used for interaction with multiple systems in an aircraft. In some cases, the speech recognition capability of each system may be distinct. Interacting with multiple distinct speech systems when operating an aircraft can be difficult for a flight crew member to designate and keep track of which system they are interacting with.
  • In addition, the independent speech systems could have separate but overlapping vocabularies, which could result in unintended inputs or control actions if the flight crew believes they are commanding one system through speech, but in fact, are commanding another. If each system were developed such that there is little overlap in vocabularies, a risk of reduced recognition rates or non-responsive systems may result which may be frustrating or confusing to the crew members. These and other problems may exist for other environments having multiple independent speech systems.
  • Hence, there is a need for systems and methods for managing speech inputs in a multi-speech system environment. Other desirable features and characteristics will become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and the foregoing technical field and background.
  • BRIEF SUMMARY
  • Methods and system are provided for managing speech processing in an environment having at least two speech enabled systems. In one embodiment, a method includes: recording first user data that indicates an action of a user; determining, by a processor, a selection of a first speech enabled system based on the recorded user data; and generating, by the processor, a signal to at least one of activate and deactivate speech processing based on the first speech enabled system.
  • In another embodiment, a system includes: an input device that records first user data that indicates an action of a user; and a processor. The processor determines a selection of a speech enabled system based on the recorded user data, and generates a signal to at least one of activate and deactivate speech processing based on the speech enabled system.
  • Furthermore, other desirable features and characteristics of the method and system will become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and the preceding background.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present invention will hereinafter be described in conjunction with the following figures, wherein like numerals denote like elements, and wherein:
  • FIG. 1 is a functional block diagram illustrating a speech management system for an aircraft in accordance with exemplary embodiments;
  • FIG. 2 is dataflow diagram illustrating modules of the speech management system in accordance with exemplary embodiments; and
  • FIG. 3 is a flowchart illustrating a speech management method that may be performed by the speech management system in accordance with exemplary embodiments.
  • DETAILED DESCRIPTION
  • The following detailed description is merely exemplary in nature and is not intended to limit the disclosure or the application and uses of the disclosure. As used herein, the word “exemplary” means “serving as an example, instance, or illustration.” Thus, any embodiment described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments. All of the embodiments described herein are exemplary embodiments provided to enable persons skilled in the art to make or use the invention and not to limit the scope of the invention which is defined by the claims. Furthermore, there is no intention to be bound by any expressed or implied theory presented in the preceding technical field, background, brief summary, or the following detailed description.
  • In accordance with various embodiments, speech management systems are disclosed for managing speech input from a user in an environment having multiple speech enabled systems. The speech management system generally allows for the selection of a particular speech enabled system through one or more input modalities (e.g., speech, gesture, gaze, etc.). For example, a user may simply say, point to, or look at the speech enabled system that he or she wants to control through speech, and the speech management recognizes the intent of the user and activates speech recognition for that system. Since the suggested input modalities do not require the user to touch the intended system or physically activate the system, the user is free to perform other tasks. In addition, the input modalities allow a user to interact with speech enabled systems that may be outside his or her reach-envelope. Based upon the selection, the speech management system generates signals to the selected speech enabled system and/or the non-selected speech enabled systems. The signals activate the speech recognition by the selected speech enabled system and/or deactivate speech recognition by the non-selected speech enabled systems.
  • Referring now to FIG. 1, exemplary embodiments of the present disclosure are directed to a speech management system shown generally at 10 that is associated with an aircraft 12. As can be appreciated, the speech management system 10 described herein can be implemented in any aircraft 12 (vehicle or other environment) having onboard a computing device 14 that is associated with two or more speech enabled systems 16 a-16 n.
  • In various embodiments, the speech enabled systems 16 a-16 n each include a speech system that is configured to receive and process speech input from a crew member or other user. In various other embodiments, the speech enabled systems 16 a-16 n receive inputs from a central speech processor (not shown) that performs speech processing for each of the speech enabled systems. As can be appreciated, the computing device 14 may be implemented as a part of one of the speech enabled systems 16 a and may communicate with the other speech enabled systems 16 b-16 n, may be a stand-alone system that communicates with each of the speech enabled systems 16 a-16 n (as shown), or may be partially part of one or more of the speech enabled systems 16 a-16 n, and partially part of a stand-alone system.
  • The computing device 14 may be associated with a display device 18 and one or more input devices 20 a-20 n and may generally include a memory 22, one or more processors 24, and one or more input/output controllers 26 that are communicatively coupled to the display device 18 and the one or more input devices 20 a-20 n. The input devices 20 a-20 n include for example, an activation switch 20 a, an audio recording device 20 b, and/or one or more video recording devices 20 n.
  • In various embodiments, the memory 22 stores instructions that can be performed by the processor 24. The instructions stored in memory 22 may include one or more separate programs, each of which comprises an ordered listing of executable instructions for implementing logical functions. In the example of FIG. 1, the instructions stored in the memory include an operating system (OS) 28 and a speech management system 30.
  • The operating system 28 controls the performance of other computer programs and provides scheduling, input-output control, file and data management, memory management, and communication control and related services. When the computing device 14 is in operation, the processor 24 is configured to execute the instructions stored within the memory 22, to communicate data to and from the memory 22, and to generally control operations of the computing device 14 pursuant to the instructions. The processor 24 can be any custom made or commercially available processor, a central processing unit (CPU), an auxiliary processor among several processors associated with the computing device 14, a semiconductor based microprocessor (in the form of a microchip or chip set), a macroprocessor, or generally any device for executing instructions.
  • The processor 24 executes the instructions of the speech management system 30 of the present disclosure. The speech management system 30 generally allows for the selection of a particular speech enabled system 16 a-16 n by a user through one or more input modalities (e.g., speech, gesture, gaze, etc.). The speech management system 30 recognizes the selection and activates the corresponding speech enabled system 16 a-16 n based on the selection.
  • In various embodiments, the speech management system 30 continuously monitors data of one or more of the input modalities for the user initiated selection, and/or the speech management system 30 monitors data of one or more of the input modalities only after being activated via the activation switch 20 a or other input device. For example, in the case of using speech for the identification of the selection, the speech management system 30 receives an activation signal from the activation switch 16 a and, in response, activates the audio recording device 20 b for recording a command spoken by a user. The command may include a first name or other name designating a selected speech enabled system 16 a-16 n. The speech management system 30 processes the recorded audio data to determine the selected speech enabled system 16 a. Once the speech enabled system 16 a has been selected, the speech management system 30 activates speech recognition for the selected speech enabled system 16 a by sending an activation signal to the speech enabled system 16 a (e.g., when each speech enabled system 16 a-16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabled system 16 a (e.g., when a centralized speech processor performs the speech processing for all the speech enabled systems 16 a-16 n). Additionally or alternatively, the speech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabled systems 16 b-16 n.
  • In another example, in the case of using gesture for the identification of the selection, the speech management system 30 receives an activation signal from the activation switch 16 a and, in response, activates the video recording device 20 n (or other device) for recording a gesture performed by a user. The gesture may include any gesture made by a finger, hand, or arm, such as pointing for a minimum amount of time, or using a finger movement (e.g., a twirl) to indicate a direction of a selected speech enabled system 16 a. The speech management system 30 processes the recorded video data to determine the selected speech enabled system 16 a. Once the speech enabled system 16 a has been selected, the speech management system 30 activates speech recognition for the speech enabled system 16 a by sending an activation signal to the speech enabled system 16 a (e.g., when each speech enabled system 16 a-16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabled system 16 a (e.g., when a centralized speech processor performs the speech processing for all speech enabled systems 16 a-16 n). Additionally or alternatively, the speech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabled systems 16 b-16 n.
  • In still another example, in the case of using gaze for the identification of the selection, the speech management system 30 receives an activation signal from the activation switch and, in response, activates the video recording device 20 n (or other device) for recording a gaze of the user. The gaze of the user's eyes may indicate a direction of a selected speech enabled system 16 a. The speech management system 30 processes the recorded video data to determine the selected speech enabled system 16 a. Once a speech enabled system 16 a has been selected, the speech management system 30 activates speech recognition for the speech enabled system 16 a by sending an activation signal to the speech enabled system 16 a (e.g., when each speech enabled system 16 a-16 n performs speech processing) or by selecting a vocabulary and/or speech processing methods associated with the speech enabled system 16 a (e.g., when a centralized speech processor performs the speech processing for all speech enabled systems 16 a-16 n). Additionally or alternatively, the speech management system 30 deactivates speech recognition for the non-selected speech enabled systems by sending deactivation signals to the speech enabled systems 16 b-16 n.
  • Referring now to FIG. 2 and with continued reference to FIG. 1, a dataflow diagram illustrates various embodiments of the speech management system 30. Various embodiments of speech management systems 30 according to the present disclosure may include any number of sub-modules embedded within the speech management system 30. As can be appreciated, the sub-modules shown in FIG. 2 may be combined and/or further partitioned to manage speech input to the speech management system 30. The inputs to the speech management system 30 may be received from other modules (not shown), determined/modeled by other sub-modules (not shown) within the speech management system 30, and/or may be user input that is based on a user interacting with a user interface via an input device 16 a-16 n. In various embodiments, the speech management system 30 includes a system activation module 31, at least one of a speech processing module 32, a gaze processing module 34, and a gesture processing module 36 (or any other processing modules depending on the number of input modalities), and a speech system activation/deactivation module 38.
  • The system activation module 31 receives as input user input data 40. The user input data 40 may be received based on a user interacting with an input device, such as, for example, the activation switch 20 a or other device. The system activation module 31 processes the user input data 40 to determine if the user input data indicates a user's request to activate the selection of a speech enabled system 16 a-16 n. If the user input data 40 does not indicate to activate speech system selection, optionally, the system activation module 31 may generate display data 42 that includes a message that may be displayed in an interface that indicates that the input is not recognized. If the user input data 40 indicates to activate the speech system selection, the system activation module 31 sets an activation flag 44 to TRUE (or other value indicating to activate the speech system selection).
  • The speech processing module 32, for example, receives as input the activation flag 44. When the activation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection), the speech processing module 44 sends a signal 46 to the recording device 20 b to activate audio recording. In return, the speech processing module 32 receives the recorded speech data 48. The speech processing module 32 processes the recorded speech data 48 to determine a spoken command. The processing can be performed based on a set of recognized commands that identify speech enabled systems 16 a-16 n of the aircraft 12 and speech processing techniques known in the art. If the speech processing module 32 is unable to recognize a spoken command from the recorded speech data 48, optionally, the speech processing module 32 generates display data 50 that includes a message that, when displayed, indicates that the command was not recognized. If a spoken command was recognized, the speech processing module 32 determines a particular speech enabled system 16 a of the speech enabled systems 16 a-16 n on the aircraft 12 and sets a selected speech system 52 to the particular speech enabled system.
  • The gaze processing module 34 receives as input the activation flag 44. When the activation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection), the gaze processing module 34 sends a signal 54 to the recording device 20 n to activate video recording. In return, the gaze processing module 34 receives recorded gaze data 56. The gaze processing module 34 processes the recorded gaze data 56 to determine a gaze direction. The processing can be performed based on gaze recognition techniques known in the art. If the gaze processing module 34 is unable to recognize a gaze diction from the recorded gaze data 56, optionally, the gaze processing module 34 generates display data 58 that includes a message that, when displayed, indicates that the gaze direction was not identified. If a gaze direction was recognized, the gaze processing module 34 determines a particular speech enabled system 16 a of the speech enabled systems 16 a-16 n on the aircraft 12 and sets a selected speech enabled system 60 to the particular speech enabled system.
  • The gesture processing module 36 receives as input the activation flag 44. When the activation flag 44 is equal to TRUE (or other value indicating to activate the speech system selection), the gesture processing module 36 sends a signal 62 to the recording device 20 n to activate video recording. In return, the gesture processing module 36 receives recorded gesture data 64. The gesture processing module 36 processes the recorded gesture data 64 to determine a gesture direction. The processing can be performed based on gesture recognition techniques known in the art. If the gesture processing module 36 is unable to recognize a gesture diction from the recorded gesture data 64, optionally, the gesture processing module 36 generates display data 66 that includes a message that, when displayed, indicates that the gesture direction was not identified. If a gesture direction was recognized, the gesture processing module 36 determines a particular speech enabled system 16 a of the speech enabled systems 16 a-16 n on the aircraft 12 and sets a selected speech system 68 to the particular speech enabled system.
  • The speech system activation/deactivation module 38 receives as input the selected speech system 68 from the gesture processing module 36, the selected speech system 60 from the gaze processing module 34, and/or the selected speech system 52 from the speech processing module 32. The speech system activation/deactivation module 38 generates an activation/deactivation signal 70 based on the received selected speech system 52, 60, 68. The activation/deactivation signals 70 is received by the selected speech enabled system to activate and/or deactivate speech processing by the speech enabled systems, or alternatively, the activation/deactivation signals 70 are used to activate and/or deactivate speech processing by a centralized speech processor using a particular vocabulary and/or speech processing techniques.
  • When selected speech systems 52, 60, 68 are received from two or more of the gesture processing module 36, the gaze processing module 34, and the speech processing module 32, the speech system activation/deactivation module 38 determines the appropriate speech enabled system 16 a-16 n to generate the activation/deactivation signals based on an arbitration method. For example, if two or more of the selected speech systems 52, 60, 68 are the same, then the activation/deactivation signals 70 are generated based on the same selected speech systems. If, two or more of the selected speech systems 52, 60, 68 are different, then the selected speech system associated with a processing technique having a highest priority is selected. Alternatively, the speech system activation/deactivation module 38 may generate display data 72 that includes a message that indicates the different selected speech enabled systems 52, 60, 68 and a request to pick one of the different selected speech systems 52, 60, 68. In return, user input data 74 may be received indicating a selected one of the different selected speech systems 52, 60, 68, and the speech system activation/deactivation module 38 generates the activation/deactivation signals 70 based on the selected one.
  • Referring now to FIG. 3 and with continued reference to FIGS. 1 and 2, a flowchart illustrates a method that may be performed by the speech management system 30 in accordance with the present disclosure. As can be appreciated in light of the disclosure, the order of operation within the method is not limited to the sequential execution as illustrated in FIG. 3, but may be performed in one or more varying orders as applicable and in accordance with the present disclosure.
  • In various embodiments, the method can be scheduled to run based on predetermined events, and/or can run continually during operation of the computing device 14 of the aircraft 12.
  • The method may begin at 100. It is determined whether user input data 40 is received at 110. The user input data 40 may be received based on a user selecting an input device, such as, for example, the activation switch 20 a or other device. If user input data 40 is not received at 110, the method continues with monitoring for user input data 40 at 110. If however, the user input data 40 is received at 110, the user input data 40 is processed at 120 and evaluated at 130. If the user input data 40 does not indicate to activate speech recognition at 130, optionally, a message may be displayed that indicates that the input is not recognized at 140 and the method continues with monitoring for user input data 40 at 110.
  • If, however, the user input data 40 does indicate to activate speech recognition at 130, the input device 16 b, 16 n is activated at 150 to start recording of the speech, the gesture, and/or the gaze of the user. If the recorded input is speech input at 160, the recorded speech data 48 is processed at 170 based on speech recognition methods to determine the speech command The selected speech system 52 is determined from the speech command at 180. The activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a-16 n at 190 based on the selected speech system 52. Thereafter, the method may end at 200.
  • If, however, the recorded data is not speech data at 160 rather, the recorded data is gaze data 56 at 210, the recorded gaze data 56 is processed at 220 based on gaze recognition methods to determine the direction of gaze of the user. The selected speech system 60 is determined from the direction of gaze of the user at 230. The activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a-16 n at 190 based on the selected speech system 60. Thereafter, the method may end at 200.
  • If, however, the recorded data is not speech data at 160, and the recorded data is not gesture data at 240, rather the recorded data is recorded gesture data 64 at 210, the recorded gesture data 64 is processed at 250 based on gesture recognition methods to determine the direction of the gesture of the user. The selected speech system 68 is determined from the direction of gesture of the user at 260. The activation/deactivation signals 70 are generated and communicated to the appropriate speech system 16 a-16 n at 190 based on the selected speech system 68. Thereafter, the method may end at 200.
  • The method shown in FIG. 3 illustrates processing one of speech data, gaze data, and gesture data to determine the selected speech system. As can be appreciated, two or more of the speech data, the gaze data, and the gesture data can be processed to determine the selected speech system. For example, if two or more inputs indicate the same speech system, then that speech system is the selected speech system. In another example, if one input indicates a first speech system and another input indicates a second speech system, then a message may be displayed indicating the discrepancy.
  • Those of skill in the art will appreciate that the various illustrative logical blocks, modules, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. Some of the embodiments and implementations are described above in terms of functional and/or logical block components (or modules) and various processing steps. However, it should be appreciated that such block components (or modules) may be realized by any number of hardware, software, and/or firmware components configured to perform the specified functions. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. For example, an embodiment of a system or a component may employ various integrated circuit components, e.g., memory elements, digital signal processing elements, logic elements, look-up tables, or the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices. In addition, those skilled in the art will appreciate that embodiments described herein are merely exemplary implementations
  • The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal In the alternative, the processor and the storage medium may reside as discrete components in a user terminal
  • In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Numerical ordinals such as “first,” “second,” “third,” etc. simply denote different singles of a plurality and do not imply any order or sequence unless specifically defined by the claim language. The sequence of the text in any of the claims does not imply that process steps must be performed in a temporal or logical order according to such sequence unless it is specifically defined by the language of the claim. The process steps may be interchanged in any order without departing from the scope of the invention as long as such an interchange does not contradict the claim language and is not logically nonsensical.
  • While at least one exemplary embodiment has been presented in the foregoing detailed description of the invention, it should be appreciated that a vast number of variations exist. It should also be appreciated that the exemplary embodiment or exemplary embodiments are only examples, and are not intended to limit the scope, applicability, or configuration of the invention in any way. Rather, the foregoing detailed description will provide those skilled in the art with a convenient road map for implementing an exemplary embodiment of the invention. It being understood that various changes may be made in the function and arrangement of elements described in an exemplary embodiment without departing from the scope of the invention as set forth in the appended claims.

Claims (20)

What is claimed is:
1. A method of managing speech processing in an environment having at least two speech enabled systems, comprising:
recording first user data that indicates an action of a user;
determining, by a processor, a selection of a first speech enabled system based on the recorded user data; and
generating, by the processor, a signal to at least one of activate and deactivate speech processing based on the first speech enabled system.
2. The method of claim 1, wherein the action of the user includes a gesture of the user.
3. The method of claim 1, wherein the action of the user includes a gaze of the user.
4. The method of claim 1, wherein the action of the user includes a spoken command from the user.
5. The method of claim 1, wherein the signal activates speech processing by the first speech enabled system.
6. The method of claim 1, wherein the signal activates speech processing by a centralized speech processor using at least one of a vocabulary and a speech processing technique associated with the first speech enabled system.
7. The method of claim 1, further comprising recording second user data that indicates a second action of the user, and wherein the determining the selection of the first speech enabled system is based on the first recorded user data and the second recorded user data.
8. The method of claim 7, wherein the action of the user indicates at least one of a gesture of the user, a gaze of the user, and a spoken command from the user, and wherein the second action of the user indicates at least one of a gesture of the user, a gaze of the user, and a spoken command from the user.
9. The method of claim 1, further comprising receiving user input data indicating to activate recording, and wherein the recording is performed based on the user input data.
10. The method of claim 1, wherein the recording is performed continuously.
11. A system managing speech processing in an environment having at least two speech enabled systems, comprising:
an input device that records first user data that indicates an action of a user; and
a processor that determines a selection of a first speech enabled system based on the recorded user data, and that generates a signal to at least one of activate and deactivate speech processing based on the first speech enabled system.
12. The system of claim 11, wherein the action of the user includes a gesture of the user.
13. The system of claim 11, wherein the action of the user includes a gaze of the user.
14. The system of claim 11, wherein the action of the user includes a spoken command from the user.
15. The system of claim 11, wherein the signal activates speech processing by the speech enabled system.
16. The system of claim 11, wherein the signal activates speech processing by a centralized speech processor using at least one of a vocabulary and a speech processing technique associated with the speech enabled system.
17. The system of claim 11, further comprising a second input device that records second user data that indicates a second action of the user, and wherein the processor determines the selection of the speech enabled system based on the recorded user data and the recorded second user data.
18. The system of claim 17, wherein the action of the user indicates at least one of a gesture of the user, a gaze of the user, and a spoken command from the user, and wherein the second action of the user indicates at least one of a gesture of the user, a gaze of the user, and a spoken command from the user.
19. The system of claim 11, further comprising a third input device that generates user input data indicating to activate recording, and wherein the first input device records the user data based on the user input data.
20. The system of claim 11, wherein the input device records the user data continuously.
US14/325,916 2014-07-08 2014-07-08 Methods and systems for managing speech recognition in a multi-speech system environment Expired - Fee Related US9569174B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US14/325,916 US9569174B2 (en) 2014-07-08 2014-07-08 Methods and systems for managing speech recognition in a multi-speech system environment
EP15171540.6A EP2966644A3 (en) 2014-07-08 2015-06-10 Methods and systems for managing speech recognition in a multi-speech system environment
CN201510392770.XA CN105261361A (en) 2014-07-08 2015-07-07 Methods and systems for managing speech recognition in a multi-speech system environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US14/325,916 US9569174B2 (en) 2014-07-08 2014-07-08 Methods and systems for managing speech recognition in a multi-speech system environment

Publications (2)

Publication Number Publication Date
US20160011853A1 true US20160011853A1 (en) 2016-01-14
US9569174B2 US9569174B2 (en) 2017-02-14

Family

ID=53373350

Family Applications (1)

Application Number Title Priority Date Filing Date
US14/325,916 Expired - Fee Related US9569174B2 (en) 2014-07-08 2014-07-08 Methods and systems for managing speech recognition in a multi-speech system environment

Country Status (3)

Country Link
US (1) US9569174B2 (en)
EP (1) EP2966644A3 (en)
CN (1) CN105261361A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9653075B1 (en) * 2015-11-06 2017-05-16 Google Inc. Voice commands across devices
US9824689B1 (en) * 2015-12-07 2017-11-21 Rockwell Collins Inc. Speech recognition for avionic systems
US10890969B2 (en) * 2018-05-04 2021-01-12 Google Llc Invoking automated assistant function(s) based on detected gesture and gaze
WO2021086600A1 (en) * 2019-11-01 2021-05-06 Microsoft Technology Licensing, Llc Selective response rendering for virtual assistants
US11289081B2 (en) * 2018-11-08 2022-03-29 Sharp Kabushiki Kaisha Refrigerator
US11614794B2 (en) 2018-05-04 2023-03-28 Google Llc Adapting automated assistant based on detected mouth movement and/or gaze
US11688417B2 (en) * 2018-05-04 2023-06-27 Google Llc Hot-word free adaptation of automated assistant function(s)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0702355A2 (en) * 1994-09-14 1996-03-20 Canon Kabushiki Kaisha Speech recognition method and apparatus
US6157403A (en) * 1996-08-05 2000-12-05 Kabushiki Kaisha Toshiba Apparatus for detecting position of object capable of simultaneously detecting plural objects and detection method therefor
US6868383B1 (en) * 2001-07-12 2005-03-15 At&T Corp. Systems and methods for extracting meaning from multimodal inputs using finite-state devices
US20050175218A1 (en) * 2003-11-14 2005-08-11 Roel Vertegaal Method and apparatus for calibration-free eye tracking using multiple glints or surface reflections
US20060110008A1 (en) * 2003-11-14 2006-05-25 Roel Vertegaal Method and apparatus for calibration-free eye tracking
US20070016426A1 (en) * 2005-06-28 2007-01-18 Microsoft Corporation Audio-visual control system
US20070081090A1 (en) * 2005-09-27 2007-04-12 Mona Singh Method and system for associating user comments to a scene captured by a digital imaging device
US20070288242A1 (en) * 2006-06-12 2007-12-13 Lockheed Martin Corporation Speech recognition and control system, program product, and related methods
US20110125503A1 (en) * 2009-11-24 2011-05-26 Honeywell International Inc. Methods and systems for utilizing voice commands onboard an aircraft
US20130210406A1 (en) * 2012-02-12 2013-08-15 Joel Vidal Phone that prevents texting while driving
US20130281079A1 (en) * 2012-02-12 2013-10-24 Joel Vidal Phone that prevents concurrent texting and driving

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2696258B1 (en) 1992-09-25 1994-10-28 Sextant Avionique Device for managing a human-machine interaction system.
US6154723A (en) 1996-12-06 2000-11-28 The Board Of Trustees Of The University Of Illinois Virtual reality 3D interface system for data creation, viewing and editing
US6219645B1 (en) * 1999-12-02 2001-04-17 Lucent Technologies, Inc. Enhanced automatic speech recognition using multiple directional microphones
EP1215658A3 (en) 2000-12-05 2002-08-14 Hewlett-Packard Company Visual activation of voice controlled apparatus
WO2007017796A2 (en) * 2005-08-11 2007-02-15 Philips Intellectual Property & Standards Gmbh Method for introducing interaction pattern and application functionalities
US8964298B2 (en) 2010-02-28 2015-02-24 Microsoft Corporation Video display modification based on sensor input for a see-through near-to-eye display
CN103092557A (en) * 2011-11-01 2013-05-08 上海博泰悦臻网络技术服务有限公司 Vehicular speech input device and method
CN102614057B (en) * 2012-04-11 2013-08-28 合肥工业大学 Multifunctional electric nursing sickbed with intelligent residential environment
US9423870B2 (en) 2012-05-08 2016-08-23 Google Inc. Input determination method
CN102945672B (en) * 2012-09-29 2013-10-16 深圳市国华识别科技开发有限公司 Voice control system for multimedia equipment, and voice control method

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0702355A2 (en) * 1994-09-14 1996-03-20 Canon Kabushiki Kaisha Speech recognition method and apparatus
US6157403A (en) * 1996-08-05 2000-12-05 Kabushiki Kaisha Toshiba Apparatus for detecting position of object capable of simultaneously detecting plural objects and detection method therefor
US6868383B1 (en) * 2001-07-12 2005-03-15 At&T Corp. Systems and methods for extracting meaning from multimodal inputs using finite-state devices
US7295975B1 (en) * 2001-07-12 2007-11-13 At&T Corp. Systems and methods for extracting meaning from multimodal inputs using finite-state devices
US20050175218A1 (en) * 2003-11-14 2005-08-11 Roel Vertegaal Method and apparatus for calibration-free eye tracking using multiple glints or surface reflections
US20060110008A1 (en) * 2003-11-14 2006-05-25 Roel Vertegaal Method and apparatus for calibration-free eye tracking
US20070016426A1 (en) * 2005-06-28 2007-01-18 Microsoft Corporation Audio-visual control system
US20070081090A1 (en) * 2005-09-27 2007-04-12 Mona Singh Method and system for associating user comments to a scene captured by a digital imaging device
US20070288242A1 (en) * 2006-06-12 2007-12-13 Lockheed Martin Corporation Speech recognition and control system, program product, and related methods
US7774202B2 (en) * 2006-06-12 2010-08-10 Lockheed Martin Corporation Speech activated control system and related methods
US20110125503A1 (en) * 2009-11-24 2011-05-26 Honeywell International Inc. Methods and systems for utilizing voice commands onboard an aircraft
US8515763B2 (en) * 2009-11-24 2013-08-20 Honeywell International Inc. Methods and systems for utilizing voice commands onboard an aircraft
US20130210406A1 (en) * 2012-02-12 2013-08-15 Joel Vidal Phone that prevents texting while driving
US8538402B2 (en) * 2012-02-12 2013-09-17 Joel Vidal Phone that prevents texting while driving
US20130281079A1 (en) * 2012-02-12 2013-10-24 Joel Vidal Phone that prevents concurrent texting and driving
US8914014B2 (en) * 2012-02-12 2014-12-16 Joel Vidal Phone that prevents concurrent texting and driving

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9653075B1 (en) * 2015-11-06 2017-05-16 Google Inc. Voice commands across devices
US20170249940A1 (en) * 2015-11-06 2017-08-31 Google Inc. Voice commands across devices
US10714083B2 (en) * 2015-11-06 2020-07-14 Google Llc Voice commands across devices
US11749266B2 (en) 2015-11-06 2023-09-05 Google Llc Voice commands across devices
US9824689B1 (en) * 2015-12-07 2017-11-21 Rockwell Collins Inc. Speech recognition for avionic systems
US10890969B2 (en) * 2018-05-04 2021-01-12 Google Llc Invoking automated assistant function(s) based on detected gesture and gaze
US11493992B2 (en) 2018-05-04 2022-11-08 Google Llc Invoking automated assistant function(s) based on detected gesture and gaze
US11614794B2 (en) 2018-05-04 2023-03-28 Google Llc Adapting automated assistant based on detected mouth movement and/or gaze
US11688417B2 (en) * 2018-05-04 2023-06-27 Google Llc Hot-word free adaptation of automated assistant function(s)
US11289081B2 (en) * 2018-11-08 2022-03-29 Sharp Kabushiki Kaisha Refrigerator
WO2021086600A1 (en) * 2019-11-01 2021-05-06 Microsoft Technology Licensing, Llc Selective response rendering for virtual assistants
US11289086B2 (en) 2019-11-01 2022-03-29 Microsoft Technology Licensing, Llc Selective response rendering for virtual assistants

Also Published As

Publication number Publication date
EP2966644A2 (en) 2016-01-13
US9569174B2 (en) 2017-02-14
CN105261361A (en) 2016-01-20
EP2966644A3 (en) 2016-02-17

Similar Documents

Publication Publication Date Title
US9569174B2 (en) Methods and systems for managing speech recognition in a multi-speech system environment
CN105691406B (en) System and method for the built-in negotiation automation movement of the vehicles
MX2015002413A (en) Disambiguation of dynamic commands.
US10065750B2 (en) Aircraft maintenance systems and methods using wearable device
JP2016530660A (en) Context-sensitive gesture classification
EP3451129B1 (en) System and method of providing clipboard cut and paste operations in an avionics touchscreen system
US20150054743A1 (en) Display method through a head mounted device
CN111737670B (en) Method, system and vehicle-mounted multimedia device for multi-mode data collaborative man-machine interaction
US20170287476A1 (en) Vehicle aware speech recognition systems and methods
EP3040917A1 (en) Speech recognition systems and methods for maintenance repair and overhaul
US20180095477A1 (en) Method for accessing a vehicle-specific electronic device
US9588611B2 (en) System and method for guarding emergency and critical touch targets
EP3021265A1 (en) Context based content display in a wearable device
US9594610B2 (en) Method and system for processing multimodal input signals
CN109857326A (en) A kind of vehicular touch screen and its control method
EP2903240A1 (en) Configurable communication systems and methods for communication
CN113835570A (en) Method, device, apparatus, storage medium, and program for controlling display screen in vehicle
US20160086389A1 (en) Methods and systems for processing speech to assist maintenance operations
US20210193133A1 (en) Information processing device, information processing method, and program
US10061388B2 (en) Method and apparatus for processing user input
US10646998B2 (en) System and method for optimizing resource usage of a robot
CN113791841A (en) Execution instruction determining method, device, equipment and storage medium
US20150317973A1 (en) Systems and methods for coordinating speech recognition
US9721356B2 (en) Methods and systems for programatically identifying shapes in graphical artifacts
US9858918B2 (en) Root cause analysis and recovery systems and methods

Legal Events

Date Code Title Description
AS Assignment

Owner name: HONEYWELL INTERNATIONAL INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ROGERS, WILLIAM;LETSU-DAKE, EMMANUEL;WHITLOW, STEPHEN;SIGNING DATES FROM 20140624 TO 20140707;REEL/FRAME:033262/0932

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20210214