WO1995025326A1 - Voice/pointer operated system - Google Patents

Voice/pointer operated system Download PDF

Info

Publication number
WO1995025326A1
WO1995025326A1 PCT/US1995/002921 US9502921W WO9525326A1 WO 1995025326 A1 WO1995025326 A1 WO 1995025326A1 US 9502921 W US9502921 W US 9502921W WO 9525326 A1 WO9525326 A1 WO 9525326A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
pen
accordance
signal
voice signal
Prior art date
Application number
PCT/US1995/002921
Other languages
French (fr)
Inventor
Mark P. Fortunato
Neil E. Hickox
Original Assignee
Voice Powered Technology International, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Voice Powered Technology International, Inc. filed Critical Voice Powered Technology International, Inc.
Publication of WO1995025326A1 publication Critical patent/WO1995025326A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/038Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/02Digital computers in general; Data processing equipment in general manually operated with input through keyboard and computation using a built-in program, e.g. pocket calculators
    • G06F15/0225User interface arrangements, e.g. keyboard, display; Interfaces to other computer systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0354Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of 2D relative movements between the device, or an operating part thereof, and a plane or surface, e.g. 2D mice, trackballs, pens or pucks
    • G06F3/03545Pens or stylus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/038Indexing scheme relating to G06F3/038
    • G06F2203/0381Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer

Definitions

  • the present invention relates to systems for entering data into a device using voice recognition of spoken words, and more particularly to systems capable of combining voice recognition with other forms of data entry such as by touching.
  • the pen or stick is used to activate buttons and to highlight fields displayed on touch-screen displays.
  • the pen or stick may be used to perform other functions, such as handwriting recognition.
  • the pen or stick provides a means of human interfacing whereby the operator may point to fields, frames or buttons to make selections or otherwise enter data into an electronic device, such as a personal computer, a personal organizer or a personal digital assistant (PDA) . Additional data, such as that entered for purposes of recording or storage, is often entered by other interfacing means such as a keyboard or keypad.
  • Still other arrangements for inputting data into an electronic device are known.
  • Such electronic devices include systems in which remote controls are provided with sophisticated voice recognition electronics.
  • the remote controls recognize spoken commands, translate the commands into the traditional remote digital control signals, and transmit the control signals to a controlled device. Examples of such systems are provided by co-pending application Serial No. 07/915,112 of Bissonnette et al. , entitled Voice Operated Remote Control Device, by co-pending application Serial No. 07/915,938 of Bissonnette et al. , entitled Voice Recognition Apparatus and Method, and by co-pending application Serial No. 07/915,114 of Fischer, entitled Remote
  • a further example of an electronic device in which data is inputted using voice recognition is provided by co-pending application Serial No. 08/113,394 of Fischer et al. , entitled Voice Operated Remote Control System, which application was filed August 27, 1993 and is commonly assigned with the present application.
  • the Fischer et al. application describes a voice operated remote control system in which a remote control device responsive to the voice commands of a user transmits representations of the voice commands to a controlled device which then produces voice signals in response to the transmitted representations.
  • the controlled device includes voice recognition circuitry for recognizing the transmitted voice commands and executing action routines denoted thereby.
  • the remote control device receives the voice commands via a microphone and produces a corresponding analog audio signal which is modulated and then transmitted by an infrared transmitter.
  • An audio receiver at the controlled device includes an infrared sensor for receiving the transmitted signal, and additional circuitry for demodulating and processing the transmitted signal into a corresponding voice signal.
  • the voice signal is used to generate an incoming digital voice template for comparison with a plurality of digital reference voice templates. If a substantial equivalent is found, a corresponding action routine is executed to achieve a desired action within the controlled device.
  • Another electronic device utilizing voice recognition for data inputting is described by co- pending application Serial No. 08/134,327 of Bissonnette et al. , entitled Voice Activated Personal Organizer, which application was filed October 12, 1993 and is commonly assigned with the present application. The Bissonnette et al.
  • the application describes a small, portable, hand-held electronic personal organizer which performs voice recognition on words spoken by a user to input data into the organizer, and records voice messages from the user.
  • the spoken words and the voice messages are input via a microphone, and the voice messages are compressed and then converted into digital signals for storage.
  • the stored digital voice messages are reconverted into analog signals and then expanded for reproduction using a speaker.
  • the organizer is capable of a number of different functions, including voice training, memo record, reminder, manual reminder, time setting, message review, waiting message, calendar, phone group select, number retrieval, add phone number, security, and "no" logic.
  • voice training memo record, reminder, manual reminder, time setting, message review, waiting message, calendar, phone group select, number retrieval, add phone number, security, and "no" logic.
  • data is principally entered by voice and occasionally through use of a limited keypad, and voice recordings are made and played back as appropriate.
  • a visual display provides feedback to the user.
  • the user can edit different data within the organizer by eliminating or
  • the present invention provides a voice and pointer operated system in which a pen or stick responsive to voice commands is operative to input data into a device using voice recognition, in addition to data inputted into the device using a pointer or tip region of the pen or stick.
  • the device has a pointer-responsive area thereon as well as voice recognition means for recognizing a voice signal from the pen and producing an action routine denoted thereby.
  • the pointer-responsive area may comprise a touch screen presenting frames for interaction with the pen to select functions for the device, with voice commands entered via a microphone in the pen being used to provide input data to the device.
  • the pen is electrically coupled to the device to provide the voice signal directly to the device.
  • a wire extending between the pen and the device, and which is long enough to permit movement of the tip region of the pen to different areas of the touch screen, couples the microphone directly to the device to provide the voice signals thereto.
  • the wire connection also enables the pen to utilize the power supply of the device.
  • the pen is wireless, thereby eliminating the need for a wire or other direct connection between the pen and the device.
  • Voice signals generated by the microphone of the pen in response to voice commands from the user are modulated before being transmitted by the pen to an audio receiver within the device for conversion into the voice signal.
  • the modulated voice signal may be transmitted by an infrared transmitter, within the pen, to an infrared sensor at the audio receiver within the device.
  • the voice recognition apparatus within the device may assume any appropriate form.
  • such apparatus includes a reference memory for storing a plurality of reference voice templates, a program memory for storing a control program, and a processor for generating an incoming voice template in response to each voice signal which is received, in order to produce a voice signal corresponding to the transmitted representation.
  • the control program is executed to determine whether the incoming voice template is substantially equivalent to one of the reference voice templates.
  • One of a plurality of action routines is then selected, based on the reference voice template which the incoming voice template matches.
  • the device which is controlled by the voice and pointer inputs of the pen or stick, may comprise such electronic devices as a computer, a personal organizer or a personal digital assistant (PDA) .
  • PDA personal digital assistant
  • the device may, for example, comprise a personal digital assistant which is organized to operate as a phone directory for storing names, addresses, phone numbers and other information. In that event, various fields within a display on the phone directory are touched by the pen to select basic functions such as entry of an additional name or entry of digital information for a name already displayed, with the user then speaking into the pen to enter the specific data within such fields.
  • Fig. 1 is a perspective view of a pen and a personal digital assistant, providing an example of a voice and pointer operated system in accordance with the invention
  • Fig. 2 is a block diagram of a first embodiment of the system of Fig. 1 in which the pen is directly coupled by wire to the personal digital assistant;
  • Fig. 3 is an alternative embodiment of the system of Fig. 1 in which the pen is wireless;
  • Fig. 4 is a block diagram of the electronic portion of the pen of Fig. 3;
  • Fig. 5 is a block diagram of the audio receiver forming a part of the personal digital assistant of Fig. 3;
  • Fig. 6 is a basic flow chart illustrating the operation of the system of Fig. 1; and Fig. 7 is a detailed flow chart illustrating a detailed example of the operation of the system of Fig. 1.
  • FIG. 1 shows a voice and pointer operated system 10 in accordance with the invention.
  • Fig. 1 includes a pen 12 and electronic device 14 with which the pen 12 is used.
  • the device 14 comprises a personal digital assistant (PDA) .
  • PDA personal digital assistant
  • the electronic device 14 includes a pointer- responsive area thereon in the form of a touch-screen 16.
  • the pen 12 includes a pointer or tip region 18 thereof. Placement of the pen 12 to engage the touch ⁇ screen 16 by the tip region 18 results in the inputting of data into the touch-screen 16 in conventional fashion.
  • the pen 12 is provided with a microphone 20 mounted at an end of the pen 12 opposite the tip region 18.
  • a microphone 20 mounted at an end of the pen 12 opposite the tip region 18.
  • voice signals are provided to the device 14.
  • the user presses a push-to-talk switch 21 on the pen 12 to signal the device 14 that voice commands are being transmitted thereto.
  • the device 14 utilizes voice recognition to input further data represented by the voice signals.
  • the pen 12 may be directly coupled to the device 14 by wire, as described hereafter in connection with Fig. 2.
  • the pen 12 may be wireless and may transmit the voice commands to the device 14, such as by use of infrared techniques, as described hereafter in connection with Fig. 3.
  • the manner in which the voice signals received by the microphone 20 of the pen 12 are conveyed to the device 14 is unimportant. What is important is that the system 10 of Fig. 1, as is true of systems in accordance with the invention, be capable of inputting data into the device 14 using both voice commands and pointing. Moreover, the voice commands and the pointing can be utilized to input data into the device 14 simultaneously.
  • Fig. 1 systems in accordance with the invention such as the system 10 shown in Fig. 1 provide a greatly improved, and indeed a natural, human interface of the user with the device 14.
  • the pen 12 can be used to point to a desired field or other area of the device 14, and thereby input data into the device 14 while simultaneously responding to the user's voice commands to input additional data into the device 14 in conjunction therewith.
  • This enables the user to point to a desired field or frame of the touch-screen 16 within the device 14 while at the same time making choices and otherwise inputting data in connection with such field or frame using voice commands.
  • Fig. 2 shows an embodiment of the system 10 of Fig. 1 in which the pen 12 is directly coupled by wire to the device 14.
  • the microphone 20 of the pen 12 is coupled by a wire 22 to the device 14.
  • the wire 22, which is long enough to provide the user with flexibility in pointing to different portions of the touch-screen 16, provides the voice signals which are generated by the microphone 20 in response to the voice commands of the user directly to a microcontroller 24 within the device 14.
  • the microcontroller 24 forms a part of a voice recognition circuit 26, and includes an analog-to- digital converter (A/D) 28.
  • the microphone 20 of the pen 12 converts words spoken by the user into analog voice signals, and the A/D converter 28 converts such signals into corresponding digital signals.
  • the microcontroller 24 also includes a microprocessor (MP) 30 coupled to the A/D converter 28, and having access to both a read only memory (ROM) 32 and a random access memory (RAM) 34.
  • the microcontroller 24 is coupled to an external DRAM 36, which forms a part of the voice recognition circuit 26, and which is utilized for voice recognition storage requirements, including primarily the storage of voice templates.
  • a digital signal processor can be used to perform the voice recognition.
  • the microcontroller 24 comprising the voice recognition circuit 26 operates in the same manner as described in the previously referred to co-pending applications, Serial Nos. 07/915,112, 07/915,938 and
  • the A/D converter 28 of Fig. 2 may comprise an 8-bit converter which samples incoming data at a preassigned frequency such as 9.6 KHz.
  • the A/D converter 28 outputs a digital signal representing the input analog voice signal from the microphone 20.
  • the microprocessor 30 processes the digital voice signal together with a voice recognition software routine forming part of a control program stored in the ROM 32.
  • the digital voice signal is converted into an incoming voice template that is compared against previously stored voice templates of the user's voice stored in the external DRAM 36.
  • the program decodes the voice templates.
  • the RAM 34 comprises a reference memory for temporary storage of data.
  • the analog voice signal from the microphone 20 of the pen 12 is applied to the A/D converter 28 for conversion into an incoming digital voice signal.
  • the reference memory comprised of the external DRAM 36 in conjunction with the RAM 34, stores a plurality of reference digital voice templates.
  • the ROM 32 stores the control program.
  • the microprocessor 30, which is coupled to the A/D converter 28, the ROM 32 and the RAM 34, generates an incoming digital voice template from the incoming digital voice signal at the output of the A/D converter 28.
  • the microprocessor 30 then executes the control program to determine whether the incoming digital voice template is substantially equivalent to one of the reference digital voice templates stored in the reference memory comprised of the RAM 34 and the external DRAM 36.
  • the microprocessor 30 determines what action to take corresponding to a reference digital voice template, if the incoming digital voice template is found to have substantial similarity to the reference digital voice template. Again, however, it should be understood that this description is by way of example only, and that other voice recognition techniques can be used in accordance with the invention.
  • the electronic device 14 includes a keyboard input 38.
  • the keyboard input 38 is not essential but may be used, as in the present example, for manual entry of additional data into the microcontroller 24.
  • the electronic device 14 also includes an audio circuit 40 coupled through a speaker or speakers 42 to provide audible messages to the user of the device 14.
  • the electronic device 14 of Fig. 2 includes a liquid crystal display (LCD) 44 coupled to the microcontroller 24 through an LCD display circuit 46.
  • the LCD display 44 which provides a visual display to the user, forms part of the touch-screen 16 of the electronic device 14.
  • the touch-screen 16 is responsive to the tip region 18 of the pen 12 to input data, with such data being provided to the microcontroller 24 by the LCD display circuit 46.
  • the physical or touching relationship between the tip region 18 of the pen 12 and the touch-screen of the LCD display 44 is represented by a dotted line 48 in Fig. 2.
  • the touch-screen 16 of the device 14 responds to the tip region 18 of the pen 12 to input data into the device 14 in conventional pen or stick fashion.
  • data is also input into the device 14 using the microphone 20 of the pen 12 and the voice recognition capabilities of the device 14.
  • the microphone 20 responds to vocal commands of the user by providing the analog voice signals to the microcontroller 24 comprising a portion of the voice recognition circuit 26.
  • the microcontroller 24 attempts to recognize the voice signals, and when such recognition occurs additional data is input into the system. In this fashion, data inputted into the device 14 by voice command is combined with data input by touching the touch-screen 16.
  • Fig. 3 shows an alternative embodiment in which the pen 12 is not coupled by the wire 22 to the device 14, as in the case of Fig. 2. Instead, the pen 12 is wireless and transmits representations of the voice commands of the user to an audio receiver 50 within the device 14. The audio receiver 50 in turn converts the transmitted signals into the voice signals for application to the microcontroller 24. The microcontroller 24 performs voice recognition on the voice signals in the same manner as described in connection with Fig. 2.
  • the device 14 of Fig. 3 is like that of Fig. 2, except for the presence of the audio receiver 50.
  • the pen 12 of Fig. 3 is like the pen of Fig. 2, except that it transmits the vocal commands to the audio receiver 50 in wireless fashion. Because the pen 12 of Fig. 3 is not directly coupled to the device 14 of Fig. 3 by wire, the pen 12 must have its own power supply. To conserve on power, the push-to-talk switch 21 which is mounted on the side of the pen 12 is used to apply power to the microphone 20 whenever the user wishes to speak into the microphone 20. As in the case of Fig.
  • the tip region 18 of the pen 12 of Fig. 3 physically interacts with the touch-screen 16 formed in conjunction with the LCD display 44, as represented by the dotted line 48.
  • the electronic portion of the pen 12 of Fig. 3 is shown in Fig. 4.
  • the microphone 20 responds to words spoken by the user to produce the analog voice signals.
  • Such analog voice signals are amplified in an audio amplifier 54.
  • the audio amplifier 54 conditions the signal for proper application to a voltage controlled oscillator (VCO) 56 to frequency modulate (FM) a signal at the output of the VCO 56 in accordance with the analog audio signal from the amplifier 54.
  • the output of the VCO 56 is conditioned by a driver circuit 58 prior to driving an infrared (IR) transmitter 60.
  • VCO voltage controlled oscillator
  • FM frequency modulate
  • the pen 12 is powered by batteries 62. Because there is no memory or microprocessor in the pen 12, there is no need to keep the pen 12 powered when not in use. Accordingly, the push-to-talk switch 21 is used to provide the power from the batteries 62 only when the user wishes to speak into the microphone 20. Pressing the push-to-talk switch 21 at the outer surface of the pen 12 couples the batteries 62 to a voltage regulator 64.
  • the voltage regulator 64 which is coupled between a common ground line 66 and the push-to-talk switch 21, has an output terminal 68 thereof coupled to provide a power supply voltage (+V) at a terminal 70.
  • This regulated supply voltage is applied to the microphone 20, the audio amplifier 54 and the VCO 56.
  • the batteries 62 are directly coupled to the driver 58 by the push-to-talk switch 21.
  • the regulated supply voltage +V optimizes the operating conditions of the VCO 56.
  • the user Whenever the user wishes to transmit a voice command to the electronic device 14, the user pushes the push-to-talk switch 21 and speaks the voice command into the microphone 20 of the pen 12.
  • the resulting analog audio signal is amplified by the amplifier 54 and applied to the VCO 56 to frequency modulate the output of the VCO 56.
  • This FM signal is applied via the driver 58 to the infrared transmitter 60, for transmission as an infrared signal to the electronic device 14.
  • the push-to-talk switch 21 When the user is not transmitting voice commands, the push-to-talk switch 21 is open, and the batteries 62 are not coupled to the voltage regulator 64 or to the driver 58.
  • the pen 12 transmits a frequency modulated infrared signal from the infrared transmitter 60.
  • the infrared transmitter 60 may comprise an infrared diode or other appropriate infrared transmitting device.
  • Such infrared signals are sensed by an infrared (IR) sensor 72 forming a part of the audio receiver 50, as shown in Fig. 3.
  • IR infrared
  • the infrared sensor 72 which may comprise any appropriate form of infrared sensor such as those typically used in remote control devices, provides the received signal for amplification by an amplifier 74 and filtering by a filter 76.
  • the output of the filter 76 is applied to a phase-locked loop receiver 78 which demodulates the frequency modulated signal by converting it to a voice signal .
  • the phase- locked loop receiver 78 is of conventional phase-locked loop (PLL) configuration, and includes a phase detector 80 coupled to the filter 76 and having an output and a second input coupled in a loop which includes a loop filter 82 and a voltage controlled oscillator (VCO) 84.
  • PLL phase-locked loop
  • the output of the phase-locked loop receiver 78 is filtered by an audio filter 86 and amplified by an audio amplifier 88 to provide the demodulated voice signal.
  • the phase-locked loop receiver 78 also provides a lock detect signal, which indicates when a transmitted signal is present in the audio receiver 50.
  • the phase-locked loop receiver 78 is described herein by way of example only, and it should be understood that other demodulation receivers can be used as desired.
  • a super-heterodyne receiver can be used in applications posing more stringent requirements.
  • Voice control of the electronic device 14 within the voice and pointer operated system 10 is made possible by first voice training the collection of reference digital voice templates in the voice recognition circuit 26 in accordance with the user's voice. Such templates are collected in the same manner as described in co-pending application Serial No. 07/915,112.
  • the LCD display 44 of the device 14 is used to provide displays which prompt the user by requesting the needed words. The user responds, with the push-to-talk switch 21 being pushed, by speaking the prompted word into the microphone 20 of the pen 12.
  • the electronic device 14 in the example of Fig. 1 comprises a personal digital assistant (PDA) capable of input and retrieval of a phone directory.
  • PDA personal digital assistant
  • the LCD display 44 of the device 14 is employed to display a list of names and phone numbers.
  • a button labeled "add” displayed on the LCD display 44 with the tip region 18 of the pen 12
  • a blank entry with boxes for the name, voice training, country code, area code, phone numbers, address and notes is placed at the end of the list.
  • the user is also given the option of adding additional phone numbers and alternate addresses per entry, if desired.
  • the user may randomly select fields to input data, by touching such fields with the pen 12.
  • the user touches a "name” box on the LCD display 44 with the pen 12, the name is input using the keyboard input 38, or by voice, spelling each word.
  • the user can then touch a "voice” button and speak the person's name into the microphone 20 of the pen 12 in order to create a recording of the name.
  • the user can then touch an "area code” box and speak the digits for the area code, or the user can touch the "phone number” box and speak the digits of the phone number, to complete the entry.
  • the various fields are optional and can be selected in any order. If the user decides not to input data into a field, the field is left blank. A detailed example of this operation is provided hereafter in connection with Figs. 6 and 7.
  • the user touches a "search" button on the LCD display 44 using the pen 12, and speaks the name of the person the user wishes to call.
  • the person's name, phone number, and any other available information is immediately displayed on the LCD display 44, and the recording of the name is spoken back to the user for confirmation, using the audio circuit 40 and the speaker 42 shown in the arrangements of Figs. 2 and 3.
  • the device 14 can be programmed to speak the phone number and other information or to dial the number if connected to a telephone.
  • the flow chart of Fig. 6 illustrates the manner in which the user inputs all of the information normally found in an address book. For simplicity of illustration, the flow chart of Fig. 6 shows only one path through the various fields. However, it should be understood that the user can input such information in any order desired.
  • a first step 90 the user requests a new entry in the directory, and does so by touching a button labeled "add" with the pen 12.
  • the user then chooses to input the name first, by touching the "name” field with the pen 12, in a step 92.
  • the "name” field is highlighted, and the user then types the person's name using the keyboard input 38, in a step 94. This highlights the "record” field.
  • the user then holds the push-to-talk switch 21 on the pen 12 and speaks the person's name, in a step 96.
  • step 96 Upon speaking the person's name, in the step 96, the "address" field is highlighted upon release of the push-to-talk switch 21. In a step 98, the user types in the address using the keyboard input 38.
  • the user may record the address by speaking the address into the microphone 20 of the pen 12 in a step 100.
  • the user can speak the digits of the street number into the pen 12, in a step 102, and then type the street name using the keyboard input 38, in a following step 104.
  • the "phone number” field is highlighted.
  • the user may type the phone number using the keyboard input 38, in a step 106.
  • the user may speak the digits of the phone number into the pen 12, in a step 108.
  • the address entry is automatically saved when an "enter” key on the keyboard entry 38 is pressed, or the push-to-talk switch 21 on the pen 12 is released, or one of the on-screen icons such as "save” or "new entry” is touched.
  • the flow chart of Fig. 7 provides a more detailed example of the manner in which the user can randomly select fields and input information in any order.
  • the flow chart of Fig. 7 is confined to entry of a phone number. Nevertheless, the flow chart of Fig. 7 demonstrates the ease and flexibility with which the user may enter information.
  • an elongated box 120 at the top of Fig. 7 the user has chosen to start a new phone number entry.
  • the user may touch a "country" field, as illustrated in a step 122. If so, and as illustrated in a following step 124, the country code field is highlighted and the device 14 waits for the user to press the push-to-talk switch 21 on the pen 12 (again assuming the wireless example of Fig. 3) .
  • the device 14 By pushing the push-to-talk switch 21 on the pen 12, in a step 126, the device 14 is ready to interpret each spoken word and to append the result to the number in the country code field, until the push-to-talk switch 21 is released or the field is full, as illustrated in a step 128. Release of the push-to-talk switch 21 results in the device 14 highlighting the country code field (step 124) . If the field is full, the box 120 at the top of Fig. 7 is returned to. From the starting box 120, the user may also touch an "area code" field, in a step 130, and this highlights the area code field and waits for the user to press the push-to-talk switch 21 in a step 132.
  • the device 14 interprets each spoken word and appends the result to the number in the area code field, until either the user releases the button or the field is full, as shown in a step 136. Release of the push-to-talk switch 21 results in the highlighting of the area code field (step 136) . If the field is filled, the starting box 120 is returned to.
  • the user may also touch a "phone number" field in a step 138, and this highlights the phone number field and causes the device 14 to wait for the user to press the push-to-talk switch 21, in a step 140. If the user then presses the push-to-talk switch 21, in a step 142, the device 14 interprets each spoken word and appends the result to the number in the phone number field, until the user releases the push-to-talk switch 21 or the field is full, as illustrated in a step 144. If the user releases the push-to-talk switch 21, the phone number field is highlighted (step 140) . If the field is filled, the starting box 120 is returned to.
  • the user may move to the step 130 or the step 138 for highlighting of the area code field or the phone number field respectively.
  • the user can move to either the step 138, or to the step 122 in which the country code is highlighted.
  • the step 140 in which the phone number field is highlighted can also result in the user moving to either the step 122 or the step 130. This illustrates the flexibility of the system and the fact that fields can be selected and information input thereto randomly and in any order.
  • an alternative approach to accommodating multiple users of the electronic device 14 involves the addition of a multiple position switch to the device 14. Positioning such switch to a number corresponding to a particular one of multiple users results in the device 14 using that person's voice templates. While various forms and modifications have been suggested, it will be appreciated that the invention is not limited thereto but encompasses all expedients and variations falling within the scope of the appended claims.

Abstract

A voice and pointer operated system includes a pen (12) provided with a microphone (20) responsive to a voice command for providing a voice signal and an electronic device (14) having a pointer-responsive area in the form of a touch screen (16) thereon for interfacing with the pen (12) as well as apparatus for recognizing the voice signal and producing an action routine denoted thereby. The pointer responsive area (16) may comprise a touch screen presenting frames for interaction with the pen (12) to select function for the electronic device (14) with voice commands entered via the microphone (20) in the pen being used to provide input data to the electronic device (14). The pen (12) may be wireless or electrically coupled to the electronic device (14).

Description

VOICE/POINTER OPERATED SYSTEM Background of the Invention
1. Field of the Invention.
The present invention relates to systems for entering data into a device using voice recognition of spoken words, and more particularly to systems capable of combining voice recognition with other forms of data entry such as by touching.
2. History of the Prior Art. It is well known to provide an electronic device with a pointer member such as a pen or stick for purposes of inputting data into the device. Typically, the pen or stick is used to activate buttons and to highlight fields displayed on touch-screen displays. In addition, the pen or stick may be used to perform other functions, such as handwriting recognition. In such arrangements, the pen or stick provides a means of human interfacing whereby the operator may point to fields, frames or buttons to make selections or otherwise enter data into an electronic device, such as a personal computer, a personal organizer or a personal digital assistant (PDA) . Additional data, such as that entered for purposes of recording or storage, is often entered by other interfacing means such as a keyboard or keypad.
Still other arrangements for inputting data into an electronic device are known. Among such arrangements are those which use voice recognition techniques to recognize voice commands entered by the user. Such electronic devices include systems in which remote controls are provided with sophisticated voice recognition electronics. The remote controls recognize spoken commands, translate the commands into the traditional remote digital control signals, and transmit the control signals to a controlled device. Examples of such systems are provided by co-pending application Serial No. 07/915,112 of Bissonnette et al. , entitled Voice Operated Remote Control Device, by co-pending application Serial No. 07/915,938 of Bissonnette et al. , entitled Voice Recognition Apparatus and Method, and by co-pending application Serial No. 07/915,114 of Fischer, entitled Remote
Control Device. All three applications were filed on July 17, 1992 and are commonly assigned with the present application.
A further example of an electronic device in which data is inputted using voice recognition is provided by co-pending application Serial No. 08/113,394 of Fischer et al. , entitled Voice Operated Remote Control System, which application was filed August 27, 1993 and is commonly assigned with the present application. The Fischer et al. application describes a voice operated remote control system in which a remote control device responsive to the voice commands of a user transmits representations of the voice commands to a controlled device which then produces voice signals in response to the transmitted representations. The controlled device includes voice recognition circuitry for recognizing the transmitted voice commands and executing action routines denoted thereby. The remote control device receives the voice commands via a microphone and produces a corresponding analog audio signal which is modulated and then transmitted by an infrared transmitter. An audio receiver at the controlled device includes an infrared sensor for receiving the transmitted signal, and additional circuitry for demodulating and processing the transmitted signal into a corresponding voice signal. The voice signal is used to generate an incoming digital voice template for comparison with a plurality of digital reference voice templates. If a substantial equivalent is found, a corresponding action routine is executed to achieve a desired action within the controlled device. Another electronic device utilizing voice recognition for data inputting is described by co- pending application Serial No. 08/134,327 of Bissonnette et al. , entitled Voice Activated Personal Organizer, which application was filed October 12, 1993 and is commonly assigned with the present application. The Bissonnette et al. application describes a small, portable, hand-held electronic personal organizer which performs voice recognition on words spoken by a user to input data into the organizer, and records voice messages from the user. The spoken words and the voice messages are input via a microphone, and the voice messages are compressed and then converted into digital signals for storage. The stored digital voice messages are reconverted into analog signals and then expanded for reproduction using a speaker. The organizer is capable of a number of different functions, including voice training, memo record, reminder, manual reminder, time setting, message review, waiting message, calendar, phone group select, number retrieval, add phone number, security, and "no" logic. During such functions, data is principally entered by voice and occasionally through use of a limited keypad, and voice recordings are made and played back as appropriate. A visual display provides feedback to the user. During the various functions, the user can edit different data within the organizer by eliminating or correcting such data or entering new data.
While the electronic devices described in the above-noted co-pending applications incorporate a number of different advantageous features for human interfacing with such electronic devices so that data can be efficiently and effectively transferred, it may be desired for certain applications to provide further improvements in human interfacing with such devices.
For example, it would be desirable to be able to select functions or otherwise input data via manual means while simultaneously entering data specific to those functions by non-manual entries such as by vocal commands. Such an arrangement would provide an easy, intuitive human interface, allowing for both simultaneous and independent speech and touch inputting.
Brief Summary of the Invention Briefly stated, the present invention provides a voice and pointer operated system in which a pen or stick responsive to voice commands is operative to input data into a device using voice recognition, in addition to data inputted into the device using a pointer or tip region of the pen or stick. The device has a pointer-responsive area thereon as well as voice recognition means for recognizing a voice signal from the pen and producing an action routine denoted thereby. The pointer-responsive area may comprise a touch screen presenting frames for interaction with the pen to select functions for the device, with voice commands entered via a microphone in the pen being used to provide input data to the device.
In a first embodiment of a voice and pointer operated system in accordance with the invention, the pen is electrically coupled to the device to provide the voice signal directly to the device. A wire extending between the pen and the device, and which is long enough to permit movement of the tip region of the pen to different areas of the touch screen, couples the microphone directly to the device to provide the voice signals thereto. The wire connection also enables the pen to utilize the power supply of the device.
In a second embodiment of a voice and pointer operated system, the pen is wireless, thereby eliminating the need for a wire or other direct connection between the pen and the device. Voice signals generated by the microphone of the pen in response to voice commands from the user are modulated before being transmitted by the pen to an audio receiver within the device for conversion into the voice signal. The modulated voice signal may be transmitted by an infrared transmitter, within the pen, to an infrared sensor at the audio receiver within the device.
The voice recognition apparatus within the device may assume any appropriate form. In one example, such apparatus includes a reference memory for storing a plurality of reference voice templates, a program memory for storing a control program, and a processor for generating an incoming voice template in response to each voice signal which is received, in order to produce a voice signal corresponding to the transmitted representation. The control program is executed to determine whether the incoming voice template is substantially equivalent to one of the reference voice templates. One of a plurality of action routines is then selected, based on the reference voice template which the incoming voice template matches.
The device, which is controlled by the voice and pointer inputs of the pen or stick, may comprise such electronic devices as a computer, a personal organizer or a personal digital assistant (PDA) . The device may, for example, comprise a personal digital assistant which is organized to operate as a phone directory for storing names, addresses, phone numbers and other information. In that event, various fields within a display on the phone directory are touched by the pen to select basic functions such as entry of an additional name or entry of digital information for a name already displayed, with the user then speaking into the pen to enter the specific data within such fields. Brief Description Of The Drawings
The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings, in which:
Fig. 1 is a perspective view of a pen and a personal digital assistant, providing an example of a voice and pointer operated system in accordance with the invention;
Fig. 2 is a block diagram of a first embodiment of the system of Fig. 1 in which the pen is directly coupled by wire to the personal digital assistant;
Fig. 3 is an alternative embodiment of the system of Fig. 1 in which the pen is wireless;
Fig. 4 is a block diagram of the electronic portion of the pen of Fig. 3; Fig. 5 is a block diagram of the audio receiver forming a part of the personal digital assistant of Fig. 3;
Fig. 6 is a basic flow chart illustrating the operation of the system of Fig. 1; and Fig. 7 is a detailed flow chart illustrating a detailed example of the operation of the system of Fig. 1.
Detailed Description Fig. 1 shows a voice and pointer operated system 10 in accordance with the invention. The system 10 of
Fig. 1 includes a pen 12 and electronic device 14 with which the pen 12 is used. In the example of Fig. 1, the device 14 comprises a personal digital assistant (PDA) . It should be appreciated, however, that systems in accordance with the invention may utilize other electronic devices such as computers and personal organizers which are adapted for voice and pointer operation, using the pen 12, in accordance with the invention. The electronic device 14 includes a pointer- responsive area thereon in the form of a touch-screen 16. The pen 12 includes a pointer or tip region 18 thereof. Placement of the pen 12 to engage the touch¬ screen 16 by the tip region 18 results in the inputting of data into the touch-screen 16 in conventional fashion. In this connection, it should be understood that arrangements using other than physical touching can be used in accordance with the invention, including those in which the pointer-responsive area responds to infrared signals blocked by the pen 12. Any appropriate arrangement can be used in which the tip region 18 of the pen 12 interacts with the touch-screen 16 or other pointer-responsive area of the device 14 to input data therein, when placed in close proximity thereto.
In accordance with the invention, the pen 12 is provided with a microphone 20 mounted at an end of the pen 12 opposite the tip region 18. As the user speaks voice commands into the microphone 20, corresponding voice signals are provided to the device 14. The user presses a push-to-talk switch 21 on the pen 12 to signal the device 14 that voice commands are being transmitted thereto. As described hereafter, the device 14 utilizes voice recognition to input further data represented by the voice signals.
The pen 12 may be directly coupled to the device 14 by wire, as described hereafter in connection with Fig. 2. Alternatively, the pen 12 may be wireless and may transmit the voice commands to the device 14, such as by use of infrared techniques, as described hereafter in connection with Fig. 3. The manner in which the voice signals received by the microphone 20 of the pen 12 are conveyed to the device 14 is unimportant. What is important is that the system 10 of Fig. 1, as is true of systems in accordance with the invention, be capable of inputting data into the device 14 using both voice commands and pointing. Moreover, the voice commands and the pointing can be utilized to input data into the device 14 simultaneously. It will be appreciated by those skilled in the art, particularly from the detailed description to follow, that systems in accordance with the invention such as the system 10 shown in Fig. 1 provide a greatly improved, and indeed a natural, human interface of the user with the device 14. The pen 12 can be used to point to a desired field or other area of the device 14, and thereby input data into the device 14 while simultaneously responding to the user's voice commands to input additional data into the device 14 in conjunction therewith. This enables the user to point to a desired field or frame of the touch-screen 16 within the device 14 while at the same time making choices and otherwise inputting data in connection with such field or frame using voice commands.
Fig. 2 shows an embodiment of the system 10 of Fig. 1 in which the pen 12 is directly coupled by wire to the device 14. As shown in Fig. 2, the microphone 20 of the pen 12 is coupled by a wire 22 to the device 14. The wire 22, which is long enough to provide the user with flexibility in pointing to different portions of the touch-screen 16, provides the voice signals which are generated by the microphone 20 in response to the voice commands of the user directly to a microcontroller 24 within the device 14. The user presses the push-to-talk switch 21 before speaking to signal the device 14 that voice signals are being transmitted thereto.
The microcontroller 24 forms a part of a voice recognition circuit 26, and includes an analog-to- digital converter (A/D) 28. The microphone 20 of the pen 12 converts words spoken by the user into analog voice signals, and the A/D converter 28 converts such signals into corresponding digital signals. The microcontroller 24 also includes a microprocessor (MP) 30 coupled to the A/D converter 28, and having access to both a read only memory (ROM) 32 and a random access memory (RAM) 34. The microcontroller 24 is coupled to an external DRAM 36, which forms a part of the voice recognition circuit 26, and which is utilized for voice recognition storage requirements, including primarily the storage of voice templates.
The voice recognition apparatus shown and described in Fig. 2 is by way of example, and it should be understood that other apparatus capable of voice recognition can be used in accordance with the invention. For example, a digital signal processor (DSP) can be used to perform the voice recognition.
The microcontroller 24 comprising the voice recognition circuit 26 operates in the same manner as described in the previously referred to co-pending applications, Serial Nos. 07/915,112, 07/915,938 and
07/915,114. Such applications are incorporated herein by reference. As described in detail in co-pending application Serial No. 07/915,112, for example, the A/D converter 28 of Fig. 2 may comprise an 8-bit converter which samples incoming data at a preassigned frequency such as 9.6 KHz. The A/D converter 28 outputs a digital signal representing the input analog voice signal from the microphone 20. The microprocessor 30 processes the digital voice signal together with a voice recognition software routine forming part of a control program stored in the ROM 32. The digital voice signal is converted into an incoming voice template that is compared against previously stored voice templates of the user's voice stored in the external DRAM 36. The program decodes the voice templates. Together with the external DRAM 36, the RAM 34 comprises a reference memory for temporary storage of data.
Thus, the analog voice signal from the microphone 20 of the pen 12 is applied to the A/D converter 28 for conversion into an incoming digital voice signal. The reference memory, comprised of the external DRAM 36 in conjunction with the RAM 34, stores a plurality of reference digital voice templates. The ROM 32 stores the control program. The microprocessor 30, which is coupled to the A/D converter 28, the ROM 32 and the RAM 34, generates an incoming digital voice template from the incoming digital voice signal at the output of the A/D converter 28. The microprocessor 30 then executes the control program to determine whether the incoming digital voice template is substantially equivalent to one of the reference digital voice templates stored in the reference memory comprised of the RAM 34 and the external DRAM 36. The microprocessor 30 determines what action to take corresponding to a reference digital voice template, if the incoming digital voice template is found to have substantial similarity to the reference digital voice template. Again, however, it should be understood that this description is by way of example only, and that other voice recognition techniques can be used in accordance with the invention.
The electronic device 14 includes a keyboard input 38. The keyboard input 38 is not essential but may be used, as in the present example, for manual entry of additional data into the microcontroller 24. The electronic device 14 also includes an audio circuit 40 coupled through a speaker or speakers 42 to provide audible messages to the user of the device 14.
The electronic device 14 of Fig. 2 includes a liquid crystal display (LCD) 44 coupled to the microcontroller 24 through an LCD display circuit 46. The LCD display 44, which provides a visual display to the user, forms part of the touch-screen 16 of the electronic device 14. The touch-screen 16 is responsive to the tip region 18 of the pen 12 to input data, with such data being provided to the microcontroller 24 by the LCD display circuit 46. The physical or touching relationship between the tip region 18 of the pen 12 and the touch-screen of the LCD display 44 is represented by a dotted line 48 in Fig. 2.
The touch-screen 16 of the device 14 responds to the tip region 18 of the pen 12 to input data into the device 14 in conventional pen or stick fashion. In addition, however, and in accordance with the invention, data is also input into the device 14 using the microphone 20 of the pen 12 and the voice recognition capabilities of the device 14. As previously described, the microphone 20 responds to vocal commands of the user by providing the analog voice signals to the microcontroller 24 comprising a portion of the voice recognition circuit 26. The microcontroller 24 attempts to recognize the voice signals, and when such recognition occurs additional data is input into the system. In this fashion, data inputted into the device 14 by voice command is combined with data input by touching the touch-screen 16.
Fig. 3 shows an alternative embodiment in which the pen 12 is not coupled by the wire 22 to the device 14, as in the case of Fig. 2. Instead, the pen 12 is wireless and transmits representations of the voice commands of the user to an audio receiver 50 within the device 14. The audio receiver 50 in turn converts the transmitted signals into the voice signals for application to the microcontroller 24. The microcontroller 24 performs voice recognition on the voice signals in the same manner as described in connection with Fig. 2.
Thus, the device 14 of Fig. 3 is like that of Fig. 2, except for the presence of the audio receiver 50. The pen 12 of Fig. 3 is like the pen of Fig. 2, except that it transmits the vocal commands to the audio receiver 50 in wireless fashion. Because the pen 12 of Fig. 3 is not directly coupled to the device 14 of Fig. 3 by wire, the pen 12 must have its own power supply. To conserve on power, the push-to-talk switch 21 which is mounted on the side of the pen 12 is used to apply power to the microphone 20 whenever the user wishes to speak into the microphone 20. As in the case of Fig.
2, the tip region 18 of the pen 12 of Fig. 3 physically interacts with the touch-screen 16 formed in conjunction with the LCD display 44, as represented by the dotted line 48. The electronic portion of the pen 12 of Fig. 3 is shown in Fig. 4. As previously described, the microphone 20 responds to words spoken by the user to produce the analog voice signals. Such analog voice signals are amplified in an audio amplifier 54. The audio amplifier 54 conditions the signal for proper application to a voltage controlled oscillator (VCO) 56 to frequency modulate (FM) a signal at the output of the VCO 56 in accordance with the analog audio signal from the amplifier 54. The output of the VCO 56 is conditioned by a driver circuit 58 prior to driving an infrared (IR) transmitter 60.
The pen 12 is powered by batteries 62. Because there is no memory or microprocessor in the pen 12, there is no need to keep the pen 12 powered when not in use. Accordingly, the push-to-talk switch 21 is used to provide the power from the batteries 62 only when the user wishes to speak into the microphone 20. Pressing the push-to-talk switch 21 at the outer surface of the pen 12 couples the batteries 62 to a voltage regulator 64. The voltage regulator 64, which is coupled between a common ground line 66 and the push-to-talk switch 21, has an output terminal 68 thereof coupled to provide a power supply voltage (+V) at a terminal 70. This regulated supply voltage is applied to the microphone 20, the audio amplifier 54 and the VCO 56. The batteries 62 are directly coupled to the driver 58 by the push-to-talk switch 21. The regulated supply voltage +V optimizes the operating conditions of the VCO 56.
Whenever the user wishes to transmit a voice command to the electronic device 14, the user pushes the push-to-talk switch 21 and speaks the voice command into the microphone 20 of the pen 12. The resulting analog audio signal is amplified by the amplifier 54 and applied to the VCO 56 to frequency modulate the output of the VCO 56. This FM signal is applied via the driver 58 to the infrared transmitter 60, for transmission as an infrared signal to the electronic device 14. When the user is not transmitting voice commands, the push-to-talk switch 21 is open, and the batteries 62 are not coupled to the voltage regulator 64 or to the driver 58.
When the user pushes the push-to-talk switch 21 and speaks a command word into the microphone 20, the pen 12 transmits a frequency modulated infrared signal from the infrared transmitter 60. The infrared transmitter 60 may comprise an infrared diode or other appropriate infrared transmitting device. Such infrared signals are sensed by an infrared (IR) sensor 72 forming a part of the audio receiver 50, as shown in Fig. 3. A detailed example of the audio receiver 50 is shown in Fig. 5.
Referring to Fig. 5, the infrared sensor 72, which may comprise any appropriate form of infrared sensor such as those typically used in remote control devices, provides the received signal for amplification by an amplifier 74 and filtering by a filter 76. The output of the filter 76 is applied to a phase-locked loop receiver 78 which demodulates the frequency modulated signal by converting it to a voice signal . The phase- locked loop receiver 78 is of conventional phase-locked loop (PLL) configuration, and includes a phase detector 80 coupled to the filter 76 and having an output and a second input coupled in a loop which includes a loop filter 82 and a voltage controlled oscillator (VCO) 84. The output of the phase-locked loop receiver 78 is filtered by an audio filter 86 and amplified by an audio amplifier 88 to provide the demodulated voice signal. The phase-locked loop receiver 78 also provides a lock detect signal, which indicates when a transmitted signal is present in the audio receiver 50.
The phase-locked loop receiver 78 is described herein by way of example only, and it should be understood that other demodulation receivers can be used as desired. For example, a super-heterodyne receiver can be used in applications posing more stringent requirements. Voice control of the electronic device 14 within the voice and pointer operated system 10 is made possible by first voice training the collection of reference digital voice templates in the voice recognition circuit 26 in accordance with the user's voice. Such templates are collected in the same manner as described in co-pending application Serial No. 07/915,112. The LCD display 44 of the device 14 is used to provide displays which prompt the user by requesting the needed words. The user responds, with the push-to-talk switch 21 being pushed, by speaking the prompted word into the microphone 20 of the pen 12.
As previously noted, the electronic device 14 in the example of Fig. 1 comprises a personal digital assistant (PDA) capable of input and retrieval of a phone directory. The LCD display 44 of the device 14 is employed to display a list of names and phone numbers. By touching a button labeled "add", displayed on the LCD display 44 with the tip region 18 of the pen 12, a blank entry with boxes for the name, voice training, country code, area code, phone numbers, address and notes is placed at the end of the list. The user is also given the option of adding additional phone numbers and alternate addresses per entry, if desired. The user may randomly select fields to input data, by touching such fields with the pen 12. For example, if the user touches a "name" box on the LCD display 44 with the pen 12, the name is input using the keyboard input 38, or by voice, spelling each word. The user can then touch a "voice" button and speak the person's name into the microphone 20 of the pen 12 in order to create a recording of the name. The user can then touch an "area code" box and speak the digits for the area code, or the user can touch the "phone number" box and speak the digits of the phone number, to complete the entry. The various fields are optional and can be selected in any order. If the user decides not to input data into a field, the field is left blank. A detailed example of this operation is provided hereafter in connection with Figs. 6 and 7. To retrieve a phone number, the user touches a "search" button on the LCD display 44 using the pen 12, and speaks the name of the person the user wishes to call. The person's name, phone number, and any other available information is immediately displayed on the LCD display 44, and the recording of the name is spoken back to the user for confirmation, using the audio circuit 40 and the speaker 42 shown in the arrangements of Figs. 2 and 3. Alternatively, the device 14 can be programmed to speak the phone number and other information or to dial the number if connected to a telephone.
The flow chart of Fig. 6 illustrates the manner in which the user inputs all of the information normally found in an address book. For simplicity of illustration, the flow chart of Fig. 6 shows only one path through the various fields. However, it should be understood that the user can input such information in any order desired. Referring to the flow chart example of Fig. 6, in a first step 90 the user requests a new entry in the directory, and does so by touching a button labeled "add" with the pen 12. The user then chooses to input the name first, by touching the "name" field with the pen 12, in a step 92. The "name" field is highlighted, and the user then types the person's name using the keyboard input 38, in a step 94. This highlights the "record" field. The user then holds the push-to-talk switch 21 on the pen 12 and speaks the person's name, in a step 96.
Upon speaking the person's name, in the step 96, the "address" field is highlighted upon release of the push-to-talk switch 21. In a step 98, the user types in the address using the keyboard input 38.
Alternatively, the user may record the address by speaking the address into the microphone 20 of the pen 12 in a step 100. As a further alternative, the user can speak the digits of the street number into the pen 12, in a step 102, and then type the street name using the keyboard input 38, in a following step 104.
Following entry of the address, the "phone number" field is highlighted. The user may type the phone number using the keyboard input 38, in a step 106. Alternatively, the user may speak the digits of the phone number into the pen 12, in a step 108. As shown by a subsequent step 110, the address entry is automatically saved when an "enter" key on the keyboard entry 38 is pressed, or the push-to-talk switch 21 on the pen 12 is released, or one of the on-screen icons such as "save" or "new entry" is touched.
The flow chart of Fig. 7 provides a more detailed example of the manner in which the user can randomly select fields and input information in any order. For simplicity of illustration, the flow chart of Fig. 7 is confined to entry of a phone number. Nevertheless, the flow chart of Fig. 7 demonstrates the ease and flexibility with which the user may enter information. As illustrated by an elongated box 120 at the top of Fig. 7, the user has chosen to start a new phone number entry. The user may touch a "country" field, as illustrated in a step 122. If so, and as illustrated in a following step 124, the country code field is highlighted and the device 14 waits for the user to press the push-to-talk switch 21 on the pen 12 (again assuming the wireless example of Fig. 3) . By pushing the push-to-talk switch 21 on the pen 12, in a step 126, the device 14 is ready to interpret each spoken word and to append the result to the number in the country code field, until the push-to-talk switch 21 is released or the field is full, as illustrated in a step 128. Release of the push-to-talk switch 21 results in the device 14 highlighting the country code field (step 124) . If the field is full, the box 120 at the top of Fig. 7 is returned to. From the starting box 120, the user may also touch an "area code" field, in a step 130, and this highlights the area code field and waits for the user to press the push-to-talk switch 21 in a step 132. If the user presses the push-to-talk switch 21 in a step 134, the device 14 interprets each spoken word and appends the result to the number in the area code field, until either the user releases the button or the field is full, as shown in a step 136. Release of the push-to-talk switch 21 results in the highlighting of the area code field (step 136) . If the field is filled, the starting box 120 is returned to.
From the starting box 120 at the top of Fig. 7, the user may also touch a "phone number" field in a step 138, and this highlights the phone number field and causes the device 14 to wait for the user to press the push-to-talk switch 21, in a step 140. If the user then presses the push-to-talk switch 21, in a step 142, the device 14 interprets each spoken word and appends the result to the number in the phone number field, until the user releases the push-to-talk switch 21 or the field is full, as illustrated in a step 144. If the user releases the push-to-talk switch 21, the phone number field is highlighted (step 140) . If the field is filled, the starting box 120 is returned to.
As shown by the various lines and arrows in Fig. 7, from the step 124 in which the country code field is highlighted, the user may move to the step 130 or the step 138 for highlighting of the area code field or the phone number field respectively. Similarly, from the step 132 in which the area code field is highlighted, the user can move to either the step 138, or to the step 122 in which the country code is highlighted. The step 140 in which the phone number field is highlighted can also result in the user moving to either the step 122 or the step 130. This illustrates the flexibility of the system and the fact that fields can be selected and information input thereto randomly and in any order.
It is possible for two or more users to use the same electronic device 14. Each such user trains his or her voice, with the resulting templates being stored separately from the templates of the other user or users. Thereafter, and in accordance with one technique, a particular user initiates use of the device 14 by speaking his or her name, at an appropriate word group, and this results in the device 14 using that person's voice templates. Once the name is recognized, the LCD display 44 shows the user's name and uses that person's voice templates, until the name is changed. Even if the device 14 is turned off, the device continues to identify that person and his or her voice templates when again turned on. This technique for use of the electronic device 14 by multiple users has the advantage that the electronic device 14 need not have components added thereto to support the multiple voice capability. Also, the users do not have to remember a number on a switch or other device corresponding to their voice. However, the word groups must be increased by addition of more words, and it is difficult to constantly determine which user's voice templates are currently being used.
Accordingly, an alternative approach to accommodating multiple users of the electronic device 14 involves the addition of a multiple position switch to the device 14. Positioning such switch to a number corresponding to a particular one of multiple users results in the device 14 using that person's voice templates. While various forms and modifications have been suggested, it will be appreciated that the invention is not limited thereto but encompasses all expedients and variations falling within the scope of the appended claims.

Claims

WHAT IS CLAIMED IS:
1. A voice and pointer operated system comprising the combination of: a pen responsive to a voice command for providing a voice signal; and a device having a pointer-responsive area thereon and including voice recognition means for recognizing the voice signal and producing an action routine denoted thereby.
2. A voice and pointer operated system in accordance with claim 1, wherein the pen has a tip region for interacting with the pointer-responsive area of the device and a microphone responsive to the voice command.
3. A voice and pointer operated system in accordance with claim 1, wherein the pointer-responsive area of the device comprises a touch screen.
4. A voice and pointer operated system in accordance with claim 1, wherein the pen is electrically coupled to the device to provide the voice signal thereto.
5. A voice and pointer operated system in accordance with claim 1, wherein the pen includes means responsive to the voice command for transmitting a representation of the voice command, and the device includes means responsive to the transmitted representation of the voice command for providing the voice signal.
6. A voice and pointer operated system in accordance with claim 5, wherein the means for transmitting includes an infrared transmitter and the means responsive to the transmitted representation includes an infrared sensor.
7. A voice and pointer operated system in accordance with claim 5, wherein the pen includes means for converting the voice command into an analog signal and means for modulating the analog signal to produce a transmission signal, and the device includes means for demodulating the transmission signal.
8. A voice and pointer operated system in accordance with claim 1, wherein the voice recognition means includes a reference memory for storing a plurality of reference voice templates, a program memory for storing a control program, and a processor coupled to the reference memory and the program memory for generating an incoming voice template in response to each voice signal produced by the means for producing a voice signal corresponding to the transmitted representation, and for executing the control program to determine whether the incoming voice template is substantially equivalent to one of the reference voice templates, and for selecting one of a plurality of action routines based on the incoming voice template.
9. A method of controlling, by voice command and by pointing, a control device, comprising the steps of: pointing to a specific area of the control device to interact with and input data into the control device; and responding to a voice command by providing a corresponding voice signal to the control device.
10. A method of controlling in accordance with claim 9, wherein the steps of pointing and responding are carried out simultaneously.
11. A method of controlling in accordance with claim 9, comprising the further steps of responding to the voice signal at the control device by performing voice recognition of the voice signal and providing an action routine to the control device in accordance with the voice recognition performed on the voice signal.
12. A method of controlling in accordance with claim 9, wherein the step of responding to a voice command by providing a corresponding voice signal to the control device comprises the steps of converting the voice command, transmitting the converted voice command to the control device and converting the transmitted converted voice command into a voice signal at the control device.
13. A pen for interacting with a touch screen, comprising the combination of: an elongated pen having a tip region for interacting with a touch screen; and means contained within the pen and responsive to a voice command for providing a voice signal .
14. A pen in accordance with claim 13, wherein the means for providing a voice signal comprises a microphone.
15. A pen in accordance with claim 14, wherein the microphone is mounted at an end of the elongated pen opposite the tip region.
16. A pen in accordance with claim 13, wherein the means for providing a voice signal includes means responsive to the voice command for transmitting a corresponding signal.
17. A pen in accordance with claim 13, wherein the means for providing a voice signal includes means responsive to the voice command for producing a corresponding analog voice signal, means for modulating the analog voice signal and means for transmitting the modulated analog voice signal.
18. A device for operating in response to a voice signal and to interaction with a pen comprising the combination of: means responsive to the voice signal for performing voice recognition thereon,- means responsive to the voice recognition for producing an action routine within the device; and means responsive to interaction with a pen for performing a function with the device.
19. A device in accordance with claim 18, wherein the means responsive to interaction with a pen includes a touch screen mounted on the device.
20. A device in accordance with claim 18, further comprising means for sensing a transmitted modulated audio signal and means for demodulating a sensed signal to produce the voice signal.
21. A method of making entries in a directory stored in an electronic device, the electronic device including a display having a plurality of touch- responsive fields and means for recognizing words spoken by a user, comprising the steps of: touching a name field on the display; manually entering a name into the device; speaking the name to enter it into the device; entering an address for the name into the device; and entering a phone number for the name into the device.
22. A method in accordance with claim 21, wherein the step of touching a name field is carried out using a pen, and the step of speaking the name comprises speaking the name into the pen so that the pen transmits a corresponding voice signal to the device.
23. A method in accordance with claim 22, wherein the step of entering an address for the name comprises speaking the address into the pen so that the pen transmits a corresponding voice signal to the device.
24. A method in accordance with claim 22, wherein the step of entering a phone number for the name comprises speaking the phone number into the pen so that the pen transmits a corresponding voice signal to the device.
PCT/US1995/002921 1994-03-17 1995-03-08 Voice/pointer operated system WO1995025326A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US21022194A 1994-03-17 1994-03-17
US08/210,221 1994-03-17

Publications (1)

Publication Number Publication Date
WO1995025326A1 true WO1995025326A1 (en) 1995-09-21

Family

ID=22782048

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1995/002921 WO1995025326A1 (en) 1994-03-17 1995-03-08 Voice/pointer operated system

Country Status (1)

Country Link
WO (1) WO1995025326A1 (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999008175A2 (en) * 1997-08-05 1999-02-18 Assistive Technology, Inc. Universally accessible computing system
US5884249A (en) * 1995-03-23 1999-03-16 Hitachi, Ltd. Input device, inputting method, information processing system, and input information managing method
EP1009102A2 (en) * 1998-12-09 2000-06-14 Robert Bosch Gmbh Communication device
GB2345558A (en) * 1999-01-05 2000-07-12 Assaf Ahmed Abdel Rahman Portable electronic book reader
EP1037135A2 (en) * 1999-03-18 2000-09-20 Samsung Electronics Co., Ltd. Method of processing user information inputted through touch screen panel of a digital mobile station
WO2002079970A2 (en) * 2001-03-30 2002-10-10 Siemens Aktiengesellschaft Computer and control method therefor
WO2002093342A2 (en) * 2001-05-16 2002-11-21 Kanitech International A/S A computer control device with optical detection means and such device with a microphone and use thereof
EP1320025A2 (en) * 2001-12-17 2003-06-18 Ewig Industries Co., LTD. Voice memo reminder system and associated methodology
FR2838278A1 (en) * 2002-04-08 2003-10-10 France Telecom MULTIMEDIA MOBILE TERMINAL AND METHOD FOR REMOTE CONTROL OF A DOMESTIC GATEWAY USING SUCH A TERMINAL
GB2387747A (en) * 2002-04-19 2003-10-22 Motorola Inc Inputting speech to a hand held device with multi modal user interfaces via a stylus including a directional microphone
KR100444985B1 (en) * 2001-09-03 2004-08-21 삼성전자주식회사 Combined stylus and method for driving thereof
KR100454688B1 (en) * 2001-04-25 2004-11-05 지버노트 코포레이션 Wireless pen input device
EP1494209A1 (en) * 2003-06-30 2005-01-05 Harman Becker Automotive Systems GmbH Acoustically and haptically operated apparatus and method thereof
EP1617409A1 (en) * 2004-07-13 2006-01-18 Microsoft Corporation Multimodal method to provide input to a computing device
GB2428125A (en) * 2005-07-07 2007-01-17 Hewlett Packard Development Co Digital pen with speech input
WO2009105652A2 (en) * 2008-02-22 2009-08-27 Vocollect, Inc. Voice-activated emergency medical services communication and documentation system
US7778821B2 (en) 2004-11-24 2010-08-17 Microsoft Corporation Controlled manipulation of characters
US8255225B2 (en) 2008-08-07 2012-08-28 Vocollect Healthcare Systems, Inc. Voice assistant system
US8451101B2 (en) 2008-08-28 2013-05-28 Vocollect, Inc. Speech-driven patient care system with wearable devices
US8942985B2 (en) 2004-11-16 2015-01-27 Microsoft Corporation Centralized method and system for clarifying voice commands
US9632650B2 (en) 2006-03-10 2017-04-25 Microsoft Technology Licensing, Llc Command searching enhancements
US11481109B2 (en) * 2007-01-07 2022-10-25 Apple Inc. Multitouch data fusion

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4495646A (en) * 1982-04-20 1985-01-22 Nader Gharachorloo On-line character recognition using closed-loop detector
US4811243A (en) * 1984-04-06 1989-03-07 Racine Marsh V Computer aided coordinate digitizing system
US4914704A (en) * 1984-10-30 1990-04-03 International Business Machines Corporation Text editor for speech input
US5063600A (en) * 1990-05-14 1991-11-05 Norwood Donald D Hybrid information management system for handwriting and text
US5148155A (en) * 1990-11-13 1992-09-15 Wang Laboratories, Inc. Computer with tablet input to standard programs
US5309359A (en) * 1990-08-16 1994-05-03 Boris Katz Method and apparatus for generating and utlizing annotations to facilitate computer text retrieval
US5347477A (en) * 1992-01-28 1994-09-13 Jack Lee Pen-based form computer
US5365434A (en) * 1993-06-04 1994-11-15 Carolyn E. Carlson Book enhancer

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4495646A (en) * 1982-04-20 1985-01-22 Nader Gharachorloo On-line character recognition using closed-loop detector
US4811243A (en) * 1984-04-06 1989-03-07 Racine Marsh V Computer aided coordinate digitizing system
US4914704A (en) * 1984-10-30 1990-04-03 International Business Machines Corporation Text editor for speech input
US5063600A (en) * 1990-05-14 1991-11-05 Norwood Donald D Hybrid information management system for handwriting and text
US5309359A (en) * 1990-08-16 1994-05-03 Boris Katz Method and apparatus for generating and utlizing annotations to facilitate computer text retrieval
US5148155A (en) * 1990-11-13 1992-09-15 Wang Laboratories, Inc. Computer with tablet input to standard programs
US5347477A (en) * 1992-01-28 1994-09-13 Jack Lee Pen-based form computer
US5365434A (en) * 1993-06-04 1994-11-15 Carolyn E. Carlson Book enhancer

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5884249A (en) * 1995-03-23 1999-03-16 Hitachi, Ltd. Input device, inputting method, information processing system, and input information managing method
WO1999008175A3 (en) * 1997-08-05 1999-05-20 Assistive Technology Inc Universally accessible computing system
US6128010A (en) * 1997-08-05 2000-10-03 Assistive Technology, Inc. Action bins for computer user interface
WO1999008175A2 (en) * 1997-08-05 1999-02-18 Assistive Technology, Inc. Universally accessible computing system
EP1009102A2 (en) * 1998-12-09 2000-06-14 Robert Bosch Gmbh Communication device
EP1009102A3 (en) * 1998-12-09 2003-04-23 Siemens Aktiengesellschaft Communication device
GB2345558A (en) * 1999-01-05 2000-07-12 Assaf Ahmed Abdel Rahman Portable electronic book reader
EP1037135A2 (en) * 1999-03-18 2000-09-20 Samsung Electronics Co., Ltd. Method of processing user information inputted through touch screen panel of a digital mobile station
US7031756B1 (en) 1999-03-18 2006-04-18 Samsung Electronics Co., Ltd Method of processing user information inputted through touch screen panel of digital mobile station
EP1037135A3 (en) * 1999-03-18 2002-12-04 Samsung Electronics Co., Ltd. Method of processing user information inputted through touch screen panel of a digital mobile station
WO2002079970A3 (en) * 2001-03-30 2003-08-21 Siemens Ag Computer and control method therefor
WO2002079970A2 (en) * 2001-03-30 2002-10-10 Siemens Aktiengesellschaft Computer and control method therefor
KR100454688B1 (en) * 2001-04-25 2004-11-05 지버노트 코포레이션 Wireless pen input device
WO2002093342A2 (en) * 2001-05-16 2002-11-21 Kanitech International A/S A computer control device with optical detection means and such device with a microphone and use thereof
WO2002093342A3 (en) * 2001-05-16 2004-02-26 Kanitech Internat A S A computer control device with optical detection means and such device with a microphone and use thereof
KR100444985B1 (en) * 2001-09-03 2004-08-21 삼성전자주식회사 Combined stylus and method for driving thereof
EP1320025A3 (en) * 2001-12-17 2004-11-03 Ewig Industries Co., LTD. Voice memo reminder system and associated methodology
EP1320025A2 (en) * 2001-12-17 2003-06-18 Ewig Industries Co., LTD. Voice memo reminder system and associated methodology
WO2003085616A1 (en) * 2002-04-08 2003-10-16 France Telecom Mobile terminal and method for remote control of a home gateway using same
FR2838278A1 (en) * 2002-04-08 2003-10-10 France Telecom MULTIMEDIA MOBILE TERMINAL AND METHOD FOR REMOTE CONTROL OF A DOMESTIC GATEWAY USING SUCH A TERMINAL
GB2387747A (en) * 2002-04-19 2003-10-22 Motorola Inc Inputting speech to a hand held device with multi modal user interfaces via a stylus including a directional microphone
GB2387747B (en) * 2002-04-19 2004-07-14 Motorola Inc Microphone arrangement
EP1494209A1 (en) * 2003-06-30 2005-01-05 Harman Becker Automotive Systems GmbH Acoustically and haptically operated apparatus and method thereof
WO2005004112A1 (en) * 2003-06-30 2005-01-13 Harman Becker Automotive Systems Gmbh Acoustically and haptically operated apparatus and method thereof
EP1617409A1 (en) * 2004-07-13 2006-01-18 Microsoft Corporation Multimodal method to provide input to a computing device
US10748530B2 (en) 2004-11-16 2020-08-18 Microsoft Technology Licensing, Llc Centralized method and system for determining voice commands
US9972317B2 (en) 2004-11-16 2018-05-15 Microsoft Technology Licensing, Llc Centralized method and system for clarifying voice commands
US8942985B2 (en) 2004-11-16 2015-01-27 Microsoft Corporation Centralized method and system for clarifying voice commands
US7778821B2 (en) 2004-11-24 2010-08-17 Microsoft Corporation Controlled manipulation of characters
US8082145B2 (en) 2004-11-24 2011-12-20 Microsoft Corporation Character manipulation
GB2428125A (en) * 2005-07-07 2007-01-17 Hewlett Packard Development Co Digital pen with speech input
US9632650B2 (en) 2006-03-10 2017-04-25 Microsoft Technology Licensing, Llc Command searching enhancements
US11816329B2 (en) 2007-01-07 2023-11-14 Apple Inc. Multitouch data fusion
US11481109B2 (en) * 2007-01-07 2022-10-25 Apple Inc. Multitouch data fusion
WO2009105652A3 (en) * 2008-02-22 2009-10-22 Vocollect, Inc. Voice-activated emergency medical services communication and documentation system
WO2009105652A2 (en) * 2008-02-22 2009-08-27 Vocollect, Inc. Voice-activated emergency medical services communication and documentation system
US9171543B2 (en) 2008-08-07 2015-10-27 Vocollect Healthcare Systems, Inc. Voice assistant system
US8521538B2 (en) 2008-08-07 2013-08-27 Vocollect Healthcare Systems, Inc. Voice assistant system for determining activity information
US10431220B2 (en) 2008-08-07 2019-10-01 Vocollect, Inc. Voice assistant system
US8255225B2 (en) 2008-08-07 2012-08-28 Vocollect Healthcare Systems, Inc. Voice assistant system
US8451101B2 (en) 2008-08-28 2013-05-28 Vocollect, Inc. Speech-driven patient care system with wearable devices

Similar Documents

Publication Publication Date Title
WO1995025326A1 (en) Voice/pointer operated system
EP0838945B1 (en) Video user's environment
US6438524B1 (en) Method and apparatus for a voice controlled foreign language translation device
US6198939B1 (en) Man machine interface help search tool
US5452340A (en) Method of voice activated telephone dialing
EP0892534B1 (en) Radio telephone
EP1603312B1 (en) Method for performing functions using searched telephone number in mobile terminal
US6272361B1 (en) Radio telephone
JP4031255B2 (en) Gesture command input device
US20080031434A1 (en) Telephone/Transaction Entry Device and System for Entering Transaction Data into Databases
TW201044265A (en) Touch anywhere to speak
US20070297597A1 (en) Telephone/Transaction Entry Device and System for Entering Transaction Data into Databases
CN101025860A (en) Digital media adaptor with voice control function and its voice control method
JPH06131108A (en) Information input device
WO1995006309A1 (en) Voice operated remote control system
KR100657059B1 (en) Method for initiating voice recognition
JP2972632B2 (en) Radio selective call receiver
CN109346077B (en) Voice system suitable for portable intelligent equipment and use method thereof
US20070117080A1 (en) Auxiliary operation system of handheld electronic device
TWI271984B (en) Instructions generating and executing method of handheld electronic device
Cisco Media Control Step Descriptions
KR100581827B1 (en) Method for searching telephone number of mobile communication terminal
EP1261963A4 (en) Recorder adapted to interface with internet browser
JPH11296182A (en) Karaoke device
JP2000261537A (en) Dial memory retrieval device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase