WO2001026350A1 - Vocal interface system and method - Google Patents
Vocal interface system and method Download PDFInfo
- Publication number
- WO2001026350A1 WO2001026350A1 PCT/US2000/026935 US0026935W WO0126350A1 WO 2001026350 A1 WO2001026350 A1 WO 2001026350A1 US 0026935 W US0026935 W US 0026935W WO 0126350 A1 WO0126350 A1 WO 0126350A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- caller
- service
- information
- business
- database
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
- H04M3/4936—Speech interaction details
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2242/00—Special services or facilities
- H04M2242/22—Automatic class or number identification arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/38—Graded-service arrangements, i.e. some subscribers prevented from establishing certain connections
- H04M3/382—Graded-service arrangements, i.e. some subscribers prevented from establishing certain connections using authorisation codes or passwords
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/42025—Calling or Called party identification service
- H04M3/42034—Calling party identification service
- H04M3/42059—Making use of the calling party identifier
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q3/00—Selecting arrangements
- H04Q3/72—Finding out and indicating number of calling subscriber
Definitions
- Web pages has for example, experienced explosive growth. Moreover, the
- VUI vocal user interface
- VUI implementations tend to be cumbersome
- the present invention comprises a VUI Speech Object Application
- Fig. A1 Depicts a conversational state diagram of an embodiment of a
- Fig. A2 Depicts a conversational state diagram of an embodiment of a
- Fig. A3 Depicts a conversational state diagram of an embodiment of a
- Fig. A4 Depicts a conversational state diagram of an embodiment of a
- Fig. B1 Depicts a conversational state diagram of an embodiment of a
- Fig. B2 Depicts a conversational state diagram of an alternate embodiment of a Traffic Condition Speech Object.
- Fig. C1 Depicts a conversational state diagram of an embodiment of a 1/26350
- Fig. C2 Depicts a conversational state diagram of an embodiment of a extended functionality of the Business Finder Speech Object.
- Fig. D1 Depicts a conversational state diagram of an embodiment of a Stock Information Speech Object.
- Fig. D2 Depicts a conversational state diagram of an embodiment of extended functionality of a Stock Information Speech Object.
- Fig. D3 Depicts a conversational state diagram of an embodiment of extended functionality of a Stock Information Speech Object.
- Fig. E1 Depicts a conversational state diagram of an embodiment of a Weather Speech Object.
- Fig. E2 Depicts a conversational state diagram of an embodiment of a List Speech Object for conveying weather information to the caller.
- Fig. F1 Depicts a conversational state diagram of an embodiment of a Address Locating Speech Object.
- Fig. F2 Depicts a conversational state diagram of an embodiment of a Address Disambiguation Speech Object.
- Fig- G1 Depicts a conversational state diagram of an embodiment of a Flight Finder Speech Object.
- Fig- G2 Depicts a conversational state diagram of an embodiment of a Flight Information Speech Object.
- Fig- G3 Depicts a conversational state diagram of an embodiment of a Itinerary Speech Object.
- Fig. H Depicts a conversational state diagram of an embodiment of a Driving Directions Speech Object.
- scaleable system architecture that includes at least one each of the following;
- Vocal User Interface Application Server
- Telephony Server a Telephony Server
- Speech Recognition Server a Text-to-Speech Server
- Media Server a Media Server
- API Application Program Interface
- the presently preferred backbone network comprises a TCP/IP
- a caller connects to the Telephony Server by dialing a telephone
- the Telephony Server includes a Telephone Network (PSTN).
- PSTN Telephone Network
- the Telephony Server includes a Telephone Network (PSTN).
- FIG. 1 depicts the Telephone Network Interface coupled
- PSTN Public Switched Telephone Network
- Telephone Network Interface further comprises speech signal processing
- the VUI Application Server comprises hardware under control of a VUI
- the VUI Application implements a vocally navigable Speech Object interface between the caller and the API of the independent Service-
- VUI Application further comprises distinct program
- speech grammars that are particularly germane to the
- a Traffic Condition Module a Traffic Condition Module
- a Business Finder Module a Stock Finder Module
- the Media Server comprises hardware under program control to store
- the Media Server conveys speech objects to the voice
- Speech Recognition Server comprises hardware under program control to
- VUI Application translates the caller's uttered Service- content
- Speech Objects that further comprise reused
- the caller may vocalize a primary specific navigable point or a
- Service-Database e.g. "Traffic Conditions Database”, “Home Menu”, or “Stock Information Database”
- the List Speech Object comprises a preamble that will convey
- Speech Object Further, selection of an item in the list or getting more
- disambiguation in accordance with the present invention comprises a method
- the first step is to convey the ambiguous
- caller is accomplished with appropriate utterance and speech grammar (e.g.
- Disambiguating Speech Object further creates dynamic speech grammars
- the Main Menu Speech Object comprises
- Diagram A depicts several possible transitions depending upon the
- the caller may utter a grammar associated with a
- Service-Database program module e.g. "Traffic” or with one of
- caller administrative program modules e.g. "Login”, “New Account”,
- the Main Menu document confirms the caller's choice (e.g.
- Figure A2 depicts a Login Speech Object (SOLogin A4) that permits
- PIN personal identification number
- Login Speech Object associates each caller's PIN with their telephone
- Figure A3 depicts a Passcode Speech Object to
- Figure A4 depicts a New Account Speech Object to
- the caller may at any time return to the Main Menu document by
- the caller having to retrace the same navigated path.
- the caller may utter several of the same navigational choices previously
- Figure B1 depicts a Traffic Conditions Program Module conversational
- Traffic Speech Object The caller can select the Traffic
- Speech Object from the above Main Menu Speech Object module by uttering
- the Traffic Speech Object coveys prompts that direct the caller to utterances that indicate a region of interest for traffic condition information.
- the Traffic Speech Object first processes the caller's area code
- the Traffic Speech Object prompts the caller to
- Traffic Module confirms a metro area associated with the caller's selected
- the Traffic Module confirms that it will search the traffic
- Service-Database e.g SO_GetMetroTraffic B2
- the Traffic Module will prompt the caller for a new metro area if the caller at this time cancels the pending search by uttering a "cancel"
- the caller can request additional information by uttering "that one.”
- the Traffic Module prompts the caller to optionally perform
- the Traffic Speech Object engages the
- Figure C depicts a Business Finder Service Program Module
- the presumption is conveyed to the caller (e.g.
- the caller is prompted (e.g. SO_Brand/Category
- the Business Finder Speech Object automatically first filters out matches that are more than a specified distance away (e.g. more that 50
- the caller is prompted whether a new search is desired.
- the Business Finder Speech Object further includes the ability for the
- Figure C2 depicts a conversational state diagram of this
- the caller may also select a business
- the Business Finder Speech Object audibly confirms the
- the Telephony Network Server initiates a telephone call to the Telephony Network Server
- Figure D1 depicts a Stock Information Program Module conversational
- a speech object audibly alerts the caller that the Stock Information
- the caller may opt out of the assumption by uttering a speech
- a stock information indicator e.g. company name, ticker symbol,
- the Stock Information Speech Object performs a search of the stock
- the caller may wish to receive detailed information about stocks or abbreviated information.
- Object recognizes both contextually global - non temporal utterances (e.g.
- Figure D2 depicts a conversational state diagram reflecting additional
- the Stock Sub-Module performs a search of the stock information
- Figure D3 depicts an example of a conversational state diagram for
- Figure E1 depicts the Weather Conditions Speech Object ("Weather
- the Weather Speech Object infers a city for the caller based upon
- the Weather Speech Object retrieves the weather information for
- Figure E2 depicts a List Speech Object for conveying the weather
- Figure F1 depicts the conversation state diagram of the Address
- Locating Program Module ("Address Speech Object").
- the Address Speech Object is ordinarily transitioned to from another Speech Object that needs to
- Landmarks are preassigned speech grammars that can be both global
- the Airport Finder Speech Object searches the caller private profile
- the Address Module will access the address associated with
- the Address Module prompts the caller
- Speech Object engages the caller in speech objects that enable the caller to
- Speech Object or alternatively, to begin searching from scratch. For example,
- the caller may change the street name, or the cross street name.
- FIG. G1 depicts the conversation state diagram of the Flight
- the caller is greeted and prompted to
- Flight Information Program Module (“Flight Information Speech Object").
- Figure G2 depicts the conversation state diagram of the Flight
- Flight Information Program Module (“Flight Information Speech Object"). Upon a transition to the Flight Information Speech Object, the caller is prompted to
- Flight Information Speech Object transitions to speech objects that prompt the
- Flight Information Speech Object will transition alternate speech
- Flight Information Speech Object will convey the flight status information
- the caller may also pick a flight without any specific information about
- Figure G3 depicts the Itinerary Speech Object
- Object includes speech objects that allow the caller to choose a flight
- Object will engage the caller in a speech object to determine the airline if it is
- the Itinerary Speech Object engages the caller in speech objects to
- Service-Database 60 conveys it to the caller (SOReadRoutes G17).
- the Driving Directions Speech Object determines point-to-point driving
- Speech Object can be evoked both as a stand-alone program module Speech Object or from another program module, such as the Business Finder Speech
- Speech Object If the Speech Object is evoked from another program module such as
- Speech Object contains speech objects to determine either
- the Driving Directions Speech Object interfaces with the API of an
- the caller may also receive the driving directions by
- the caller's driving directions includes a particularly long stretch of road
- the Driving Directions Speech Object has the
- a caller may use the Driving Directions Speech Object to determine directions to a particular location, save the driving
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP00965546A EP1224797A1 (en) | 1999-10-01 | 2000-09-28 | Vocal interface system and method |
AU76248/00A AU7624800A (en) | 1999-10-01 | 2000-09-28 | Vocal interface system and method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15696899P | 1999-10-01 | 1999-10-01 | |
US60/156,968 | 1999-10-01 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2001026350A1 true WO2001026350A1 (en) | 2001-04-12 |
Family
ID=22561826
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2000/026935 WO2001026350A1 (en) | 1999-10-01 | 2000-09-28 | Vocal interface system and method |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1224797A1 (en) |
AU (1) | AU7624800A (en) |
WO (1) | WO2001026350A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1253569A2 (en) * | 2001-04-23 | 2002-10-30 | Hitachi, Ltd. | Audio interactive navigation system, mobile terminal device, and audio interactive server |
FR2827695A1 (en) * | 2001-07-23 | 2003-01-24 | France Telecom | Telecommunication services portal with server using speech recognition and associated navigation services, uses telephone link and GPS data as input to server which delivers navigation information taking account of traffic information |
WO2017132660A1 (en) * | 2016-01-29 | 2017-08-03 | Liquid Analytics, Inc. | Systems and methods for dynamic prediction of workflows |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0175503A1 (en) * | 1984-09-06 | 1986-03-26 | BRITISH TELECOMMUNICATIONS public limited company | Method and apparatus for use in interactive dialogue |
WO1998024225A1 (en) * | 1996-11-28 | 1998-06-04 | British Telecommunications Public Limited Company | Interactive apparatus |
US5774860A (en) * | 1994-06-27 | 1998-06-30 | U S West Technologies, Inc. | Adaptive knowledge base of complex information through interactive voice dialogue |
EP0895396A2 (en) * | 1997-07-03 | 1999-02-03 | Texas Instruments Incorporated | Spoken dialogue system for information access |
EP0922279A2 (en) * | 1997-01-09 | 1999-06-16 | Scansoft, Inc. | Method and apparatus for executing a human-machine dialogue in the form of two-sided speech as based on a modular dialogue structure |
WO2000064137A1 (en) * | 1999-04-21 | 2000-10-26 | Ranjeet Nabha | Method and system for the provision of internet-based information in audible form |
-
2000
- 2000-09-28 AU AU76248/00A patent/AU7624800A/en not_active Abandoned
- 2000-09-28 EP EP00965546A patent/EP1224797A1/en not_active Withdrawn
- 2000-09-28 WO PCT/US2000/026935 patent/WO2001026350A1/en not_active Application Discontinuation
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0175503A1 (en) * | 1984-09-06 | 1986-03-26 | BRITISH TELECOMMUNICATIONS public limited company | Method and apparatus for use in interactive dialogue |
US5774860A (en) * | 1994-06-27 | 1998-06-30 | U S West Technologies, Inc. | Adaptive knowledge base of complex information through interactive voice dialogue |
WO1998024225A1 (en) * | 1996-11-28 | 1998-06-04 | British Telecommunications Public Limited Company | Interactive apparatus |
EP0922279A2 (en) * | 1997-01-09 | 1999-06-16 | Scansoft, Inc. | Method and apparatus for executing a human-machine dialogue in the form of two-sided speech as based on a modular dialogue structure |
US6035275A (en) * | 1997-01-09 | 2000-03-07 | U.S. Philips Corporation | Method and apparatus for executing a human-machine dialogue in the form of two-sided speech as based on a modular dialogue structure |
EP0895396A2 (en) * | 1997-07-03 | 1999-02-03 | Texas Instruments Incorporated | Spoken dialogue system for information access |
WO2000064137A1 (en) * | 1999-04-21 | 2000-10-26 | Ranjeet Nabha | Method and system for the provision of internet-based information in audible form |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1253569A2 (en) * | 2001-04-23 | 2002-10-30 | Hitachi, Ltd. | Audio interactive navigation system, mobile terminal device, and audio interactive server |
EP1253569A3 (en) * | 2001-04-23 | 2004-03-31 | Hitachi, Ltd. | Audio interactive navigation system, mobile terminal device, and audio interactive server |
US7076362B2 (en) | 2001-04-23 | 2006-07-11 | Hitachi, Ltd. | Audio interactive navigation system, moving terminal device, and audio interactive server |
FR2827695A1 (en) * | 2001-07-23 | 2003-01-24 | France Telecom | Telecommunication services portal with server using speech recognition and associated navigation services, uses telephone link and GPS data as input to server which delivers navigation information taking account of traffic information |
WO2003010494A1 (en) * | 2001-07-23 | 2003-02-06 | France Telecom | Telecommunication service portal comprising a voice recognition server and navigation and guidance equipment using said portal |
WO2017132660A1 (en) * | 2016-01-29 | 2017-08-03 | Liquid Analytics, Inc. | Systems and methods for dynamic prediction of workflows |
US10339481B2 (en) | 2016-01-29 | 2019-07-02 | Liquid Analytics, Inc. | Systems and methods for generating user interface-based service workflows utilizing voice data |
Also Published As
Publication number | Publication date |
---|---|
AU7624800A (en) | 2001-05-10 |
EP1224797A1 (en) | 2002-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210201932A1 (en) | Method of and system for real time feedback in an incremental speech input interface | |
US7627096B2 (en) | System and method for independently recognizing and selecting actions and objects in a speech recognition system | |
US7450698B2 (en) | System and method of utilizing a hybrid semantic model for speech recognition | |
US6708150B1 (en) | Speech recognition apparatus and speech recognition navigation apparatus | |
US9202247B2 (en) | System and method utilizing voice search to locate a product in stores from a phone | |
US7376640B1 (en) | Method and system for searching an information retrieval system according to user-specified location information | |
KR100383352B1 (en) | Voice-operated service | |
US6246986B1 (en) | User barge-in enablement in large vocabulary speech recognition systems | |
US6944594B2 (en) | Multi-context conversational environment system and method | |
JP5315289B2 (en) | Operating system and operating method | |
US20030191639A1 (en) | Dynamic and adaptive selection of vocabulary and acoustic models based on a call context for speech recognition | |
US7783475B2 (en) | Menu-based, speech actuated system with speak-ahead capability | |
US20030149566A1 (en) | System and method for a spoken language interface to a large database of changing records | |
US7689425B2 (en) | Quality of service call routing system using counselor and speech recognition engine and method thereof | |
US20020143548A1 (en) | Automated database assistance via telephone | |
EP2289231A1 (en) | A system and method utilizing voice search to locate a procuct in stores from a phone | |
US8428241B2 (en) | Semi-supervised training of destination map for call handling applications | |
TWI698756B (en) | System for inquiry service and method thereof | |
US20200193984A1 (en) | Conversation guidance method of speech recognition system | |
WO2001026350A1 (en) | Vocal interface system and method | |
JP2003016087A (en) | Operating method for automatic sector information system and the system | |
KR20020077422A (en) | Distributed speech recognition for internet access | |
JP2008216461A (en) | Speech recognition, keyword extraction, and knowledge base retrieval coordinating device | |
JP2003223187A (en) | Method of operating speech dialogue system | |
JP2001134285A (en) | Speech recognition device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 10089753 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2000965546 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2000965546 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2000965546 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: JP |