WO1999032203A1 - A standalone interactive toy - Google Patents

A standalone interactive toy Download PDF

Info

Publication number
WO1999032203A1
WO1999032203A1 PCT/IL1998/000617 IL9800617W WO9932203A1 WO 1999032203 A1 WO1999032203 A1 WO 1999032203A1 IL 9800617 W IL9800617 W IL 9800617W WO 9932203 A1 WO9932203 A1 WO 9932203A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
toy
interactive
response
doll
Prior art date
Application number
PCT/IL1998/000617
Other languages
French (fr)
Inventor
Amir Schorr
Shie Manor
Sharon Fridman
Original Assignee
Smartoy Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Smartoy Ltd. filed Critical Smartoy Ltd.
Priority to AU15754/99A priority Critical patent/AU1575499A/en
Publication of WO1999032203A1 publication Critical patent/WO1999032203A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63HTOYS, e.g. TOPS, DOLLS, HOOPS OR BUILDING BLOCKS
    • A63H3/00Dolls
    • A63H3/28Arrangements of sound-producing means in dolls; Means in dolls for producing sounds
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63HTOYS, e.g. TOPS, DOLLS, HOOPS OR BUILDING BLOCKS
    • A63H2200/00Computerized interactive toys, e.g. dolls

Definitions

  • the present invention relates to talking toys generally and more particularly to an improved interactive toy with artificial intelligence capable of learning from its interaction with a child.
  • US Patent No. 4,696,653 to McKeefery describes a toy doll which responds with spoken words or phrases to the touching of selected areas of the doll by the user or in response to the user's voice.
  • the doll responds to being touched on specific parts of its body by one of two phrases related to the particular part of the doll's body being touched.
  • the doll replies with a randomly selected sentences.
  • US Patent No. 4,923,428 to Curran describes an interactive talking toy which simulates interaction with the child by appearing to respond to input from the child.
  • the speech and movement of body parts of the toy are controlled by multiple audio tracks.
  • the particular audio track reproduced at any time is directed by a processor on the basis of a response from a human.
  • An object of the present invention is to provide a standalone interactive toy such as a doll configured to talk to the user and respond to vocal input from the user.
  • a further object of the present invention is to provide an interactive toy utilizing artificial intelligence which responds to the user and which has learning capabilities.
  • the toy is capable of initiating communication with the user, adapting to the user and developing an individual "personality" and other characteristics through interaction with the user.
  • a standalone interactive toy having an adaptive personality which includes a plurality of characteristics, the plurality of characteristics representing at least one of a group including a plurality of personality characteristics of the toy, a plurality of personality characteristics of an interactive user and a plurality of interaction characterizations between the user and the toy; means for receiving and characterizing vocal input from the user; means for changing the adaptive personality thereby creating a current personality characteristic for the toy; memory storage means for storing the current personality characteristic of the toy and the plurality of characteristics; means for generating a response, the generated response being responsive to the characterized vocal input and the current personality characteristic of the toy and audio output means for outputting the response.
  • the standalone interactive toy further includes means for amending the plurality of characteristic means according to at least one of a group of environment conditions including random changes, time based changes, direct and indirect feedback from the interacting user, periodic changes and programmed changes.
  • the standalone interactive toy further includes decision making means wherein the generated response is responsive to the decision making means.
  • the decision making means is adjustable in accordance with the vocal input from the user and/or is adjustable in accordance with the usage of each of the plurality of characteristics.
  • the decision making means are determined by one of a group of conditions including randomization, weighted randomization, the usage of each of the plurality of characteristics and on the basis of which of the plurality of characteristics has been recently selected.
  • the toy further includes a plurality of question types and means for determining which of the plurality of question types is to be asked by the toy.
  • the toy further includes means for assessing the interactive user's response to each of the plurality of question types, for determining whether a subsequent question should be asked and means for formulating the subsequent question to ask.
  • the interactive user's response includes one of a group of responses including non-response or non-reaction within a pre-determined period
  • the toy further includes means for determining the subsequent step to take after a specific activity has ended and to decide whether and how to interrupt an ongoing activity.
  • the initial vocal response is generated on the toy being powered up.
  • the processor is any one of a group of processors including a central processing unit (CPU), a digital signal processor (DSP), microprocessor or micro-controller.
  • the memory storage means includes at least one of a group of memory storage units including a ROM (read only memory) unit, a ROM paging unit, and a RAM (random access memory) unit.
  • the memory storage means includes interactive computer games.
  • the standalone interactive toy further includes a non-volatile memory unit.
  • the non-volatile memory unit includes at least one of a group including flash memory and EEPROM memory.
  • the standalone interactive toy further includes a replaceable cartridge unit and/or a PC computer connectable to the processor.
  • the cartridge unit and/or a PC computer includes interactive computer games.
  • the cartridge unit and/or a PC computer includes at least one of a group of memory storage units including flash memory, EEPROM memory and ROM memory.
  • the interactive computer games are time and date dependent.
  • the standalone interactive toy further includes at least one of a group of actuators responsive to activation by the user including a plurality of sensors, indicators and a communication link connected to the processor.
  • the sensors comprise touch sensitive contacts having specific operating functions.
  • the ROM (read only memory) unit includes at least one of a group including: a library of recognition words; software for activating the recognition words library, a synthesizer library; software for activating the synthesizer library, application software for controlling and activating the doll, and a text database storing data for the use with the vocal input converting means.
  • the vocal input converting means includes a text-to-speech (TTS) converter.
  • TTS text-to-speech
  • the processor includes at least one of a group of including: modules for speech recognition, artificial intelligence , music synthesis , speech synthesis and digital sound processing.
  • the standalone interactive toy further includes a clock and a scheduler for scheduling games and activities dependent on time and date.
  • a method of interactive communication between an adaptive standalone interactive toy and a user comprising the steps of: creating a plurality of characteristics, the plurality of characteristics representing at least one of a group including a plurality of personality characteristics of the toy, a plurality of personality characteristics of an interactive user and a plurality of interaction characterizations between the user and the toy; creating a current personality characteristic for the toy; receiving and characterizing the vocal input from the user; storing the current personality characteristic of the toy and the plurality of characteristics; generating a response, the generated response being responsive to the characterized vocal input and the current personality characteristic of the toy; and outputting the response.
  • the method of interactive communication further includes the step of amending the plurality of characteristic means according to at least one of a group of environment conditions including random changes, time based changes, direct and indirect feedback from said interacting user, periodic changes and programmed changes. Furthermore, in accordance with a preferred embodiment of the invention, the method of interactive communication further includes the step of decision making, the step of generating a response being responsive to the step of decision making.
  • the method of interactive communication further includes the step of determining whether a subsequent question should be asked; and if so, determining which of a plurality of question types is to be asked by said toy.
  • the method further includes the step of assessing said interactive user's response to each of said plurality of question types and the step of formulating the subsequent question to ask.
  • the method further includes the step of determining what step to take after a specific activity has ended and/or determining whether and how to interrupt an ongoing activity.
  • FIG. 1A illustrates an interactive speaking doll, in accordance with a preferred embodiment of the invention
  • Fig. 1 B is a high level block diagram illustration of the main elements of the interactive speaking doll of Fig. 1A;
  • Fig. 2 is a block diagram illustration of the electronic components of the doll of Fig. 1A;
  • Fig. 3 is a schematic representation of the software application components which are processed by the electronic components of the doll of Fig.
  • Fig. 4 is a schematic representation of the contents of the ROM and RAM units of the electronic components of the doll of Fig. 1 ;
  • Fig. 5 is a high level flow chart illustration of the question asking mechanism.
  • Figs. 6A, 6B and 6C are flow chart diagram illustrations of the functional operation of the doll of Fig. 1.
  • Figs. 1A, 1 B and 2 illustrate the interactive speaking doll, generally referenced 1 1 , in accordance with a preferred embodiment of the invention.
  • the mechanism, generally designated 10, for operating the interactive speaking doll in accordance with a preferred embodiment of the invention is embodied within the body 12 of doll 1 1 , illustrated in Fig. 1A.
  • Fig. 1 B is a high level block diagram illustration of the main elements of the interactive speaking doll 11.
  • Fig. 2 is a block diagram illustration of the electronic components of the mechanism 10.
  • the mechanism 10 is controlled by a processor 20.
  • processor 20 is hereinbelow referred to as a central processing unit (CPU).
  • CPU central processing unit
  • the processor 20 may include any processing unit including a central processing unit (CPU), a digital signal processor (DSP), microprocessor or micro-controller.
  • the interactive speaking doll 1 1 is configured to talk to the user and to respond to the vocal commands of the user via text-to-speech (TTS) and voice recognition software, respectively, generated by means of the processor 20.
  • the interactive speaking doll 1 1 is a standalone doll and does not require to be connected to another unit, such as a computer or video game to operate.
  • a feature of the interactive speaking doll 11 of the present invention is its use of artificial intelligence (Al) and its machine learning capabilities.
  • the doll 1 1 is capable of initiating communication with the user and adapting itself and developing an individual "personality" through interaction with the user. The various features of the doll will be described in detail hereinbelow.
  • the mechanism 10 comprises a microphone 14 (for audio input) and a speaker 16 (for audio output) linked via a codec 18 to processor 20.
  • Processor 20 comprises a vocal input converter 90, a response generator 92 and a text to speech (TTS) converter 94.
  • TTS text to speech
  • the electronic components of the mechanism 10 further comprise memory storage means, generally designated 19 (Fig. 1A) comprising a ROM (read only memory) unit 22, a ROM paging unit 24, a RAM (random access memory) unit 26 and a non-volatile memory unit 28 (best seen in Fig.2).
  • the doll 11 comprises a plurality of sensors 30, indicators 32 and a communication link 34, all of which are connected to the central processing unit
  • the mechanism 10 is controlled by a power control unit 36 and powered by a battery power supply 37.
  • the mechanism 10 further comprises clock units
  • a cartridge unit 38 and/or a computer (PC) 40 may also be connected to processor 20.
  • communication link 34 can be coupled to an external computer (PC) 35.
  • a microphone amplifier 44 is connected between microphone 14 and codec 18 for amplifying the analog signals prior to being decoded.
  • a second amplifier 45 is used to amplify the doll's voice signals being broadcast via the speaker 16.
  • Codec 18 is a standard known in the art coder/decoder device for converting analog signals to digital for transmission and vice versa for converting digital signals to analog.
  • the sensors 30 comprise touch sensitive contacts, generally designated
  • the sensors 30 may also optionally comprise buttons, generally designated 48 having specific operating functions, which may be activated by the user.
  • the indicators 32 may comprise light emitting diodes (LEDs) to attract the user's attention to the doll 11.
  • LEDs light emitting diodes
  • Fig. 3 is a schematic representation of the various software application components which are processed by processor 20.
  • Fig. 4 is a schematic representation of the contents of the ROM unit 22 and RAM unit 26.
  • Processor 20 is configured to process software applications and comprises modules for speech recognition 50, speech synthesis 52, music synthesis 54, the generation of sound effects 56, artificial intelligence 58 and digital sound processing 60.
  • the vocal input converter 90 (Fig. 1 B) of the processor 20 analyzes and processes the speech of the user using the recognition module 50 and speech recognition program 64 in conjunction with the word library 62.
  • Response generator 92 utilizes software driven algorithms to decide on the appropriate response.
  • the "decision making” algorithms which are described in greater detail below, utilize techniques from control theory and machine learning to render an appropriate response.
  • the text to speech (TTS) converter 94 utilizes a combination of speech synthesis 52, music synthesis 54, sound effect blocks 56 together with the synthesizer library 66 and synthesizer software 68 to output the response via speaker 16.
  • ROM unit 22 comprises at least the following elements: a) a library of recognition words 62 and software 64 for activating the recognition words library 62, b) a synthesizer library 66 and software 68 for activating the synthesizer library, c) application software 70 for controlling and activating the doll, and d) a text database 72 for the use with the Text-to-Speech (TTS) software.
  • TTS Text-to-Speech
  • the memory space available in RAM unit 26 is suitably partitioned into a main application work area 74, a synthesis work area 76 and a recognition work area 78 for use by the response generator 92, the text to speech (TTS) converter 94 and the vocal input converter 90.
  • a main application work area 74 for use by the response generator 92, the text to speech (TTS) converter 94 and the vocal input converter 90.
  • TTS text to speech
  • the recognition words library 62 comprises a database containing strings of words or phrases.
  • the extent of the database and the capability of the speech recognition software 50 are dependent upon the processing speed and memory capacity available.
  • Speech recognition software 50 such as the automatic speech recognition (ASR) products from Lernout and Hauspie of Massachusetts, USA, are commercially available.
  • ASR automatic speech recognition
  • ASR is a process by which machines understand spoken output and is generally capable of recognizing natural and fluently spoken words continuous digits, isolated words, key words and the alphabet.
  • Automatic speech recognition is speaker independent. That is, it is capable of being used by different speakers without speaker involvement.
  • ASR can adapt to changes in background knowledge level and the users manner of speech pronunciation tones of speech and pitch.
  • the ASR also has the capability of processing and recognizing a string of naturally and continuously spoken words.
  • a further feature of ASR is its capability to recognize isolated words and to identify key words from a group of surrounding words from other sounds and continuous speech. For example the ASR could recognize the word "song" from a phrase such as "I want a happy song”.
  • the text to speech (TTS) software converts computer readable text stored in the TTS database 72 into computer generated speech.
  • the text to speech is phoneme based, capable of reading any word whose pronunciation is consistent with the general rules of pronunciation for the particular language.
  • the available vocabulary is only limited by the capabilities of the processor and size of the memory storage.
  • TTS attempts to mimic natural sounding speech and an alternative embodiment the speech output (speaker volume, speech rate and speech pitch) is adjustable by the processor 20.
  • TTS Text to speech modules are commercially available, for example the TTS3000 module from Lernout and Hauspie, Massachusetts, USA.
  • the response generator 92 of the interactive speaking doll 11 of the invention utilizes algorithms and functions which include the following elements:
  • each of the personality 'characteristics' of the doll and of the user can be represented by a vector of numbers.
  • a personality characteristic such as the sense of humor or the intelligence of the doll can be represented by a number from 0 to 100, where 0 means “not at all” and 100 means "the maximum”.
  • personality characteristics which describe the user of the doll such as an estimate of the user's intelligence or how much the user likes jokes or songs, can also represented by a number from 0 to 100.
  • the development of the doll's personality is adaptive, that is, its personality changes and adapts to the responses made by the user and 'environment' changes. Every response of the doll can be related to one or more of characteristics of the doll and the user. For example, if the doll is deciding whether to yawn or not, it will take into account the 'tiredness characteristic' of the doll and how much the user likes non-regular responses (that is, interruptions in the regular flow of events).
  • Non-limiting examples of "characteristics” include: A. Patience - When the doll is impatient it will only accept one wrong answer in response to a question, and when the doll is patient it will allow the user several tries to find the correct answer. B. Tiredness - When the doll is tired it will be less active, that is, it will initiate less and will yawn every now and then. C. Doll's Amusement - When the doll is amused it will laugh more often, try to initiate funnier activities and will act in a more pleasant manner and be "nicer” to the user. D. User's intelligence - If the user's intelligence is considered to be high, he will be asked tougher questions and the doll will assume that the user knows what he wants well.
  • the doll has to decide whether to accept the user's proposal for a game, it will give a higher consideration to a suggestion made by an "intelligent" user than an "unintelligent” one.
  • the various characteristics are reflected in the behavior of the doll, the way the doll treats the user and the agenda of the doll. Every characteristic can be reflected in any or all of the above.
  • non-volatile memory unit 28 such as an EEPROM
  • the doll becomes impatient as directed by software programmed commands.
  • the doll can be programmed to have certain "feelings". For example, in a game which requires apparently behavior, the doll's 'amusement' characteristic is increased so that the overall behavior of the doll will become more amusing and funny.
  • Direct Feedback - from the user For example, if the user answers difficult questions correctly, the user's intelligence will increase; and 5. Indirect Feedback - from the user can indicate if decisions that were made in acceptance with certain characteristics in the past are liked by the user. For example, if the doll plays a song and the user doesn't respond after a time, it is deduced that the user doesn't like songs and therefore the user's "like songs" characteristic is reduced.
  • Both the Direct and Indirect Feedback models also affect characteristics which relate to interaction between user and the doll. In the case of the Indirect Feedback model, the user's response is generally delayed. The various characteristics may vary in their rate of change, that is how much a measurement affects the value of the characteristic. The characteristic has various properties such as 'confidence level' which represent how much the value can be trusted and other information.
  • a further feature of the interactive toy is the response generator 92 which activates a software driven component that performs "decision making”.
  • the "decision making” algorithms are weighted according to feedback and the actual usage of the dolls characteristics to customize decisions to the specific doll and the user. Techniques from control theory and machine learning are utilized to adapt the "decision making” algorithms. The weights of decisions and the decisions taken are stored in the non-volatile memory 28.
  • the "decision making” weighting mechanism allows for different options to be chosen. For example, the decision to which of three stories, "A", "B” or “C”, should be read, can be linked to the "humor” characteristic of the doll. The probability of choosing a particular story such as “A”, which is rated as “funny”, will be increased (or reduced) depending on the mood of the doll. Thus, if story A is classed as 'funny', when the doll is in a humorous mood, the weight of story “A” will be increased by the "humor” characteristic
  • There are several possible “decision making” strategies non-limiting examples of which include:
  • the basic weight of the options can be updated.
  • the "weighting" can be stored in the non-volatile memory 28.
  • Non-limiting examples of the types of questions which the doll can ask the user are as follows:
  • Fig. 5 is a high level flow chart illustration of the question asking mechanism.
  • the initial type of question to be asked (step 102) is determined, based upon the user's intelligence, local and global understanding and the last question asked.
  • the type of question is selected from the group of either "Open", “Enumerate” or “Yes/No” questions (step 104) The operational flow then continues depending on the answer given (query box 105).
  • the relevant databases including characteristics database are updated (step 106). If the correct answer is given, the relevant databases including characteristics database are updated (step 106). If the wrong answer is given, a decision made whether to ask the question again (query box 108). The decision is made according to the maximal number of questions allowed, upon the user's intelligence and understanding and the doll's characteristics (such as patience).
  • step 104 the kind of question to be asked (Open, Enumerate or Yes/No) (step 104)) is determined according to the user's intelligence and understanding (step 110) and the above loop steps 104 -110 continues.
  • step 106 other parameters, including the number of consecutive questions not having a response or the number of Yes/No questions answered incorrectly are also updated.
  • a further feature of the interactive toy is a software driven mechanism called the "What Next” mechanism which allows the doll 1 1 to decide what to do after a specific activity ended and decide whether and how to interrupt an ongoing activity.
  • the "What Next” mechanism uses a technique called “RuleBase”, which is based on prior gained knowledge of the user and which checks any of a plurality of possible conditions have occurred and responds accordingly.
  • A. Non-User Response If the user did not react or respond within a pre-determined period, for example, if the user didn't either switch the doll off or call the doll or verbally address the doll to suggest a new activity, the mood of the doll 11 can be changed?
  • B. Mood Changes A check is made to determine periodic or random changes in mood. Preferably, mood changes are not changed constantly, but rather only a pre-determined time period.
  • Action Determination An action decision is made.
  • the action decision may include whether to make a funny remark, asking the user to decide what to do or suggesting an activity to the user. i) If the user is asked to decide for himself, the guestion asked by the doll is weighted towards the user's preferred activities. The doll decides whether accept the user's suggestion, according to the mood of the doll and/or the user's recent behavior, ii) If the doll suggests a new activity to the user, activities that were not used recently are suggested first. Alternatively, the doll may insist on playing a certain activity if it is in a stubborn mood, for example.
  • Figs. 6A - 6C are flow chart illustrations of an embodiment of the functional operation of the doll 11.
  • step 200 Once the doll 11 is switched on (step 200), initially by an on/off switch connected to the power control 36, the doll 1 1 is activated for play (step 202).
  • Vocal activation may be either initiated by the doll itself (step208) or by the user (child) (step 210).
  • Non-vocal operation is initiated by the user activating one of the sensors 30 or buttons 48. After the doll 1 1 initiates a request (step 208), a sub-routine 211 (Fig.
  • the doll 11 will wait for the user to respond (query box 212). If there is a hiatus in the play between the doll and the child, after a pre-selected period of time, the doll will make the selection (step 214). Depending on the routine logic, the routine will wait for the user's response (query box 213) or return to central control ((step 215).
  • the routine returns to the central control (202) to await further response (vocal or non-vocal).
  • the type and length of question is determined by the programming and can, for example, be a single word such as "hello”, or a longer phrase such as "How are you?", "Who are you?", and "What game do you want to play today?' 'Let me give you a choice”.
  • the doll can be configured so that the by default the doll initially waits for a pre-selected time period for the user to start a conversation. For example, if the child wishes to play with the doll it could activate the doll vocally (step 210) by calling it by name. If the child uses a longer phrase or a couple of words strung together, the doll will react according to how the child itself makes the request. In other words, the doll interacts with the child. If, for example, the child says "play with me", the doll may well inquire "what do you want to play? - A game?" - "or do you want me to sing a song?”.
  • the response the doll makes can also be time and date dependent.
  • the doll can greet the user on her birthday by singing "happy birthday” or suggest playing a specific game on a specific date such as Christmas.
  • a sub-routine 217 (Fig. 6C) is performed.
  • the child's vocal request is analyzed and processed by the speech recognition module 50 (step 218). If the word or phrase is 'understood', that is recognized by the speech recognition program 64 in conjunction with the word library 62, the doll 11 will decide on a response (step 220).
  • the doll's response will depend on the application program, as described in detail hereinabove (Fig. 5). The response may be in the form of a question to the user or the doll may begin 'playing the game' or 'singing the song' requested, for example.
  • the main processing blocks used for the doll to speak include the speech synthesis 52, music synthesis 54, generating sound effects 56 blocks and the synthesizer library 66 and synthesizer software 68. If the speech recognition module 50 does not match the user's speech with its word library 62, the speech recognition program 64 can "recognize" the new words spoken and store them in the non-volatile memory unit 28 (step 222).
  • the speech recognition module 50 is basically a combination of recognition engines, language and environment dependent databases and application programming interfaces.
  • the recognition engine has four processing blocks; data acquisition and pre-processing, feature extraction, acoustic match and dynamic programming.
  • Data acquisition converts the spoken analog signal into digital form.
  • Pre-processing such as automatic game control, echo cancellation and voice activity detection, for example, may be applied to the signal.
  • the relevant acoustic phonetic information is captured in a feature vector every few msecs. Algorithms are adapted to recording channel and to suppression of background noise.
  • the acoustic match block uses a probabilistic matching score which is computed between each feature vector to describe the smallest acoustic phonetic units which are know to the recognizer, such as context independent and context dependent phonemes and word models.
  • the dynamic programming block finds the best match and alignment for the speech input on a word and sentence level by accumulating the acoustic scores over time within the restrictions imposed by lexical and syntactic constraints.
  • the text to speech (TTS) technology converts computer readable text stored in the TTS database 72 into synthetic speech in stages, first converting the text into a phonetic transcription, calculating the speech parameters and finally using these parameters to generate synthetic speech signals.
  • the text to speech (TTS) technology is also capable of adding vocal and tonal enhancements, referred to as 'improved TTS'.
  • the doll may be powered up initially by the on/off switch connected to the power control 36 (step 200) and activated non-vocally by pressing one of the plurality of operationally defined buttons 48 (steps 230, 232).
  • a plurality of buttons 48 each of which has a different function, can be placed on the doll 11 in any suitable position, such as a belt 80 around the doll's waist (Fig.
  • buttons 48 can be configured with operational functions, such as “stop”, “continue”, “on”, “off.
  • buttons can be configured for specific reaction functions, for example "sing a song", or "play a game”.
  • the doll 11 is configured to switch itself off after a pre-determined period inactivity, that is not receiving any reaction from the child/user.
  • the non-vocal routine for example, can be used to terminate operations. Referring to Fig. 5A, the non-vocal routine awaits a user response (steps 230, 232). If a button is not pressed within a pre-determined time period (loop steps 234, 230), the program terminates (236).

Abstract

A standalone interactive toy (11) having an adaptive personality and utilizing artificial intelligence which responds to the user and which has learning capabilities. The toy is capable of initiating communication with the user, adapting to the user and developing an individual 'personality' through interaction with the user. The standalone interactive toy includes a plurality of characteristics which include, inter alia, a plurality of interaction characterizations between the user and the toy, means for receiving and characterizing vocal input from the user, means for changing the adaptive personality thereby creating a current personality characteristic for the toy, memory storage means, means for generating a response which is responsive to the characterized vocal input and the current personality characteristic of the toy and the audio output means for outputting the response.

Description

A STANDALONE INTERACTIVE TOY
FIELD OF THE INVENTION
The present invention relates to talking toys generally and more particularly to an improved interactive toy with artificial intelligence capable of learning from its interaction with a child.
BACKGROUND OF THE INVENTION
It has long been recognized that for a toy doll or figure to appear more lifelike to a child, there should be interaction between the child and the figure. It is known to utilize simulated speech which produce sounds in response to movement of the figure or by selecting pre-recorded messages in response to activation by the user.
US Patent No. 4,696,653 to McKeefery describes a toy doll which responds with spoken words or phrases to the touching of selected areas of the doll by the user or in response to the user's voice. The doll responds to being touched on specific parts of its body by one of two phrases related to the particular part of the doll's body being touched. When the user speaks to the doll, the doll replies with a randomly selected sentences.
US Patent No. 4,923,428 to Curran describes an interactive talking toy which simulates interaction with the child by appearing to respond to input from the child. The speech and movement of body parts of the toy are controlled by multiple audio tracks. The particular audio track reproduced at any time is directed by a processor on the basis of a response from a human.
The existing prior art is restricted to passively responding to physical or vocal activation by the user. Further, existing interactive toys are limited in their possible responses which reduces the interest of the child after a short period. SUMMARY OF THE INVENTION
An object of the present invention is to provide a standalone interactive toy such as a doll configured to talk to the user and respond to vocal input from the user. A further object of the present invention is to provide an interactive toy utilizing artificial intelligence which responds to the user and which has learning capabilities. The toy is capable of initiating communication with the user, adapting to the user and developing an individual "personality" and other characteristics through interaction with the user. There is thus provided, in accordance with a preferred embodiment of the invention, a standalone interactive toy having an adaptive personality which includes a plurality of characteristics, the plurality of characteristics representing at least one of a group including a plurality of personality characteristics of the toy, a plurality of personality characteristics of an interactive user and a plurality of interaction characterizations between the user and the toy; means for receiving and characterizing vocal input from the user; means for changing the adaptive personality thereby creating a current personality characteristic for the toy; memory storage means for storing the current personality characteristic of the toy and the plurality of characteristics; means for generating a response, the generated response being responsive to the characterized vocal input and the current personality characteristic of the toy and audio output means for outputting the response.
Furthermore, in accordance with a preferred embodiment of the invention, the standalone interactive toy further includes means for amending the plurality of characteristic means according to at least one of a group of environment conditions including random changes, time based changes, direct and indirect feedback from the interacting user, periodic changes and programmed changes.
In addition, in accordance with a preferred embodiment of the invention, the standalone interactive toy further includes decision making means wherein the generated response is responsive to the decision making means. The decision making means is adjustable in accordance with the vocal input from the user and/or is adjustable in accordance with the usage of each of the plurality of characteristics. Also, the decision making means are determined by one of a group of conditions including randomization, weighted randomization, the usage of each of the plurality of characteristics and on the basis of which of the plurality of characteristics has been recently selected.
Furthermore, in accordance with a preferred embodiment of the invention, the toy further includes a plurality of question types and means for determining which of the plurality of question types is to be asked by the toy.
Furthermore, in accordance with a preferred embodiment of the invention, the toy further includes means for assessing the interactive user's response to each of the plurality of question types, for determining whether a subsequent question should be asked and means for formulating the subsequent question to ask. The interactive user's response includes one of a group of responses including non-response or non-reaction within a pre-determined period Furthermore, in accordance with a preferred embodiment of the invention, the toy further includes means for determining the subsequent step to take after a specific activity has ended and to decide whether and how to interrupt an ongoing activity.
Furthermore, in accordance with a preferred embodiment of the invention, the initial vocal response is generated on the toy being powered up.
Furthermore, in accordance with a preferred embodiment of the invention, the processor is any one of a group of processors including a central processing unit (CPU), a digital signal processor (DSP), microprocessor or micro-controller. Additionally, in accordance with a preferred embodiment of the invention, the memory storage means includes at least one of a group of memory storage units including a ROM (read only memory) unit, a ROM paging unit, and a RAM (random access memory) unit. The memory storage means includes interactive computer games. Furthermore, in accordance with a preferred embodiment of the invention, the standalone interactive toy further includes a non-volatile memory unit. The non-volatile memory unit includes at least one of a group including flash memory and EEPROM memory.
Furthermore, in accordance with a preferred embodiment of the invention, the standalone interactive toy further includes a replaceable cartridge unit and/or a PC computer connectable to the processor. The cartridge unit and/or a PC computer includes interactive computer games. Also the cartridge unit and/or a PC computer includes at least one of a group of memory storage units including flash memory, EEPROM memory and ROM memory. The interactive computer games are time and date dependent. A standalone interactive toy according to claim 19 and further includes a communication link coupled to the processor and an external computer coupled to the communication link.
Furthermore, in accordance with a preferred embodiment of the invention, the standalone interactive toy further includes at least one of a group of actuators responsive to activation by the user including a plurality of sensors, indicators and a communication link connected to the processor. The sensors comprise touch sensitive contacts having specific operating functions.
Additionally, in accordance with a preferred embodiment of the invention, the ROM (read only memory) unit includes at least one of a group including: a library of recognition words; software for activating the recognition words library, a synthesizer library; software for activating the synthesizer library, application software for controlling and activating the doll, and a text database storing data for the use with the vocal input converting means.
Furthermore, in accordance with a preferred embodiment of the invention, the vocal input converting means includes a text-to-speech (TTS) converter.
Additionally, in accordance with a preferred embodiment of the invention, the processor includes at least one of a group of including: modules for speech recognition, artificial intelligence , music synthesis , speech synthesis and digital sound processing. Furthermore, in accordance with a preferred embodiment of the invention, the standalone interactive toy further includes a clock and a scheduler for scheduling games and activities dependent on time and date.
In addition, there is thus provided, in accordance with a preferred embodiment of the invention, a method of interactive communication between an adaptive standalone interactive toy and a user, the method comprising the steps of: creating a plurality of characteristics, the plurality of characteristics representing at least one of a group including a plurality of personality characteristics of the toy, a plurality of personality characteristics of an interactive user and a plurality of interaction characterizations between the user and the toy; creating a current personality characteristic for the toy; receiving and characterizing the vocal input from the user; storing the current personality characteristic of the toy and the plurality of characteristics; generating a response, the generated response being responsive to the characterized vocal input and the current personality characteristic of the toy; and outputting the response.
Furthermore, in accordance with a preferred embodiment of the invention, the method of interactive communication further includes the step of amending the plurality of characteristic means according to at least one of a group of environment conditions including random changes, time based changes, direct and indirect feedback from said interacting user, periodic changes and programmed changes. Furthermore, in accordance with a preferred embodiment of the invention, the method of interactive communication further includes the step of decision making, the step of generating a response being responsive to the step of decision making.
Additionally, in accordance with a preferred embodiment of the invention, the method of interactive communication further includes the step of determining whether a subsequent question should be asked; and if so, determining which of a plurality of question types is to be asked by said toy.
Furthermore, in accordance with a preferred embodiment of the invention, the method further includes the step of assessing said interactive user's response to each of said plurality of question types and the step of formulating the subsequent question to ask.
Furthermore, in accordance with a preferred embodiment of the invention, the method further includes the step of determining what step to take after a specific activity has ended and/or determining whether and how to interrupt an ongoing activity.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be understood and appreciated more fully from the following detailed description taken in conjunction with the appended drawings in which: Fig. 1A illustrates an interactive speaking doll, in accordance with a preferred embodiment of the invention;
Fig. 1 B is a high level block diagram illustration of the main elements of the interactive speaking doll of Fig. 1A;
Fig. 2 is a block diagram illustration of the electronic components of the doll of Fig. 1A;
Fig. 3 is a schematic representation of the software application components which are processed by the electronic components of the doll of Fig.
1 ;
Fig. 4 is a schematic representation of the contents of the ROM and RAM units of the electronic components of the doll of Fig. 1 ;
Fig. 5 is a high level flow chart illustration of the question asking mechanism; and
Figs. 6A, 6B and 6C are flow chart diagram illustrations of the functional operation of the doll of Fig. 1.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
Reference is now made to Figs. 1A, 1 B and 2 which illustrate the interactive speaking doll, generally referenced 1 1 , in accordance with a preferred embodiment of the invention. The mechanism, generally designated 10, for operating the interactive speaking doll in accordance with a preferred embodiment of the invention is embodied within the body 12 of doll 1 1 , illustrated in Fig. 1A. Fig. 1 B is a high level block diagram illustration of the main elements of the interactive speaking doll 11. Fig. 2 is a block diagram illustration of the electronic components of the mechanism 10. The mechanism 10 is controlled by a processor 20. For the purposes of example only, processor 20 is hereinbelow referred to as a central processing unit (CPU). The processor 20 may include any processing unit including a central processing unit (CPU), a digital signal processor (DSP), microprocessor or micro-controller. The interactive speaking doll 1 1 is configured to talk to the user and to respond to the vocal commands of the user via text-to-speech (TTS) and voice recognition software, respectively, generated by means of the processor 20. The interactive speaking doll 1 1 is a standalone doll and does not require to be connected to another unit, such as a computer or video game to operate. A feature of the interactive speaking doll 11 of the present invention is its use of artificial intelligence (Al) and its machine learning capabilities. The doll 1 1 is capable of initiating communication with the user and adapting itself and developing an individual "personality" through interaction with the user. The various features of the doll will be described in detail hereinbelow. The mechanism 10 comprises a microphone 14 (for audio input) and a speaker 16 (for audio output) linked via a codec 18 to processor 20.
Processor 20 comprises a vocal input converter 90, a response generator 92 and a text to speech (TTS) converter 94.
The electronic components of the mechanism 10 further comprise memory storage means, generally designated 19 (Fig. 1A) comprising a ROM (read only memory) unit 22, a ROM paging unit 24, a RAM (random access memory) unit 26 and a non-volatile memory unit 28 (best seen in Fig.2). In addition, the doll 11 comprises a plurality of sensors 30, indicators 32 and a communication link 34, all of which are connected to the central processing unit
(CPU) 20.
The mechanism 10 is controlled by a power control unit 36 and powered by a battery power supply 37. The mechanism 10 further comprises clock units
42 and 43 for controlling the timing of the processor 20 and codec 18, respectively. Optionally, a cartridge unit 38 and/or a computer (PC) 40 may also be connected to processor 20. Optionally, communication link 34 can be coupled to an external computer (PC) 35. A microphone amplifier 44 is connected between microphone 14 and codec 18 for amplifying the analog signals prior to being decoded. A second amplifier 45 is used to amplify the doll's voice signals being broadcast via the speaker 16.
Codec 18 is a standard known in the art coder/decoder device for converting analog signals to digital for transmission and vice versa for converting digital signals to analog.
The sensors 30 comprise touch sensitive contacts, generally designated
46, which are disposed beneath various parts of the doll 11 , such as the eyes, ears and mouth, for example. The sensors 30 may also optionally comprise buttons, generally designated 48 having specific operating functions, which may be activated by the user.
The indicators 32 may comprise light emitting diodes (LEDs) to attract the user's attention to the doll 11.
Reference is now also made to Figs. 3 and 4. Fig. 3 is a schematic representation of the various software application components which are processed by processor 20. Fig. 4 is a schematic representation of the contents of the ROM unit 22 and RAM unit 26.
Processor 20 is configured to process software applications and comprises modules for speech recognition 50, speech synthesis 52, music synthesis 54, the generation of sound effects 56, artificial intelligence 58 and digital sound processing 60. The vocal input converter 90 (Fig. 1 B) of the processor 20 analyzes and processes the speech of the user using the recognition module 50 and speech recognition program 64 in conjunction with the word library 62.
Response generator 92 utilizes software driven algorithms to decide on the appropriate response. The "decision making" algorithms, which are described in greater detail below, utilize techniques from control theory and machine learning to render an appropriate response.
The text to speech (TTS) converter 94 utilizes a combination of speech synthesis 52, music synthesis 54, sound effect blocks 56 together with the synthesizer library 66 and synthesizer software 68 to output the response via speaker 16.
The memory space available in ROM unit 22 is suitably partitioned and is sufficiently large to be store the required libraries and programs. ROM unit 22 comprises at least the following elements: a) a library of recognition words 62 and software 64 for activating the recognition words library 62, b) a synthesizer library 66 and software 68 for activating the synthesizer library, c) application software 70 for controlling and activating the doll, and d) a text database 72 for the use with the Text-to-Speech (TTS) software. The memory space available in RAM unit 26 is suitably partitioned into a main application work area 74, a synthesis work area 76 and a recognition work area 78 for use by the response generator 92, the text to speech (TTS) converter 94 and the vocal input converter 90.
The recognition words library 62 comprises a database containing strings of words or phrases. The extent of the database and the capability of the speech recognition software 50 are dependent upon the processing speed and memory capacity available. Speech recognition software 50, such as the automatic speech recognition (ASR) products from Lernout and Hauspie of Massachusetts, USA, are commercially available. Briefly, automatic speech recognition (ASR) is a process by which machines understand spoken output and is generally capable of recognizing natural and fluently spoken words continuous digits, isolated words, key words and the alphabet.
Automatic speech recognition is speaker independent. That is, it is capable of being used by different speakers without speaker involvement. ASR can adapt to changes in background knowledge level and the users manner of speech pronunciation tones of speech and pitch. In addition to recognizing individual words the ASR also has the capability of processing and recognizing a string of naturally and continuously spoken words. A further feature of ASR is its capability to recognize isolated words and to identify key words from a group of surrounding words from other sounds and continuous speech. For example the ASR could recognize the word "song" from a phrase such as "I want a happy song".
The text to speech (TTS) software converts computer readable text stored in the TTS database 72 into computer generated speech. Preferably the text to speech is phoneme based, capable of reading any word whose pronunciation is consistent with the general rules of pronunciation for the particular language. The available vocabulary is only limited by the capabilities of the processor and size of the memory storage. TTS attempts to mimic natural sounding speech and an alternative embodiment the speech output (speaker volume, speech rate and speech pitch) is adjustable by the processor 20.
Text to speech (TTS) modules are commercially available, for example the TTS3000 module from Lernout and Hauspie, Massachusetts, USA.
The use of a processor 20 together with text-to-speech (TTS) and voice recognition software as integral components of the doll 11 , allows the doll to interact with the user. Specifically, the response generator 92 of the interactive speaking doll 11 of the invention utilizes algorithms and functions which include the following elements:
1. Personality Characteristic development; 2. Decision Making algorithms;
3. Question and Answer mechanisms; and
4. What Next (WN) mechanism These specific features will now be described.
Personality Characteristic Development
Using the principles of Vector Quantization, that is using a short vector of numbers to represent a complex system, each of the personality 'characteristics' of the doll and of the user can be represented by a vector of numbers. For example, a personality characteristic, such as the sense of humor or the intelligence of the doll can be represented by a number from 0 to 100, where 0 means "not at all" and 100 means "the maximum".
Similarly, personality characteristics which describe the user of the doll, such as an estimate of the user's intelligence or how much the user likes jokes or songs, can also represented by a number from 0 to 100.
The development of the doll's personality is adaptive, that is, its personality changes and adapts to the responses made by the user and 'environment' changes. Every response of the doll can be related to one or more of characteristics of the doll and the user. For example, if the doll is deciding whether to yawn or not, it will take into account the 'tiredness characteristic' of the doll and how much the user likes non-regular responses (that is, interruptions in the regular flow of events).
Non-limiting examples of "characteristics" include: A. Patience - When the doll is impatient it will only accept one wrong answer in response to a question, and when the doll is patient it will allow the user several tries to find the correct answer. B. Tiredness - When the doll is tired it will be less active, that is, it will initiate less and will yawn every now and then. C. Doll's Amusement - When the doll is amused it will laugh more often, try to initiate funnier activities and will act in a more pleasant manner and be "nicer" to the user. D. User's intelligence - If the user's intelligence is considered to be high, he will be asked tougher questions and the doll will assume that the user knows what he wants well. For example, if the doll has to decide whether to accept the user's proposal for a game, it will give a higher consideration to a suggestion made by an "intelligent" user than an "unintelligent" one. The various characteristics are reflected in the behavior of the doll, the way the doll treats the user and the agenda of the doll. Every characteristic can be reflected in any or all of the above.
All characteristics are stored in the non-volatile memory unit 28 such as an EEPROM, so the doll's actions and characteristic behavior continue uninterrupted, even after being turned off.
To emulate a 'personality', the characteristics are constantly subject to change according to 'environment' measurements, which may be listed as follows:
1. Random changes - which represent "moods". For example, the doll can become sick or angry for no reason. If the doll becomes sick, several characteristics are changed. The decision on becoming sick is random and thus after the doll decides to be sick, the "tiredness" characteristic will increased, and the "amusement" characteristic will decrease, for example.
2. Time based (or periodic) changes - the doll can become more tired later in the day, for instance;
3. Programmed changes - the doll becomes impatient as directed by software programmed commands. The doll can be programmed to have certain "feelings". For example, in a game which requires foolish behavior, the doll's 'amusement' characteristic is increased so that the overall behavior of the doll will become more amusing and funny.
4. Direct Feedback - from the user. For example, if the user answers difficult questions correctly, the user's intelligence will increase; and 5. Indirect Feedback - from the user can indicate if decisions that were made in acceptance with certain characteristics in the past are liked by the user. For example, if the doll plays a song and the user doesn't respond after a time, it is deduced that the user doesn't like songs and therefore the user's "like songs" characteristic is reduced. Both the Direct and Indirect Feedback models also affect characteristics which relate to interaction between user and the doll. In the case of the Indirect Feedback model, the user's response is generally delayed. The various characteristics may vary in their rate of change, that is how much a measurement affects the value of the characteristic. The characteristic has various properties such as 'confidence level' which represent how much the value can be trusted and other information. Decision Making
A further feature of the interactive toy, is the response generator 92 which activates a software driven component that performs "decision making". The "decision making" algorithms are weighted according to feedback and the actual usage of the dolls characteristics to customize decisions to the specific doll and the user. Techniques from control theory and machine learning are utilized to adapt the "decision making" algorithms. The weights of decisions and the decisions taken are stored in the non-volatile memory 28.
The "decision making" weighting mechanism allows for different options to be chosen. For example, the decision to which of three stories, "A", "B" or "C", should be read, can be linked to the "humor" characteristic of the doll. The probability of choosing a particular story such as "A", which is rated as "funny", will be increased (or reduced) depending on the mood of the doll. Thus, if story A is classed as 'funny', when the doll is in a humorous mood, the weight of story "A" will be increased by the "humor" characteristic There are several possible "decision making" strategies, non-limiting examples of which include:
1. Choosing at random (All options have the same probability) 2. Choosing at random where the probability of choosing an option is proportional to it's weight. 3. Choosing the highest/lowest weight option.
4. Choosing options that were not chosen lately (buffered n-length FIFO).
After a decision has been made and feedback measured (either directly or indirectly), the basic weight of the options can be updated. The "weighting" can be stored in the non-volatile memory 28. Question and Answer Mechanisms
The type of questions which the doll asks the user are defined by known
Artificial Intelligence (Al) mechanisms which enables the automatic handling of questions. Additionally, characteristics which represent the user's ability to understand questions are taken into account when determining the type of question to ask.
Non-limiting examples of the types of questions which the doll can ask the user are as follows:
1. "Open" question where any answer is acceptable. 2. "Yes/No" question - Only a yes or no is an acceptable answer.
3. "Open Menu" questions - There are several known answers that are either correct or relevant. An "Open Menu" question is asked without naming any of the possible answers; for example: "what is the color of my hat".
4. "Enumerate Menu" Questions - There are several known answers as in "Open Menu" question. The question asked also gives some of the possible answers ; for example: "what is the color of my hat, red blue or green ?".
5. "Yes/No Menu" questions - There are several known answers as in "Open Menu" questions and in this case, the question asked is followed by asking a 'Yes /No' question; for example: "what is the color of my hat, is it red ?" For "Yes/No" questions, the questions are asked directly. The- Artificial
Intelligence mechanism learns whether the user understands how to answer Yes/No type question. If the user does not respond as expected, then a short explanation (there are several explanation according to the user's behavior and past) is added to the question. For "Menu" type Questions (Open, Enumerate or Yes No), the algorithm makes a decision based on the question asked, the list of relevant and correct answers (a "menu") and the maximal number of times that the question can be asked. Reference is now made to Fig. 5, which is a high level flow chart illustration of the question asking mechanism. The initial type of question to be asked (step 102) is determined, based upon the user's intelligence, local and global understanding and the last question asked. The type of question is selected from the group of either "Open", "Enumerate" or "Yes/No" questions (step 104) The operational flow then continues depending on the answer given (query box 105).
If the correct answer is given, the relevant databases including characteristics database are updated (step 106). If the wrong answer is given, a decision made whether to ask the question again (query box 108). The decision is made according to the maximal number of questions allowed, upon the user's intelligence and understanding and the doll's characteristics (such as patience).
If a follow-up question is to asked, the kind of question to be asked (Open, Enumerate or Yes/No) (step 104)) is determined according to the user's intelligence and understanding (step 110) and the above loop steps 104 -110 continues.
If a follow-up question is not being asked, the relevant databases are updated (step 106).
During updating (step 106), other parameters, including the number of consecutive questions not having a response or the number of Yes/No questions answered incorrectly are also updated.
What Next (WN) Mechanism
A further feature of the interactive toy, is a software driven mechanism called the "What Next" mechanism which allows the doll 1 1 to decide what to do after a specific activity ended and decide whether and how to interrupt an ongoing activity. Briefly, the "What Next" mechanism uses a technique called "RuleBase", which is based on prior gained knowledge of the user and which checks any of a plurality of possible conditions have occurred and responds accordingly.
For example, in deciding what action to take whenever a specific activity has ended, the various operational steps of the "What Next" mechanism can be described as follows:
A. Non-User Response: If the user did not react or respond within a pre-determined period, for example, if the user didn't either switch the doll off or call the doll or verbally address the doll to suggest a new activity, the mood of the doll 11 can be changed? B. Mood Changes: A check is made to determine periodic or random changes in mood. Preferably, mood changes are not changed constantly, but rather only a pre-determined time period.
C. Action Determination: An action decision is made. For example, the action decision may include whether to make a funny remark, asking the user to decide what to do or suggesting an activity to the user. i) If the user is asked to decide for himself, the guestion asked by the doll is weighted towards the user's preferred activities. The doll decides whether accept the user's suggestion, according to the mood of the doll and/or the user's recent behavior, ii) If the doll suggests a new activity to the user, activities that were not used recently are suggested first. Alternatively, the doll may insist on playing a certain activity if it is in a stubborn mood, for example.
Reference is now made to the flow chart diagram of Figs. 6A - 6C, which are flow chart illustrations of an embodiment of the functional operation of the doll 11.
Once the doll 11 is switched on (step 200), initially by an on/off switch connected to the power control 36, the doll 1 1 is activated for play (step 202).
There are two principal modes of operation, vocal (step 204) and non-vocal (step 206). Vocal activation may be either initiated by the doll itself (step208) or by the user (child) (step 210). Non-vocal operation is initiated by the user activating one of the sensors 30 or buttons 48. After the doll 1 1 initiates a request (step 208), a sub-routine 211 (Fig.
6B) is performed. The doll 11 will wait for the user to respond (query box 212). If there is a hiatus in the play between the doll and the child, after a pre-selected period of time, the doll will make the selection (step 214). Depending on the routine logic, the routine will wait for the user's response (query box 213) or return to central control ((step 215).
If the user responds to the doll's question, the response is analyzed and the appropriate action taken (step 216). Having completed the particular sub-routine, the routine returns to the central control (202) to await further response (vocal or non-vocal).
Depending upon how the child answers will depend upon the development of the game or operation. The type and length of question is determined by the programming and can, for example, be a single word such as "hello", or a longer phrase such as "How are you?", "Who are you?", and "What game do you want to play today?' 'Let me give you a choice".
The doll can be configured so that the by default the doll initially waits for a pre-selected time period for the user to start a conversation. For example, if the child wishes to play with the doll it could activate the doll vocally (step 210) by calling it by name. If the child uses a longer phrase or a couple of words strung together, the doll will react according to how the child itself makes the request. In other words, the doll interacts with the child. If, for example, the child says "play with me", the doll may well inquire "what do you want to play? - A game?" - "or do you want me to sing a song?".
The response the doll makes can also be time and date dependent. For instance, the doll can greet the user on her birthday by singing "happy birthday" or suggest playing a specific game on a specific date such as Christmas.
If the child (user) talks to the doll 1 1 , a sub-routine 217 (Fig. 6C) is performed. The child's vocal request is analyzed and processed by the speech recognition module 50 (step 218). If the word or phrase is 'understood', that is recognized by the speech recognition program 64 in conjunction with the word library 62, the doll 11 will decide on a response (step 220). The doll's response will depend on the application program, as described in detail hereinabove (Fig. 5). The response may be in the form of a question to the user or the doll may begin 'playing the game' or 'singing the song' requested, for example. The main processing blocks used for the doll to speak include the speech synthesis 52, music synthesis 54, generating sound effects 56 blocks and the synthesizer library 66 and synthesizer software 68. If the speech recognition module 50 does not match the user's speech with its word library 62, the speech recognition program 64 can "recognize" the new words spoken and store them in the non-volatile memory unit 28 (step 222). The speech recognition module 50 is basically a combination of recognition engines, language and environment dependent databases and application programming interfaces. The recognition engine has four processing blocks; data acquisition and pre-processing, feature extraction, acoustic match and dynamic programming.
Data acquisition converts the spoken analog signal into digital form. Pre-processing such as automatic game control, echo cancellation and voice activity detection, for example, may be applied to the signal. After frequency analysis of the signal, the relevant acoustic phonetic information is captured in a feature vector every few msecs. Algorithms are adapted to recording channel and to suppression of background noise.
The acoustic match block uses a probabilistic matching score which is computed between each feature vector to describe the smallest acoustic phonetic units which are know to the recognizer, such as context independent and context dependent phonemes and word models. The dynamic programming block finds the best match and alignment for the speech input on a word and sentence level by accumulating the acoustic scores over time within the restrictions imposed by lexical and syntactic constraints.
Briefly, the text to speech (TTS) technology converts computer readable text stored in the TTS database 72 into synthetic speech in stages, first converting the text into a phonetic transcription, calculating the speech parameters and finally using these parameters to generate synthetic speech signals. In addition, the text to speech (TTS) technology is also capable of adding vocal and tonal enhancements, referred to as 'improved TTS'. The doll may be powered up initially by the on/off switch connected to the power control 36 (step 200) and activated non-vocally by pressing one of the plurality of operationally defined buttons 48 (steps 230, 232). A plurality of buttons 48, each of which has a different function, can be placed on the doll 11 in any suitable position, such as a belt 80 around the doll's waist (Fig. 1). Buttons 48 can be configured with operational functions, such as "stop", "continue", "on", "off. In addition, buttons can be configured for specific reaction functions, for example "sing a song", or "play a game". Preferably, the doll 11 is configured to switch itself off after a pre-determined period inactivity, that is not receiving any reaction from the child/user. Thus, after the doll has completed a specific routine (as described hereinabove with reference to Figs. 5B and 5C), the non-vocal routine, for example, can be used to terminate operations. Referring to Fig. 5A, the non-vocal routine awaits a user response (steps 230, 232). If a button is not pressed within a pre-determined time period (loop steps 234, 230), the program terminates (236).
It will be appreciated by persons knowledgeable in the art that the above description of an interactive toy is not restricted to a 'talking' doll but is applicable to any toy or game. Furthermore, the functions and routines described are given by way of example only and are restricted thereto. By means of PC 40 and cartridge unit 38, it is possible to download additional units containing speech and music modules or other games, for example. It will be further appreciated that the present invention is not limited by what has been described hereinabove and that numerous modifications, all of which fall within the scope of the present invention, exist. Rather the scope of the invention is defined by the claims which follow:

Claims

1. A standalone interactive toy having an adaptive personality comprising: a plurality of characteristics, said plurality of characteristics representing at least one of a group including a plurality of personality characteristics of the toy, a plurality of personality characteristics of an interactive user and a plurality of interaction characterizations between said user and the toy; means for receiving and characterizing vocal input from said user; means for changing said adaptive personality thereby creating a current personality characteristic for said toy; memory storage means for storing said current personality characteristic of said toy and said plurality of characteristics; means for generating a response, said generated response being responsive to said characterized vocal input and said current personality characteristic of said toy; and audio output means for outputting said response.
2 . A standalone interactive toy according to claim 1 and further comprising means for amending said plurality of characteristic means according to at least one of a group of environment conditions including random changes, time based changes, direct and indirect feedback from said interacting user, periodic changes and programmed changes.
3 . A standalone interactive toy according to claim 1 further comprising decision making means wherein said generated response is responsive to said decision making means.
4 . A standalone interactive toy according to claim 3 wherein said decision making means is adjustable in accordance with the vocal input from said user.
5 . A standalone interactive toy according to claim 3 wherein said decision making means is adjustable in accordance with said usage of each of said plurality of characteristics.
6 . A standalone interactive toy according to claim 3 wherein said decision making means are determined by one of a group of conditions including randomization, weighted randomization, said usage of each of said plurality of characteristics and on the basis of which said plurality of characteristics has been recently selected.
7 . A standalone interactive toy according to claim 1 and further comprising a plurality of question types and means for determining which of said plurality of question types is to be asked by said toy.
8 . A standalone interactive toy according to claim 7 and further comprising means for assessing said interactive user's response to each of said plurality of question types, for determining whether a subsequent question should be asked and means for formulating the subsequent question to ask.
9 . A standalone interactive toy according to claim 8 wherein said interactive user's response includes one of a group of responses including non-response or non-reaction within a pre-determined period
10 . A standalone interactive toy according to claim 1 and further comprising means for determining the subsequent step to take after a specific activity has ended and to decide whether and how to interrupt an ongoing activity.
1 1 . A method of interactive communication between an adaptive standalone interactive toy and a user, said method comprising the steps of: creating a plurality of characteristics, said plurality of characteristics representing at least one of a group including a plurality of personality characteristics of said toy, a plurality of personality characteristics of an interactive user and a plurality of interaction characterizations between said user and said toy; creating a current personality characteristic for said toy; receiving and characterizing the vocal input from said user; storing said current personality characteristic of said toy and said plurality of characteristics; generating a response, said generated response being responsive to said characterized vocal input and said current personality characteristic of said toy; and outputting said response.
12 . A method of interactive communication according to claim 11 and further comprising the step of amending said plurality of characteristic means according to at least one of a group of environment conditions including random changes, time based changes, direct and indirect feedback from said interacting user, periodic changes and programmed changes.
13 . A method of interactive communication according to claim 11 and further comprising the step of decision making and wherein the step of generating a response is responsive to the step of decision making.
14 . A method of interactive communication according to claim 13 wherein the decision making step is adjustable in accordance with the vocal input from said user.
15 . A method of interactive communication according to claim 13 wherein said decision making step is adjustable in accordance with said usage of each of said plurality of characteristics.
16 . A method of interactive communication according to claim 13 wherein said decision making step is determined by one of a group of conditions including randomization, weighted randomization, the usage of each of said plurality of characteristics and on the basis of which said plurality of characteristics has been recently selected.
17 . A method of interactive communication according to claim 11 and further comprising the steps of: determining whether a subsequent question should be asked; and if so, determining which of a plurality of question types is to be asked by said toy.
18 . A method of interactive communication according to claim 17 and further comprising the step of assessing said interactive user's response to each of said plurality of question types and the step of formulating the subsequent question to ask.
19 . A method of interactive communication according to claim 18 wherein said interactive user's response includes one of a group of responses including non- response or non-reaction within a pre-determined period.
20 . A method according to claim 11 and further comprising the step of determining what step to take after a specific activity has ended.
21 . A method according to claim 11 and further comprising the step of determining whether and how to interrupt an ongoing activity.
PCT/IL1998/000617 1997-12-19 1998-12-17 A standalone interactive toy WO1999032203A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU15754/99A AU1575499A (en) 1997-12-19 1998-12-17 A standalone interactive toy

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US99415197A 1997-12-19 1997-12-19
US08/994,151 1997-12-19

Publications (1)

Publication Number Publication Date
WO1999032203A1 true WO1999032203A1 (en) 1999-07-01

Family

ID=25540337

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL1998/000617 WO1999032203A1 (en) 1997-12-19 1998-12-17 A standalone interactive toy

Country Status (2)

Country Link
AU (1) AU1575499A (en)
WO (1) WO1999032203A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001076714A1 (en) * 2000-04-07 2001-10-18 Valagam Rajagopal Raghunathan Intelligent toy
EP1151779A2 (en) * 2000-03-31 2001-11-07 Sony Corporation Robot and action deciding method for robot
FR2811238A1 (en) * 2000-07-04 2002-01-11 Tomy Co Ltd Interactive dog/robot game having stimulus detector and drive elements with command element providing interactive response following action point sequence.
EP1175929A2 (en) * 2000-07-26 2002-01-30 Deutsche Telekom AG Toy with link to an external database
WO2002013935A1 (en) * 2000-08-12 2002-02-21 Smirnov Alexander V Toys imitating characters behaviour
ES2187243A1 (en) * 1999-10-25 2003-05-16 Onilco Innovacion Sa Doll that simulates learning to speak
KR100396751B1 (en) * 2000-08-17 2003-09-02 엘지전자 주식회사 Scholarship/growth system and method for toy using web server
WO2010061286A1 (en) * 2008-11-27 2010-06-03 Stellenbosch University A toy exhibiting bonding behaviour
EP2720216A3 (en) * 2012-10-10 2015-10-07 Zanzoon SAS Method for establishing virtual interactivity between an individual and an electronic guessing game device
US9443515B1 (en) 2012-09-05 2016-09-13 Paul G. Boyce Personality designer system for a detachably attachable remote audio object
US9868072B2 (en) * 1999-07-10 2018-01-16 Interactive Play Devices Llc Interactive play device and method
US20210134295A1 (en) * 2017-08-10 2021-05-06 Facet Labs, Llc Oral communication device and computing system for processing data and outputting user feedback, and related methods
CN113696201A (en) * 2021-09-17 2021-11-26 灵起科技(深圳)有限公司 Method for simulating sick and recovered by desktop pet robot

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4923428A (en) * 1988-05-05 1990-05-08 Cal R & D, Inc. Interactive talking toy
US5443388A (en) * 1994-08-01 1995-08-22 Jurmain; Richard N. Infant simulation system for pregnancy deterrence and child care training

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4923428A (en) * 1988-05-05 1990-05-08 Cal R & D, Inc. Interactive talking toy
US5443388A (en) * 1994-08-01 1995-08-22 Jurmain; Richard N. Infant simulation system for pregnancy deterrence and child care training

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"INSTRUCTION MANUAL READY-OR-NOT TOT.", READY OR NOT TOT INSTRUCTION MANUAL, XX, XX, 1 January 1997 (1997-01-01), XX, pages COMPLETE., XP002920744 *
"THE BABY THINK IT OVER PROGRAM INSTRUCTOR'S HANDBOOK.", BABY THINK IT OVER PROGRAM INSTRUCTOR'S HANDBOOK, XX, XX, 1 January 1997 (1997-01-01), XX, pages COMPLETE., XP002920745 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9868072B2 (en) * 1999-07-10 2018-01-16 Interactive Play Devices Llc Interactive play device and method
ES2187243A1 (en) * 1999-10-25 2003-05-16 Onilco Innovacion Sa Doll that simulates learning to speak
EP1151779A2 (en) * 2000-03-31 2001-11-07 Sony Corporation Robot and action deciding method for robot
EP1151779A3 (en) * 2000-03-31 2003-05-14 Sony Corporation Robot and action deciding method for robot
WO2001076714A1 (en) * 2000-04-07 2001-10-18 Valagam Rajagopal Raghunathan Intelligent toy
FR2811238A1 (en) * 2000-07-04 2002-01-11 Tomy Co Ltd Interactive dog/robot game having stimulus detector and drive elements with command element providing interactive response following action point sequence.
US6682390B2 (en) 2000-07-04 2004-01-27 Tomy Company, Ltd. Interactive toy, reaction behavior pattern generating device, and reaction behavior pattern generating method
EP1175929A2 (en) * 2000-07-26 2002-01-30 Deutsche Telekom AG Toy with link to an external database
EP1175929A3 (en) * 2000-07-26 2003-08-13 Funtel GmbH Toy with link to an external database
WO2002013935A1 (en) * 2000-08-12 2002-02-21 Smirnov Alexander V Toys imitating characters behaviour
KR100396751B1 (en) * 2000-08-17 2003-09-02 엘지전자 주식회사 Scholarship/growth system and method for toy using web server
WO2010061286A1 (en) * 2008-11-27 2010-06-03 Stellenbosch University A toy exhibiting bonding behaviour
US9443515B1 (en) 2012-09-05 2016-09-13 Paul G. Boyce Personality designer system for a detachably attachable remote audio object
EP2720216A3 (en) * 2012-10-10 2015-10-07 Zanzoon SAS Method for establishing virtual interactivity between an individual and an electronic guessing game device
US20210134295A1 (en) * 2017-08-10 2021-05-06 Facet Labs, Llc Oral communication device and computing system for processing data and outputting user feedback, and related methods
US11763811B2 (en) * 2017-08-10 2023-09-19 Facet Labs, Llc Oral communication device and computing system for processing data and outputting user feedback, and related methods
CN113696201A (en) * 2021-09-17 2021-11-26 灵起科技(深圳)有限公司 Method for simulating sick and recovered by desktop pet robot

Also Published As

Publication number Publication date
AU1575499A (en) 1999-07-12

Similar Documents

Publication Publication Date Title
US11908472B1 (en) Connected accessory for a voice-controlled device
CN108962217B (en) Speech synthesis method and related equipment
US11699455B1 (en) Viseme data generation for presentation while content is output
JP4465768B2 (en) Speech synthesis apparatus and method, and recording medium
Pieraccini The voice in the machine: building computers that understand speech
Moore PRESENCE: A human-inspired architecture for speech-based human-machine interaction
US20050154594A1 (en) Method and apparatus of simulating and stimulating human speech and teaching humans how to talk
KR100843822B1 (en) Robot device, method for controlling motion of robot device, and system for controlling motion of robot device
JP4363590B2 (en) Speech synthesis
US7065490B1 (en) Voice processing method based on the emotion and instinct states of a robot
US6728679B1 (en) Self-updating user interface/entertainment device that simulates personal interaction
KR100814569B1 (en) Robot control apparatus
JPH08297498A (en) Speech recognition interactive device
JP2003271174A (en) Speech synthesis method, speech synthesis device, program, recording medium, method and apparatus for generating constraint information and robot apparatus
JP2003271173A (en) Speech synthesis method, speech synthesis device, program, recording medium and robot apparatus
JP2004090109A (en) Robot device and interactive method for robot device
WO1999032203A1 (en) A standalone interactive toy
TW201434600A (en) Robot for generating body motion corresponding to sound signal
JP3273550B2 (en) Automatic answering toy
JP2005342862A (en) Robot
JP2006061632A (en) Emotion data supplying apparatus, psychology analyzer, and method for psychological analysis of telephone user
WO2002077970A1 (en) Speech output apparatus
JP2002169590A (en) System and method for simulated conversation and information storage medium
Mashino et al. The corporeality of sound and movement in performance
WO2005038776A1 (en) Voice controlled toy

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase