WO2003058518A2 - Method and apparatus for an avatar user interface system - Google Patents
Method and apparatus for an avatar user interface system Download PDFInfo
- Publication number
- WO2003058518A2 WO2003058518A2 PCT/GB2003/000031 GB0300031W WO03058518A2 WO 2003058518 A2 WO2003058518 A2 WO 2003058518A2 GB 0300031 W GB0300031 W GB 0300031W WO 03058518 A2 WO03058518 A2 WO 03058518A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- avatar
- user
- accordance
- computing appliance
- person
- Prior art date
Links
Classifications
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/70—Game security or game management aspects
- A63F13/79—Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories
-
- A63F13/12—
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/30—Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/30—Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
- A63F13/33—Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers using wide area network [WAN] connections
- A63F13/335—Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers using wide area network [WAN] connections using Internet
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/131—Protocols for games, networked simulations or virtual reality
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/75—Indicating network or usage conditions on the user display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/141—Systems for two-way working between two video terminals, e.g. videophone
- H04N7/147—Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/40—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterised by details of platform network
- A63F2300/407—Data transfer via internet
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/50—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/50—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers
- A63F2300/57—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers details of game services offered to the player
- A63F2300/572—Communication between players during game play of non game information, e.g. e-mail, chat, file transfer, streaming of audio and streaming of video
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/80—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game specially adapted for executing a specific type of game
- A63F2300/8082—Virtual reality
Definitions
- the present invention concerns methods and apparatus for an avatar user interface system to people, information, media and agents with photo-realistic avatars.
- An alternative method of communicating is in a virtual world.
- Several companies have provided 3D worlds with avatars including Blaxxun (Germany) with its consumer world Cybertown. In these worlds, the user navigates his avatar into proximity with one or more avatars and chat then commences involving the owners of the avatars. User-driven gestures are incorporated.
- the avatars used in these virtual worlds are not photo-realistic representations of the person they represent.
- Photo-realistic avatars of people can be generated in Avatar Booths as disclosed in UK Patent GB 2336981.
- An ad hoc standards group called H- anim has drafted a version H-Anim 2001 for avatars that can be found on the world wide web at www.h-anim.org.
- These photo-realistic avatars are also becoming anima-realistic : they can be animated realistically.
- Harold Sun and Dimitri Metaxas published a solution to generating life-like walking animation for an avatar automatically following a path in the proceedings of SIGGRAPH 2001 p 261-269.
- the present invention aims to provide avatar user interface system means by which a user has a high sense of presence that overcomes some of the disadvantages of other communication methods.
- Embodiments of the present invention use photo-realistic avatars of the participants in the communication session to create a virtual communication room with high photo-realism and high anima-realism.
- Embodiments of the present invention provide an avatar user interface system in which a synchronous communication session can take place without the user needing to control the user interface manually and thus allowing the user to concentrate on communicating.
- Embodiments of the present invention provide an avatar user interface system in which multi- tasking can take place between multiple communication and information processing tasks.
- Embodiments of the present invention provide an avatar user interface system in which people and agents may communicate with each other.
- an apparatus for an avatar user interface system comprising: server means for serving the communication session; one or more computing appliance means; network means for joining said server means and said computing appliance means; avatar means for representing each user visually; and avatar user interface application means resident on each computing appliance means; operable by one or more users .
- a method of communication between a plurality of users via an avatar user interface system comprising the steps of: joining a plurality of computing appliance means and a server means for serving the communications session to start a communication session by means of a network; viewing the avatars of the users involved in the communication session on the said plurality of computing appliance means; a user first communicating into a computing appliance; - one or more users receiving the first communication on one or more other computing appliances; avatars enacting the first communication on said computing appliances; a user responding to the first communication in a second communication; one or more users receiving the second communication on one or more other computing appliances; avatars enacting the second communication on said computing appliances; - continuing the exchange of communications until the session is finished; and terminating the joining of the computing appliance means and the server means for serving the communications session to terminate the communication session.
- a method of communicating between at least one user and at least one avatar agent via an avatar user interface system comprising the steps of: joining one or more computing appliance means, an avatar agent hosting server means hosting one or more intelligent agent software units and a server means for serving the communications session to start a communication session by means of a network; viewing the avatars of the said avatar agents and said users involved in the communication session on the said computing appliance means; - a user or an avatar agent first communicating; if there are one or more users who did not first communicate, then the one or more users who did not first communicate receive the first communication on one or more other computing appliances; avatars enacting the first communication on said computing appliances; if there are one or more avatar agents who did not first communicate, then the one or more avatar agents who did not first communicate receive the first communication; a user or an avatar agent responding to the first communication in a second communication; one or more users or one or more avatar agents receiving the second communication; if there are one or more avatars receiving the second communication, then
- the present invention aims to provide an integrated multi-media communication system for use in a broad range of applications based around photo-realistic avatars for communication with people and intelligent agents in both synchronous and asynchronous ways that is supportive of multiple concurrent communication sessions and of switching between communication sessions .
- the present invention aims to provide a user interface system in which avatar means may be photo-realistic avatar means or parameter avatar means or animatable image avatar means.
- Figure 1 is a block diagram of apparatus for an avatar user interface system in accordance with a first embodiment of the present invention
- Figure 2 is a schematic diagram of an avatar
- Figure 3 is a block diagram of avatar visual types
- Figure 4 is a block diagram for the reconstruction of a parameter avatar
- Figure 5 is an example table of avatar parameters
- Figure 6 is a block diagram of apparatus for generating and editing a parameter avatar
- Figure 7 is a list of action impersonation parameters stored in the memory of a personal computer
- Figure 7a is a flow diagram illustrating the process for defining action impersonation parameters and action impersonation rules for an activity
- Figure 8 is a block diagram of apparatus for generating and editing action impersonation parameters
- Figure 9 is a schematic diagram of an avatar hosting server system
- Figure 10 is a schematic diagram of an avatar number
- Figure 11 is a block diagram of a personal computer with an avatar user interface
- Figure 12 is a diagrammatic representation of avatar user interface functionality in an avatar conference application
- Figure 13 is a block diagram of a presentation media window
- Figure 14 is a block diagram of a whiteboard media window
- Figure 15 is a representation of an example of a meeting room media window
- Figures 16a, 16b, 16c and 16d are schematic diagrams to illustrate possible virtual camera positions in a virtual video conference
- Figures 17a, 17b and 17c are schematics of three possible layouts in the meeting room media window
- Figure 18 is a plan view of the virtual meeting room illustrating possible virtual camera positions
- Figure 19 is a set of four timelines of the camera shots during an avatar user interface session in four modes
- Figures 20 is a block diagram of a software director and avatar engine player
- Figure 21 is a block diagram of events on a personal computer and a session server
- Figures 22a, 22b, 22c, 22d and 22e are schematics of the five seating plans viewed by the five participants;
- Figures 23 is a schematic of the audio mixer
- Figures 24 is a schematic of the audio mixer for multiple conversations ;
- Figure 25 is a block diagram of a lip sync generator
- Figure 26 is a timeline of a lip sync generator
- Figures 27a, 27b, 27c and 27d are diagrammatic representations of four lip sync animation types that can be used to animate a talking head
- Figure 28 is a flow diagram illustrating the steps involved in a lip sync generator
- Figure 29 is a flow diagram illustrating the steps in the passage of sound through an avatar user interface system
- Figure 30a is a spectrogram
- Figure 30b is a graphical diagram of a spectrum
- Figure 31 is a block diagram of the session server system
- Figure 32 is a block diagram of an apparatus for holding an avatar user interface session using voice and data networks in accordance with a second embodiment of the present invention.
- Figure 33 is a schematic diagram of an animatable image in accordance with a third embodiment of the present invention.
- Figure 34 is a schematic diagram of an animatable image avatar
- Figure 35 is a schematic diagram of a set of four state images for the jaw and mouth segment
- Figure 36 is a tree diagram of the hierarchy of animatable avatar image components
- Figure 37 is a schematic diagram of an animatable image generator
- Figure 38 is a schematic diagram of an apparatus for animatable image generation
- Figure 39 is a block diagram of an avatar user interface system with multiple formats of avatar
- Figure 40 is a schematic layout of an avatar user interface with attendee functionality
- Figure 41 is a schematic layout of an apparatus for a multi-party location in an avatar user interface system in accordance with a fourth embodiment of the present invention.
- Figure 41a is a schematic of the 3D sound processing
- Figure 42 is a representation of an example of the displayed avatar user interface with switchboard functionality in accordance with a fifth embodiment of the present invention.
- Figure 43 is a block diagram of a multi-session server system
- Figure 44 is a block diagram of a stand-alone avatar user interface system in accordance with a sixth embodiment of the present invention.
- Figure 45 is a representation of an example of the avatar user interface system with extended exhibition functionality in accordance with a seventh embodiment of the present invention.
- FIG. 46 is a block diagram of an avatar agent hosting system and intelligent agent software in accordance with an eighth embodiment of the present invention.
- Figure 47 is a block diagram of an apparatus for generating impersonation parameters
- Figure 48 is a block diagram of the avatar user interface system with extended security functionality in accordance with a ninth embodiment of the present invention
- Figure 49 is a block diagram of an avatar user interface system for interactive computer gaming in accordance with a tenth embodiment of the present invention
- Figure 50 is a schematic of an avatar user interface system for a six- sided cave in accordance with an eleventh embodiment of the present invention.
- Figure 51 is a schematic of an avatar user interface system for two caves connected by a network
- Figure 52 is a schematic of an avatar user interface system comprising two exercise stations connected together by a network in accordance with a twelfth embodiment of the present invention
- Figure 53 is a schematic of the display of an avatar user interface system with an avatar virtual environment as the background in accordance with a fourteenth embodiment of the present invention.
- Figure 54 is a schematic of a terminal of an avatar user interface system including motion-tracking cameras in accordance with a fifteenth embodiment of the present invention.
- Figure 55 is a block diagram of apparatus for an avatar user interface system with multiple user devices
- Figure 56 is a schematic of a display device consisting of a display screen, an AVE projector and a Presentation projector in accordance with a sixteenth embodiment of the present invention
- Figure 57 is a schematic of a display device in which the AVE and Presentation projection means are combined into one physical unit;
- Figure 58 is a schematic of a multi-density display device comprising an area of low density pixels and an embedded area of high density pixels;
- Figure 59 is a schematic of an avatar user interface system with a mixed audience of avatars of virtual users at various locations and physical users;
- Figure 60 is a block diagram of an apparatus for presentation preparation.
- Figure 1 is a block diagram of an apparatus for an avatar user interface system 261 in accordance with a first embodiment of the present invention.
- the avatar user interface system 261 invention can be embodied in many applications.
- the avatar user interface system 261 is disclosed in this first embodiment embodied as an avatar conference application.
- An avatar conference is an example of a communication session on an avatar user interface system 261.
- Further embodiments disclose the avatar user interface system 261 invention embodied in different applications .
- the apparatus comprises two or more personal computers 3 with memory 345, display devices 264 and displayed avatar user interfaces 260 that are connected by a network 2 to a session server 1 with memory 346 using a standard avatar interface protocol 300 and an avatar hosting server 4 containing a plurality of avatars 5 and memory 344.
- avatars 5 representing the parties taking part in the avatar user interface session are stored on the avatar hosting server 4.
- the avatars 5 are transferred to the personal computers 3 across the network 2.
- the session server 1 mixes the voice streams from the personal computers 3 and returns them to the personal computers 3.
- the avatars 5 are displayed in the displayed avatar user interfaces 260 of the display devices 264 of the personal computers 3.
- Avatars
- FIG. 2 is a schematic diagram of an avatar 5.
- the avatar 5 has an avatar identity 275 comprising an avatar number 8, a password 9 and a display permission flag 259.
- Associated with the avatar 5 are one or more types of data which may include: photo-realistic visual avatar data 340, animatable image avatar segment data 395, other visual image data 396, avatar parameters 230, impersonation parameters 325, biometric data 317, intelligent agent software unit 320, billing data 342 and personal data 341.
- the impersonation parameters 325 are of two types: voice impersonation parameters 331 and action impersonation parameters 332.
- Each set of data associated with the avatar 5 may be resident on different servers on the network 2 or servers on other networks that may be accessible via the network 2.
- Figure 3 is a block diagram of avatar visual types.
- the visual component of an avatar 5 may be a 3D avatar 39 or an animatable avatar image 382 or another avatar type 239.
- An avatar 5 includes at least one of the photo-realistic visual avatar data 340 or the avatar parameters 230 or the animatable image avatar segment data 395 or the other visual image data 396 and any other or all of the other types of data.
- An avatar 5 comprising at least photo-realistic visual avatar data 340 is referred to as a photo-realistic avatar 238.
- An avatar 5 comprising at least avatar parameters 230 is referred to as a parameter avatar 232.
- An avatar 5 comprising at least animatable image avatar segment data 395 is referred to as an animatable image avatar 382.
- An avatar 5 comprising at least either photo-realistic visual avatar data 340 or avatar parameters 230 is referred to as a 3D avatar 39.
- An avatar 5 comprising at least other visual image data 396 is referred to as another avatar type 239.
- Photo-realistic visual avatar data 340 is a computer model that represents an individual taking part in the avatar conference. It is photo-realistic. When viewed by a person who knows the individual that it represents, that photo-realistic visual avatar data 340 will be recognisable as a photo-realistic avatar 238 of the individual in the same way that a photograph of an individual is recognisable by a person who knows the individual as being a photograph of an individual .
- the photo-realistic visual avatar data 340 is a three dimensional (3D) computer model.
- the structure of the photorealistic visual avatar data 340 is similar in terms of its components to the draft H-Anim 2001 standard.
- the external shape of the photo-realistic visual avatar data 340 is represented by polygonal meshes totalling approximately 6,000 polygons.
- a generic avatar topology is used in which every photo-realistic visual avatar data 340 of every person has the same number of polygons, whether the person is tall or short, fat or thin, male or female. Texture mapping is used to position images of the avatar over the polygons so that the avatar can be rendered to appear like the individual it represents.
- the compressed size of the photo-realistic visual avatar data's computer model is typically between 200 and 900 Kbytes.
- Photo-realistic visual avatar data 340 can be quite large and, on lower bandwidth connections, it can take a long time to download. For the avatar conference to feel right to the user, a person's avatar should be seen when he is speaking, rather than just heard as a disembodied voice. Ideally, the avatar should appear in the avatar conference at the same time as a person joins the conference. If it is known who will be in the conference when the conference is organised, then photo-realistic visual avatar data 340 can be sent out in advance of the start of the conference. However, if someone joins the conference without any notice, then it is a purpose of this invention to use parameter avatars 232 that are very small and that will appear shortly after the person joins.
- Figure 4 is a block diagram for the reconstruction of a parameter avatar 232.
- a set of avatar parameters 230 is sent to a personal computer 3 that enable a parameter avatar 232 to be constructed from a general database of avatar information 231.
- Avatar parameter 230 download assumes that there is a general database of avatar information 231 already downloaded at the personal computer 3 from which a parameter avatar 232 can be quickly generated from a small set of avatar parameters 230.
- the general database of avatar information 231 is downloaded the first time that an avatar conference is accessed on a personal computer 3 and remains for later avatar conferences unless it is deleted.
- Figure 5 is an example table of avatar parameters 230 that can be used to define a parameter avatar 232 from a general database of avatar information 231.
- This set of avatar parameters 230 is typically in the range of 100 to 1,000 bytes in size but may be smaller than 100 bytes or larger than 1000 bytes and thereby be sent over the network 2 from the avatar hosting server 4 to the personal computer 3 very quickly.
- the parameter avatar 232 can also be assembled very quickly from the database 231 and the avatar parameters 230. In this way, an avatar of the new participant can be constructed quickly that would look like that person from a distance.
- This parameter avatar 232 may be displayed until such time as the photo-realistic avatar 238 has been downloaded from the avatar hosting server 4 to the personal computer 3 at which point the parameter avatar 232 is automatically replaced with the photo-realistic avatar 238.
- the photo-realistic avatar 238 can be downloaded progressively, such that rather than a sudden change from a parameter avatar 232 to a photo-realistic avatar 238, the user sees a slow morphing from one to the other over a period of time.
- Progressive download can be implemented in many ways. One implementation might be to first download the geometry, then the joint positions, then the textures. A second implementation might download low-resolution textures followed by high-resolution textures.
- Avatars and parameter avatars may be generated in several ways : a photo-realistic avatar 238 may be generated from photos of the user a parameter avatar 232 may be built up manually by the user without using photos of the user a parameter avatar 232 may be automatically generated from a photorealistic avatar 238 of the user
- Figure 6 is a block diagram of apparatus for generating and editing a parameter avatar 232.
- the parameter avatar 232 may be generated automatically or manually.
- a set of avatar parameters 230 is automatically created from a photorealistic avatar 238 of the person by a parameter avatar generator 233 with avatar editing software 234. There is enough information in a photo-realistic avatar 238 for the avatar generator 233 to be relatively simple to create for those skilled in the art.
- the parameter avatar generator 233 is shown resident on a personal computer 3 but may be resident on an avatar hosting server 4 or any other server or computer on the network 2.
- a user 17 has not yet had a photo-realistic avatar 238 made of himself, then he can quickly create a set of avatar parameters 230 for a parameter avatar 232 that is roughly similar to him by providing input into the parameter avatar generator 233.
- Parameter avatar creation in the parameter avatar generator 233 is by selection by the user 17 of a number of graphical alternatives such as hairstyles and by entry by the user 17 of data such as height .
- a new user without an avatar needs to join his first avatar conference as quickly as possible it is imperative that it is possible to create a 'rough' parameter avatar as quickly as possible. In these situations, users are very impatient and the interaction in which the parameter avatar is created must be very efficient and fast.
- the user may be prepared to spend 30-60 seconds on this interaction.
- the interaction would normally be one of selection of options with a mouse click rather than typing in data. Later on, the user may go back and spend more time refining his parameter avatar. It is a purpose of this embodiment that there are two or more ways of generating a parameter avatar depending on the amount of time that the user has available.
- Action impersonation parameters may be used to characterise how a person moves.
- One of the objectives of a successful avatar user interface system invention is anima-realism. It is a first objective for an avatar to move anima-realistically such that a user who does not know the person whose avatar it is, thinks that the animation is realistic. It is a second objective for an avatar to move anima- realistically whilst impersonating the actions of the person whose avatar it is, such that a user who knows the person whose avatar it is, thinks that the animation is both realistic and typical of that person.
- Figure 7 is a list of action impersonation parameters 332 stored in the memory 345 of a personal computer 3.
- Action impersonation parameters 332 include: walking 400, running 401, ambient motion whilst standing 402, ambient motion whilst sitting 403, gestures whilst talking 404, facial expressions whilst talking 405 and lip synchronisation whilst talking 406.
- gestures for gestures whilst talking 404
- gestural animations actions
- These could include: waving hands excitedly in a beat mode whilst talking and moving hands to time with the end of a sentence.
- Action impersonation parameters 332 are not limited to the above characteristics, but may be extended to include any characteristics required in an application of this avatar user interface system invention.
- a reference to action impersonation parameters 332 will mean reference to either or both of: types of action impersonation parameter and action impersonation parameter values .
- Values for action impersonation parameters 332 depend on the type of action and its definition. Values are set for action impersonation parameters 332 of a particular person in their avatar 5. Alternatively, values may be assigned as a set of action impersonation parameters 332 for a generic person in or with a context. Examples of sets of generic values might include : an Italian person a hyperactive person a person in a meeting a hyperactive Italian in a meeting
- a context for generic impersonation parameters might be a communication context.
- Examples of communication contexts include: meetings, product presentations, virtual exhibitions, receptions, major conferences, security situations, interactive game playing, exercise and practicing.
- Values may also be assigned for individual action impersonation parameters 332 that are characteristic of a style.
- An example is walking, where styles of walk can be defined such as a rolling gait, a mincing step etc.
- the activity used by way of example is a meeting, but this embodiment is not limited to the activity of meetings and is applicable to most types of human activity.
- Figure 7a is a flow diagram illustrating the process for defining action impersonation parameters 332 and action impersonation rules 333 for an activity 337.
- a significant corpus of videos 336 of meetings is recorded.
- Each meeting will typically require several video cameras 29 to synchronously record different participants at a sufficient resolution.
- Using a plurality of cameras 29 overcomes the problem of one camera not being able to image participants to a high enough resolution sitting all the way around a table. Meetings with different numbers of participants are recorded.
- a video corpus 336 of 20-50 hours is a typical size for an activity 337.
- the corpus is processed by a trained person along a timeline.
- the actions of each participant may be related to a number of parameters such as status, activity type (speaking, listening, observing) , speech content and emotion.
- the result is an annotated timeline 334 with actions of each participant related to the parameters .
- the annotated timeline 334 is analysed to produce: (i) a type definition of each possible action impersonation parameter 332, (ii) a set of rules that can be incorporated in a finite state machine 333.
- Figure 8 is a block diagram of apparatus for generating and editing action impersonation parameters for an avatar 5 of a particular person.
- Action impersonation parameters 332 may be set manually by providing input from the user 17 into the action impersonation generator/editor 335.
- the user 17 may be the particular person whose avatar it is or someone else such as a friend, a family member or an expert providing a service.
- Action impersonation parameters 332 may be edited manually by providing input from the user 17 into the action impersonation generator/editor 335.
- Individual action impersonation parameter setting in the action impersonation generator/editor 335 may be by manual selection by the user 17 of a number of high-level visual alternatives for each individual action impersonation parameter such as walking style and by entry by the user 17 of data such as whether a particular gesture is typically used.
- An alternative high-level way of setting action impersonation parameters quickly is to choose between pre-set action impersonation parameter sets according to culture.
- the user may choose between cultural characteristics such as: Anglo-Saxon Japanese - Hispanic Italian
- personal action impersonation parameters 332 After personal action impersonation parameters 332 have been set in a high-level, generic way, they may be edited at a low-level where they can really be fine-tuned to the way a person moves. For instance a person may be hyper-active and use a characteristic gesture a lot but never use another gesture. By editing at a low-level, the action impersonation parameters 332 may be refined such that a user who knows the person whose avatar it is, thinks that the animation is both realistic and typical of that person.
- the user 17 makes selections from a number of choices at a high level.
- the user edits those selections at a lower level.
- a video camera 29 may make video recordings 336 of a person carrying out a number of pre-defined actions.
- the action impersonation generator/editor 335 may automatically set the action impersonation parameters by automatic processing of the video recording. In this process, the emphasis is on replicating the particular person's style in actions that have different styles.
- the camera 29 may be mounted in a booth 18.
- video recordings 336 are made of a person carrying out a number of defined actions.
- the action impersonation generator/editor 335 automatically analyses the video recordings 336 to generate a set of action impersonation parameters 332.
- Action impersonation parameters may be set by a number of means in addition to those disclosed.
- videos can be made of a person carrying out a number of tasks and an expert may study the video and set the action impersonation parameters.
- the processes disclosed above for manually and automatically generating, setting and editing action impersonation parameters 332 define a number of methods by example. This aspect of the invention is not limited to the processes disclosed, but covers all processes for manually and automatically generating, setting and editing action impersonation parameters 332.
- Each avatar 5 has a unique avatar number 8.
- An avatar 5 may contain multiple visual avatar data including a photo-realistic avatar 238, a parameter avatar 232 and an animatable image avatar 382. When an avatar 5 is first created, it is allocated a unique avatar number 8. At any point thereafter, visual avatar data of different types may be added, deleted or edited.
- the password 9 when used together with the avatar number 8 gives the user 17 access to change the avatar 5 including other types of data such as personal data 341.
- the display permission flag 259 if set by a user 17 with a password 9 and avatar number 8 gives permission to all other users 17 to use the avatar 5 for viewing purposes such as in a displayed avatar user interface 260 without need of the password 9.
- Access permissions are not limited in this invention to the password 9 and the display permission flag 259. A range of access permissions may be created for access to different types of data by different users .
- Avatar Hosting Server Figure 9 is a schematic diagram of an avatar hosting server system.
- the avatar hosting server 4 contains a database 6, avatar hosting management software 229, and avatars 5.
- each avatar 5 has a unique avatar number 8 and a password 9.
- the avatar hosting server 4 may also contain one or both of billing software 237 and avatar generation software 222.
- the avatar hosting management software 229 on the avatar hosting server 4 When the avatar hosting management software 229 on the avatar hosting server 4 receives a request 7 over the network 2 from a personal computer 3 for an avatar 5, then the avatar hosting management software 229 will check with the database 6 to see if the request is accompanied by a valid avatar number 8 and password 9. If the request 7 is valid, then the avatar hosting management software 229 will send the requisite avatar 5 to the personal computer 3 in such a form that it can be changed. If the request 7 is not accompanied by a valid password 9, then the avatar hosting management software 229 will check to see if the display permission flag 259 is set for the avatar 5 with avatar number 8.
- the avatar hosting management software 229 will send the requisite avatar 5 to the personal computer 3 in such a form that the avatar 5 can only be displayed and cannot be changed. If the request 7 is not accompanied by a valid password 9 and the display permission flag 259 is not set for the avatar 5 with avatar number 8, then the avatar hosting management software 229 will not send the requisite avatar 5.
- Photo-realistic avatars 238 of people are generated and edited from digital images 19 of people, usually taken from several sides of the person using a camera 221 using generation software 222 and avatar editing software 234 in an Avatar Generator Editor (AGE) 235.
- AGE Avatar Generator Editor
- the generation management software 236 usually takes the images 19 of the person using a camera 221 and generates a photo-realistic avatar
- the special avatar generation apparatus 18 usually contains means for regulating the quality of the images 19 that reduces or eliminates the need for skilled processing of the images 19 before they enter the AGE software
- Such regulation means usually include fixed camera settings, controlled lighting levels and a uniform colour and shape background and floor such as a chroma green sheet but neither have to include these regulation means or are limited by these regulation means.
- any camera 221 can be used to take images 19 of the person in a largely unregulated way. These images can be transferred to a personal computer 3 on which AGE software 235 is resident. Alternatively, the images 19 can be sent over the network 2 to the avatar hosting server 4 on which there is also generation software 222 that automatically generates a photo-realistic avatar 238 without any user intervention. Alternatively, the images 19 can be sent over the network 2 to an avatar generation service 223 that uses an AGE 235.
- the automatic generation of an avatar or a parameter avatar generates an imprecise avatar.
- the avatar generated may not at first be pleasing to the user, in the same way that photographic images of a person are often not pleasing to the person. The user may think that the avatar does not represent himself or even his self-image.
- - low-level changing the avatar by touching up manually the 3D shape, textures, texture coordinates and joint positions
- - high-level changing the avatar by interactively adjusting avatar parameters from which the avatar is regenerated
- a peer to peer avatar serving system can be used.
- an avatar hosting server 4 is not required and the user's avatar 5 that is resident in local storage 274 on his personal computer 3 can be sent to all other participant's personal computers 3 directly over the network 2.
- FIG 10 is a schematic diagram of an avatar number 8.
- the avatar number 8 comprises two parts: an avatar hosting service identity number AHS-ID 224 and an avatar identity number A-ID 225. If there is multiple avatar hosting servers 4 on the network 2, then each avatar hosting server 4 has an avatar hosting service identity AHS-ID 224. There is an avatar hosting registry server AHR 226 on the network 2 run by AHR management software 227 stored in memory 347. When a personal computer 3 needs an avatar 5 it takes the avatar hosting service identity AHS-ID 224 and sends it to the AHR management software 227 to request the location of the avatar hosting server 4 corresponding to the AHS-ID 224 on which the avatar 5 is stored.
- Each avatar identity number 225 for a particular AHS-ID 224 is unique.
- the personal computer 3 contacts the avatar hosting management software 229 on the correct avatar hosting server 4 with the location provided by the AHR management software 227 and retrieves the avatar 5 using the AHS-ID 224.
- It is a purpose of this first embodiment to disclose a process for retrieving an avatar comprising the following steps: user providing an avatar number and password; a computing appliance sends the avatar number and password to the network location of an avatar hosting service; - avatar hosting server management software on the avatar hosting service checks a database to verify that the avatar number and password are valid; if the avatar number and password are valid, then avatar hosting server management software on the avatar hosting service sends the avatar to the computing appliance.
- It is a further purpose of this first embodiment to disclose a process for retrieving an avatar using an avatar hosting registry server comprising the following steps: - user providing an avatar number and password; a computing appliance sends an avatar hosting service identity number to an avatar hosting registry server; the avatar hosting registry server sends to the computing appliance the network location of the avatar hosting service corresponding to the avatar hosting service identity number; the computing appliance sends the avatar number and password to the network location of the avatar hosting service; avatar hosting server management software on the avatar hosting service checks a database to verify that the avatar number and password are valid; if the avatar number and password are valid, then avatar hosting server management software on the avatar hosting service sends the avatar to the computing appliance.
- This invention is not limited to this one way of designing an avatar number 8 but includes all other ways of designing an avatar number 8 such that the avatar 5 with avatar number 8 may be located on one or more avatar servers .
- FIG 11 is a block diagram of a personal computer 3 with an avatar user interface 260 in an environmental location 273.
- the personal computer 3 includes a display device 264, a webcam 29, a headset 11 comprising microphone 12 and headphones 13, a keyboard 14 and a mouse 15 in a cabinet 16 running an operating system 20 which in this embodiment is the Microsoft Windows XP operating system, an avatar user interface software application 262 as a plug-in to the browser 263 in which the displayed avatar user interface 260 is seen by the user 17 in the browser window 21 on the desktop 423.
- the headset 11 is normally worn by the user 17 of the personal computer 3 during an avatar conference in such a way that the user 17 can hear through the headphones 13 and speak into the microphone 12.
- Each PC peripheral may be connected to the PC cabinet 16 by a wired or a wireless method; if it is a wireless method, the peripheral may contain a battery or be connected to a power source .
- avatar conference call During an avatar user interface session (avatar conference call) , those participating in the session will communicate via information flowing between the personal computers 3 and the session server 1.
- This information can be in different media formats including: voice, music, video, avatar animation, 3D models, presentation images, text, office application sharing, spreadsheets, word processor documents and whiteboard annotation.
- Session server arrangement including: voice, music, video, avatar animation, 3D models, presentation images, text, office application sharing, spreadsheets, word processor documents and whiteboard annotation.
- the session server 1 may be resident on the network 2 in a server- client network design.
- the session server functionality may be resident on a personal computer 3 in a peer to peer network design.
- the personal computers 3 of the users 17 with session server functionality resident on at least one personal computer 3 is sufficient to use the avatar user interface system 261 over the network 2 without a separate session server 1.
- Figure 12 is a diagrammatic representation of avatar user interface functionality in a conference application.
- the personal computer 3 is running a personal computer operating system user interface 20 which is visible in the display device 264 as a desktop 423.
- the personal computer 3 is also running a network browser which is visible in the display device 264 as a browser window 21 and which in this embodiment is the Microsoft Internet Explorer browser Version 6.
- the personal computer 3 is connected over the network to the session server 1 via the browser window 21.
- the Uniform Resource Locator (URL) 22 active in the browser window 21 points to the session server 1.
- the avatar session user interface 10 comprising a large conference window 23, two smaller conference windows 24, 25 and one or more interaction windows 26.
- the large conference window 23 has control buttons 27; these buttons change depending on which media is being shown in the large conference window 23.
- An interaction window 26 has mode buttons 28.
- the user interface may be 'always on' for the user 17 to speak.
- a button 272 is depressed by the user 17 when speaking and is acknowledged with the button 272 changing colour to show that the microphone is live.
- the button may also be activated by pushing a key on the keyboard 14.
- the large conference window 23 is used to show whichever media is in use and requires the maximum resolution.
- the two smaller conference windows 24, 25 are for two other media formats.
- the interaction windows 26 have several functions including: text chat, attendance list, address list, agenda and audio settings.
- the number of interaction windows 26 can be reduced by means of a window having several modes. In this embodiment there are two windows 26.
- the first window 26 is permanently dedicated to text chat.
- the second window 26 is controlled by mode buttons 28 for swapping between functions: attendance list, address list, agenda and audio settings.
- the three conference windows 23, 24, 25 may have the same aspect ratio or may have different aspect ratios depending on the system design.
- the three conference windows 23, 24, 25 show the three avatar conference media windows: the presentation, the whiteboard and the meeting room. The user may select the media window in one of the two small conference windows 24, 25 to go into the large conference window 23 and the media window currently in the large window swaps back into the small window vacated by the selected media window.
- Figure 13 is a block diagram of a presentation media window 30 during an avatar user interface session.
- the presentation media window 30 can show images, slides, video clips and other visual media such as Flash from Macromedia Inc (USA) or applications such as computer games.
- the presentation media window 30 is controlled by the user using the control buttons 31 - 35, when it is in the large window, but cannot be operated when it is in a smaller window.
- Button 31 returns the presentation to the first slide.
- Button 32 moves back one slide.
- Button 33 moves forward one slide.
- Button 34 goes to the last slide in the presentation.
- Button 35 toggles between local control of the presentation and presenter control of the presentation.
- Figure 14 is a block diagram of a whiteboard media window 40 during an avatar user interface session.
- the whiteboard 40 is controlled by sets of control buttons 41 - 43 when it is in the large window but cannot be operated when it is in a smaller window.
- the session server 1 maintains the whiteboard content as being identical on all client personal computers 3.
- the whiteboard consists of multiple pages on which content can be created or pasted.
- the analogy is that of a flip-chart which has multiple pages.
- the set of control buttons 41 are similar in function to buttons 31 to 35 in the presentation window. They control which of the whiteboard pages is displayed. There can be local control of the whiteboard pages or control can be handed to the presenter by means of a mode toggle key.
- the set of control buttons 42 presents a palette of colours for the person creating content to choose from. This is similar to the Microsoft Paint application.
- the set of control buttons 43 presents a collection of tools for creating content. Examples include text mode, line drawing mode and rubout mode. These tools are similar to the Microsoft Paint application.
- Figure 15 is a representation of an example of a meeting room media window 50 during an avatar user interface session.
- Each participant in the avatar user interface session is represented by their avatar 5 sitting around a meeting table 51.
- In the background is a screen 53 on which presentation slides are displayed, a whiteboard 54 which can be written on by the participants and the room comprising walls 55, ceiling 56, floor 57, door 58 with a door handle 59 and a windowpane 60.
- the avatars 5 shown in the meeting room media window 50 are labelled Ted, Jill, Andy and Pam.
- the avatar 5 labelled Pam is using a mobile phone 79.
- the avatar of Bert is not shown in Figure 15.
- Bert is viewing the meeting room media window 50 and is the fifth participant on the session. Bert does not see an avatar of himself.
- There may be other items in the room such as plants, sky visible through the windowpane 60, birds flying in the sky and trees visible through the windowpane 60.
- the user 17 arranging the conference may select from several designs of meeting room 50 offered by an avatar conference service provider.
- a selected meeting room 50 may be informal or formal. It may be large or small. It may be designed to suit a particular culture eg Japanese .
- buttons 45-48 control the mode 84 in which the meeting room media window operates.
- Button 45 selects mode Ml.
- Button 46 selects mode M2.
- Button 47 selects mode M3.
- Button 48 selects mode M4.
- the layout button 85 controls the layout for modes in which the layout is an option.
- the meeting room media window in an avatar user interface session is useful to different people at different times: if you have never physically met a person who is on the session, it is usually interesting to see their avatar to see what they look like
- each window is unrelated to the others
- the patterns of gaze of the participants as seen in the monitors are disjointed; each participant tends to look in a different direction; this is at its worst in desktop video conferencing when webcams situated on top of personal computer monitors are used and the participant looks at the monitor and not at the webcam; this is unlike a real meeting in which the gaze of each participant has a function and there is a cohesive whole .
- Figures 16a, 16b, 16c and 16d are schematic diagrams to illustrate the virtual camera positions in the virtual video conference. It is a plan view. Cameras 61, 62, 63 and 64 view avatars 5 labelled Ted, Jill, Andy and Pam respectively. Behind avatars 5 are four backgrounds 65, 66, 67 and 68.
- Figures 17a, 17b and 17c are schematics of three possible layouts in the meeting room media window 50.
- Layout 1 shows the avatars 5 in a virtual room 69 sitting around a virtual table 51.
- Layout 2 shows the avatars 5 in a straight line arrangement.
- Layout 3 shows the avatars 5 in a split screen arrangement.
- the backgrounds 65, 66, 67 and 68 may be identical, similar or completely different depending on what works best for the selected Layout 1, 2 or 3.
- the layout is selected using layout button 85.
- the Meeting Room media window 50 of the avatar conference is a metaphor for an actual meeting that is being video-cast live.
- An example might be a group discussion broadcast from a television studio.
- photo-realistic 3D avatars a photo-realistic 3D meeting room, anima-realistic animations of the avatars and good camera direction, it is possible to suspend the disbelief of the viewer on the session such that he thinks it is an actual meeting where he is the only person who is not in the room.
- the objective is for the enactment to be so realistic that the viewer finds it hard to tell the difference between the avatar conference and a live video of the actual meeting room.
- Figure 18 is a plan view of the virtual meeting room illustrating possible virtual camera positions.
- Camera 71 is the overview camera and will show the view illustrated in Figure 15.
- Camera 71 is positioned at the eye position of the Avatar called Bert who is seeing the Meeting room media window 50 in Figure 15 on his personal computer 3.
- Cameras 72, 73, 74 and 75 view avatars 5 labelled Ted, Jill, Andy and Pam respectively.
- Camera 76 shows the presentation screen 53.
- Camera 77 shows the whiteboard 54.
- Other cameras may be positioned at any location and oriented at any orientation.
- each mode the view presented is from a virtual camera position.
- a virtual camera can have camera controls such as zoom and pan in addition to spatial movement .
- Figure 19 is a set of four timelines of the camera shots during the avatar conference for each Mode.
- Mode Ml by way of example, there is only one shot SI which lasts for the duration of the avatar conference and is shot from Camera 71.
- Mode M2 by way of example, the first shot S10 is from Camera 71 and is an overview view similar to that in Figure 15. This is followed by shot Sll from Camera 72 which shows Ted. This is followed by shot S12 from Camera 76 which shows the presentation screen.
- the avatar conference timeline continues until the last shot S17 from Camera 71.
- Mode M3 by way of example, there is only one shot S20 which lasts for the duration of the avatar conference and is in Layout 1 using Cameras 61, 62, 63 and 64.
- Mode M4 by way of example, the first shot S30 is in Layout 1. This is followed by shot S31 from Camera 61 which shows Ted against background 65. This is followed by shot S32 from Camera 76 which shows the presentation screen. The avatar conference timeline continues until the last shot S37 in Layout 1.
- This mode Ml uses the Meeting Room metaphor. An overview from a single virtual camera of: the table 51, all the avatars around it 5, the whiteboard 54 and the presentation screen 53. There are no other cameras .
- the viewer's avatar is not present. If the viewer's avatar were present, then the viewer sees his own avatar animating and in particular lip syncing whilst he talks and the effect would be like a mirror that reflects actions you do not make. Seeing your own avatar breaks the metaphor and reduces the copresence felt by the viewer.
- the camera viewpoint can be from where the viewer could be sitting at the table or any other viewpoint that 'misses' the viewer's avatar.
- This mode M2 uses the Meeting Room metaphor. Multiple cameras are used but there is only one window. The result is like a televised chat show with cuts from one camera to another as the chat develops .
- This mode M3 uses the Video Conference metaphor but improves on it to partially overcome the drawbacks of a real video conference.
- the Meeting Room media window 50 is laid out in sections and shows one participant's avatar in each section of the window.
- one of the three layouts in Figures 17a, 17b and 17c can be chosen by the user by toggling button 85.
- the gaze direction of the avatars can be controlled to overcome the drawback of a video conference or desktop conference in which the gaze direction of the avatars is disconcerting to the user.
- Layout 1 helps to give a sense of cohesive space for the video conference in that the layout is enhanced with a virtual room 69 which can include all items shown in Figure 15 including a virtual table 51.
- Layout 2 goes half way to providing a sense of cohesive 3D space by putting the avatars in a line but does not include a virtual room and virtual table.
- Layout 3 is a split screen layout that maximises the display resolution per participant and is useful where there are a large number of participants in the avatar conference.
- This Mode M4 uses the Video Conference metaphor. Multiple cameras are used but there is only one window. The result is like a televised multi-location show with each participant in a different location with cuts from one camera to another as the chat develops.
- the avatar user interface system is used in a variety of ways. The following is a list of collaborative meeting activities and the percentages are an indication of the % of meeting time devoted to each activity type when averaged over a wide variety of meeting types.
- Multi-tasking with non-meeting activity eg reading, doing e-mail
- Designers may discuss a 3D object. Advertising people may listen to radio adverts or view prototype packaging images, video clips of TV adverts. Businessmen may view a slide presentation. Salesmen may present new products. Students may take part in an e- learning course led by a tutor or they may work collaboratively together.
- Figure 12 is just one example of an avatar conference display. This invention is not limited to the one example shown in Figure 12.
- the avatar conference is a series of events .
- the events are largely un-scripted, although there is often an agenda and a Chairman whose objective is to ensure that the meeting follows the agenda.
- the following events are listed by way of example only and do not form a comprehensive list of all events that can take place in an avatar conference:
- Figure 20 is a block diagram of a software director 80 and an avatar engine player 210.
- the flow of events 81 into a software director finite state machine 80 is shown with the resulting flow of camera shots 82, light settings 214 and actions 83 such as avatar animations into an avatar player engine 210.
- the avatar player engine 210 also uses at least one avatar 5, the scene 211, props 215 and the lighting model 212 to combine with the shots 82, light settings 214 and actions 83 to generate and display the avatar conference on the avatar session user interface 10.
- a 3D graphics processor chip 213 is often used in the personal computer 3.
- the avatar conference can be enacted with each event being acted out by an avatar.
- the enactment of the avatar conference can be shown from multiple camera viewpoints and camera movements such as translation, zoom and pan.
- a software director 80 which is a finite state machine, directs the enactment and visualisation of the meeting in the avatar conference media window by reacting to the events 81 as they occur during the meeting.
- the software director 80 takes into account the mode 84 and layout 85.
- a library of actions 87 is available.
- An action generator 88 is available. These actions are animations for avatars. Action impersonation parameters 332 from at least one avatar 5 are available.
- timers 86 are started after some actions and new events are triggered by timers 86 expiring.
- the software director finite state machine 80 is effectively a software agent that initiates actions triggered by events according to rules.
- a constrained activity such as an avatar conference, it is quite feasible to completely define all the events, all the actions and the set of rules for actions being generated by events.
- the software director 80 takes into account the action impersonation parameters 332 of that avatar 5. In this way, the actions 83 generated for that avatar 5 can be more characteristic of the user 17 that the avatar 5 represents.
- the software director 80 will generates actions 83 involving a lot of arm movement.
- the action impersonation parameter 332 for lip synchronisation whilst talking 406 indicates very little lip movement whilst talking
- the software director 80 will generates actions 83 for lip synchronisation involving very little talking.
- Rules for the five other disclosed action impersonation parameters 332 [400, 401, 402, 403 and 405] may be drawn up in a similar way and for any other action impersonation parameters 332 that are defined and used.
- the scene 211 is typically that of a room as illustrated in Figure 15. Each item in the scene is modelled in 3D. To achieve a close to video experience that encourages a sense of presence, each item is made of photo-realistic textures as well as a 3D topology. Props 215 are 3D items in the scene that can be moved by the avatars or under self- power. Props 215 are modelled in a similar way to the scene.
- a lighting model 212 is used.
- the light levels 214 of the lights in the lighting model 212 can be changed by the software director 80 in reaction to events during the avatar conference.
- the visual aspect of the avatar conference is a collection of 3D content including multiple avatars, a scene, props and a lighting model. If rendering effects such as shadows are required, the complexity increases. This can provide a large load on the personal computer 3. More and more often, a powerful 3D graphics processing chip 213 is built into the personal computer 3. In this way, it is possible for the avatar conference to achieve an acceptable frame rate such as 15-25 frames per second.
- Figure 21 is a block diagram of events on a personal computer 3 and a session server 1. It illustrates the event accumulator 89 on the session server 1 that gathers events 81 and sends the accumulated events 81 to the software director 80 on each personal computer 3 via a network 2. A software director 80 can also generate events 81 and send them to the event accumulator 89.
- the event accumulator 89 on the session server 1 receives events 81 from a variety of sources including:
- the session management software 228 manages one or more user interface sessions on the session server 1. Session and Hosting Payment
- billing software 237 on the avatar hosting server 4 monitors aspects of the use of avatars such as the number of avatars hosted for a customer and arranges billing according to the revenue model agreed with the customer.
- the billing software 237 is not limited to the functionality described above.
- the billing software 237 could monitor other aspects of the sessions, it could apply different revenue models to different customers, it could use micro- payments for immediate debiting during use, it could combine billing for sessions, billing for avatar hosting, billing for other services and it could be resident on any computer or server.
- Figures 22a, 22b, 22c, 22d and 22e are schematics of the five seating plans viewed by the five participants in the avatar conference in their five meeting room media windows 50.
- the table 51 and the presentation screen 53 are the same in each view.
- Each participant's avatar is abbreviated to the first letter of its name: B for Bert, T for Ted, J for Jill, A for Andy and P for Pam.
- B for Bert B for Bert
- T for Ted Ted
- J for Jill A for Andy
- P for Pam the avatar of the viewer.
- the seating plan is rotated with reference to the presentation screen 53 for each of the five views.
- Each meeting room arrangement is therefore different. Other seating arrangement rules can be drawn up.
- each meeting room window 50 displayed on each Personal Computer 3 can show a different representation of the virtual meeting room and a different enactment of the meeting.
- Figure 23 is a schematic of the audio mixer 90. It illustrates the audio mixer 90 that is part of the session server 1 and includes a balance system 204 and a filter system 205.
- N audio input streams 91 arrive at the session server 1 from the personal computers 3 over the network 2.
- One audio input stream 91 arrives from each personal computer 3.
- one or more audio input streams 92 might be available; audio input streams 92 can be generated from playing a media object during the avatar conference such as an audio or video clip or as streaming media channels coming in over the network 2; an audio input stream 92 might be voice, music, radio, TV or any other audio stream.
- N audio output streams 93 are generated by the audio mixer 90 and sent to the N personal computers 3 over the network.
- the audio mixer 90 is a finite state machine that follows one main rule in the case of a conference where there is a single conversation common to all participants: the audio output stream 93 going to a personal computer 3 is a mix of one media object audio stream 92 and all the input audio streams 91 from the other personal computers except for the one coming from that personal computer. Audio mixing in the audio mixer 90 is digital and as will be clear to those skilled in the art is carried out by combining synchronised time segments such that the real time of each input segment from each participant is the same .
- the audio mixer is also able to carry out an amplitude balancing function using the balance system 204 by balancing the amplitudes of the input audio streams 91 by reducing the amplitude of loud audio streams and increasing the amplitude of quiet audio streams before mixing. In this way participants do not need to concentrate hard to hear quieter participants and do not get shocked by louder participants .
- the audio mixer is also able to carry out a filtering function using the filter system 205 filtering the input audio streams 91 to reduce annoying sound artefacts generated by the mixing process or by lags in the network 2. In this way participants enjoy a cleaner and higher quality audio experience during the conference.
- Visual feedback in the meeting room media window 50 can be provided to participants showing who is whispering and who has split into a smaller group.
- a simple way is for the avatars 5 of those whispering to automatically get up and move to the back of the room where they can be seen chatting together by others (but not heard) .
- the same approach of forming a standing group can be used for small groups.
- another meeting room can be used. So as not to lose visual continuity, the additional meeting room can be situated behind the wall 55 which can be made of glass like a large window 60 and the avatars in the additional meeting room can be visible through the glass wall 55.
- the additional meeting room can be situated behind the wall 55 which can be made of glass like a large window 60 and the avatars in the additional meeting room can be visible through the glass wall 55.
- this can be represented by his avatar 5 using a mobile phone 79. This conveys to the other participants that a user 17 whose avatar 5 is holding a mobile phone 79 does not have his full attention on the conference.
- Figure 24 is a schematic of the audio mixer 90 for multiple conversations. It illustrates the audio mixer 90 that is part of the session server 1 when more than one conversation is taking place simultaneously during the conference.
- Conversationl 201 in the C0NV1 mixer 94 uses the input and output streams 1, 2 and 3.
- Conversation 202 in the C0NV2 mixer 95 uses the input and output streams 4 and 5.
- Conversations 203 in the CONV3 mixer 96 uses the input and output streams 6, 7 and 8.
- the mixed output 97 of mixer CONV1 94 is also fed into the CONV2 mixer 95.
- the C0NV2 mixer 95 is set up to combine conversationl and conversation such that the output streams 4 and 5 include both conversationl 201 and conversation 202 but the output streams 1, 2 and 3 do not include any element of conversation 202.
- the audio mixer 90 can be configured to support two or more conversations simultaneously. In addition, it is possible to combine the main conference conversation with whispering such that two conversations can be heard simultaneously.
- Figure 25 is a block diagram of a Lip Sync Generator (LSG) 100 in which the microphone 12 receives voice 270 from a user 17 and background noise 271 from an environmental location 273.
- the resulting analogue audio stream 103 generated by the microphone 12 is processed by a standard sound card 102 such as a Sound Blaster from Creative Technologies Inc (USA) that is in the personal computer 3.
- the digital output from the sound card 104 is input into the LSG 100 which first reduces the background noise 271 with a filter 205 and then outputs a stream of geometric positions 101.
- a digital audio transform stream 105 is output from the LSG 100.
- the digital audio transform stream 105 can also be the same as the input audio stream 91 to the audio mixer 90.
- a stream of events 81 is also output by the LSG 100 which travels over a network 2 to the event accumulator 89 on the session server 1.
- Figure 26 is a timeline of a lip sync generator. It illustrates that the processing in the LSG takes time and the output 101 lags the input 104 by time T milliseconds.
- Figures 27a, 27b, 27c and 27d are diagrammatic representations of four geometric values for four lip sync animation types that can be used to animate a talking head 111 with a mouth 112.
- the jaw rotation angle B is the angle between the jaw 107 and the upper teeth 106 first geometric value that can be output from the LSG.
- the mouth length L is the distance between the two corners of the mouth 109.
- the lip rotation angle A is the angle between the angle of the teeth 106 and the angle of the lip 108.
- the tongue protrusion length P is the length of protrusion of the tongue 110 from its rearmost position.
- the microphone records sound from the person 17 speaking in the conference. Human voice is typically audible in the range 20 Hz to 20 kHz.
- the analogue signal 103 from the microphone 12 is processed to produce a digital audio stream 104 sampled at 16 kHz and 16 bits resolution by the sound card 102 in the personal computer. Sampling at 16 kHz and 8 bits was tried but the data was too sparse to allow the LSG to perform well in this particular avatar conference configuration.
- the output 101 from the LSG 100 is four real numbers, one for each geometric value of a lip sync animation type at a sample rate of 30 per second.
- Figure 28 is a flow diagram of the process followed by the LSG 100.
- the digital audio stream data 104 flows into a buffer 120.
- a discrete Fourier transform 121 is performed on the audio data accumulated in the buffer 120 and a spectrum 146 is output.
- the spectrum 146 comprises a finite number of bins representing frequency ranges with the degree to which each bin is filled defining the amplitude of that frequency range.
- a jaw rotation analyser 123 outputs a value representing the jaw angle
- a mouth length analyser 125 outputs a value representing the mouth length 126.
- a lip rotation analyser 127 outputs a value representing the lip angle 128.
- a tongue protrusion analyser 129 outputs a value representing the tongue protrusion 130.
- One or more emotion analysers 135 output strengths of emotion 136.
- the stream of spectrums 146 generated is the audio transform stream 105.
- the combination of real numbers 124, 126, 128 and 130 in a stream is the geometric position stream 101 in which one or more strengths of emotion 136 are included.
- the audio transform stream is compressed 131 to produce a compressed audio stream 132 and this is combined 133 with the geometric positions 101 to form a stream of packets 134 for transfer over the network 2 to the session server 1.
- the geometric values 124, 126, 128 and 130 for the four lip sync animation types are each normalised and output by the respective analysers 123,
- the LSG 100 operates at 62.5 Hz in that a discrete Fourier transform is performed on the digital audio stream data 104 accumulated during the previous 0.016 sec.
- the frequency spectrum is divided into 128 bins representing frequency ranges.
- the packets sent over the network are sent at a frequency of 30 Hz. Operation at rates in excess of 100 Hz was tried, but the LSG quality deteriorated due to a reduction in signal.
- These values are settings at which the LSG 100 works, but this invention is not limited to these precise settings and includes all settings that work for this process. Audio compression
- the audio compression 131 in which the stream of spectrums 105 is further compressed can be carried out by any of the compression- decompression routines known to experts in the field.
- Figure 29 is a flow diagram illustrating the steps involved in the passage of a sound from the microphone 12 on one personal computer 3, through the sound card 102, processed by the LSG 100, sent to the session server 1 over the network 2, buffered and mixed in the audio mixer 90, resent over the network 2 to another personal computer 3, buffered and decompressed 140 and played on the headphones 13 via the sound card 102.
- the geometric and audio information in the packets is for the same period of time; in other words there is no lag within the packet between the geometric and audio information.
- This has the advantage of perfect timing on the lip synchronisation on replay.
- Figure 30a is a spectrogram 145 and Figure 30b is a graphical diagram of a spectrum 146 from time t on the spectrogram 145.
- the spectrum 146 comprises a finite number of bins representing frequency ranges with the degree to which each bin is filled defining the amplitude of that frequency range.
- the spectrum 146 is segmented into just 7 bins corresponding to rows fl to f7. As already disclosed, the number of bins in a spectrum 146 is likely to be much higher.
- the row fl corresponds to the lowest frequencies collected and the row f7 corresponds to the highest frequencies collected, with f2-f6 covering frequency ranges in between.
- the amplitude of each bin is split into ranges al to a6.
- the range a6 is the largest range of maximum amplitudes.
- the spectrogram 145 can be depth encoded in discrete colours such that the square in row fl is coloured with a colour signifying amplitude al, the square in row f2 is coloured with a colour signifying amplitude a4 and so on for rows f3 to f7 of the spectrum 146.
- the amplitude is likely to be stored as a floating point number and only split into amplitude ranges for the purposes of visualisation on the colour spectrogram 145.
- the LSG 100 has a direct route between voice spectrum and the geometry output without attempting to go through intermediate concepts such as phonemes, visemes, diphones and co-articulation.
- LSG One requirement for the LSG is for it to scale to non-speech utterances such as singing and laughing. The need is for the avatar to visually represent those utterances in an acceptable manner. Another requirement for the LSG is for it to work with different people's voices and all languages. A further requirement is for the software code to be small enough to download from the session server 1 over a network 2 to the client personal computer 3 without too long a delay.
- the approach involved creating spectrograms of simple sounds and recording the corresponding facial geometry made whilst speaking those sounds .
- the spectrums in the spectrograms were then studied to look for patterns that could be transferred into heuristic algorithms. These algorithms were then installed in the jaw, mouth, lip and tongue analysers 123, 125, 127 and 129. Once the system was working for simple sounds with algorithms in place in the analysers, it was tested with more complex words and different voices. Whenever the facial animation was found to be unacceptable the algorithms were adjusted or new algorithms developed to improve the facial animation.
- the algorithm in the jaw rotation analyser 123 relates the output jaw angle to the energy in the high frequency bins. In general, whilst talking, the mouth opens further when making high frequency sounds than low frequency sounds. In the jaw rotation analyser 123, the higher the amplitude in the high frequency bins, the larger the jaw rotation and the more the mouth is open.
- the algorithm in the jaw rotation analyser 123 calculates a normalised average value 124 of the sum of the normalised amplitudes in the high frequency ranges f5, f6 and f7. This algorithm in the jaw rotation analyser 123 can be improved by setting a minimum level of mean normalised amplitude in the high frequency ranges f5, f6 and f7. If the actual mean normalised amplitude is not above this minimum level then the output value 124 is set to zero. This stops the mouth opening in response to low levels of background noise rather than speech.
- the algorithm in the mouth length analyser 125 works on frequency range. The wider the range of frequencies, the larger the length between the mouth corners 126.
- the standard deviation of the spectrum is calculated from the amplitudes in each bin in the spectrum.
- the mouth length 126 output by the mouth length analyser 125 is proportional to this standard deviation.
- the mouth length 126 is a normalised value from 0 to 1. Whistling is an extreme example in which the mouth length 126 is very short to make a small hole through which air is expelled at a focused frequency.
- the mouth length analyser 125 can handle whistling because the standard deviation of a whistling sound is very small and the output mouth length 126 is correspondingly small.
- the lip rotation analyser 127 looks for high amplitudes at particular frequencies.
- Lip rotations are associated with plosive sounds such as 's' or 't' that are in effect sudden bursts of energy at characteristic frequency range.
- Each plosive sound has a characteristic frequency bin or set of neighbouring bins. The higher the relative amplitude at one characteristic frequency, the larger the lip rotation.
- the lip rotation analyser 127 checks for high amplitude at one of these known sets of frequency bins relative to all the other frequency bins.
- the lip rotation 128 output by the lip rotation analyser 127 is proportional to the ratio between the average amplitude of the set of characteristic bins and the average amplitude of all the other frequency bins.
- the lip rotation 128 is a normalised value from 0 to 1.
- the tongue protrusion analyser 129 looks for characteristic sounds such as 'th' in which the tongue protrudes. The higher the amplitude of the characteristic sound, the more the tongue protrudes.
- Emotion detection It is useful to detect the emotion of a person from the person's voice. Once detected, the emotion can be used to modify the avatar's actions such that the avatar's visual behaviour matches the emotion conveyed by the audio. Some emotions engender large changes in body language and other emotions engender barely noticeable changes in body language. For a good avatar metaphor it is useful to detect emotions that engender large changes in body language .
- the simplest emotion to detect is the absence of speech over time. This can be detected by a special emotion analyser 135 designed to detect absence of speech that outputs a strength of speech 136. If the strength of speech 136 is zero, then there is no speech at that time t. If the strength of speech 136 is 1 then there is speech.
- Laughing has a characteristic pattern that can be detected from speech. There is a regular pattern of sounds at a frequency of around 3-4 Hz along the time axis in the spectrogram 145 and characteristic high amplitude such as levels a4-a6 and a low frequency such as f5-f7 in the spectrum 146. Laughing can be detected by a special emotion analyser 135 designed to detect laughing. The strength of laughing 136 output is a normalised value in the range 0 to 1.
- Anger can be detected by an increase in amplitude. This is not always reliable, because, for example, moving the microphone 12 closer to the mouth of the user 17 may result in a significant increase in amplitude.
- the voice channels 91 can just be mixed and the users 17 will sort out the situation if several people speak at once; if necessary a chairman will be appointed to determine the next speaker.
- Microphones 12 often pick up background noise 271, particularly if the user 17 is in an open plan office. In an ideal world, all microphones would only pick up the voice of the user 270 and automatically filter out background noise 271. In many user environmental locations 273, background noise 271 can be at the same amplitude as voice 270 or even higher.
- the filter 205 in the LSG 100 plays an important role in reducing this background noise 271 before it reaches the LSG 100.
- the LSG 100 to know whether the audio stream is noise 271 or voice 270, even after filtering. In many cases, the LSG 100 generates a stream of geometric positions 101 from the digital audio stream 104 that is in fact just background noise 271.
- One simple way of eliminating the problem of identifying whether the audio stream 104 is voice 270 or background noise 271, is to request users 17 to turn off their microphones 12 when they are not speaking. This effect can also be implemented in a different way by requesting that the user 17 presses a 'Push to Speak' button 272 on the avatar session user interface 10 whilst he speaks. If several users 17 have their buttons 272 depressed at the same time, then the audio mixer 90 mixes all the active channels.
- the LSG 100 uses filtering and switching techniques as described above to more reliably generate events 81 that indicate whether a user 17 is speaking or not.
- the avatar conference is a client-server architecture.
- the LSG 100 runs on each personal computer client 3.
- the alternative was to run the LSG 100 on the session server 1. It is better in most instances to run the LSG 100 on the personal computer client 3 rather than the session server 1 because (a) this uses up less network bandwidth in that the data rate for the combined compressed audio and geometric values 134 is much less than that for the digital audio stream 104 and (b) the network architecture is more scalable for large conferences in that massive session server processing demands are avoided.
- the software code size for the LSG is around 20 kBytes . This has the advantage of being small compared to other approaches which often involve the necessity for large dictionaries to be on the client personal computer 3, usually by downloading over the network 2 from the session server 1. Such a small size of software code makes the LSG suitable for applications on small network devices such as mobile phones .
- a microphone means records sound from a user of a computing appliance means as the user speaks;
- a lip synchronisation generator means on the computing appliance means processes the sound to provide a combined audio and geometric position stream;
- the computing appliance means streams the combined audio and geometric position stream over the network to an audio mixer;
- the audio mixer mixes the combined audio and geometric position stream with any other combined audio and geometric position streams to produce a specific mixed audio and geometric position stream for each computing appliance;
- the audio mixer sends each computing appliance its specific mixed audio and geometric position stream;
- the computing appliance plays the specific mixed audio and geometric position stream to its user via a loudspeaker means.
- a lip synchronisation generator process comprising a process performed at regular intervals on a digital audio stream flowing into a buffer of the following steps: the contents of the buffer are copied and then the buffer is emptied; a discrete fourier transform is performed on the copied contents of the buffer and a spectrum is output; one or more analysers analyse the output spectrum and each analyser outputs a value representing a geometric position of a part of a talking head.
- One role of the software director 80 is to decide and activate the cameras to form a sequence of shots.
- the camera shot shown in the meeting room media window depends on: the mode chosen 84 the layout chosen 85 flow of events (historical and actual) 81 flow of shots (historical and actual) 82 - flow of actions (historical and actual) 83 timers 86 random choice the cameras programmed (61-64, 71-77 etc)
- the rules for the shots can be very simple for some modes such as Mode Ml and fairly complex for modes such as M2.
- the person programming these rules has a large degree of freedom and is in effect building an expert system of an expert film director.
- the rules are improved with feedback from users during trials.
- avatar conference it is normal for different people to speak at different times. Since each person has his own microphone 12 and personal computer 3, it is known which avatar is associated with a voice stream. Events include a person stopping speaking and another person starting speaking. The camera shot is usually on the main speaker; if several people are speaking at once then a wide shot of all the participants can be shown.
- Another role of the software director 80 is to decide on the ambient and event animations of the avatars. This is equivalent to the director of a stage play defining every aspect of an actor's facial and body movement.
- the animation shown in the meeting room media window depends on at least some of : the mode chosen 84 the layout chosen 85 flow of events (historical and actual) 81 flow of actions (historical and actual) 83 timers 86 random choice actions 83 available in a library 87 - action generator capabilities 88 as defined by action parameters 243
- Animation actions can be classified into four types: - Ambient animations (generated by software director) Event animations (generated by software director) Head/facial animated gestures (triggered by user) Hand/arm/body animated gestures (triggered by user)
- ambient animations An actual person is almost never still. Breathing, swaying, changing gaze, small head movements and many others are termed ambient animations. In a meeting, ambient animations depend on the role of the person and his culture. A speaker will usually move his hands and arms a lot. A listener will be less dynamic. Ambient animations are designed to be encouraging towards a good meeting atmosphere; listener's faces can be seen to smile and look positive; heads can nod regularly as if in agreement or in understanding; body posture can be upright rather than slouched. Ambient animations are generated automatically by the software director.
- Event animations are the actions associated with an event.
- - a person entering the meeting room, walking to his chair, pulling the chair out, sitting on it and moving the chair nearer to the table the detection of emotion from the audio stream; for example, if a laugh is detected, the avatar can be animated as laughing - a participant has been silent for longer than a certain period, actions associated with the participant not being involved in the meeting are adopted; a method might be a certain slouching in the chair that will convey visually to the other participants that this person is not involved much the participant is not able to see the meeting room media window 50 because he is viewing another document, then his avatar could be seen reading a document a participant takes another session (call) . His avatar can be seen using a mobile phone
- Event animations are generated automatically by the software director in response to an event .
- the software director automatically generates ambient and event animations.
- the software director 80 uses rules for controlling the gaze of the avatars based on observations of people in meetings.
- Gestures In a meeting, a participant often wishes to convey information by body language gestures.
- the gesture is sometimes purposeful - based on an active decision by the participant. Examples include: raising his hand to show he wants to ask a question clapping in applause - waving to say hello Body language can also be passive, often without the participant being aware of the body language he is sending out. Examples include: shaking head in disagreement with what is being said nodding in agreement with what is being said - slumped in a chair, bored
- a participant could select a button in the user interface corresponding to the body language gesture he wishes to convey. Other participants looking at the meeting room media window will see the gesture. Both active and passive gestures could be used. Gestures can be particularly useful to the chairman of a meeting who can respond to a gesture in choosing the next person to speak.
- the software director generates animated gestures in response to an active user trigger.
- the software director 80 generates a flow of animations 83 for each avatar 5.
- the animations are retrieved from an action library 87 or are generated in real time from an action generator 88.
- Actions 83 in the action library 87 are fixed actions with a fixed duration and fixed movement. They are usually created by motion capture or by key frame animation. An example is raising a hand to wave .
- Actions 83 generated by the action generator 88 are variable actions that are generated in real-time to action parameters 243 specified by the software director 80.
- An example is asking the action generator 88 to generate a walking animation action 83 that follows a specified path across the meeting room floor.
- a possible set of action parameters 243 for this example are: the avatar number 8, the path specification, the walking style, the speed, starting conditions and end conditions.
- an action library 87 of all possible actions 83 can be compiled from motion capture of an actor or key-frame animation. In this case, an action generator 88 is not used.
- any action 83 for an avatar 5 during the conference can be chosen by the software director 80 either from an action library 87 or an action generator 88.
- the movement of an avatar can be defined as a set of joint positions at each time point or frame in the animation.
- the positions of each vertex on the skin or clothes of the avatar are determined from the joint positions and any weightings associating a vertex with each neighbouring joint.
- the main advantage of defining an animation as a series of sets of joint positions is that it is smaller than a series of sets of vertex positions.
- An avatar typically has 20-50 joints but thousands of vertices.
- a file with a set of joint positions stored for every 1/25 second is many times smaller than a similar file with vertex positions.
- Blending is pretty good for joining two similar positions: this is known as a subtle blend. However, when the positions are radically different, the result can be completely incorrect. It is quite possible for arms and legs to pass through each other during a radical blend; this effect can be most annoying for the user.
- the software designer designing an avatar conference system must carefully define each action 83 in the library of actions 87 such that all possible actions 83 that the software director 80 selects to follow any given action 83 require only a subtle blend and not a radical blend.
- the main method used to achieve this is the adoption of a limited number of neutral positions, with each action 83 edited until it starts in one neutral position and stops in another neutral position.
- Animation merging More than one action 83 can be merged and played simultaneously to form a single merged action.
- Actions are one of two types: Dominant action Modifying action
- a dominant action is an action involving major displacements such as walking.
- a modifying action is an action involving minor displacements such as ambient actions and smiling.
- Each action 83 in the library has a defined action type: either a dominant action or a modifying action.
- the most common modifying action is facial animation. It is possible to merge three or more actions. But only one action in a merged action can be a dominant action. For instance, the walking dominant action can be defined with smiling and lip synchronisation.
- Modifying actions are applied to the dominant action one frame at a time.
- the modifying action is defined as a relative movement of joints.
- a modifying action is 'added' on top of a dominant action during animation.
- Each avatar 5 of a particular person is a unique size. Some avatars may be short and fat, others may be tall and thin.
- an action 83 is created for the action library 87 it is created on an avatar of a particular size. If the creation means is motion capture, then the action 83 will play back best on an avatar with the same size and shape as the person whose motion is captured. Similarly, if a skilled animator creates an action 83 for an avatar of a particular size, it will play back best on an avatar with similar size and shape.
- the use of joint positions to define animations makes it possible for animations created on an avatar of a particular size, to be played back on avatars of different sizes. It is a further purpose of this embodiment that any action 83 can be played on an avatar 5 of a different size and shape from the avatar 5 for which the action 83 was created.
- This problem may be overcome with a commercially reasonable amount of effort by the simplifications of: morphing all avatars 5 to the same standard size and shape - preparing all actions 83 for avatars of that standard size and shape in a defined virtual environment crafting the software director 80 state machine to generate series of actions that work without exhibiting poor motion artefacts
- the photo-realism of the avatars will be severely degraded, if the avatars are all the same size and shape.
- Avatar size range Avatars are scaled across a design size range between a minimum and a maximum. Very small avatars are scaled up to a minimum size, very large avatars are scaled down to a maximum size and the rest are spread between. In this way, taller people have taller avatars than shorter people's avatars.
- the environment is designed to cope with avatars in the design size range.
- the software director 80 state machine is crafted to generate series of actions that work without exhibiting poor motion artefacts for avatars within the design size range,
- Adaptive action control Actions 83 such as sitting in a chair may be animated adaptively to avoid particular motion artefacts. An example is sitting in a chair.
- Avatars of different heights and different posterior sizes might either float above the chair seat or break through it.
- Adapting the sitting down action 83 to the avatar size by raising or lowering the whole avatar during the sitting process solves this problem.
- the action 83 is probably generated by the action generator 88 based on action parameters
- Figure 31 is a block diagram of the session server 1 containing: an audio recording 185, an event accumulator 89, a speech recognition engine 182, voice profiles of participants 184, a text transcript 183, a translation engine 186, translated text 187, a text to speech engine 188, a voice profile 184 of the voice used in the text to speech engine 188, a text chat engine 189 and an e-mail engine 190.
- the software engines are running in memory 346.
- the session server 1 is connected over a network 2 to a speech recognition service 192, a text to speech service 193 and a translation service 191.
- the conference can be stored as a linear audio file 185 and a time- stamped event accumulator 89.
- Events include: - person enters conference new speaker starts new agenda item started
- the audio recording 185 can be compressed in length to reduce the amount of time a person needs to spend listening to the audio recording. For example, periods of time in which there was no speech can be removed. Also, the time axis can be compressed such that playback takes less time than the original conference took.
- the playback speed eg 125% of normal speed, can be controlled by the person listening.
- the person playing back the conference can also use the event accumulator 89 as key points at which to start listening to the recording. For example, if he is only interested in agenda item number 3 then he can skip to the point at which the chairman has noted that agenda item number three started.
- a high enough quality text transcript 183 of the meeting that is acceptable to users to be automatically produced by a speech recognition engine 182 from the audio recording 185 and event accumulator 89 using voice profiles 184 to improve the speech recognition.
- the text transcript 183 can be generated by the speech recognition engine 182 after the conference or in near real-time during the conference.
- a speech recognition service 192 may be used instead of having a speech recognition engine 182 on the session server 1.
- the text transcript 183 can also be translated to translated text 187 in another language using a translation engine 186 present on the session server 1 or by a network translation service 191 over a network 2.
- the translated text 187 can be generated from the text transcript 183 after the conference or in near real-time during the conference.
- Text to speech audio translation
- speech recognition engine 182 a text translation engine 186 and a text to speech engine 188 from the audio 104.
- a text to speech conversion service 193 may be used instead of having a text to speech engine 188 on the session server 1.
- participant can see text chat in a dedicated window 26 driven by a text chat engine 189 on the session server 1.
- a participant can input and send text messages to all participants or just to selected participants.
- the text chat window 26 can be used to show any or all of: text sent by a text chat engine 189, events 89 described in a textual format, a text transcript 183 and translated text 187.
- the text chat window can be set to the preferred language of the user such that all text is translated and displayed in the text chat window 26 in the preferred language. Text can be shown twice: in the language in which it was generated and in translation.
- the e-mail engine 190 can send copies of some or all of the text generated during the conference in e-mail form to the e-mail addresses of participants and also to those who could not attend.
- the e-mail engine 190 can also be used as an e-mail reflector for participants in which e-mails concerning the conference whether before, during or after the conference, are sent to the e-mail engine which will then immediately forward copies to all participants.
- a user 17 in a Chairman role can be provided with functionality to enable him to:
- a user 17 in a Secretary role can type minutes.
- a user 17 in a Teacher role can control the display seen on all personal computers 3 in a presentation.
- Participant performance During an avatar conference, the activity of users 17 can be recorded and fed back to participants. If a user 17 has not spoken for a period of time, an event animation is used such that his avatar 5 can be animated in a way that shows his lack of recent participation. The avatar might sink down in the chair and appear to withdraw from the conference. If this visual withdrawal is noticed by other participants, then they have the opportunity to try to involve the quiet user in the conference. Alternatively, statistics of % of the conference time that each person has spoken for might be shown. This will show up users who might be hogging the conversation and others who might be lurking without saying anything. Real-time performance feedback can provide the participants as a team with a tool for making their conferences more effective. In applications such as education or training, participant performance data such as attendance records can be available to teachers. Storage and access to information on participants performance is liable to be regulated by laws in different countries.
- Some of the performance data available includes:
- the video (or streaming webcast) 336 can come from a webcam 29 situated on the display device 264 of a user 17.
- the video 336 can come from any other type of video camera 29 connected to a personal computer 3 on the network 2.
- the quality of the streaming video 336 seen by each participant will vary with the bandwidth available to the participant. It can vary from one frame every few seconds for one participant with a low bandwidth connection to full frame rate for a participant with a high bandwidth network connection.
- the resolution in pixels of the webcam broadcast 336 is usually small and the software director 80 shows the webcam in a correspondingly small window.
- the avatar from whose webcam 29 the broadcast 336 is streaming must leave.
- the avatar walks out of the room before the webcast 336 starts and walks back in when it finishes.
- the streaming video webcast 336 from the webcam 29 is shown on the screen 53.
- Figure 32 is a block diagram of an apparatus for holding an avatar user interface session in accordance with a second embodiment of the present invention.
- the apparatus comprises a plurality of personal computers 3 that are connected by a network 2 to a session server 1, an avatar hosting server 4 containing avatars 5 and a telephone network 155 with telephones 150 and a telephone server 154.
- voice is carried over either the telephone network 155 or the network 2 and data is carried over either the telephone network 155 or the network 2.
- VoIP voice over the internet protocol
- IP v6 Two main protocols exist for transmitting over an IP network: HTTP and UDP.
- HTTP checks that each packet is received. This checking is the main cause of lag.
- UDP does not check and typically has much less lag.
- UDP is considered a security risk for companies and companies typically configure their firewalls to prevent UDP from getting through.
- a UDP system that does not work for most companies will not be purchased.
- new versions of IP such as IP v6, may improve the quality and access of VoIP such that it rivals that of telephone networks .
- the main method for remote conferencing today is telephone conference calls using the PSTN, mobile networks and a conference server for mixing the calls.
- Telephone conference calls are expensive, not only for the calls but also for the service of the session server.
- access is limited to those who have a microphone and headphone on their computers and who are situated by a networked computer.
- Someone who does not have a networked computer with microphone and headphone cannot participate in an avatar user interface session using VoIP as disclosed in the first embodiment .
- IP for audio, as disclosed in the first embodiment, avoids the cost of the telephone calls.
- a telephone server 154 is connected to the IP network 2.
- Party#l 151 can use his telephone 150 over a telephone line 155 to a telephone server 154 and his personal computer 3 on network 2.
- Party#2 152 can use his headset 11 and his personal computer 3 over network 2.
- Party#3 153 can use his telephone
- Party#4 158 can use his mobile telephone 157 over a mobile telephone network 159 to a mobile telephone server 156.
- Party#4 can transfer both voice and data over the mobile network 159.
- Party#4 could wear a hands-free headset for audio and look at the screen of his mobile handset to see the avatar user interface session.
- the audio mixer 90 can be resident on either the session server 1 or on a telephone server 154 or 156 on a separate computer.
- the Lip Sync Generator (LSG) 100 is normally present on the personal computer 3 through which it is connected to the sound card 102.
- the LSG functionality 100 can be present on a server, either the session server 1 or the telephone server 154.
- the geometric positions stream 101 and the audio transform stream 105 can then be routed to the personal computers 3 over the network 2 or to a mobile device over the mobile network 159.
- voice or data transfers in the avatar user interface session can be over a plurality of networks of any types connected by devices such as network switches or network routers. Examples of networks include: the internet, intranets, extranets, Virtual Private Networks (VPNs) , GSM mobile networks, GPRS mobile networks, 3G mobile networks, satellite networks .
- VPNs Virtual Private Networks
- communication appliances in the avatar user interface session can be any sorts of devices including but not limited to: personal computers, mobile telephones, networked personal digital assistants, networked computer games consoles, interactive digital televisions, laptop computers.
- the system architecture can be of any type including client server and peer to peer and that any item of system functionality disclosed in this embodiment can be resident on any device.
- Any communication appliance might also act as a server as well as a client.
- the session server 1 does not need to be an independent unit and that a computing appliance 3 could run both the functionality of the session server 1 and the avatar user interface 160.
- the software functionalities and hardware capabilities of many servers could be combined into a single computing appliance 3.
- the session server 1, the avatar hosting server 4, the avatar hosting registry 226 and the avatar agent hosting server 321 could be combined in one computing appliance 3.
- the format of the avatars in each communication appliance is appropriate to the computing power, graphics processing power and display size of the computing appliance such that real-time visualisation in the avatar user interface system can be achieved.
- Animations of avatars 5 at much less than 12 frames per second look jerky and reduce the sense of presence felt during a session on an avatar user interface system.
- Avatar computer models can be in different mathematical 3D representations. Possible representations include but are not limited to: triangles, quadrangles, other n-sided polygons, B-spline surfaces, NURBS and subdivision surfaces. It is a further purpose of this third embodiment that the format of a 3D avatar 39 can be any 3D mathematical representation.
- Some representations can be progressive 3D representations in which an actual format displayed can be an instantiation of a representation of arbitrary size on a continuum from low size to high size. In this way, an instantiation can be chosen that is optimal for the power of the computing appliance.
- Animatable image representation In addition to 3D representations, avatars may be represented in other ways.
- One way includes an animated image representation.
- Figure 33 is a schematic diagram of an animatable image 380. There are a minimum of two parts to the image: an animatable image avatar 382 in the foreground and a background image 381.
- an animatable image 380 may be described as a talking post card in which a talking avatar 382 and optionally a prop image 383 are superimposed in front of a fixed background image 381.
- the background image 381 is usually photo-realistic.
- the animatable image avatar 382 is usually photo-realistic.
- Figure 34 is a schematic diagram of an animatable image avatar 382.
- the animatable image avatar 382 is considered to be split into five animatable avatar segments 395: (i) upper body segment 390 (ii) jaw and mouth segment 391
- Each animatable avatar segment 395 has a set of one or more different images representing that segment.
- the upper body 390 segment normally has only one image in its set .
- Figure 35 is a schematic diagram of a set of four state images 425 for the jaw and mouth segment 391 showing the jaw and mouth in four states: neutral 470, happy 471, sad 472 and laughing 473. It is usual, for a high fidelity animation to be possible, that the jaw and mouth segment 391 has several more state images in its set 425.
- the eyes and eyebrows segment 392 has at least two state images 425: eyes closed and eyes open.
- the head segment 393 normally has several state images 425 in its set with the head at slightly different orientations.
- the face segment 394 normally has several state images 425 for different facial expressions in which wrinkles play an important role .
- Figure 36 is a tree diagram of the hierarchy of animatable avatar image components.
- a complete set of images 424 for playing an animatable image 380 comprises the background 381, prop 383 and for each avatar segment 395 the set of state images 425.
- the animatable image avatar segments 395 in this embodiment are not limited to the five disclosed animatable image avatar segments 395.
- the animatable image avatar 382 might be split into more or less segments .
- FIG 37 is a schematic diagram of an animatable image generator 397 resident on an avatar hosting server 4.
- the animatable image generator 397 is based on an avatar player engine 210.
- An animatable image 380 comprising a complete set of images 424 may be generated by the avatar player engine 210 from a photo-realistic avatar 238 and a virtual background scene 65 using a virtual camera 61.
- the photo- realistic avatar 238 is posed in front of camera 61 to form a neutral pose defined as an action 83 in which the photo-realistic avatar 238 looks forward, eyes open, neutral expression and mouth closed. This pose when viewed from camera 61 generates a base animatable image avatar 382.
- the photo-realistic avatar 238 is removed, the image of the scene 65 viewed from camera 61 is the background image 381.
- the set of state images 425 for an animatable avatar segment 395 are generated by applying a predefined set of poses as actions 83 in the animatable image generator 397.
- Figure 38 is a schematic diagram of an apparatus for animatable image generation 398. If a photo-realistic avatar 238 of a subject person 428 is not available, an animatable image 380 may be generated from a single photo-realistic image 399 of a person in front of a background. A skilled person 427 will use image processing software 426 running in memory 345 on a personal computer 3 to process the image 399 to define the complete set of images 424.
- This invention is not limited to using animatable image generators 397 and 398.
- a complete set of images 424 could be generated from video 336 or a set of still images taken of the subject person.
- the animatable image generator 397 could be resident on a personal computer 3.
- an animatable image 380 is generated and played in a similar way to that of a 3D avatar 39.
- a software director 80 generates the actions 83 that are played by the player 210.
- the main difference is that the software director 80, the player 210 and the other components in Figure 20 are designed to work with animatable images 380 instead of 3D avatars 39 and scenes.
- the animatable image avatar 382 is animated in the player 210 from actions 83 by a combination of methods that are now disclosed.
- the animatable image avatar 382 is normally based on a front view of the avatar covering at least the face, but rarely descending below the shoulders. This focus on the face removes the need to attempt to animate upper body movements such as arm gestures and lower body movements such as walking or even turning the head by more than a few degrees .
- the body movement action A is limited to a combination of horizontal translation, vertical translation and rotation relative to the background image 381.
- a body movement action A. affects all five animatable avatar segments 390-394.
- the five animatable segments 390- 394 are moved according to a body movement action A as if they were locked together.
- the head movement action B is limited to two rotational components about the middle of the neck.
- a first rotation component left-right equivalent to shaking ones head and a second rotation component up- down equivalent to nodding ones head.
- a head movement action B affects the four animatable avatar segments 391-394.
- the four animatable segments 391-394 are moved according to a head movement action B as if they were locked together.
- the two actions A and B are added together to give a combined head and body movement .
- the lip synchronisation movement action C affects only the jaw and mouth segment 391.
- the eye movement action D affects only the eye and eyebrow segment 392.
- the facial expression action E affects only the facial segment 394.
- the three actions C, D and E are applied locally to their respective segments after the actions A and B have been applied.
- Two forms of morphing are used: at segment boundaries and between images in a set. At segment boundaries such as the neck which lies between the body segment 390 and the head segment 393, image morphing is used to stretch the image on one or both sides of the boundary. Between image morphing is used where there is a gradual progression from one image in the set to another for a particular segment.
- Animation of image avatars 382 is not limited to the apparatus and methods disclosed above, but may be extended to any image based method.
- This embodiment is not limited to one animatable image avatar 382 superimposed in front of the background image 381 but may contain two or more animatable image avatars 382. Referring again to Figures 16b, 16c and 16d, it can be seen that several animatable image avatars 382 may be generated from several avatars 5.
- three layouts Layout 1, Layout 2 and Layout 3 may be used for displaying multiple animatable image avatars 382 with one or more background images 381 on a single display device 264.
- This invention is not limited to displaying Layouts 1-3 but may cover any layout that fits the application that this avatar user interface system invention is used for.
- Props 215 may be converted into prop images 383 which are animated in front of the background image 381.
- the animated prop image 383 may appear as part of a background image 381; an example is a tree bending in the wind. Or the animated prop image 383 may appear separate from the background image 381; an example is a bird flying across the background image .
- an avatar 5 can be any animatable non-3D mathematical representation including animated image representations.
- a computing appliance may be very powerful with a processor running at speeds in excess of 2 GHz, more than 512 MB of memory 345, a display device 264 with more than 1 million pixels and a specialist 3D graphics chip such as an Nvidia GeForce 3 from Nvidia Inc (USA) .
- Such a computing appliance can easily render real-time animation at 20 frames per second of 10-20 avatars 5 in the full generic format as disclosed in the first embodiment.
- computing appliances are less powerful and do not have specialist 3D processing hardware. Processing power is usually constrained so as not to use up battery life on lightweight portable devices with small batteries. Examples include mobile phones and wireless personal digital assistant appliances. Less powerful computing appliances usually have less memory than more powerful computing appliances. Less powerful computing appliances such as mobile phones may have very small display device 264 sizes with fewer than 5,000 pixels.
- 3D avatars with lower levels of detail may be used on intermediate power computing appliances to achieve the desired animation performance.
- Avatars with lower levels of detail typically have fewer polygons and smaller texture maps. This is good for achieving higher frame rates and uses less memory but the downside is that the visual quality of the 3D avatar is less good.
- a combination of low and high levels of detail avatars may be used to achieve a good frame rate.
- the closest avatars to the camera might be high level of detail and those furthest away might be low level of detail avatars. This can be achieved by having two or more level of detail avatars available and switching between them. Alternatively a progressive avatar approach might be used.
- Animated image representations may be used on low power computing appliances to achieve the desired animation performance. These use less computing power and memory than 3D representations.
- Figure 39 is a block diagram of an apparatus for holding an avatar user interface session in accordance with a third embodiment of the present invention.
- the apparatus comprises computing appliance 160 with a specific avatar 5 in format Al, computing appliance 161 with a specific avatar 5 in format A2, computing appliance 167 with a specific avatar 5 in format A3 and an avatar converter software 164 of type C3 stored in memory 345.
- the computing appliances 162, 163 and 167 are connected by a network 2 to an avatar hosting server 4 containing a substantial number of avatars 5, database 6, avatar converter software 164 of types CI, C2 stored in memory 344 and specific avatars 5 of formats Al and A2.
- the avatar hosting server 4 has avatar converter software 164 such as CI that can convert an avatar 5 into a specific avatar 5 with a format such as Al at a different level of detail.
- avatar converter software 164 such as CI that can convert an avatar 5 into a specific avatar 5 with a format such as Al at a different level of detail.
- the specific avatar 5 in format Al is then transmitted over the network 2 to a computing appliance 160 for which the specific avatar 5 of format Al is suitable.
- An alternative approach is to have avatar converter software 164 C3 in memory 345 on a computing appliance 167 such that an avatar 5 can be converted to a specific avatar 5 format A3 locally on the computing appliance 167.
- Software techniques such as progressive meshes or variable levels of detail employed in the avatar converter software 164 C3 known to those skilled in the art might convert the avatar 5 to several different formats during the conference depending on the graphics load on the computing appliance.
- a computing appliance 167 can contain avatar converter software 164 C3 for which the specific avatar 5 of format A3 is suitable at any one instant.
- This invention is not limited to one type of avatar converter software 164 running on a computing appliance 167 but may allow any number of avatar converter software 164 on a computing appliance 167.
- Different avatars 5 will contain different visual data depending on how they were originally generated; for instance, photo-realistic avatars 238, parameter avatars 232 and animatable image avatars 382 will be based on different raw data.
- the avatar hosting server 4 usually stores the raw data from which the avatar 5 was generated, including manual input used to generate the avatar 5.
- the raw data for photo-realistic avatars is usually in the form of digital images 19. In this way, an avatar 5 can be regenerated automatically from the images 19. Moreover, if any technological improvements are made to the avatar 5, then a newer version of the avatar 5 can be generated automatically from the images 19 and replace the older version.
- the suite of avatar converter software 164 should ideally be capable of converting any avatar 5 to any requested format. These conversions will not always be of the highest quality due to missing information. For instance, an animatable image avatar 382 cannot easily be converted into a photo-realistic avatar 238 because there is no information on the body shape.
- the avatar hosting server 4 usually stores all formats of the avatar 5 that have been previously requested. The reason is to maximise response time for requests for that avatar in a particular format. If an avatar must be first converted into a particular format then it will take the avatar hosting server 4 longer to service a request. It is a benefit to the user that his request for an avatar is serviced as quickly as possible. However, storing several formats of each avatar uses up a lot of server space. To conserve server storage space, it may be pragmatic data management to delete formats that have not been used for a considerable time and formats that have been superseded by new versions .
- the communication session on the avatar user interface system invention may involve any combination of 3D or animatable image representations on the computing appliances.
- all the computing appliances may be high power personal computers 3 and use photo- realistic avatar 238 representations.
- all the computing appliances may be mobile phones and use animatable image 380 representations.
- one computing appliance might use 3D avatars 39 with high numbers of polygons, a second computing appliance might use NURBS based 3D avatars 39 and a third computing appliance might use animatable images 380.
- Major conference In a major conference, with around 50-1,000 participants, a number of techniques are used to run the avatar user interface session on any computing appliance 167. It is likely to be impossible for some time that a personal computer 3 would have enough power to fully animate and render 1,000 avatars 5 at the same time. In large conferences, it is still useful for the complete audience to be seen to provide participants with an ambience matching the scale of the event.
- the software director 80 pan quickly across a single image taken at a real conference of the appropriate size; the real conference room must match the avatar user interface session room store short video clips in the personal computer 3 taken at a real conference of the appropriate size and replay them from time to time when there is a question from the audience, zoom in quickly from the real audience image to a virtual close up of the avatar surrounded by other avatars
- the chairman of a large avatar user interface session needs to handle questions from a lot of people.
- Figure 40 is a schematic layout of a major conference user interface functionality 291 for the Chairman consisting of a list of attendees with names 244 and organisations 293 wishing to ask questions, a button 294 for the Chairman to permit an attendee to speak. Attendees have buttons 290 to indicate a desire to ask a question and buttons 295 for testing their microphones before asking a question.
- the software director 80 will lower the hand of the avatar 5 and connect the user 17 's audio input channel 91 to the audio mixer
- a user 17' s microphone 12 will be connected over the network 12 to the audio mixer 9.
- a short dialogue will take place using pre-recorded sound files on the audio mixer in which the user 17 is able to verify that his microphone 12 works and is connected to the audio mixer 90.
- This test procedure should reduce the frequency of occurrence of a user 17 trying to speak but not being heard by the conference attendees because of a microphone problem.
- the whispering capability of this invention will permit a large number of whispered conversations of 2 or more people during these breaks.
- a remote presenting user presents a presentation remotely comprising the following steps: the remote presenting user starts a prepared presentation; remote audience users watch the avatar of the remote presenting user perform the prepared presentation; - present audience users present physically together in a theatre watch a projection of the avatar of the remote presenting user perform the prepared presentation; the prepared presentation ends; a remote audience user asks a question; - the remote presenting user views the avatar of the remote audience user asking the question from amongst a single virtual audience and the avatar of the remote audience user gazes at the remote presenting user; the present audience users view the avatar of the remote audience user asking the question from amongst a single virtual audience around the avatar of the remote presenting user and the avatar of the remote audience user gazes at the avatar of the remote presenting user.
- the apparatus of the invention supports two or more participants at one location.
- Speaker phone It is common in audio conferences for several people to congregate around a speaker phone in a single room for a conference call which includes at least one other location. Often the speaker phone has several microphones attached to it that are placed near different people around the meeting table. In this way, the people in the room can communicate directly via physical document exchange, body language, whispering and facial expressions in parallel to the formal audio exchanges .
- Shared display device Figure 41 is a schematic layout of an apparatus for holding an avatar user interface session in accordance with a fourth embodiment of the present invention.
- a personal computer 3 with a computer cabinet 16 contains a wireless transmitter/receiver 170.
- Participants 17 'Albert' , 'Bruce' and 'Charles' sit around a table 172 at an environmental location 273 with each participant 17 wearing a wireless headset 171 including microphone 12 and earphone 13.
- Each wireless headset 171 has an identified owner eg Albert.
- a large display device 264 shows the avatars 5 of all participants on the avatar user interface session other than those participants 17 around the table 172 at this location.
- Means for controlling the computer such as a keyboard 14 or a mouse 15 are available for use by the participants 17.
- the environmental location 273 is usually a room such as a meeting room.
- a participant 17 eg Albert can see all other participants either physically 17 ie Bruce and Charles, or as avatars 5.
- the wireless headset 171 is identified as being owned by a specified person eg Albert, it is possible for the lipsync to be applied to the correct avatar 5 of Albert.
- the wireless headset may be identified by means of an identification chip inside the headset .
- one or more loudspeakers 173 are used for broadcasting sound to the participants 17 and each user has a wireless microphone 12 linked to the identity of the user. Signals from the wireless microphone 12 are transmitted to the receiver 170. To prevent audio feedback between the loudspeakers 173 and the microphones 12, the audio mixer 90 does not mix in the audio streams from the microphones 12 of all the participants at that location.
- 3D sound can be used to increase the sense of co-presence of the participants. If an avatar 5 on the far left of the display device 264 is talking, then the 3D sound can be mixed locally to appear as if it is coming from the mouth of that avatar. In this case, the sound volume from a loudspeaker on the left would be louder than that from a loudspeaker on the right.
- the audio mixer 90 is not involved in generating the 3D sound.
- Figure 41a is a schematic of the 3D sound processing.
- the mixer 90 generates an audio output stream 93 which travels over the network 2 to the PC 3.
- a splitter 141 splits off the geometric positions 101 to the player 210.
- the splitter 141 sends the remaining audio transform 105 to the decompressor 140.
- the decompressor 140 generates digital voice 104 and streams it to the 3D sound generator 143.
- the player 20 calculates the pixel coordinates 142 on the display 264 of the mouth of the avatar 5 that is speaking and streams them to the 3D sound generator 143.
- the 3D sound generator 143 uses the known positions of the loudspeakers 173 relative to the display 264, to generate digital voice signals 104 to the sound card 102 which streams analogue voice 103 to the loudspeakers 173.
- This fourth embodiment has the advantage of allowing an avatar user interface session to take place with more than one person at a single location. It is also scalable for the case where there are two or more locations, with more than one participant at each location. Furthermore, it has the advantage of greatly increasing the sense of presence by showing all the non-present people on the call as avatars.
- the avatar user interface system comprises an integrated multi-media communication system based around photo-realistic avatars for communication with people and intelligent agents in both synchronous and asynchronous ways that is supportive of multi-tasking.
- Modes 3 and 5 in the table above are perhaps the most common modes of multi-tasking on a personal computer between different task types.
- An integrated avatar user interface system should support reading and typing tasks whilst listening.
- Another type of multi-tasking is time efficiency of verbal communication. Whilst in an avatar user interface session, the participant should be able to carry out other voice tasks in the periods when the conversation of the session is not important to him. The following voice tasks are possible whilst in an avatar user interface session: listen to voice-mail - speak voice-mail make a voice call receive a voice call interact with conversational intelligent avatar agents interact with user interfaces
- Some functional considerations are important for voice tasks: mixing of conference audio with the incoming voice signal so that participants can be passively aware of what is happening in the conference. An example is someone asking you a question whilst you are on another task, in which case you are likely to hear "What do you think YourName?" and react appropriately easy to use switchboard between voice tasks. Multi-tasking requires speed and efficiency in switching between synchronous voice tasks such as putting a party on hold rapid directory look-up of a person, indication of whether a person is logged on / active on his personal computer and automatic dialling of a voice call. visual status of voice tasks that are active or on hold mixing of voice functionality and text functionality ability to switch off direct voice access to yourself giving you time to think or just have a break; voice calls can be diverted to voice mail for listening to them later
- IM Immediate Messaging
- IM can be switched to e-mail and voice calls to voice-mail.
- Communications may use any type of media or any combination of multiple media.
- multi-media is used to cover all types of media such as but not limited to text, voice, video, image, animation and avatar.
- Synchronous communication is when a communication is usually received in real-time and often responded to in real-time.
- Asynchronous communication is when a communication is usually received after a delay.
- An avatar call is when the avatar 5 of the user 17 appears in the
- An Avatar voice-mail is when the avatar 5 of the user 17 appears in the Meeting Room Media window 50 whilst the asynchronous avatar voice-mail is being played back.
- this avatar user interface system invention discloses a new form of user interface for interacting with people, information, entertainment and avatar agents.
- Figure 42 is a representation of an example of the displayed avatar user interface 260 in this fifth embodiment.
- the new switchboard avatar user interface functionality 268 is added to the avatar session user interface 10 shown in Figure 12 such that both sets of functions are integrated and easily accessible through the same user interface hardware 3.
- the switchboard avatar user interface functionality includes: a buddy list 240 with data for each buddy such as name 244 and facial icon 243 buddy list buttons such as add buddy 247, edit buddy 248 and delete buddy 249 a switchboard 241 with numbered events 252 including live sessions with data for each session party such as name 244 and facial icon 243, live conferences with conference name 253 and with data for each conference attendee such as name 244 and facial icon 243, streaming media channels 254 such as music, radio and TV and voice mails 255 a status bar 250 with a message 251 and session control buttons such as start session to new party 242, end session 246 and whisper to a party on a session 245
- Figure 43 is a block diagram of a multi-session server system. It shows a personal computer 3 with an avatar user interface 260 connected via a network 2 to two or more session servers 1. Different live sessions 252 may take place on different session servers 1. Protocol converters 301 are resident at different places on the system.
- an avatar user interface 260 on a personal computer 3 can simultaneously be connected to multiple sessions on a plurality of session servers 1 using a standard avatar interface protocol 300.
- protocol converters 301 can convert between the protocols in real time.
- the protocol converters 301 can be situated on the network 2 or within the session servers 1 or within the avatar user interface 260 or any other suitable place.
- protocol converters 301 can convert between the protocols in real time.
- FIG. 44 is a block diagram of the displayed avatar user interface 260 on the display device 264 being driven by the avatar user interface software application 262 running stand-alone in memory 345 on the personal computer 3.
- the displayed avatar user interface 260 might be for a digital exhibition in which there are many virtual stands representing different organisations on which information about their products and services is accessible.
- Figure 45 is a representation of an example of the displayed avatar user interface 260 containing the switchboard avatar user interface functionality 268, the avatar session user interface 10 and the exhibition user interface functionality 280.
- the exhibition user interface functionality 280 includes an exhibitor list 281 with different organisations 282 that can be selected. Pressing a browse button 283 enables the user 17 to enter a 3D meeting room media window 50 of the selected organisation in which information media about the organisation' s products and services is available for browsing by the user 17. Pressing a contact button 284 enables the user 17 to call a representative of that organisation into the organisation's 3D meeting room media window 50.
- the representative can be an actual person with his own avatar or an intelligent agent avatar. As on a physical exhibition stand, multiple users 17 can be present with one or more representatives of the organisation in the same exhibition 3D meeting room media window 50.
- a user 17 Whilst browsing in the 3D meeting room media window 50, a user 17 may: - see objects representing products 286 navigate by pressing on the object 286 or pressing buttons on the navigation bar 287 view it by pressing a button 288.
- the user's avatar can pick the product 286 up and turn it around if the product is of a suitable size.
- the product can be rotated by the user, buy it by pressing a button 285 - be taken on a tour around the company's products 286 by an intelligent agent avatar 5 if the user 17 presses the button 289.
- This embodiment is not limited to the functions disclosed here but covers any function from an actual exhibition that can be implemented virtually.
- this seventh embodiment disclose a process wherein users communicate in virtual exhibition means comprising the following steps: a user navigates in a virtual exhibition stand of a company; - the user views and interacts with virtual objects representing products; optionally the user communicates remotely with a real sales representative ; optionally the user communicates with an intelligent agent avatar; - optionally the user views presentations; - optionally the user buys the product.
- an avatar agent 5 is driven by an intelligent agent and not by the user 17.
- Avatar agents
- Avatar agents are photo-realistic avatars driven by intelligent software agents rather than people.
- An avatar user interface system that will provide the benefits of avatar agents to people does not exist.
- FIG. 46 is a block diagram of an avatar agent hosting system and intelligent agent software in accordance with an eighth embodiment of the present invention. It shows intelligent agent software unit 320 on an avatar agent hosting server (AAHS) 321 running with AAHS management software 322 stored in memory 348 driving an avatar agent 5 in an avatar user interface window 260 on the display device 264 of a personal computer 3.
- the AAHS management software 322 manages one or more intelligent agent software unit 320 running concurrently on the AAHS 321.
- the intelligent agent software unit 320 may be running in memory 344 on the avatar hosting server (AHS) 4.
- the intelligent agent software unit 320 may be running in memory 345 on a personal computer 3.
- the identity 275 of the avatar agent 5 is usually the same as for the intelligent agent software unit 320.
- the identities of the avatar agent 5 and the intelligent agent software unit 320 could also be different, in which case the avatar agent identity number would have to indicate on which avatar agent hosting service the avatar agent is resident .
- the intelligent agent software unit 320 can perform synchronously or asynchronously. It can communicate by outputting marked-up text 327 or audio voice 185. It contains artificial intelligence software 323 and a database of knowledge 324. It may also have access to further databases of knowledge 324 via the network 2. For voice communication it includes a speech recognition engine 182 and an agent text to speech engine 326.
- the intelligent agent software unit 320 can generate events 81 that are incorporated as mark-ups in the marked-up text 327 or output from the agent text to speech system 326.
- the events 81 that go to the software director 80 cover such aspects as emotions and gestures.
- the actions 83 of an avatar agent 5 usually exhibit better anima- realism than the actions 83 of an avatar 5 driven from voice 185 because the intelligent agent software unit 320 has more knowledge for generating events 81 than can be extracted from analysis of the live voice stream 185 of a user 17.
- the intelligent agent software unit 320 can represent itself visually with an avatar 5 that does not have the identity of a real person. Each avatar agent 5 is driven by one intelligent agent software unit 320.
- the avatar agent 5 of the intelligent agent software unit 320 may be a parameter avatar 232, or it may be edited to look like a photo- realistic avatar 238 of a real person or it may be based on images taken of a real person with whom that person' s identity is not associated.
- the intelligent agent software unit 320 speaks through an agent text to speech engine 326 using impersonation parameters 325 that makes the voice 185 emit a characteristic profile.
- An example of a characteristic voice profile 184 is a middle-aged Scottish woman.
- the avatar agent 5 is impersonating a middle-aged Scottish woman.
- the impersonation parameters 325 are of two types: voice impersonation parameters 331 and action impersonation parameters 332.
- the agent avatar 5 can represent a real person 17 and use the photorealistic avatar 238 of that real person 17.
- the impersonation parameters 325 can be the personalised voice profile of that particular person 17. In this way the avatar agent 5 can represent the real person 17 by looking like that person and sounding like that person whilst that real person is unavailable.
- FIG 47 is a block diagram of an apparatus for generating impersonation parameters.
- a person's 17 voice and movements may be recorded in a room 330 insulated from a noisy environmental location 273.
- Video 336 is recorded from at least one camera 29 and audio 185 is recorded using a microphone 12 of a person 17 reading known text 189 on a screen 264 of a personal computer 3.
- the impersonation parameter generation software 331 running in memory 345 on the personal computer 3 processes the video 336 and audio 185 to generate a set of impersonation parameters 325.
- the impersonation parameters 325 are of two types: voice impersonation parameters 331 and action impersonation parameters 332.
- the voice impersonation parameters 331 are generated by processing the audio of the known text 189.
- the action impersonation parameters 332 are generated by processing the facial movements as the words in the known text 189 are spoken and as emotions are used.
- the intelligent agent software unit 320 generates marked up text 327.
- the marked-up text 327 is processed by the agent text to speech engine 326 using the voice impersonation parameters 331 of the person 17 to modify an existing speech database 328 by speech synthesis.
- the voice 185 emitted by the agent text to speech engine 326 sounds like that of the person 17.
- the marked up events in the marked-up text 327 are modified by the action impersonation parameters 332 to produce gesture action events 81 that use characteristic gestures that the person 17 normally uses when speaking.
- avatar agent impersonation of a real person would be in a personalised answer-phone application.
- the intelligent agent software unit 320 may know who is calling and their access level to the real person's information. It may also know what activity the real person is currently involved in and when it is due to finish. It answers the avatar call with an appropriately personalised message. For example: "Hi John, I'm in an avatar session until 11.00, please leave an avatar-mail.” The caller, John, will recognise the voice and see the avatar as if it were the real person he was calling.
- more advanced bi-directional communications can take place in which, for example, arrangements can be 'pencilled in' involving diaries .
- Avatar agents in this embodiment of the avatar user interface system invention will provide benefits to people in a wide range of applications, including as: call centre personnel: users can interact with an organisation via virtual agents instead of expensive call centre personnel; fields include: account payments, technical support, changing service levels sales representatives: users can discuss potential purchases with virtual agent sales representatives real estate: home purchasers can be shown round virtual 3D replicas of homes on the market by a virtual real estate agent - entertainers: avatar agents will become performers in shows customised to the user's desires; what was a child's television programme becomes an interactive, personalised entertainment led by an avatar agent advisers: people can consult an avatar agent specialist for advice; fields include: independent financial advice, style of clothing, selection of make-up, dieting, fitness, sports, psychology, psychiatry, cooking newscasters: virtual newscasters will be able to read the news that you want when you want - housekeepers: virtual agents will provide: management of the home network, automatic call out of home service personnel such as heating system technicians, automatic reordering of home consumables such as
- a software director uses voice impersonation parameters defined for an avatar to generate speech from text using text to speech engine means for an avatar such that the avatar speaks recognisably like the person it represents comprising the following steps : intelligent agent software unit means generates the text; text to speech engine means converts the text to speech; the speech is played on the computing appliance.
- voice impersonation parameters are defined for an avatar of a particular person comprising the following steps: recording the person speaking predefined text; - processing the recording using impersonation parameter generation software; the impersonation parameter generation software outputting the voice impersonation parameters for that person; storing the voice impersonation parameters in the avatar.
- a speech recognition engine means processes the voice communication comprising the following steps: - a user generates a voice communication by speaking; a speech recognition means processes the voice communication and outputs text; the text is sent to any intelligent agent software units involved in the session.
- this eighth embodiment disclose a process wherein a user speaks in a first language and an intelligent agent software unit operates in a second language such that text is translated by translation engine means comprising the following steps: a user generates a voice communication by speaking in a first language; - a speech recognition means that operates in the first language processes the voice communication in the first language and outputs text in the first language; the text in the first language is translated by translation engine means into text in a second language; - text in the second language is sent to any intelligent agent software units involved in the session capable of processing text in the second language.
- this eighth embodiment discloses a process wherein a user understands a first language and an intelligent agent software unit operates in a second language such that text is translated by translation engine means comprising the following steps: an intelligent agent software units generates text in a first language; - the text in the first language is translated by translation engine means into text in a second language; text to speech engine means converts the text in the second language to speech in the second language; the speech in the second language is played to the user using loudspeaker means .
- the avatar user interface system 261 may be used for biometric security applications at locations such as airports or military installations. There is an increasing need for security systems based on biometric identification at airports to combat terrorism and in many other security applications. Currently, a biometric security system based on photo-realistic avatars of people does not exist.
- FIG. 48 is a block diagram of the avatar user interface system with extended security functionality in accordance with a ninth embodiment of the present invention. It shows a person 313 passing a security checkpoint 314.
- the person's identity 275 is contained on an identity source 310 such as a smart card carried by the person 313 or an implant in the person 313.
- the person's identity 275 is read from the identity source 310 by an identity source reader 311 attached to a personal computer 3.
- Identity processing software 312 in memory 345 on the personal computer 3 calls up the avatar 5 corresponding to the identity 275 from the avatar hosting service 4 over the network 2.
- the avatar 5 corresponding to the identity 275 is displayed in the avatar user interface window 260 on the display device 264 of the personal computer 3.
- a security user 17 who is usually a security guard, can visually compare the person 313 to the avatar 5 corresponding to the identity 275 presented by the person 313. If the person 313 and the avatar 5 are not similar then the security user 17 can stop the person 313 for questioning.
- the network 2 and avatar hosting service 4 may be private to the organisation conducting the security check.
- a camera 29 attached to the personal computer 3 takes images 19 of the person 313.
- Image processing and comparison software 315 in memory 345 on the personal computer 3 can automatically compare the images 19 to the avatar 5 corresponding to the identity 275 presented by the person 313. If the image processing and comparison software 315 finds a significant discrepancy between the images 19 and the avatar 5 corresponding to the identity 275 presented by the person 313, then the security user 17 is alerted.
- Image processing and comparison software 315 are well known to those skilled in the art; however, increasing the accuracy of such software is still a research area.
- the images 19 may be sent to remote image processing and comparison software 315 in memory 344 on the avatar hosting server 4 which will compare the images 19 with a database 318 of images of known people because they are a security risk or employees or for any other reason. If the remote image processing and comparison software 315 on the avatar hosting server 4 makes one or more possible matches then these possible matches are communicated to the security user 17 via the avatar user interface window 260 on the display device 264.
- the remote image processing and comparison software 315 and database 318 are not limited to being resident on the avatar hosting server 4 but may be resident on any server accessible via at least the network 2.
- the intelligent agent software unit 320 may generate communications to the security user 17 relating to the advisable actions to be taken depending on the results of any comparisons made by the image processing and comparison software 315.
- the intelligent agent software unit 320 may respond to communications from the security user 17 such as requests for further comparisons.
- Certainty in verifying identity can be increased by combining the results of two or more biometric devices.
- a biometric device 316 connected to the personal computer 3 could measure another part of the person and compare it to reference biometric data 317 linked with the avatar 5 via the identity 275.
- Typical biometric devices include fingerprint scanning, iris scanning, hand scanning, face recognition and voice pattern recognition.
- It is a purpose of this ninth embodiment to disclose a security process comprising the following steps: - a person providing an identity source that is read by an identity source reader; retrieving the avatar whose identity matches the identity in the identity source; displaying the avatar; a security user visually comparing the avatar with the person.
- this ninth embodiment disclose a largely automated security process comprising the following steps: the person providing an identity source that is read by an identity source reader; retrieving the avatar whose identity matches the identity in the identity source; - extracting the avatar biometric data from the avatar; a biometric device scanning part of the person to provide scanned biometric data; comparing the scanned biometric data with the avatar biometric data; - if the scanned biometric data does not match the avatar biometric data then alerting the security user; displaying the avatar to the alerted security user; the alerted security user visually comparing the avatar with the person.
- the intelligent agent software unit 320 might be resident on an avatar agent hosting server (AAHS) on the network 2 instead of on the personal computer 3.
- AAHS avatar agent hosting server
- Two or more security checkpoints 314 at one location may be connected to a single personal computer 3.
- multiple security checkpoints at multiple locations may be wired to multiple personal computers 3 in one or more security rooms monitored by multiple security users 17.
- This embodiment of the avatar user interface system invention has significant utility. It can support a security guard in making a quick visual verification that the person showing an identity is actually the person to whom the identity belongs. In a more automated form, it can alert a security guard when a discrepancy between the person going through a security checkpoint and his avatar is detected.
- the avatar user interface system 261 is used for interactive computer games.
- On-line interactive computer games do not exist where the user is represented by a photo-realistic avatar 238 of himself.
- On-line interactive computer games do not exist where avatars 5 can exhibit lip synchronised animation in real-time to voice transmitted over a network 2.
- FIG 49 is a block diagram of an avatar user interface system 261 for interactive computer gaming in accordance with a tenth embodiment of the present invention.
- Users 17 interact with their personal computers 3.
- a session server 1 handles the voice mixing between users 17.
- An avatar hosting server 4 hosts the avatars 5 of the users 17 which are sent to the personal computers 3.
- a Game Hosting Server 370 hosts the game software 371, the state 372 of the game and billing software 237 in memory 374.
- a network 2 connects the servers and personal computers. If the game involves avatar agents, one or more avatar agent hosting servers 321 may serve the avatar agents 5.
- Special game interface equipment 373 may be attached to the personal computer 3 containing sensors to detect the movements of the user and feedback devices to stimulate the user's senses.
- the computer games industry is clearly structured with a number of game genres such as roll playing games (RPG) , sports including football and wrestling, car racing, God simulations, strategy games, board games and first-person fighting.
- RPG roll playing games
- sports including football and wrestling, car racing, God simulations, strategy games, board games and first-person fighting.
- Some of these genres have found a place in on-line gaming in which users play the game with each other over a network.
- the game is hosted on a game server that is also on the network.
- this avatar user interface system invention a new genre of communicative avatar on-line game may be built using the avatar user interface system 261 that was not possible before.
- Users 17 see each other in the virtual environment of the game software 371 as photo-realistic avatars 5. During the game, users 17 can communicate with each other by voice as if they were in the same room.
- the user 17 may navigate his avatar 5 through the virtual environment of the game using normal personal computer input devices such as mouse and keyboard. Examples of this are environments where users navigate towards other people's avatars to meet them.
- the actions of the user's avatar are generated by the software director 80 in reaction to events.
- the user 17 may move one of his chess pieces from one square to another and the software director 80 will show the user's avatar 5 picking up a piece and moving it from one square to another .
- the state 372 of the game In an on-line game 371 with a shared virtual environment, the state 372 of the game must be maintained on the game hosting server 370. In this way, the shared virtual environment of the game 371 is the same for all users 17 at all times because there is only one state. The only time that there are differences is if there are delays or lags on the network 2. However, state discrepancies caused by the software director 80 playing actions in anticipation of what will happen in the game 371 whilst waiting for the new state 372 to be synchronised over the network 2 delays can be quickly corrected by the software director 80 when the new state 372 arrives at the personal computer 3.
- the avatar user interface system 261 is used in immersive virtual reality (VR) environments .
- VR headsets There is a variety of immersive VR systems. These include but are not limited to VR headsets and caves.
- a person can wear a VR headset in which his view of the physical environmental location 273 is replaced by viewing a display apparatus on which the virtual environment is displayed.
- a person can enter into a cave, that can be generally defined as a partially or fully enclosed physical space in which the display area is large. The person sees a virtual environment that has been projected onto the walls, floor and ceiling of the room.
- An immersive VR system does not exist in which people are represented by photo-realistic avatars of themselves. Nor does an immersive VR system exist where people's movements can be motion tracked and used to drive photo-realistic avatars of themselves in other locations.
- Figure 50 is a schematic of an avatar user interface system 261 for a six-sided cave 350 in accordance with an eighth embodiment of the present invention.
- the six faces of the cave 350 are illuminated by six back projectors 352.
- the six back projectors 352 are connected by one or more cables 354 to a computer 355.
- the computer 355 contains an avatar player engine 210 and a cave display system 357 in memory 345 and is connected to a network 2.
- the avatar player engine 210 generates a 3D virtual environment 356 containing avatars 5 that usually changes over time.
- the avatar player engine 210 transmits the 3D virtual environment 356 to the cave display system 357.
- the cave display system 357 generates the digital projector images 353 and transmits them to the back projectors 352.
- the six faces of the cave 350 are fabricated from a material that permits back projection such that the six projected images 353 are visible to the user 17 from inside the cave 350.
- Each projected image 353 is a sequential stereo pair from which a 3D effect can be experienced.
- a user 17 wearing shutter glasses 351 is inside the cave 350.
- the shutter glasses 351 combine the stereo pair image 353 displayed onto each wall to form a 3D virtual environment 356.
- the 3D virtual environment 356 appears to stretch from right next to the user 17 to many hundreds of metres away.
- the experience is vivid and a strong sense of presence in the virtual world is experienced by the user 17.
- the user 17 sees a 3D virtual environment 356 with an avatar 5.
- the user 17 can see an avatar 5 in 3D when the user 17 is facing in the direction of the avatar 5. If the avatar 5 is central in the cave 350, the user 17 can walk through or around the avatar 5, turn and see the avatar 5 from a different viewpoint.
- the avatar 5 can move in the virtual environment 356 relative to the user 17.
- a cave 350 the user 17 can see parts of himself 17 such as his legs 358 and arms 359, the people with him in the cave, the virtual environment 356 and the avatars 5.
- This eleventh embodiment is not limited to a cave with six sides, a physical space with a display of one or more sides can be used. Conventional displays such as a monitor or plasma screen can be used. Shutter glasses are one method of converting images into a 3D environment, but this invention can incorporate a wide range of 3D display technologies .
- One or more users 17 in a cave 350 at a first location can be connected via a network 2 to one or more other users 17 in another cave 350 at a second location such that all the users 17 appear as avatars 5 immersed in the same 3D virtual environment 356. It is advantageous for users at one cave location to see the movements of the users at the other cave location. For this, a motion capture system is required at each location.
- Figure 51 is a schematic of an avatar user interface system 261 for two caves 350 connected by a network 2.
- a motion capture system 368 is integrated into the cave 350 at location 1.
- the motion capture system 368 comprises four cameras 362 viewing the internal area of the cave 350 connected by a cable network 365 to a computer 363 running motion capture software 364 in memory 345.
- the user 17 wears a suit 360 to which infra-red emitters 361 are attached.
- the motion capture software 364 on the computer 363 calculates the motion 369 of the user 17.
- the motion 369 is sent to the cave 350 at location 2 and the motion 369 is played on the photo-realistic avatar 5 of the user 17.
- motion capture system there are many types of motion capture system and this invention is not limited to the type disclosed.
- the motion capture system might be passive and not require the user to wear a seat with active emitters.
- a user 367 may wear a VR headset 366 whilst moving inside the motion capture system 368.
- wireless networks could be used.
- avatars 5 might be avatar agents 5 driven by intelligent agent software unit 320 rather than users 17. In this way, agents and users can mingle and interact in a 3D virtual environment 356 without it being immediately obvious which avatar 5 is driven by an agent or a user.
- This invention is not limited to participants in just two locations being immersed in the same 3D virtual environment 356. Three or more locations may be used. At each location, there could be a cave or the user could use a VR headset .
- each user 17 can wear a headset 11 for audio communication with the other participants.
- this embodiment discloses means by which the most realistic immersive VR experience can be achieved and will thereby achieve a high sense of presence in the session.
- the motion capture system means records movements of a first user in a first Cave means; the recorded movements are sent with acceptable lag from the first Cave means to a second Cave means; an avatar of the first user is displayed in the second Cave means such that the movements of the avatar duplicate the movements of the user in space; a second user wearing shutter glasses or similar immersive 3D viewing means in the second Cave means views the movements of the avatar of the first user as if the first user were physically in the second Cave with the second user.
- the avatar user interface system 261 may be connected to exercise equipment. Health and training
- Figure 52 is a schematic of an avatar user interface system 261 comprising two exercise stations 414 connected together by a network 2.
- An exercise station 414 comprises a piece of exercise equipment
- Many items of exercise equipment come with a built-in processor and a connection to a personal computer such that the personal computer can monitor and or control the exercise equipment.
- parameters monitored from the exercise equipment might be speed, strength setting, energy dissipation rate, user pulse rate and cumulative energy dissipated.
- An example of a parameter that might be controlled is the strength setting of the exercise equipment.
- the two users can share their exercise as a social experience.
- a first user 17 can see the avatar 5 of the second user 17 in his avatar user interface 260 in a scene showing the avatar 5 of the second user using a virtual exercise equipment 410.
- the exercise equipment interface software 412 monitors the movements of the exercise equipment 410 and sends them over the network 2 to the avatar user interface software 262 on the personal computer of the first user 17.
- the first user sees the avatar 5 of the second user moving on the virtual exercise equipment in the avatar user interface 260 in substantially real time compared to the actual movements of the second user. If the second user stops using the exercise equipment, then the first user will see almost immediately that the avatar of the second user has stopped using the exercise equipment.
- the two users may talk to each other using the headsets 11.
- connection between the exercise machine 410 and the personal computer could be wireless rather than a cable 413.
- the display device 264 and the personal computer 3 might be built into the exercise machine 410 which connects to the network 2.
- the headset 11 may be connected to the personal computer 3 by wireless rather than a cable . Loudspeakers may be used instead of headphones .
- Other biometric devices may be worn by the user 17 in addition to the pulse rate monitor.
- the exercise equipment interface software 412 may correlate performance of each user over a number of sessions and generate statistical data to track increases in fitness.
- the wearing of a pulse rate gauge 415 is optional .
- This twelfth embodiment is not limited to two users but three or more users may be connected simultaneously.
- One user 17 may be a personal trainer for another user 17 and use the avatar user interface system to both monitor and encourage the first user.
- a personal trainer could train several users simultaneously. Users may compete against each other on certain parameters such as speed, strength and endurance. International virtual competitions may be held with their appearance in the avatar user interface system being similar to that of a televised sports event.
- a user 17 may be a medical doctor who can monitor remotely the health of a user 17 who is a patient.
- An avatar intelligent agent software unit 320 may take the role of a user or personal trainer or doctor or any other professional such as a sports therapist .
- this twelfth embodiment may be combined with features from the fourth embodiment such that two or more people at one location can exercise together whilst being in contact with one or more other people at one or more other locations.
- This twelfth embodiment of the avatar user interface system invention enables a person who is in one location to carry out a physical activity whilst in virtual contact with one or more people in other locations.
- Advantages of this embodiment include: time and cost saved travelling by each user to an agreed location where they can exercise together, increased motivation by exercising whilst in virtual contact and not needing to dress up to be seen in public.
- this twelfth embodiment disclose a process wherein users communicate whilst exercising on exercise station means comprising the following steps: - a first user using a first exercise station means; a second user using a second exercise station means; the first user viewing the avatar of the second user using a virtual exercise station; the second user viewing the avatar of the first user using a virtual exercise station; the first and second users communicating by voice; optionally the first and second users viewing performance data generated by the first and second exercise station means; optionally any user being able to see if the other user has stopped exercising.
- the avatar user interface system 261 may be used for practicing and planning.
- Practicing might cover exercises for learning a new skill, preparing for delivery of an event or planning an event.
- applications that require practicing include: language learning, learning touch typing, delivering a presentation, public speaking, playing music, rehearsing a play, overcoming a fear such as that of public speaking by practicing in a virtual environment, planning the choreography of a ballet and planning the direction of an event.
- Practicing using the avatar user interface system 261 involves the person practicing generating output into the avatar user interface system 261 by means of voice, camera, keyboard, mouse or other specialised peripheral. This input may be fed to another person or an agent where it is processed and feedback is given to the person practicing. Feedback may be verbal or visual. Emote keys may be used by the person feeding back such that a person's avatar can visually show pleasure, displeasure, comprehension, confusion and other emotions .
- a person planning will create a plan. This can be done collaboratively with others in synchronous or asynchronous ways. Synchronous planning will involve real-time interactions between users . Asynchronous planning might involve one person creating a plan such as a choreography for a ballet and others feeding back at a later time. In this case, a set of tools and props will usually be required for the application being planned.
- the avatar user interface system 261 has an Avatar Virtual Environment (AVE) as the background to the display device 264 and that the desktop 423 is present and usable on a virtual computing appliance 421 within the AVE.
- AVE Avatar Virtual Environment
- Figure 53 is a schematic of the display 264 of an avatar user interface system 261 with an avatar virtual environment (AVE) 420 as the background in accordance with this fourteenth embodiment.
- a virtual computing appliance 421 with a virtual computing appliance display 422 is present in the AVE 420.
- the desktop 423 of the PC 3 is shown on the virtual computing appliance display 422.
- the virtual computing appliance 421 is not always visible in the AVE 420 because visibility depends on whether it falls within the field of view of the virtual camera being used at that instant .
- the windows user interface usually occupies the whole of the display area of the display device 264.
- the windows user interface usually consists of a desktop 423 background covering the whole display area and may have one or more windows open on top of the desktop 423. Any one window may be opened fully to cover the whole desktop 423.
- This avatar user interface system invention includes one or more windows containing an Avatar Virtual Environment (AVE) 420.
- An AVE is a photo-realistic virtual environment with photo-realistic avatars in it.
- the avatar user interface system of the First Embodiment uses avatar conference windows 23, 24 and 25, which are AVE windows, open in the context of the windows user interface. Controls such as control buttons 27 are situated outside the avatar conference window.
- the user 17 can move the virtual camera 71 such that the desktop 423 on the virtual computing appliance 421 is larger or smaller in the display device 264.
- the user 17 may also operate the desktop 423 on the virtual computing appliance 421 using input devices such as a keyboard 14 or a mouse 15.
- This fourteenth embodiment of the avatar user interface system invention enables a person to shift between frames with low cognitive jolt. Advantages of this embodiment include: improved communication, better task efficiency, a more suitable interface for multi-tasking between verbal tasks and information tasks and higher usability.
- Avatar agent sharing virtual computing appliance in AVE It is a further purpose of this fourteenth embodiment that an avatar agent and a user may communicate in an AVE with a virtual computing appliance in it; the virtual computing appliance may be used by the avatar agent to communicate information to the user and by the user to communicate information to the avatar agent.
- a sample script is provided that might have been enacted between an avatar agent called Johan and a user using an AVE with a virtual laptop in it.
- the domain is the avatar agent giving professional advice to the user on risk management .
- 'Johan is the Advisor. He is a slightly old-fashioned 'Mad Professor' character, dressed in an old-style suit and bow tie. His half-glasses are at the end of his nose. He is seen seated at a desktop with a laptop facing towards the
- the opening shot sets the scene: camera at first person point of view of SME user; Johan' s head, upper body visible plus data centre in background; virtual laptop partially visible on table orientated partly towards SME user
- This embodiment is not limited to a single avatar agent, there may be a plurality of avatar agents interacting with the user.
- the virtual laptop is one example of a virtual computing appliance and other virtual computing appliance might be used in its stead such as virtual plasma screens .
- the user 17 may use input means to the AVE other than voice. Such means might include a keyboard 14 or mouse 15 which when used to create input, the input would appear directly on the virtual computing appliance display 422 visible on the display device 264.
- the avatar user interface system 261 comprises motion capture means and software director means to improve the sense of co-presence during a communication session.
- Figure 54 is a schematic of a motion-tracking terminal 265 of an avatar user interface system 261 including motion-tracking cameras 29 for a communication session.
- Three users 17 sit on chairs 174 around a table 172.
- a display device 264 At the end of the table 172 is a display device 264 with an AVE 420 displayed.
- the AVE 420 is displayed in such a way that the virtual table 51 in the AVE 420 appears to be a continuation of the physical table 172.
- Behind the virtual table 51 sit avatars 5 representing users 17 at other environmental locations 273.
- the AVE background behind the avatars 5 includes a virtual meeting room with windows 60, door 58, walls 55 and ceiling 56.
- Each user wears a microphone 12.
- There are loudspeakers 173 for outputting the voices of the participants that are not at that location. As disclosed in the Fourth Embodiment, sound is mixed.
- tracked avatar animation provides additional visual cues for facial expressions, head movements and hand gestures which contribute to natural face-to-face communication.
- Cameras 29 around the display device 264 capture the movements of participants. Images from the cameras 29 are processed in real-time to track facial animation, eye gaze, upper body movement and gestures.
- the 2D tracked movements are mapped onto the 3D virtual environment and avatars of each person.
- parameterised animation is generated. Video-based motion capture is used for non-invasive capture of face and body movements using a small number of cameras 29 surrounding the display screen 264. This motion capture augments the body and face movements of a participant's avatar animation where no motion capture input is available.
- a key innovation is the mapping of the captured movement to parameterised avatar motion models based on real movement to achieve realistic avatar animation that is robust to errors in the visual tracking.
- Ill Parameters control motion characteristics such as movement speed and size.
- the emotional content of the original movement is conveyed whilst avoiding artefacts due to errors in tracking.
- Adaptive background subtraction is used to separate foreground objects (people) from the background scene and avoid the requirement for highly structured backgrounds (blue-screen) or constant scene illumination.
- Eye-contact is an essential visual cue in face-to-face communication. To establish eye-contact between a virtual avatar and real participant, eye gaze direction of all participants is reconstructed. In a virtual meeting it is critical to establish which participant each person is looking at in near real-time. To achieve this, key facial features are tracked for each participant using a statistical template of facial appearance for each individual based on their avatar model. This is used to robustly identify the location of the eyes at each time instant. The use of a model-based vision approach allows the three dimensional location of these facial features to be reconstructed. A dynamic eye-template which models the appearance of the eye with changes in viewing direction according to the iris location is then used to reconstruct the approximate viewing direction of the subject.
- Estimated gaze direction is used to identify if a participant is looking at the facial region of another real participant or avatar. Eye-contact is then established with the corresponding avatar. Avatar gaze-direction is animated to ensure correct eye-contact together with smooth transition of eye-contact between participants and with the background scene (ie the participant is not paying attention or looking at other documents) .
- Established motion capture algorithms are used to reconstruct a subject's hand and head movement from the video streams.
- This approach utilises a real-time inverse kinematics engine to recover the approximate movement as estimates of joint angles.
- the reconstructed movement is mapped directly to the animated avatar using a dynamic filter to constrain the movement, impose joint angle limits and provide smooth animation.
- techniques for mapping the captured noisy movement into parameterised gestures are used.
- a database of parameterised realistic gestures is established using conventional marker-based motion capture to construct models of common gestures and explicitly parameterise the intra-gesture variation.
- Statistical models based on learning from visual data identify the gesture class and map the gesture to the appropriate set of parameters. This model-based approach to gesture animation enables smooth and realistic gesture animation from noisy input data .
- a key visual cue in face-to-face visual communication is the secondary facial expression in conjunction with speech.
- a model-based methodology is adopted based on a highly sophisticated facial animation model.
- the facial animation model encodes parametric models of facial expression that express both the extent of movement and the temporal duration of the movement.
- Video analysis of facial expression using particle filters identifies key facial features corresponding to different facial expression.
- Statistical models of facial expression are learnt from labelled video sequence of multiple individuals. The learnt statistical models are used to identify the class of facial expression or combination of expressions.
- Finally detailed analysis of facial features is applied to identify the spatial and temporal parameters for a particular expression. The captured facial expression parameters are then used to augment the avatar facial movement synchronised with speech.
- this invention is operable for one or more users at each motion-tracking terminal 265.
- this invention is operable for one or more users at each motion-tracking terminal 265.
- One limitation comes when there are so many users close together that the motion tracking system cannot resolve which movements belong to which person.
- a second limitation is that of the computing power of the motion tracking system to follow a maximum number of users 17 simultaneously.
- a third limitation is from the number of chairs that can be fitted around the table 172. For large sessions, this motion-tracking terminal permits two or more rows of chairs and for people to stand behind those sitting in the chairs. However, in this case most participants will not be motion tracked.
- the input of users to the meeting consists of speech and motion. It is important that the captured speech and motion are attributed to the correct avatar on other user devices .
- Speech is identified automatically by means of linking the identity of each microphone 12 with the avatar number 8 of the user 17.
- a person working in an organisation could have his identity card and his wireless microphone linked together. His microphone could be used for all voice input applications in the organisation such as fixed telephone, mobile telephone, paging, PC interaction and avatar user interface sessions.
- the organisation's database would link the person's identity, the microphone identity and the person's avatar number 8. This would be made available to the radio transceiver 170.
- a low- technology way is a manual process using a seating plan.
- Chairs 174 are always in known positions and numbered: Chair 1, Chair 2 etc.
- a user 17 at each location identifies the avatar number of the person in each chair by means of direct input into the avatar user interface system 261, normally using keyboard 14 or mouse 15. This manual setup process works but relies on fixed chair positions.
- a more flexible manual process is the interactive identification of each user 17.
- the motion-tracking terminal 265 knows who is present but not where they are located.
- the software director 80 asks each user in turn to wave both arms until the motion tracking system has located him. This enables people to move chairs around to suit the number of people present.
- One drawback of this method is that if people move around, the system might lose them eg if they leave the room to get something and then return.
- a drawback with manual processes is that identification can take some time if there are a lot of people present and this wasted time costs money.
- a first method is that wireless microphones are automatically tracked by triangulation of the signal between two or more receivers to estimate the location of the person in the room. These estimated locations are automatically mapped onto the motion- tracking system output to identify each moving person automatically.
- each microphone on the system has a visible, signal emitting light that is tracked by the cameras.
- the code of the signal emitting light is unique and associated with the identity of the person.
- the cameras map the light to the movement of the person to automatically identify the person.
- the motion tracking terminal 265 might be designed as a range of different sizes and to different price points.
- a large motion tracking terminal 265 might use the whole wall of a room as the display device 264. This might be achieved by the wall being a special opaque screen for rear transmission and the projector in an adjacent room projecting the AVE 420 onto the screen such that it is visible to the users 17 in the meeting room.
- the width of the table 172 could be more than 5 metres; the shape could be elliptical on one side and straight on the display side.
- Two rows of chairs 174 might be provided.
- a large number of cameras 29 could be situated to track a large number of participants 17 sitting in the chairs 174. Each participant in the room could see each other participant. It might have a maximum capacity of more than 20 motion tracked people.
- a medium-size motion tracking terminal 265 might use two plasma screens situated on the end of the table 172. It might have a maximum capacity of 7 motion tracked people.
- a smaller motion tracking terminal 265 might use one monitor on the end of the table 172. It might have a maximum capacity of 3 motion tracked people.
- a motion tracking terminal 265 could be installed at each of the offices of an international organisation.
- a motion- tracking terminal 265 may be the optimal user device.
- a PC 3 may be the best device; this PC 3 may or may not have a webcam 29 to track the movements of the user 17. Whilst on the move, a user may use a mobile device such as a wireless Personal Digital Assistant (PDA) with telephone to participate in a communication session.
- PDA Personal Digital Assistant
- Caves 350, exercise stations 414 and VR Headsets 366 are other types of user device that may be be used in an avatar user interface system 261.
- Figure 55 is a block diagram of apparatus for an avatar user interface system 261 with multiple user devices.
- a session server 1 an avatar hosting server 4, an avatar agent hosting server 321, a motion- tracking terminal 265, a CAVE 350 and a PC 3 are connected together by a network 2.
- an avatar user interface system 261 may be operable with a minimum of one user device and one user 17. In the case of one user 17, the user is probably communicating with an avatar intelligent agent software unit 320.
- the highest quality usage for the best sense of co-presence is when all the users 17 are using motion tracking terminals 265.
- This invention provides for the reality that users 17 may not all be at locations where there are motion tracking terminals 265 available and provides for users being connected via a variety of different user devices to one session.
- the display device 264 of the avatar user interface system 261 includes two or more projection means.
- AVE Avatar Virtual Environment
- a presentation slide containing words that is projected onto the virtual presentation screen 53 will be unreadable.
- a typical computer screen will have 1024 pixels across and this might also be the width of a large meeting room media window 50 showing an AVE 420. If the virtual presentation screen 53 is in proportion with the whole virtual meeting room, then it may only have 200 pixels width. This is not enough pixels for resolving the words on a presentation slide.
- the human eye has great resolving power and a person may read a poster on a wall, even if the poster is quite small and the person is not close to it. From the same position, the person can also take in the whole wall by 'zooming out' .
- a novel display apparatus in an avatar user interface system 261 is disclosed, which takes advantage of the capabilities of the human eye to view simultaneously, the AVE 420 and the presentation screen 53 at full resolution as if they were one environment .
- Figure 56 is a schematic of a display device 264 consisting of a display screen 430, an AVE projector 431 and a Presentation projector 432.
- the meeting room media window 50 is projected by the AVE projector 431.
- the virtual presentation screen 53 is projected by the Presentation projector 432.
- the same area in the AVE is projected black with minimal light leaving the AVE projector 431 to fall on the area of the presentation screen 53.
- the presentation benefits from the full contrast of the Presentation projector 432.
- the presentation appears brighter than the AVE, which is a strong parallel to a real presentation in a darkened real room, in which the presentation screen is usually the brightest element. Projection may be from the tabletop, from a ceiling attachment or in reverse from behind an opaque screen.
- the software director 80 on the PC 3 will generate two full-size displays: the AVE and the presentation; 3D graphics cards already on the market can drive two full-size displays.
- the display screen 430 may be any aspect ratio or it may be curved.
- Figure 57 is a schematic of a display device 264 in which the AVE and Presentation projection means are combined into one physical unit 433.
- the AVE projection optics 434 has the normal controls available on a desktop projector such as focus and perhaps zoom.
- the axis 439 of the Presentation projection optics 435 may be altered such that it points anywhere within the AVE area 440 projected by the AVE projection lens 434.
- a slider control 436 can be moved by a user 17 to move the axis 439 from left to right.
- a slider control 437 can be moved by a user 17 to move the axis 439 up and down.
- a slider control 438 can be moved by a user 17 to zoom the Presentation area 441 in and out.
- the controls 436-438 may directly move the presentation projection optics 435, or they may drive motors that move the optics. In this way, the presentation area 441 can quickly be aligned to the right place in the AVE area 440 at the start of the session.
- the software director 80 does not move the pixel position of the virtual presentation screen 30 in the AVE 420.
- Manual control of the position and size of the axis may be achieved by a number of other means such as the use of a remote control.
- a camera 221 built into the projector 443 that images the AVE area 440 could be used to locate the projected size/position of the Presentation area 441.
- a control loop could be constructed to set the presentation projector axis orientation/zoom automatically using sof ware-driven motors driving the presentation projection optics 435.
- the control loop could be driven by the software director 80 from the PC 3 which could project reference images from both projectors alternately that are imaged by the camera 221. It is a further purpose of this sixteenth embodiment that the projection means is provided with alignment means that can be either manual or automatic or both.
- Each presentation screen might be driven by a different projector.
- a plurality of virtual presentation screens might be arranged in the AVE such that they can be driven by one presentation projector 432. In this case, the resolution of each virtual presentation screen is half or less.
- PCs are able to generate real-time 3D with more pixels than display projectors can project.
- Two or more AVE projectors 431 could be used in a tile formation to project a high-resolution AVE.
- Alignment means permit the projectors to be aligned to each other so that there are no gaps and no overlaps.
- the display screen 430 may be planar- rectangular, or it may be curved, or it may comprise a number of planes abutting at any angle. Different projectors might be located to project onto different planes or curves.
- any number of AVE projectors 431 and any number of Presentation projectors 432, whether integrated in units 433 of two or more projectors or not, may be used to display any number of virtual presentation screens within an AVE on a continuous display screen of any shape or combination of shapes .
- Display devices available today usually have a single screen that is either illuminated within its unit (such as CRT monitors, LCD displays, plasma screens, opto-polymer displays) or comprises a separate screen illuminated by projection from another unit (front projector, rear projector) .
- the scope of this sixteenth embodiment is not limited to projection devices, but includes single unit devices with two or more areas of display of different pixel densities as measured by pixel row and column spacings in units of length.
- Figure 58 is a schematic of a multi-density display device 451 comprising an area of low-density pixels 452 and an embedded area of high-density pixels 453.
- the multi-density display device 451 may be packaged in a single unit, which has the advantages of lower complexity, lower weight, lower manufacturing and lower installation costs for example. Or, it may be packaged as two or more units.
- the embedded high-density area 453 may insert into the low-density area 452 such that the join cannot be seen when the multi-density device is in use, or the join may be visible, but not in such a way that it impairs the usability of the device.
- the high-density area 453 could be situated anywhere in the low-density area 452.
- the high-density area 453 could be central, surrounded on all sides by low-density pixels 452, or it could be in an edge or at a corner or as a flap along a whole edge.
- the main advantage of a multi -density display device 451 over a uniformly high-density device is that it will be lower cost to manufacture and require less electronics to drive. Most multi -density display devices 451 will only have double the number of pixels of a conventional display device, instead of possibly nine times for a typical application.
- the multi-density display device 451 is operable such that a single image eg a photograph, can be displayed at uniformly low resolution across the entire device.
- the row and column density of the high-density area 453 is an integer factor of the low-density area 452. If the integer factor is 2 then there will be two rows of pixels in the high- density area for each row in the low-density area. The same applies for columns. This is shown in the magnified part of Figure 58. In this configuration, four high-density pixels 455 may be imaged to be equivalent to a single low-density pixel 454. A similar correspondence applies for other integer factors such as 3 or 4.
- the multi-density display device 451 may display an Avatar Virtual Environment 420 onto the low-density area 452 and a virtual presentation screen 53 onto some or all of the high-density area 453.
- the display illumination intensity of the low- density area 452 may be different from the display illumination intensity of the high-density area 453. In the case of displaying small text on the high-density area 453, it will be easier to read if it has a higher display illumination intensity.
- Multi -density display devices 451 may be manufactured in a variety of ways using a variety of technologies such as liquid crystal, plasma and opto-polymers . Manufacturing processes will need to be developed for the production of multi-density display devices and this is not expected to be difficult for those skilled in the art.
- any number of low-density areas 452 and any number of high-density areas 453 may be combined in any way in a multi-density display devices 451.
- Dual-projection devices 433 and multi-density display devices 451 are useful in communication sessions involving both AVEs and detailed information displays.
- a key advantage is the combination of sense of presence and the ability to view detailed information such that the user has a feeling of being there.
- a range of devices 431, 432, 433, 451 may cover needs from one user in a small room to several thousands of users in a large conference room. It is a further purpose of this sixteenth embodiment to disclose a process wherein a computing appliance means uses a display device comprising two projector means comprising the following steps: - a first projector projects an avatar virtual environment; a second projector projects a presentation; such that both projections respond to changes independently and at the frame rate being used.
- the avatar user interface system 261 includes a directional microphone device 460.
- live presentations can be delivered by a remote presenter to a room with an audience using an avatar user interface system 261. Furthermore, live presentations can be delivered to a mixed audience consisting of an audience physically present in a room and a virtual audience simultaneously present at one or more other locations, connected by a network. During a presentation, the presenter's avatar can use media such as slide images projected onto a virtual screen.
- the first problem is that of gaze. It is normal for a lecturer to address the person in the audience who asked the question. But where is that person?
- the second problem that of mixed audiences, if the questioner is not in the same room as a viewer, then it will be beneficial for the viewer to see a virtual audience.
- a third problem is the probability of everyone in a large audience not having identifiable avatars and personal microphones.
- Figure 59 is a schematic of an avatar user interface system 261 with a mixed audience of avatars 5 of virtual users at various locations and physical users 17 in an environmental location 273 which is a room containing the physical audience and a directional microphone device 460 that can not only record sound but also the direction from which the sound is coming.
- the directional microphone device 460 is connected to a 'Room PC 3 that is also connected to the room's display device 264 and a network 2.
- An avatar 5 labelled 'Virtual Presenter' represents a remote user 17 labelled 'Remote Presenter' .
- the remote presenter 17 is using a 'Presenter PC 3 on the network 2.
- a physical user 17 labelled 'Questioner' asks a question.
- the voice 270 and its direction are picked up by the directional microphone device 460 that feeds the information to the PC 3.
- the software director 80 controls the gaze direction of the virtual presenter 5 to face the questioner 17 as the presenter 17 replies.
- the accuracy of the gaze direction of the virtual presenter 5 towards the questioner 17 can be improved by building a virtual model of the environmental location 273 including the positions of the display device 264 and the directional microphone device 460.
- the accuracy of the gaze direction can be further improved by (a) using the directional microphone device to identify which fixed microphone is being used and (b) use the known location of the fixed microphone in the virtual model of the environmental location to determine the gaze direction.
- a 'Remote Questioner' 17 is visualised at the environmental location 273 as a 'Virtual Remote Questioner' 5 displayed on the display device 264.
- the software director knows the positions of both the virtual presenter avatar 5 and the virtual remote questioner avatar 5 and can calculate the gaze direction.
- the physical members of the audience at the environmental location 273 see the remote presenter answering the remote questioner.
- This embodiment is applicable to multiple remote presenters such as a presenter and a chairman or a panel of presenters .
- One or more of the presenters may be at the same environmental location 273.
- Any number of environmental locations 273 with two or more users 17 and any number of environmental locations 273 with one user 17 may be connected by a network 2 during a presentation.
- This embodiment is also applicable to the simple case of one remote presenter presenting to one physical audience, in which case there is no virtual remote audience.
- the software director 80 has to determine movements for the virtual presenter avatar 5 in real-time. Many body and facial gestures are normally timed by skilled presenters to fit in with the beginning and end of sentences. This is not possible in real-time for the software director 80 because it does not know when a sentence is due to begin or end.
- a remote presenter may pre-record his presentation using a microphone to record the words as he speaks them.
- the software director 80 can then be used to prepare a better visual avatar presentation than the live presentation. This preparation can be done automatically by the software director 80 or interactively with the presenter 17.
- Figure 60 is a block diagram of an apparatus for presentation preparation.
- a presentation preparer 461 may be operated either automatically or interactively by a user 17 to output a prepared presentation 466. At any time later, the prepared presentation 466 may be played on a player 210.
- the presentation preparer 461 has a set of voice recordings 464 and any associated media elements 465 as the main input. Media elements might be slide images, animations, audio-video clips, 3D objects, avatar player scenes or any other type of media.
- a prepared presentation 466 is an example of an avatar player scene; it may be executed in a linear fashion by a player 210.
- a presentation may be prepared without media elements 465.
- a presentation might also be mimed without voice recordings 464.
- the software director 80 takes a series of voice recordings 464 that have been associated with presentation media elements 465 such as slide changes and automatically generates the complete presentation including but not limited to: movement, gestures, gaze and lipsync for avatars; lighting, prop and camera animations.
- a library of presentation actions 462 and a presentation action generator 463 is used for preparing the avatar animation.
- a set of automatic presentation rules is built into the presentation preparer 461 which is a finite state machine.
- manual presentation preparation using the preparation preparer 461, the user 17 may select what animations should be used when.
- Manual preparation is based on manually editing event positions on a timeline .
- either a user 17 may control the mode of the software director 80 using mode selection buttons in the avatar user interface window 260, or the software director 80 may make a best guess at the mode.
- the rules applied to controlling the movement of the avatar of the presenter vary with mode. Modes include: Playing a prepared presentation Live presentation Question - Answer
- directional microphone means and seating plan means comprising the following steps: a person speaks; a directional microphone means records the person's speech and the direction the speech is coming from; a software director uses the seating plan and the direction that the speech is coming from to generate avatar enactments such that displayed avatars can gaze in the direction of the speaker.
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2003201032A AU2003201032A1 (en) | 2002-01-07 | 2003-01-07 | Method and apparatus for an avatar user interface system |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0200255A GB0200255D0 (en) | 2002-01-07 | 2002-01-07 | Avatar user interface system |
GB0200255.8 | 2002-01-07 | ||
GB0208146A GB0208146D0 (en) | 2002-01-07 | 2002-04-09 | Avatar user interface system |
GB0208146.1 | 2002-04-09 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2003058518A2 true WO2003058518A2 (en) | 2003-07-17 |
WO2003058518A3 WO2003058518A3 (en) | 2004-05-27 |
Family
ID=26246918
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB2003/000031 WO2003058518A2 (en) | 2002-01-07 | 2003-01-07 | Method and apparatus for an avatar user interface system |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2003201032A1 (en) |
WO (1) | WO2003058518A2 (en) |
Cited By (103)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2395822A (en) * | 2002-11-26 | 2004-06-02 | Rockwell Electronic Commerce | Distributed transaction processing system using virtual reality interface |
EP1437880A1 (en) * | 2003-01-13 | 2004-07-14 | AT&T Corp. | Enhanced audio communications in an interactive environment |
DE10329244A1 (en) * | 2003-06-24 | 2005-01-13 | Deutsche Telekom Ag | Communication method e.g. between conference participants, involves conference participants to communicate information with communication over conference management unit |
EP1526489A1 (en) * | 2003-10-20 | 2005-04-27 | Ncr International Inc. | Self-service terminal with avatar user interface |
EP1740279A2 (en) * | 2004-02-17 | 2007-01-10 | International Business Machines Corporation | Sip based voip multiplayer network games |
EP1784020A1 (en) * | 2005-11-08 | 2007-05-09 | TCL & Alcatel Mobile Phones Limited | Method and communication apparatus for reproducing a moving picture, and use in a videoconference system |
EP1813330A1 (en) * | 2006-01-27 | 2007-08-01 | DotCity Inc. | System of developing urban landscape by using electronic data |
WO2007130691A2 (en) | 2006-05-07 | 2007-11-15 | Sony Computer Entertainment Inc. | Method for providing affective characteristics to computer generated avatar during gameplay |
WO2008049237A1 (en) * | 2006-10-26 | 2008-05-02 | Pixman Corporation | Interactive system and method |
EP1976291A1 (en) | 2007-03-02 | 2008-10-01 | Deutsche Telekom AG | Method and video communication system for gesture-based real-time control of an avatar |
WO2009003536A1 (en) | 2007-06-29 | 2009-01-08 | Sony Ericsson Mobile Communications Ab | Methods and terminals that control avatars during videoconferencing and other communications |
EP2019530A1 (en) * | 2007-07-25 | 2009-01-28 | British Telecommunications Public Limited Company | Message delivery |
US7668515B2 (en) | 2004-10-06 | 2010-02-23 | Comverse Ltd. | Portable telephone for conveying real time walkie-talkie streaming audio-video |
US7675519B2 (en) * | 2004-08-05 | 2010-03-09 | Elite Avatars, Inc. | Persistent, immersible and extractable avatars |
US7769806B2 (en) | 2007-10-24 | 2010-08-03 | Social Communications Company | Automated real-time data stream switching in a shared virtual area communication environment |
US7844724B2 (en) | 2007-10-24 | 2010-11-30 | Social Communications Company | Automated real-time data stream switching in a shared virtual area communication environment |
US7844424B2 (en) | 2007-03-01 | 2010-11-30 | The Boeing Company | Human behavioral modeling and simulation framework |
EP2269389A1 (en) * | 2008-03-18 | 2011-01-05 | Avaya Inc. | Realistic audio communications in a three dimensional computer-generated virtual environment |
WO2011010034A1 (en) * | 2009-07-24 | 2011-01-27 | Alcatel Lucent | Method for communication between at least one transmitter of a media stream and at least one receiver of said media stream in an electronic telecommunication service |
US7983996B2 (en) | 2007-03-01 | 2011-07-19 | The Boeing Company | Method and apparatus for human behavior modeling in adaptive training |
WO2011123192A1 (en) * | 2010-03-30 | 2011-10-06 | Sony Computer Entertainment Inc. | Method for an augmented reality character to maintain and exhibit awareness of an observer |
US8149241B2 (en) | 2007-12-10 | 2012-04-03 | International Business Machines Corporation | Arrangements for controlling activities of an avatar |
WO2012066104A3 (en) * | 2010-11-17 | 2012-07-12 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
US8228170B2 (en) | 2008-01-10 | 2012-07-24 | International Business Machines Corporation | Using sensors to identify objects placed on a surface |
FR2975198A1 (en) * | 2011-05-10 | 2012-11-16 | Peugeot Citroen Automobiles Sa | Virtual reality equipment i.e. immersive virtual reality environment, has light emitting device covering whole or part of real object or individual intended to be integrated into virtual scene for display of images related to virtual scene |
US8316310B2 (en) | 2008-08-05 | 2012-11-20 | International Business Machines Corporation | System and method for human identification proof for use in virtual environments |
US8365075B2 (en) | 2009-11-19 | 2013-01-29 | International Business Machines Corporation | Recording events in a virtual world |
US8379968B2 (en) | 2007-12-10 | 2013-02-19 | International Business Machines Corporation | Conversion of two dimensional image data into three dimensional spatial data for use in a virtual universe |
US8386918B2 (en) | 2007-12-06 | 2013-02-26 | International Business Machines Corporation | Rendering of real world objects and interactions into a virtual universe |
US8465408B2 (en) | 2009-08-06 | 2013-06-18 | Neosync, Inc. | Systems and methods for modulating the electrical activity of a brain using neuro-EEG synchronization therapy |
US8475354B2 (en) | 2007-09-25 | 2013-07-02 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US8547380B2 (en) | 2004-08-05 | 2013-10-01 | Elite Avatars, Llc | Persistent, immersible and extractable avatars |
US8585568B2 (en) | 2009-11-12 | 2013-11-19 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
WO2014181045A1 (en) * | 2013-05-07 | 2014-11-13 | Glowbl | Communication interface and method, computer programme and corresponding recording medium |
WO2014181064A1 (en) * | 2013-05-07 | 2014-11-13 | Glowbl | Communication interface and method, computer programme and corresponding recording medium |
US8926490B2 (en) | 2008-09-24 | 2015-01-06 | Neosync, Inc. | Systems and methods for depression treatment using neuro-EEG synchronization therapy |
US20160110044A1 (en) * | 2014-10-20 | 2016-04-21 | Microsoft Corporation | Profile-driven avatar sessions |
US9483157B2 (en) | 2007-10-24 | 2016-11-01 | Sococo, Inc. | Interfacing with a spatial virtual communication environment |
US9649502B2 (en) | 2011-11-14 | 2017-05-16 | Neosync, Inc. | Devices and methods of low frequency magnetic stimulation therapy |
WO2017205226A1 (en) * | 2016-05-27 | 2017-11-30 | Microsoft Technology Licensing, Llc | Communication visualisation |
US9853922B2 (en) | 2012-02-24 | 2017-12-26 | Sococo, Inc. | Virtual area communications |
WO2018022392A1 (en) * | 2016-07-29 | 2018-02-01 | Microsoft Technology Licensing, Llc | Private communication by gazing at avatar |
CN107705341A (en) * | 2016-08-08 | 2018-02-16 | 创奇思科研有限公司 | The method and its device of user's expression head portrait generation |
CN107944907A (en) * | 2017-11-16 | 2018-04-20 | 琦境科技(北京)有限公司 | A kind of method and system of virtual reality exhibition room interaction |
US9962555B1 (en) | 2017-01-17 | 2018-05-08 | Neosync, Inc. | Head-mountable adjustable devices for generating magnetic fields |
US10099140B2 (en) | 2015-10-08 | 2018-10-16 | Activision Publishing, Inc. | System and method for generating personalized messaging campaigns for video game players |
US10118099B2 (en) | 2014-12-16 | 2018-11-06 | Activision Publishing, Inc. | System and method for transparently styling non-player characters in a multiplayer video game |
US10137376B2 (en) | 2012-12-31 | 2018-11-27 | Activision Publishing, Inc. | System and method for creating and streaming augmented game sessions |
EP2163066B1 (en) * | 2007-06-26 | 2019-01-09 | Orange | Method and system for determining a geographical location for meeting between people via a telecommunication environment |
US10179289B2 (en) | 2016-06-21 | 2019-01-15 | Activision Publishing, Inc. | System and method for reading graphically-encoded identifiers from physical trading cards through image-based template matching |
US10213682B2 (en) | 2015-06-15 | 2019-02-26 | Activision Publishing, Inc. | System and method for uniquely identifying physical trading cards and incorporating trading card game items in a video game |
US10226703B2 (en) | 2016-04-01 | 2019-03-12 | Activision Publishing, Inc. | System and method of generating and providing interactive annotation items based on triggering events in a video game |
US10228760B1 (en) | 2017-05-23 | 2019-03-12 | Visionary Vr, Inc. | System and method for generating a virtual reality scene based on individual asynchronous motion capture recordings |
US10232272B2 (en) | 2015-10-21 | 2019-03-19 | Activision Publishing, Inc. | System and method for replaying video game streams |
EP2720766B1 (en) * | 2011-06-15 | 2019-03-27 | Sony Interactive Entertainment America LLC | System and method for managing audio and video channels for video game players and spectators |
US10245509B2 (en) | 2015-10-21 | 2019-04-02 | Activision Publishing, Inc. | System and method of inferring user interest in different aspects of video game streams |
US10284454B2 (en) | 2007-11-30 | 2019-05-07 | Activision Publishing, Inc. | Automatic increasing of capacity of a virtual space in a virtual world |
US10286326B2 (en) | 2014-07-03 | 2019-05-14 | Activision Publishing, Inc. | Soft reservation system and method for multiplayer video games |
US10286314B2 (en) | 2015-05-14 | 2019-05-14 | Activision Publishing, Inc. | System and method for providing continuous gameplay in a multiplayer video game through an unbounded gameplay session |
US10315113B2 (en) | 2015-05-14 | 2019-06-11 | Activision Publishing, Inc. | System and method for simulating gameplay of nonplayer characters distributed across networked end user devices |
EP3371778A4 (en) * | 2015-11-06 | 2019-06-26 | Mursion, Inc. | Control system for virtual characters |
WO2019143780A1 (en) * | 2018-01-19 | 2019-07-25 | ESB Labs, Inc. | Virtual interactive audience interface |
US10376781B2 (en) | 2015-10-21 | 2019-08-13 | Activision Publishing, Inc. | System and method of generating and distributing video game streams |
US10376793B2 (en) | 2010-02-18 | 2019-08-13 | Activision Publishing, Inc. | Videogame system and method that enables characters to earn virtual fans by completing secondary objectives |
US10421019B2 (en) | 2010-05-12 | 2019-09-24 | Activision Publishing, Inc. | System and method for enabling players to participate in asynchronous, competitive challenges |
US10471271B1 (en) | 2005-01-21 | 2019-11-12 | Michael Sasha John | Systems and methods of individualized magnetic stimulation therapy |
US10471348B2 (en) | 2015-07-24 | 2019-11-12 | Activision Publishing, Inc. | System and method for creating and sharing customized video game weapon configurations in multiplayer video games via one or more social networks |
US10500498B2 (en) | 2016-11-29 | 2019-12-10 | Activision Publishing, Inc. | System and method for optimizing virtual games |
CN110635969A (en) * | 2019-09-30 | 2019-12-31 | 浪潮软件集团有限公司 | High concurrency test method for streaming media direct memory system |
US10561945B2 (en) | 2017-09-27 | 2020-02-18 | Activision Publishing, Inc. | Methods and systems for incentivizing team cooperation in multiplayer gaming environments |
US10573065B2 (en) | 2016-07-29 | 2020-02-25 | Activision Publishing, Inc. | Systems and methods for automating the personalization of blendshape rigs based on performance capture data |
US10588576B2 (en) | 2014-08-15 | 2020-03-17 | Neosync, Inc. | Methods and device for determining a valid intrinsic frequency |
US10627983B2 (en) | 2007-12-24 | 2020-04-21 | Activision Publishing, Inc. | Generating data for managing encounters in a virtual world environment |
US10650539B2 (en) | 2016-12-06 | 2020-05-12 | Activision Publishing, Inc. | Methods and systems to modify a two dimensional facial image to increase dimensional depth and generate a facial image that appears three dimensional |
US10765948B2 (en) | 2017-12-22 | 2020-09-08 | Activision Publishing, Inc. | Video game content aggregation, normalization, and publication systems and methods |
EP3734966A1 (en) * | 2019-05-03 | 2020-11-04 | Nokia Technologies Oy | An apparatus and associated methods for presentation of audio |
US10974150B2 (en) | 2017-09-27 | 2021-04-13 | Activision Publishing, Inc. | Methods and systems for improved content customization in multiplayer gaming environments |
US10981069B2 (en) | 2008-03-07 | 2021-04-20 | Activision Publishing, Inc. | Methods and systems for determining the authenticity of copied objects in a virtual environment |
US11040286B2 (en) | 2017-09-27 | 2021-06-22 | Activision Publishing, Inc. | Methods and systems for improved content generation in multiplayer gaming environments |
CN113189612A (en) * | 2021-05-17 | 2021-07-30 | 长安大学 | Gravel seal quality detection device based on depth camera |
US11097193B2 (en) | 2019-09-11 | 2021-08-24 | Activision Publishing, Inc. | Methods and systems for increasing player engagement in multiplayer gaming environments |
US11134308B2 (en) | 2018-08-06 | 2021-09-28 | Sony Corporation | Adapting interactions with a television user |
US11151999B2 (en) | 2019-08-01 | 2021-10-19 | International Business Machines Corporation | Controlling external behavior of cognitive systems |
CN113661691A (en) * | 2019-09-27 | 2021-11-16 | 苹果公司 | Environment for remote communication |
US20210358188A1 (en) * | 2020-05-13 | 2021-11-18 | Nvidia Corporation | Conversational ai platform with rendered graphical output |
US11185784B2 (en) | 2015-10-08 | 2021-11-30 | Activision Publishing, Inc. | System and method for generating personalized messaging campaigns for video game players |
EP3951604A4 (en) * | 2019-04-01 | 2022-06-01 | Sumitomo Electric Industries, Ltd. | Communication assistance system, communication assistance method, communication assistance program, and image control program |
US11351466B2 (en) | 2014-12-05 | 2022-06-07 | Activision Publishing, Ing. | System and method for customizing a replay of one or more game events in a video game |
US11351459B2 (en) | 2020-08-18 | 2022-06-07 | Activision Publishing, Inc. | Multiplayer video games with virtual characters having dynamically generated attribute profiles unconstrained by predefined discrete values |
US11527032B1 (en) | 2022-04-11 | 2022-12-13 | Mindshow Inc. | Systems and methods to generate and utilize content styles for animation |
US11524234B2 (en) | 2020-08-18 | 2022-12-13 | Activision Publishing, Inc. | Multiplayer video games with virtual characters having dynamically modified fields of view |
US11532179B1 (en) | 2022-06-03 | 2022-12-20 | Prof Jim Inc. | Systems for and methods of creating a library of facial expressions |
DE102021120330A1 (en) | 2021-08-04 | 2023-02-09 | MyArtist& Me GmbH | Real time audio video streaming system |
US20230073828A1 (en) * | 2021-09-07 | 2023-03-09 | Ringcentral, Inc | System and method for identifying active communicator |
US11679330B2 (en) | 2018-12-18 | 2023-06-20 | Activision Publishing, Inc. | Systems and methods for generating improved non-player characters |
US11712627B2 (en) | 2019-11-08 | 2023-08-01 | Activision Publishing, Inc. | System and method for providing conditional access to virtual gaming items |
WO2023156984A1 (en) * | 2022-02-21 | 2023-08-24 | TMRW Foundation IP SARL | Movable virtual camera for improved meeting views in 3d virtual |
EP4254943A1 (en) * | 2022-03-30 | 2023-10-04 | TMRW Foundation IP SARL | Head-tracking based media selection for video communications in virtual environments |
EP4286995A1 (en) * | 2022-05-31 | 2023-12-06 | TMRW Foundation IP SARL | Method, system and computer program product for providing navigation assistance in three-dimensional virtual environments |
US11893964B2 (en) | 2019-09-26 | 2024-02-06 | Apple Inc. | Controlling displays |
US11956571B2 (en) * | 2022-07-28 | 2024-04-09 | Katmai Tech Inc. | Scene freezing and unfreezing |
US11960641B2 (en) | 2018-09-28 | 2024-04-16 | Apple Inc. | Application placement based on head position |
US11972086B2 (en) | 2019-03-18 | 2024-04-30 | Activision Publishing, Inc. | Automatic increasing of capacity of a virtual space in a virtual world |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2000010099A1 (en) * | 1998-08-17 | 2000-02-24 | Net Talk, Inc. | Computer architecture and process for audio conferencing over local and global networks including internets and intranets |
US6119147A (en) * | 1998-07-28 | 2000-09-12 | Fuji Xerox Co., Ltd. | Method and system for computer-mediated, multi-modal, asynchronous meetings in a virtual space |
GB2351216A (en) * | 1999-01-20 | 2000-12-20 | Canon Kk | Computer conferencing apparatus |
WO2001001354A1 (en) * | 1999-06-24 | 2001-01-04 | Stephen James Crampton | Method and apparatus for the generation of computer graphic representations of individuals |
EP1094657A1 (en) * | 1999-10-18 | 2001-04-25 | BRITISH TELECOMMUNICATIONS public limited company | Mobile conferencing system and method |
-
2003
- 2003-01-07 WO PCT/GB2003/000031 patent/WO2003058518A2/en not_active Application Discontinuation
- 2003-01-07 AU AU2003201032A patent/AU2003201032A1/en not_active Abandoned
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6119147A (en) * | 1998-07-28 | 2000-09-12 | Fuji Xerox Co., Ltd. | Method and system for computer-mediated, multi-modal, asynchronous meetings in a virtual space |
WO2000010099A1 (en) * | 1998-08-17 | 2000-02-24 | Net Talk, Inc. | Computer architecture and process for audio conferencing over local and global networks including internets and intranets |
GB2351216A (en) * | 1999-01-20 | 2000-12-20 | Canon Kk | Computer conferencing apparatus |
WO2001001354A1 (en) * | 1999-06-24 | 2001-01-04 | Stephen James Crampton | Method and apparatus for the generation of computer graphic representations of individuals |
EP1094657A1 (en) * | 1999-10-18 | 2001-04-25 | BRITISH TELECOMMUNICATIONS public limited company | Mobile conferencing system and method |
Cited By (191)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2395822A (en) * | 2002-11-26 | 2004-06-02 | Rockwell Electronic Commerce | Distributed transaction processing system using virtual reality interface |
US10335691B2 (en) | 2002-12-10 | 2019-07-02 | Sony Interactive Entertainment America Llc | System and method for managing audio and video channels for video game players and spectators |
US7371175B2 (en) | 2003-01-13 | 2008-05-13 | At&T Corp. | Method and system for enhanced audio communications in an interactive environment |
EP1437880A1 (en) * | 2003-01-13 | 2004-07-14 | AT&T Corp. | Enhanced audio communications in an interactive environment |
DE10329244A1 (en) * | 2003-06-24 | 2005-01-13 | Deutsche Telekom Ag | Communication method e.g. between conference participants, involves conference participants to communicate information with communication over conference management unit |
EP1526489A1 (en) * | 2003-10-20 | 2005-04-27 | Ncr International Inc. | Self-service terminal with avatar user interface |
US8070601B2 (en) | 2004-02-17 | 2011-12-06 | International Business Machines Corporation | SIP based VoIP multiplayer network games |
US7985138B2 (en) | 2004-02-17 | 2011-07-26 | International Business Machines Corporation | SIP based VoIP multiplayer network games |
EP1740279A2 (en) * | 2004-02-17 | 2007-01-10 | International Business Machines Corporation | Sip based voip multiplayer network games |
EP1740279A4 (en) * | 2004-02-17 | 2010-08-18 | Ibm | Sip based voip multiplayer network games |
US7675519B2 (en) * | 2004-08-05 | 2010-03-09 | Elite Avatars, Inc. | Persistent, immersible and extractable avatars |
US8547380B2 (en) | 2004-08-05 | 2013-10-01 | Elite Avatars, Llc | Persistent, immersible and extractable avatars |
US7668515B2 (en) | 2004-10-06 | 2010-02-23 | Comverse Ltd. | Portable telephone for conveying real time walkie-talkie streaming audio-video |
US10471271B1 (en) | 2005-01-21 | 2019-11-12 | Michael Sasha John | Systems and methods of individualized magnetic stimulation therapy |
US8064754B2 (en) | 2005-11-08 | 2011-11-22 | Imerj, Ltd. | Method and communication apparatus for reproducing a moving picture, and use in a videoconference system |
EP1784020A1 (en) * | 2005-11-08 | 2007-05-09 | TCL & Alcatel Mobile Phones Limited | Method and communication apparatus for reproducing a moving picture, and use in a videoconference system |
EP1813330A1 (en) * | 2006-01-27 | 2007-08-01 | DotCity Inc. | System of developing urban landscape by using electronic data |
EP2016562A4 (en) * | 2006-05-07 | 2010-01-06 | Sony Computer Entertainment Inc | Method for providing affective characteristics to computer generated avatar during gameplay |
US8766983B2 (en) | 2006-05-07 | 2014-07-01 | Sony Computer Entertainment Inc. | Methods and systems for processing an interchange of real time effects during video communication |
WO2007130691A2 (en) | 2006-05-07 | 2007-11-15 | Sony Computer Entertainment Inc. | Method for providing affective characteristics to computer generated avatar during gameplay |
EP2016562A2 (en) * | 2006-05-07 | 2009-01-21 | Sony Computer Entertainment Inc. | Method for providing affective characteristics to computer generated avatar during gameplay |
WO2008049237A1 (en) * | 2006-10-26 | 2008-05-02 | Pixman Corporation | Interactive system and method |
US7844424B2 (en) | 2007-03-01 | 2010-11-30 | The Boeing Company | Human behavioral modeling and simulation framework |
US7983996B2 (en) | 2007-03-01 | 2011-07-19 | The Boeing Company | Method and apparatus for human behavior modeling in adaptive training |
EP1976291A1 (en) | 2007-03-02 | 2008-10-01 | Deutsche Telekom AG | Method and video communication system for gesture-based real-time control of an avatar |
EP2163066B1 (en) * | 2007-06-26 | 2019-01-09 | Orange | Method and system for determining a geographical location for meeting between people via a telecommunication environment |
WO2009003536A1 (en) | 2007-06-29 | 2009-01-08 | Sony Ericsson Mobile Communications Ab | Methods and terminals that control avatars during videoconferencing and other communications |
EP2160880A1 (en) * | 2007-06-29 | 2010-03-10 | Sony Ericsson Mobile Communications AB | Methods and terminals that control avatars during videoconferencing and other communications |
WO2009013515A1 (en) * | 2007-07-25 | 2009-01-29 | British Telecommunications Public Limited Company | Message delivery |
EP2019530A1 (en) * | 2007-07-25 | 2009-01-28 | British Telecommunications Public Limited Company | Message delivery |
US9308387B2 (en) | 2007-09-25 | 2016-04-12 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US9272159B2 (en) | 2007-09-25 | 2016-03-01 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US9015057B2 (en) | 2007-09-25 | 2015-04-21 | Neosync, Inc. | Systems and methods for controlling and billing neuro-EEG synchronization therapy |
US8961386B2 (en) | 2007-09-25 | 2015-02-24 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US8888672B2 (en) | 2007-09-25 | 2014-11-18 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US8888673B2 (en) | 2007-09-25 | 2014-11-18 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US11938336B2 (en) | 2007-09-25 | 2024-03-26 | Wave Neuroscience, Inc. | Methods for treating anxiety using neuro-EEG synchronization therapy |
US8475354B2 (en) | 2007-09-25 | 2013-07-02 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US11311741B2 (en) | 2007-09-25 | 2022-04-26 | Wave Neuroscience, Inc. | Systems and methods for anxiety treatment using neuro-EEG synchronization therapy |
US8480554B2 (en) | 2007-09-25 | 2013-07-09 | Neosync, Inc. | Systems and methods for depression treatment using neuro-EEG synchronization therapy |
US7844724B2 (en) | 2007-10-24 | 2010-11-30 | Social Communications Company | Automated real-time data stream switching in a shared virtual area communication environment |
US8621079B2 (en) | 2007-10-24 | 2013-12-31 | Social Communications Company | Automated real-time data stream switching in a shared virtual area communication environment |
US9762641B2 (en) | 2007-10-24 | 2017-09-12 | Sococo, Inc. | Automated real-time data stream switching in a shared virtual area communication environment |
US9483157B2 (en) | 2007-10-24 | 2016-11-01 | Sococo, Inc. | Interfacing with a spatial virtual communication environment |
US7769806B2 (en) | 2007-10-24 | 2010-08-03 | Social Communications Company | Automated real-time data stream switching in a shared virtual area communication environment |
US10284454B2 (en) | 2007-11-30 | 2019-05-07 | Activision Publishing, Inc. | Automatic increasing of capacity of a virtual space in a virtual world |
US8386918B2 (en) | 2007-12-06 | 2013-02-26 | International Business Machines Corporation | Rendering of real world objects and interactions into a virtual universe |
US8379968B2 (en) | 2007-12-10 | 2013-02-19 | International Business Machines Corporation | Conversion of two dimensional image data into three dimensional spatial data for use in a virtual universe |
US8149241B2 (en) | 2007-12-10 | 2012-04-03 | International Business Machines Corporation | Arrangements for controlling activities of an avatar |
US10627983B2 (en) | 2007-12-24 | 2020-04-21 | Activision Publishing, Inc. | Generating data for managing encounters in a virtual world environment |
US8228170B2 (en) | 2008-01-10 | 2012-07-24 | International Business Machines Corporation | Using sensors to identify objects placed on a surface |
US10981069B2 (en) | 2008-03-07 | 2021-04-20 | Activision Publishing, Inc. | Methods and systems for determining the authenticity of copied objects in a virtual environment |
US11957984B2 (en) | 2008-03-07 | 2024-04-16 | Activision Publishing, Inc. | Methods and systems for determining the authenticity of modified objects in a virtual environment |
EP2269389A4 (en) * | 2008-03-18 | 2013-07-24 | Avaya Inc | Realistic audio communications in a three dimensional computer-generated virtual environment |
EP2269389A1 (en) * | 2008-03-18 | 2011-01-05 | Avaya Inc. | Realistic audio communications in a three dimensional computer-generated virtual environment |
US8543930B2 (en) | 2008-08-05 | 2013-09-24 | International Business Machines Corporation | System and method for human identification proof for use in virtual environments |
US8316310B2 (en) | 2008-08-05 | 2012-11-20 | International Business Machines Corporation | System and method for human identification proof for use in virtual environments |
US8926490B2 (en) | 2008-09-24 | 2015-01-06 | Neosync, Inc. | Systems and methods for depression treatment using neuro-EEG synchronization therapy |
US8870737B2 (en) | 2008-09-24 | 2014-10-28 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
WO2011010034A1 (en) * | 2009-07-24 | 2011-01-27 | Alcatel Lucent | Method for communication between at least one transmitter of a media stream and at least one receiver of said media stream in an electronic telecommunication service |
FR2948525A1 (en) * | 2009-07-24 | 2011-01-28 | Alcatel Lucent | METHOD OF COMMUNICATING BETWEEN AT LEAST ONE TRANSMITTER OF A MEDIA STREAM AND AT LEAST ONE RECEIVER OF SAID STREAM IN AN ELECTRONIC TELECOMMUNICATION SERVICE |
US10357660B2 (en) | 2009-08-06 | 2019-07-23 | Neosync, Inc. | Systems and methods for modulating the electrical activity of a brain using neuro-EEG synchronization therapy |
US9713729B2 (en) | 2009-08-06 | 2017-07-25 | Neosync, Inc. | Systems and methods for modulating the electrical activity of a brain using neuro-EEG synchronization therapy |
US8465408B2 (en) | 2009-08-06 | 2013-06-18 | Neosync, Inc. | Systems and methods for modulating the electrical activity of a brain using neuro-EEG synchronization therapy |
US8585568B2 (en) | 2009-11-12 | 2013-11-19 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US9446259B2 (en) | 2009-11-12 | 2016-09-20 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US10065048B2 (en) | 2009-11-12 | 2018-09-04 | Neosync, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US10821293B2 (en) | 2009-11-12 | 2020-11-03 | Wave Neuroscience, Inc. | Systems and methods for neuro-EEG synchronization therapy |
US8365075B2 (en) | 2009-11-19 | 2013-01-29 | International Business Machines Corporation | Recording events in a virtual world |
US10091454B2 (en) | 2009-11-19 | 2018-10-02 | International Business Machines Corporation | Recording events in a virtual world |
US9171286B2 (en) | 2009-11-19 | 2015-10-27 | International Business Machines Corporation | Recording events in a virtual world |
US10376793B2 (en) | 2010-02-18 | 2019-08-13 | Activision Publishing, Inc. | Videogame system and method that enables characters to earn virtual fans by completing secondary objectives |
WO2011123192A1 (en) * | 2010-03-30 | 2011-10-06 | Sony Computer Entertainment Inc. | Method for an augmented reality character to maintain and exhibit awareness of an observer |
US9901828B2 (en) | 2010-03-30 | 2018-02-27 | Sony Interactive Entertainment America Llc | Method for an augmented reality character to maintain and exhibit awareness of an observer |
CN103079661A (en) * | 2010-03-30 | 2013-05-01 | 索尼电脑娱乐美国公司 | Method for an augmented reality character to maintain and exhibit awareness of an observer |
US10421019B2 (en) | 2010-05-12 | 2019-09-24 | Activision Publishing, Inc. | System and method for enabling players to participate in asynchronous, competitive challenges |
CN103298529B (en) * | 2010-11-17 | 2015-12-16 | 斯蒂尔塞瑞斯有限责任公司 | For the apparatus and method of the user's input in managing video game |
US10821359B2 (en) | 2010-11-17 | 2020-11-03 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
US9744451B2 (en) | 2010-11-17 | 2017-08-29 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
US11235236B2 (en) | 2010-11-17 | 2022-02-01 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
US9199174B2 (en) | 2010-11-17 | 2015-12-01 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
WO2012066104A3 (en) * | 2010-11-17 | 2012-07-12 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
US8939837B2 (en) | 2010-11-17 | 2015-01-27 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
US11850506B2 (en) | 2010-11-17 | 2023-12-26 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
US8337305B2 (en) | 2010-11-17 | 2012-12-25 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
US10220312B2 (en) | 2010-11-17 | 2019-03-05 | Steelseries Aps | Apparatus and method for managing user inputs in video games |
CN103298529A (en) * | 2010-11-17 | 2013-09-11 | 斯蒂尔塞瑞斯有限责任公司 | Apparatus and method for managing user inputs in video games |
FR2975198A1 (en) * | 2011-05-10 | 2012-11-16 | Peugeot Citroen Automobiles Sa | Virtual reality equipment i.e. immersive virtual reality environment, has light emitting device covering whole or part of real object or individual intended to be integrated into virtual scene for display of images related to virtual scene |
EP2720766B1 (en) * | 2011-06-15 | 2019-03-27 | Sony Interactive Entertainment America LLC | System and method for managing audio and video channels for video game players and spectators |
US9649502B2 (en) | 2011-11-14 | 2017-05-16 | Neosync, Inc. | Devices and methods of low frequency magnetic stimulation therapy |
US9853922B2 (en) | 2012-02-24 | 2017-12-26 | Sococo, Inc. | Virtual area communications |
US10137376B2 (en) | 2012-12-31 | 2018-11-27 | Activision Publishing, Inc. | System and method for creating and streaming augmented game sessions |
US11446582B2 (en) | 2012-12-31 | 2022-09-20 | Activision Publishing, Inc. | System and method for streaming game sessions to third party gaming consoles |
US10905963B2 (en) | 2012-12-31 | 2021-02-02 | Activision Publishing, Inc. | System and method for creating and streaming augmented game sessions |
WO2014181064A1 (en) * | 2013-05-07 | 2014-11-13 | Glowbl | Communication interface and method, computer programme and corresponding recording medium |
FR3005518A1 (en) * | 2013-05-07 | 2014-11-14 | Glowbl | COMMUNICATION INTERFACE AND METHOD, COMPUTER PROGRAM, AND CORRESPONDING RECORDING MEDIUM |
WO2014181045A1 (en) * | 2013-05-07 | 2014-11-13 | Glowbl | Communication interface and method, computer programme and corresponding recording medium |
US10286326B2 (en) | 2014-07-03 | 2019-05-14 | Activision Publishing, Inc. | Soft reservation system and method for multiplayer video games |
US10376792B2 (en) | 2014-07-03 | 2019-08-13 | Activision Publishing, Inc. | Group composition matchmaking system and method for multiplayer video games |
US10857468B2 (en) | 2014-07-03 | 2020-12-08 | Activision Publishing, Inc. | Systems and methods for dynamically weighing match variables to better tune player matches |
US10322351B2 (en) | 2014-07-03 | 2019-06-18 | Activision Publishing, Inc. | Matchmaking system and method for multiplayer video games |
US10588576B2 (en) | 2014-08-15 | 2020-03-17 | Neosync, Inc. | Methods and device for determining a valid intrinsic frequency |
US20160110044A1 (en) * | 2014-10-20 | 2016-04-21 | Microsoft Corporation | Profile-driven avatar sessions |
WO2016064564A1 (en) * | 2014-10-20 | 2016-04-28 | Microsoft Technology Licensing, Llc | Profile-driven avatar sessions |
US11351466B2 (en) | 2014-12-05 | 2022-06-07 | Activision Publishing, Ing. | System and method for customizing a replay of one or more game events in a video game |
US10118099B2 (en) | 2014-12-16 | 2018-11-06 | Activision Publishing, Inc. | System and method for transparently styling non-player characters in a multiplayer video game |
US10668381B2 (en) | 2014-12-16 | 2020-06-02 | Activision Publishing, Inc. | System and method for transparently styling non-player characters in a multiplayer video game |
US11896905B2 (en) | 2015-05-14 | 2024-02-13 | Activision Publishing, Inc. | Methods and systems for continuing to execute a simulation after processing resources go offline |
US11524237B2 (en) | 2015-05-14 | 2022-12-13 | Activision Publishing, Inc. | Systems and methods for distributing the generation of nonplayer characters across networked end user devices for use in simulated NPC gameplay sessions |
US10286314B2 (en) | 2015-05-14 | 2019-05-14 | Activision Publishing, Inc. | System and method for providing continuous gameplay in a multiplayer video game through an unbounded gameplay session |
US11420119B2 (en) | 2015-05-14 | 2022-08-23 | Activision Publishing, Inc. | Systems and methods for initiating conversion between bounded gameplay sessions and unbounded gameplay sessions |
US10315113B2 (en) | 2015-05-14 | 2019-06-11 | Activision Publishing, Inc. | System and method for simulating gameplay of nonplayer characters distributed across networked end user devices |
US10213682B2 (en) | 2015-06-15 | 2019-02-26 | Activision Publishing, Inc. | System and method for uniquely identifying physical trading cards and incorporating trading card game items in a video game |
US10668367B2 (en) | 2015-06-15 | 2020-06-02 | Activision Publishing, Inc. | System and method for uniquely identifying physical trading cards and incorporating trading card game items in a video game |
US10835818B2 (en) | 2015-07-24 | 2020-11-17 | Activision Publishing, Inc. | Systems and methods for customizing weapons and sharing customized weapons via social networks |
US10471348B2 (en) | 2015-07-24 | 2019-11-12 | Activision Publishing, Inc. | System and method for creating and sharing customized video game weapon configurations in multiplayer video games via one or more social networks |
US10099140B2 (en) | 2015-10-08 | 2018-10-16 | Activision Publishing, Inc. | System and method for generating personalized messaging campaigns for video game players |
US11185784B2 (en) | 2015-10-08 | 2021-11-30 | Activision Publishing, Inc. | System and method for generating personalized messaging campaigns for video game players |
US10898813B2 (en) | 2015-10-21 | 2021-01-26 | Activision Publishing, Inc. | Methods and systems for generating and providing virtual objects and/or playable recreations of gameplay |
US11679333B2 (en) | 2015-10-21 | 2023-06-20 | Activision Publishing, Inc. | Methods and systems for generating a video game stream based on an obtained game log |
US10245509B2 (en) | 2015-10-21 | 2019-04-02 | Activision Publishing, Inc. | System and method of inferring user interest in different aspects of video game streams |
US10232272B2 (en) | 2015-10-21 | 2019-03-19 | Activision Publishing, Inc. | System and method for replaying video game streams |
US10376781B2 (en) | 2015-10-21 | 2019-08-13 | Activision Publishing, Inc. | System and method of generating and distributing video game streams |
US11310346B2 (en) | 2015-10-21 | 2022-04-19 | Activision Publishing, Inc. | System and method of generating and distributing video game streams |
US10930044B2 (en) | 2015-11-06 | 2021-02-23 | Mursion, Inc. | Control system for virtual characters |
US10489957B2 (en) | 2015-11-06 | 2019-11-26 | Mursion, Inc. | Control system for virtual characters |
EP3371778A4 (en) * | 2015-11-06 | 2019-06-26 | Mursion, Inc. | Control system for virtual characters |
US10226703B2 (en) | 2016-04-01 | 2019-03-12 | Activision Publishing, Inc. | System and method of generating and providing interactive annotation items based on triggering events in a video game |
US10300390B2 (en) | 2016-04-01 | 2019-05-28 | Activision Publishing, Inc. | System and method of automatically annotating gameplay of a video game based on triggering events |
US11439909B2 (en) | 2016-04-01 | 2022-09-13 | Activision Publishing, Inc. | Systems and methods of generating and sharing social messages based on triggering events in a video game |
WO2017205226A1 (en) * | 2016-05-27 | 2017-11-30 | Microsoft Technology Licensing, Llc | Communication visualisation |
US10179289B2 (en) | 2016-06-21 | 2019-01-15 | Activision Publishing, Inc. | System and method for reading graphically-encoded identifiers from physical trading cards through image-based template matching |
US10586380B2 (en) | 2016-07-29 | 2020-03-10 | Activision Publishing, Inc. | Systems and methods for automating the animation of blendshape rigs |
US11189084B2 (en) | 2016-07-29 | 2021-11-30 | Activision Publishing, Inc. | Systems and methods for executing improved iterative optimization processes to personify blendshape rigs |
US10572005B2 (en) | 2016-07-29 | 2020-02-25 | Microsoft Technology Licensing, Llc | Private communication with gazing |
US10573065B2 (en) | 2016-07-29 | 2020-02-25 | Activision Publishing, Inc. | Systems and methods for automating the personalization of blendshape rigs based on performance capture data |
WO2018022392A1 (en) * | 2016-07-29 | 2018-02-01 | Microsoft Technology Licensing, Llc | Private communication by gazing at avatar |
CN107705341A (en) * | 2016-08-08 | 2018-02-16 | 创奇思科研有限公司 | The method and its device of user's expression head portrait generation |
US10500498B2 (en) | 2016-11-29 | 2019-12-10 | Activision Publishing, Inc. | System and method for optimizing virtual games |
US10987588B2 (en) | 2016-11-29 | 2021-04-27 | Activision Publishing, Inc. | System and method for optimizing virtual games |
US11423556B2 (en) | 2016-12-06 | 2022-08-23 | Activision Publishing, Inc. | Methods and systems to modify two dimensional facial images in a video to generate, in real-time, facial images that appear three dimensional |
US10650539B2 (en) | 2016-12-06 | 2020-05-12 | Activision Publishing, Inc. | Methods and systems to modify a two dimensional facial image to increase dimensional depth and generate a facial image that appears three dimensional |
US10991110B2 (en) | 2016-12-06 | 2021-04-27 | Activision Publishing, Inc. | Methods and systems to modify a two dimensional facial image to increase dimensional depth and generate a facial image that appears three dimensional |
US9962555B1 (en) | 2017-01-17 | 2018-05-08 | Neosync, Inc. | Head-mountable adjustable devices for generating magnetic fields |
US10835754B2 (en) | 2017-01-17 | 2020-11-17 | Wave Neuroscience, Inc. | Head-mountable adjustable devices for generating magnetic fields |
US11861059B2 (en) | 2017-05-23 | 2024-01-02 | Mindshow Inc. | System and method for generating a virtual reality scene based on individual asynchronous motion capture recordings |
US11599188B2 (en) | 2017-05-23 | 2023-03-07 | Mindshow Inc. | System and method for generating a virtual reality scene based on individual asynchronous motion capture recordings |
US10228760B1 (en) | 2017-05-23 | 2019-03-12 | Visionary Vr, Inc. | System and method for generating a virtual reality scene based on individual asynchronous motion capture recordings |
US10969860B2 (en) | 2017-05-23 | 2021-04-06 | Mindshow Inc. | System and method for generating a virtual reality scene based on individual asynchronous motion capture recordings |
US10664045B2 (en) | 2017-05-23 | 2020-05-26 | Visionary Vr, Inc. | System and method for generating a virtual reality scene based on individual asynchronous motion capture recordings |
US11231773B2 (en) | 2017-05-23 | 2022-01-25 | Mindshow Inc. | System and method for generating a virtual reality scene based on individual asynchronous motion capture recordings |
US10561945B2 (en) | 2017-09-27 | 2020-02-18 | Activision Publishing, Inc. | Methods and systems for incentivizing team cooperation in multiplayer gaming environments |
US10974150B2 (en) | 2017-09-27 | 2021-04-13 | Activision Publishing, Inc. | Methods and systems for improved content customization in multiplayer gaming environments |
US11040286B2 (en) | 2017-09-27 | 2021-06-22 | Activision Publishing, Inc. | Methods and systems for improved content generation in multiplayer gaming environments |
CN107944907A (en) * | 2017-11-16 | 2018-04-20 | 琦境科技(北京)有限公司 | A kind of method and system of virtual reality exhibition room interaction |
US10864443B2 (en) | 2017-12-22 | 2020-12-15 | Activision Publishing, Inc. | Video game content aggregation, normalization, and publication systems and methods |
US10765948B2 (en) | 2017-12-22 | 2020-09-08 | Activision Publishing, Inc. | Video game content aggregation, normalization, and publication systems and methods |
US11413536B2 (en) | 2017-12-22 | 2022-08-16 | Activision Publishing, Inc. | Systems and methods for managing virtual items across multiple video game environments |
WO2019143780A1 (en) * | 2018-01-19 | 2019-07-25 | ESB Labs, Inc. | Virtual interactive audience interface |
US11218783B2 (en) | 2018-01-19 | 2022-01-04 | ESB Labs, Inc. | Virtual interactive audience interface |
US11134308B2 (en) | 2018-08-06 | 2021-09-28 | Sony Corporation | Adapting interactions with a television user |
US11960641B2 (en) | 2018-09-28 | 2024-04-16 | Apple Inc. | Application placement based on head position |
US11679330B2 (en) | 2018-12-18 | 2023-06-20 | Activision Publishing, Inc. | Systems and methods for generating improved non-player characters |
US11972086B2 (en) | 2019-03-18 | 2024-04-30 | Activision Publishing, Inc. | Automatic increasing of capacity of a virtual space in a virtual world |
EP3951604A4 (en) * | 2019-04-01 | 2022-06-01 | Sumitomo Electric Industries, Ltd. | Communication assistance system, communication assistance method, communication assistance program, and image control program |
EP3734966A1 (en) * | 2019-05-03 | 2020-11-04 | Nokia Technologies Oy | An apparatus and associated methods for presentation of audio |
US11151999B2 (en) | 2019-08-01 | 2021-10-19 | International Business Machines Corporation | Controlling external behavior of cognitive systems |
US11097193B2 (en) | 2019-09-11 | 2021-08-24 | Activision Publishing, Inc. | Methods and systems for increasing player engagement in multiplayer gaming environments |
US11893964B2 (en) | 2019-09-26 | 2024-02-06 | Apple Inc. | Controlling displays |
CN113661691A (en) * | 2019-09-27 | 2021-11-16 | 苹果公司 | Environment for remote communication |
US11800059B2 (en) | 2019-09-27 | 2023-10-24 | Apple Inc. | Environment for remote communication |
CN113661691B (en) * | 2019-09-27 | 2023-08-08 | 苹果公司 | Electronic device, storage medium, and method for providing an augmented reality environment |
CN110635969B (en) * | 2019-09-30 | 2022-09-13 | 浪潮软件股份有限公司 | High concurrency test method for streaming media direct memory system |
CN110635969A (en) * | 2019-09-30 | 2019-12-31 | 浪潮软件集团有限公司 | High concurrency test method for streaming media direct memory system |
US11712627B2 (en) | 2019-11-08 | 2023-08-01 | Activision Publishing, Inc. | System and method for providing conditional access to virtual gaming items |
US20210358188A1 (en) * | 2020-05-13 | 2021-11-18 | Nvidia Corporation | Conversational ai platform with rendered graphical output |
US11351459B2 (en) | 2020-08-18 | 2022-06-07 | Activision Publishing, Inc. | Multiplayer video games with virtual characters having dynamically generated attribute profiles unconstrained by predefined discrete values |
US11524234B2 (en) | 2020-08-18 | 2022-12-13 | Activision Publishing, Inc. | Multiplayer video games with virtual characters having dynamically modified fields of view |
CN113189612A (en) * | 2021-05-17 | 2021-07-30 | 长安大学 | Gravel seal quality detection device based on depth camera |
DE102021120330A1 (en) | 2021-08-04 | 2023-02-09 | MyArtist& Me GmbH | Real time audio video streaming system |
US20230073828A1 (en) * | 2021-09-07 | 2023-03-09 | Ringcentral, Inc | System and method for identifying active communicator |
US11876842B2 (en) * | 2021-09-07 | 2024-01-16 | Ringcentral, Inc. | System and method for identifying active communicator |
WO2023156984A1 (en) * | 2022-02-21 | 2023-08-24 | TMRW Foundation IP SARL | Movable virtual camera for improved meeting views in 3d virtual |
EP4254943A1 (en) * | 2022-03-30 | 2023-10-04 | TMRW Foundation IP SARL | Head-tracking based media selection for video communications in virtual environments |
US11783526B1 (en) | 2022-04-11 | 2023-10-10 | Mindshow Inc. | Systems and methods to generate and utilize content styles for animation |
US11527032B1 (en) | 2022-04-11 | 2022-12-13 | Mindshow Inc. | Systems and methods to generate and utilize content styles for animation |
EP4286995A1 (en) * | 2022-05-31 | 2023-12-06 | TMRW Foundation IP SARL | Method, system and computer program product for providing navigation assistance in three-dimensional virtual environments |
US11532179B1 (en) | 2022-06-03 | 2022-12-20 | Prof Jim Inc. | Systems for and methods of creating a library of facial expressions |
US11922726B2 (en) | 2022-06-03 | 2024-03-05 | Prof Jim Inc. | Systems for and methods of creating a library of facial expressions |
US11790697B1 (en) | 2022-06-03 | 2023-10-17 | Prof Jim Inc. | Systems for and methods of creating a library of facial expressions |
US11956571B2 (en) * | 2022-07-28 | 2024-04-09 | Katmai Tech Inc. | Scene freezing and unfreezing |
Also Published As
Publication number | Publication date |
---|---|
AU2003201032A1 (en) | 2003-07-24 |
WO2003058518A3 (en) | 2004-05-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2003058518A2 (en) | Method and apparatus for an avatar user interface system | |
CA2529603C (en) | Intelligent collaborative media | |
JPH07255044A (en) | Animated electronic conference room and video conference system and method | |
KR20030039019A (en) | Medium storing a Computer Program with a Function of Lip-sync and Emotional Expression on 3D Scanned Real Facial Image during Realtime Text to Speech Conversion, and Online Game, Email, Chatting, Broadcasting and Foreign Language Learning Method using the Same | |
Nakanishi | FreeWalk: a social interaction platform for group behaviour in a virtual space | |
Behrendt | Mobile sound: media art in hybrid spaces | |
Chen | Conveying conversational cues through video | |
JP2012042503A (en) | Interactive video system | |
Ursu et al. | Orchestration: Tv-like mixing grammars applied to video-communication for social groups | |
Agamanolis et al. | Reflection of Presence: Toward more natural and responsive telecollaboration | |
JP2005055846A (en) | Remote educational communication system | |
Farouk et al. | Using HoloLens for remote collaboration in extended data visualization | |
Dean et al. | Refining personal and social presence in virtual meetings | |
Barlow et al. | Smart videoconferencing: new habits for virtual meetings | |
Lindström et al. | Affect, attitude and evaluation of multisensory performances | |
Harris | Liveness: An interactional account | |
Hauber | Understanding remote collaboration in video collaborative virtual environments | |
Vasilakos et al. | Interactive theatre via mixed reality and Ambient Intelligence | |
Lewis et al. | Whither video?—pictorial culture and telepresence | |
Lertrusdachakul* et al. | Transparent eye contact and gesture videoconference | |
Kies et al. | Desktop video conferencing: A systems approach | |
US10469803B2 (en) | System and method for producing three-dimensional images from a live video production that appear to project forward of or vertically above an electronic display | |
Andersen et al. | Viewing, listening and reading along: Linguistic and multimodal constructions of viewer participation in the net series SKAM | |
Tseng et al. | Immersive Whiteboards In a Networked Collaborative Environment. | |
El-Shimy | Exploring user-driven techniques for the design of new musical interfaces through the responsive environment for distributed performance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 165802 Country of ref document: IL |
|
ENP | Entry into the national phase |
Ref document number: 2006058572 Country of ref document: US Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10522033 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase | ||
WWP | Wipo information: published in national office |
Ref document number: 10522033 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |