WO1999019788A1 - Method and apparatus for real-time gesture recognition - Google Patents

Method and apparatus for real-time gesture recognition Download PDF

Info

Publication number
WO1999019788A1
WO1999019788A1 PCT/US1998/021718 US9821718W WO9919788A1 WO 1999019788 A1 WO1999019788 A1 WO 1999019788A1 US 9821718 W US9821718 W US 9821718W WO 9919788 A1 WO9919788 A1 WO 9919788A1
Authority
WO
WIPO (PCT)
Prior art keywords
gesture
subject
data
frame
recited
Prior art date
Application number
PCT/US1998/021718
Other languages
French (fr)
Inventor
Katerina H. Nguyen
Original Assignee
Electric Planet, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Planet, Inc. filed Critical Electric Planet, Inc.
Priority to AU10867/99A priority Critical patent/AU1086799A/en
Publication of WO1999019788A1 publication Critical patent/WO1999019788A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/0304Detection arrangements using opto-electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Definitions

  • the present invention relates generally to methods and apparatus for computer- implemented real-time gesture recognition. More particularly, the present invention relates to capturing a sequence of images of a subject moving subject performing a particular movement or gesture; extracting relevant data points from these images and comparing the resulting sequence of data points to patterns of data points for known gestures to determine if there is a match.
  • gesture recognition An emerging and increasingly important procedure in the field of computer science is gesture recognition. In order to make gesture recognition systems commercially useful and widespread, they must recognize known gestures in real-time and must do so with minimum or reduced use of the CPU. From a process perspective a gesture is defined as a time-dependent trajectory following a prescribed pattern through a feature space, e.g., a bodily movement or handwriting.
  • Prior art methods for gesture recognition typically uses neural networks or a Hidden Markov Model's (HMM's) with HMM's being the most prevalent choice.
  • HMM's Hidden Markov Model's
  • a Hidden Markov Model is a model made up of interconnected nodes or states. Each state contains information concerning itself and its relation to other states in the model. More specifically, each state contains (1) the probability of producing a particular observable output and (2) the probabilities of going from that state to any other state in the model. Since only the output is observed a system based on HMM's does not know which state it is in at any given time; it only knows what the probabilities are that a particular model produces the outputs seen thus far. Knowledge of the state is hidden from the system or application. Examples of gesture recognition systems based on Hidden Markov Models include a tennis stroke recognition system, an American sign language recognition system, a system for recognizing lip movements, and systems for recognizing handwriting.
  • HMM's can capture the variance in the way different people perform gestures at different times. However, the same statistical nature makes HMM a "black box.” For example, one state in the model may represent one particular point in a bodily gesture. An HMM-based application may know many things about this point, such as the probabilities that the gesturer will change position or move in other directions. However, the application will not be able to determine precisely when it has reached that point. Thus, the application is not able to determine whether the person has completed 25% or 50% of a known gesture.
  • the present invention provides a system for recognizing gestures made by a subject within a sequence of images and performing an operation based on the semantic meaning of the gesture.
  • a subject such as a human being, enters the viewing field of a camera connected to a computer and performs a gesture.
  • the gesture is then examined by the system one image frame at a time.
  • Positional data is derived from the input frame and compared to previously derived data representing gestures known to the system. The comparisons are done in real time and the system can be trained to better recognize known gestures or to recognize new gestures.
  • a computer-implemented gesture recognition system is described.
  • a background image model is created by examining frames of an average background image before the subject that will perform the gesture enters the image.
  • a frame of the input image containing the subject is obtained after the background image model has been created.
  • the frame captures the person in the action of performing the gesture at one moment in time.
  • the input frame is used to derive a frame data set that contains particular coordinates of the subject at that given moment.
  • sequence of frame data sets taken over a period of time is compared to sequences of positional data making up one or more recognizable gestures i.e., gestures already known to the system. If the gesture performed by the subject is recognizable to the system, an operation based on the semantic meaning of the gesture may be performed by the system.
  • the gesture recognition procedure includes a routine setting its confidence level according to the degree of mismatch between the input gesture data and the patterns of positional data making up the system's recognizable gestures. If the confidence passes a threshold, a material is considered found.
  • the gesture recognition procedure includes a partial completion query routine that updates a status report which provides information on how many of the requirements of the known gestures have been met by the input gesture. This allows queries of how much or what percentage of a known gesture is completed by probing the status report. This is done by determining how many key points of a recognizable gesture have been met.
  • the gesture recognition procedure includes a routine for training the system to recognize new gestures or to recognize certain gestures performed by an individual more efficiently.
  • Several samples of the subject, i.e., individual, performing the new gesture are used by the system to extract the number of key points, the dimensions, and other relevant characteristics of the gesture.
  • a probability distribution for each key point indicating the likelihood of producing a particular observable output at that key point is also derived.
  • Once a characteristic data pattern is obtained for the new gesture it can be compared to patterns of previously stored known gestures to produce a confusion matrix.
  • the confusion matrix describes possible similarities between the new gesture and known gestures as well as the likelihood that the system will confuse these similar gestures.
  • the gesture recognition procedure visually displays the subject performing the gesture and any resulting transformations or augmentations to the subject on a computer monitor through model-based compositing.
  • a compositing method includes shadow reduction and hole and gap filling routines for isolating the subject being composited.
  • a computer-based system for extracting data to be used to recognize gestures made by a subject.
  • an image modular for creating a background model that does not contain the subject is used to create an initial background model.
  • the system includes a frame capturer for obtaining an image frame and a frame analyzer for analyzing the image thereby determining particular coordinates of the subject at a particular time.
  • a data set creator for creating a frame data set from the particular coordinates and a data set analyzer for examining the coordinates in the frame data set and comparing them to positional data representing a known gesture.
  • Gestures are recognized and processed immediately in a computer system that can also be trained to recognize new gestures or to recognize certain known gestures more efficiently.
  • the subject is composited onto a destination image without distorting effects from shadows cast by the subject or from color uniformity between the subject and the background. This provides for a clean, well- defined composited subject on a display monitor which can be processed by the computer system according to the semantic meaning of the recognized or known gesture.
  • Figure 1 is a schematic illustration of a general purpose computer system suitable for implementing the present invention.
  • Figure 2 is a diagram of a preferred embodiment of the present invention showing a person with arms extended and with the image composited onto a computer monitor through the use of a camera.
  • Figure 3 shows a series of screen shots showing a human figure performing a gesture, an arm flap, and the resulting function performed by the system of transforming the human figure to an image of a flying bird.
  • Figure 4 shows another series of screen shots showing a human figure performing another recognizable gesture, jumping, and the system augmenting the human figure once the gesture is recognized.
  • Figure 5 a is a flowchart showing a process for a preferred embodiment for gesture recognition of the present invention.
  • Figure 5b shows data stored in a frame data set as derived from a data or image frame containing a subject performing a gesture as described in block 502 of Figure 5a.
  • Figure 6 is a flowchart showing in greater detail block 504 of Figure 5 a in which the system runs the gesture recognition process.
  • Figures 7a and 7b are flowcharts showing in greater detail block 600 of Figure 6 in which the system processes the frame data to determine whether it matches a recognized gesture.
  • Figures 8A and 8B are flowcharts showing a process for training the system to recognize a new gesture.
  • the present invention employs various processes involving data stored in computer systems. These processes are those requiring physical manipulation of physical quantities.
  • these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It is sometimes convenient, principally for reasons of common usage, to refer to these signals as bits, values, elements, variables, characters, data structures, or the like. It should be remembered, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Further, the manipulations performed are often referred to in terms such as identifying, running, comparing, or detecting. In any of the operations described herein that form part of the present invention, these operations are machine operations. Useful machines for performing the operations of the present invention include general purpose digital computers or other similar devices.
  • the present invention relates to method blocks for operating a computer in processing electrical or other physical signals to generate other desired physical signals.
  • the present invention also relates to a computer system for performing these operations.
  • This computer system may be specially constructed for the required purposes, or it may be a general purpose computer selectively activated or reconfigured by a computer program stored in the computer.
  • the processes presented herein are not inherently related to any particular computer or other computing apparatus.
  • various general purpose computing machines may be used with programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized computer apparatus to perform the required method blocks.
  • FIG. 1 is a schematic illustration of a general purpose computer system suitable for implementing the process of the present invention.
  • the computer system includes a central processing unit (CPU) 102, which CPU is coupled bi-directionally with random access memory (RAM) 104 and unidirectionally with read only memory (ROM) 106.
  • RAM 104 includes programming instructions and data, including text objects as described herein in addition to other data and instructions for processes currently operating on CPU 102.
  • ROM 106 typically includes basic operating instructions, data and objects used by the computer to perform its functions.
  • a mass storage device 108 such as a hard disk, CD ROM, magneto- optical (floptical) drive, tape drive or the like, is coupled bi-directionally with CPU 102.
  • Mass storage device 108 generally includes additional programming instructions, data and text objects that typically are not in active use by the CPU, although the address space may be accessed by the CPU, e.g., for virtual memory or the like.
  • Each of the above described computers further includes an input/output source 110 that typically includes input media such as a keyboard, pointer devices (e.g., a mouse or stylus) and the like.
  • Each computer can also include a network connection 112 over which data, including, e.g., text objects, and instructions can be transferred.
  • Additional mass storage devices may also be connected to CPU 102 through network connection 112. It will be appreciated by those skilled in the art that the above described hardware and software elements are of standard design and construction.
  • Hidden Markov Models are typically used in current gesture recognition systems to account for variance in possible movements in a gesture.
  • the present invention uses the HMM construct and removes the hidden nature of the model by allowing the application to determine which state in the model it is in.
  • the present invention also forces the application to move in a certain direction by removing all the connections from a particular state to the other states except for one. For example, at state one in a Hidden Markov Model, an application may be able to go to states two, three, or four. State one would have the probabilities that from it, the gesture would go to any one of the those states. In a preferred embodiment of the present invention, the connections to states three and four are removed, thus forcing the application or system to go to state two or to stay in state one.
  • HMM construct also allows for this case, which is generally known as the left-to-right HMM.
  • state one will have two probabilities: one indicating the probability that it will stay in state one and another that it will go to state two.
  • the application will stay in state one until it meets the criteria, such as reaching a local extrema for moving to state two.
  • a timing constraint built into the application. This timing constraint applies to individual states in the model. For example, a state may have a timing constraint such that the person cannot stay in a particular pose or position in the gesture for more than a predetermined length of time.
  • the system can determine at any time how much of a particular gesture has been completed since the system knows what state the gesture is in.
  • a training interface is included which requires a small degree of human intervention.
  • a person can " teach" the system new gestures for it to recognize by performing samples of the new gesture in front of a camera. The user can then enter certain information about the new gesture allowing the system to create a model of the new gesture to store in its library.
  • Figure 2 is a diagram of a preferred embodiment of the present invention showing a person with arms extended and having the image composited on a computer monitor through the use of a camera. It shows a computer 206 connected to a camera 200. In other preferred embodiments, the camera can be located further away from the computer. Camera 200 has within its range or field of vision, a person 202 with her arms extended, as if in the middle of an arm flap gesture.
  • the image of person 202 performing the gesture is composited onto a destination image 208 which is displayed on a computer monitor as shown in Figure 2.
  • a destination image 208 which is displayed on a computer monitor as shown in Figure 2.
  • the system recognizes that the person is performing this gesture it will perform an operation associated with that gesture. Examples of this are shown in Figures 3 and 4 below.
  • the person's image does not need to be composited onto a destination image or displayed on the computer monitor. The system can simply recognize the gesture and perform an operation, without having to composite the image of the person.
  • the person may be located in a room with background items that are static, such as furniture, or non- static, such as a television screen or open window showing moving objects, such items are not composited onto a destination image; only the human figure is composited.
  • Figure 3 shows a series of screen shots showing a human figure performing a gesture — in this case an arm flap — and the resulting function performed by the system, i.e. transforming the human figure to other images of a flying bird.
  • the human figure can perform other types of gestures and be transformed to another figure or be augmented, as shown in Figure 4 below.
  • the person is initially flapping her arms up and down at a rate acceptable to the system. This rate can vary in various embodiments but is generally dependent on factors such as camera frame speed or CPU clock speed.
  • shots 302 and 304 the person is moving her arms up and down in full range and is performing the complete gesture of arm flapping.
  • the system transforms the person to a bird as shown at shot 306.
  • Transforming the human figure to a bird is one example of a function or operation the computer can perform once it recognizes the arm flapping gesture. More generally, once recognized the computer can perform any type of function that the computer was programmed to perform upon recognition, such as, changing applications or turning the computer on or off. Performing the recognized gesture is essentially the same as pressing a key on the keyboard or clicking a button on a mouse.
  • Figure 4 shows another example of a preferred embodiment where the human figure performing a recognizable gesture — in this case jumping up and down — is augmented with a new hat by the system once it recognizes the gesture.
  • the figure or subject is not transformed as in Figure 3, but rather is augmented (i.e., a less significant change to the figure) by having an object, the hat, added to it.
  • the figure is standing still.
  • shots 402 and 404 the figure is shown jumping straight up and down at an acceptable rate to the system as described above.
  • the computer performs the function of augmenting the figure by placing a hat on the figure's head as shown at 406.
  • FIG. 5a is a flow diagram showing a process for a preferred embodiment of object gesture recognition of the present invention.
  • the system creates or digitally builds a background model by capturing several frames of a background image.
  • the background image is essentially the setting the system is being used in, for example, a child's playroom, an office, or a living room. It is the setting in which the subject, e.g. a person, will enter and, possibly, perform a gesture.
  • a preferred embodiment of creating a background model is described in an application titled "Method and Apparatus for Model-Based Compositing" by inventor Subutai Ahmad, assigned to Electric Planet, Inc., filed on October 15, 1997.
  • the system preprocesses an image frame within which the subject is performing a particular gesture in block 502.
  • this preprocessing involves compositing the object onto a destination image and displaying the destination image on a computer monitor, as described with respect to Figure 2 above.
  • the compositing process can involve sub-processes for reducing the effect of shadows and filling holes and gaps in the object once composited.
  • the destination image can be an image very different from the background image, such as an outdoor scene, outer space, or other type of imaginary scene. This gives the effect of the person performing a gesture, and being augmented or transformed, in an unusual environment or setting.
  • a preferred embodiment of the compositing process is described in detail in co-pending application titled "Method and Apparatus for Model-Based Compositing" by inventor Subutai Ahmad, assigned to Electric Planet, Inc., filed on October 15, 1997.
  • the system analyzes the person's gesture by performing a gesture recognition process using as data a sequence of image frames captured in block 502.
  • a preferred embodiment of the gesture recognition process is described in greater detail with respect to Figure 6.
  • the gesture recognition process is performed using a gesture database as shown in block 506.
  • Gesture database 506 contains data arrays representing gestures known to the system and other information such as status reports, described in greater detail below.
  • the gesture recognition process deconstructs and analyzes the gesture or gestures being made by the person.
  • the system determines whether the gesture performed by the person is actually a recognized or known gesture.
  • the system has a set of recognizable gestures to which the gesture being performed by the person is compared.
  • the data representing the recognizable gestures is stored in data arrays, described in greater detail with respect to Figure 6 below. If the gesture performed by the person is a recognizable gesture, the system proceeds to block 510.
  • the system performs a particular function or operation based on the semantic meaning of the recognized gesture. As described above this meaning can translate to transforming the person to another figure, like a bird, or augmenting the person, for example, by adding a hat.
  • the system recognizes a gesture and performs an operation based on the gesture, the system returns to block 502 and continues analyzing image frames of the person performing further gestures. That is, even though the person has performed a gesture recognizable to the system and the system has carried out an operation based on the gesture, the processing continues as long as the image frames are being sent to the system. The system will continue processing movements by the person to see if they match any of its recognizable gestures. However, if the gesture performed by the person is not recognized by the system, control also returns to block 502 where the system captures and preprocesses the next frame of the person continuing performance of a gesture (i.e. the person's continuing movements in front of the camera).
  • Figure 5b shows data stored in a frame data set as derived from an image frame containing the person performing a gesture as described in block 502 of Figure 5a.
  • the frame data set shown in Figure 5b contains x and y coordinate values of certain portions of a person performing a gesture.
  • these portions can include: a left extremity, a right extremity, a center of mass, width, top of head, and center of head.
  • the left and right extremities can be the end of a person's right and left arms and the width can be the person's shoulder span.
  • the coordinates can be of other significant or relevant portions depending on the subject performing the movements and the type of movement.
  • the frame data set contains information on the positions (via x and y coordinates) of significant or meaningful portions of the subject's "body” . What is significant or meaningful can depend on the nature and range of gestures expected to be performed by the object or that are recognized or known to the system. For example, the left and right extremities of a person are significant because one of the recognizable gestures is flapping of the arms which is determined by the movement of the ends of the person's arms.
  • each image or data frame captured has a corresponding frame data set.
  • the sequence of frame data sets is analyzed by the gesture recognition process as shown in block 504 of Figure 5a and described in greater detail in Figures 6 and 7.
  • FIG. 6 is a flow diagram showing in greater detail block 504 of Figure 5a.
  • step 600 the system processes the frame data for a known gesture (gesture #1). This process is repeated for each known gesture contained in the gesture database shown in Figure 5 as item 506.
  • the system determines whether the gesture made by the moving subject meets any of the completion requirements for the known gestures in the system in block 606. If the moving subject's gesture does not meet the requirements for any of the known gestures, control returns to block 502 of Figure 5 in which the system preprocesses a new frame of the moving subject. If the moving subject's gesture meets the requirements of any of the known gestures, the system then performs an operation based on the semantic meaning of the recognized gesture.
  • FIG. 7 is a flowchart showing in greater detail block 600 of Figure 6 in which the system processes the frame data to determine whether it matches the completion point of a known gesture.
  • the system begins processing a frame data set representative of a captured image frame.
  • An example of a frame data set is shown in Figure 5b.
  • the frame data set contains coordinates of various significant positions of the moving subject.
  • the frame data set contains information on the moving subject at one particular point in time.
  • the system continues capturing image frames and, thus, deriving frame data sets, as long as there is movement by the subject within view of the camera.
  • the system will extract from the frame data set positional coordinates it needs in order to perform a proper comparison with each of the gestures known or recognizable to the system.
  • a known gesture such as squatting
  • the system extracts relevant coordinates from the frame data set (in some cases it may be all the available coordinates) for comparison to known gestures.
  • the system compares the extracted positional coordinates from the frame data set to the positional coordinates of a particular point of the characteristic pattern of each known gesture.
  • Each of the known gestures in the system is made up of one or more dimensions.
  • the flapping gesture may have four dimensions: normalized x and y for the right arm and normalized x' and y' for the left arm.
  • a jump may have only two dimensions: one for the normalized top of the head and another for the normalized center of mass.
  • Each dimension turns out a characteristic pattern of positional coordinates representing the expected movements of the gesture in a particular space over time.
  • the extracted positional coordinates from the frame data set is compared to a particular point along each of these dimensional patterns for each gesture.
  • Each dimensional pattern has a number of key points, also referred to as states.
  • a key point can be a characteristic pose for a particular gesture. For example, in an arm flapping gesture, a key point can be when the arms are at the highest or lowest positions. In the case of a jump, a key point may be when the object reaches the highest point. Thus, a key point can be a point where the object has a significant change in direction.
  • Each dimension is typically made up of a few key points and flexible zones which are the areas between the key points.
  • the system determines whether a new state has been reached. In the course of comparing the positional data to the dimensional patterns, the system determines whether the input (potential) gesture has reached a key point for any of the known gestures.
  • the system may interpret that as a key point for the jump gesture or possibly a squatting or sitting gesture.
  • the system will make this determination. If a new state has been reached for any of the gestures, the system updates a status report to reflect this event at 710. This informs the system that the person has performed at least a part of one known gesture.
  • This information can be used for a partial completion query to determine whether a person's movement is likely to be a known gesture. For example, a system can inquire or automatically be informed when an input gesture has met three-quarters or two-thirds of a known gesture. This can be determined by probing the status report to see how many states of a known gesture have been reached. The system can then begin preparing for the completion of the known event. Essentially, the system can get a head start in performing the operation associated with the known gesture.
  • the system checks whether there is a severe mismatch between data from the frame data set and the allowable positional coordinates for each dimensional pattern of each known gesture.
  • a severe mismatch would result, for example, from coordinates indicating a change in direction that clearly shows that the gesture does not conform to a particular known gesture (e.g., an arm going up when the system would expect it to go down for a certain gesture).
  • a severe mismatch would first be detected at one of a known gesture's key points. If there is a severe mismatch the system resets the data array for the known gesture with which there was a mismatch at block 714.
  • the system maintains data arrays for each gesture in which the system stores information regarding the "history" of the movements performed by the person and captured by the camera. This information is no longer needed if it determined that it is highly unlikely that the movements by the person will match a particular known gesture. Once these data arrays are cleared so they can begin storing new information, the system also resets the status reports to reflect the mismatch at block 716. By clearing the status report regarding a particular gesture, the system will not provide misleading information when a partial completion query is made regarding that gesture.
  • the status report will indicate, at the time there is a severe mismatch, that no part of the particular gesture has been completed.
  • the system will continue obtaining and processing input image frames of the person performing movements in the range of the camera as shown generally in Figure 5a.
  • the system continues with block 712 where it checks for any severe mismatches. If there are no severe mismatches, the system checks whether there is a match between the coordinates in the frame data set and any of the known gestures in block 720. Once again, this is done by comparing the positional coordinates from the frame data to the coordinates of a particular point along the characteristic pattern of each dimension of each of the known gestures.
  • the most recent data in the known gesture's data arrays is kept and older data is discarded at 722. This is also done if a timing constraint for a state has been violated. This can occur if a person holds a position in a gesture for too long.
  • the subject's gesture should be continuous. New data stored in the array is stored from where the most recent data was kept. The system then continues obtaining new image input frames as shown in block 718.
  • a recognition flag for that gesture is set at 724.
  • a match is found when the sequence of positional coordinates from consecutive frame data sets match each of the patterns of positional coordinates of each dimension for a known gesture.
  • the system can perform an operation associated with the known and recognized gesture, such as transforming the person to another image or augmenting the person, as shown on a computer monitor.
  • the system will also continue obtaining input image frames as long as the person is moving within the range of the camera. Thus, control returns to block 718.
  • the training feature can also be used to show the system how a particular person does one of the already known gestures, such as the arm flap. For example, a particular person may not raise her arms as high as someone with longer arms. By showing the system how a particular person performs a gesture, the system will be more likely to recognize that gesture done by that person and recognize it sooner and with a greater confidence level. This is a useful procedure for frequent users or for users who pattern one gesture frequently.
  • Figures 8A and 8B are flowcharts showing a process for training the system to recognize a new gesture.
  • the system collects samples of the new gesture.
  • One method of providing samples of the new gesture is for a person to enter the field of the camera and do the gesture a certain number of times. This, naturally, requires some user intervention. In a preferred embodiment, the user or users perform the new gesture about 30 times.
  • the number of users and the number of samples have a direct bearing on the accuracy of the model representing the new gesture and the accuracy of the statistics of each key point (discussed in greater detail below). The more representative samples provided to the system, the more robust the recognition process will be.
  • the number of key points in the gesture is entered as well as the complete time it takes to finish one full gesture, from start to finish.
  • the system is provided with a sequence of key points and flexible zones.
  • the number of key points will vary depending on the complexity of the new gesture.
  • the key points determine what coordinates from the input frame data set should be extracted. For example, if the new gesture is a squatting movement, the motion of the hands or arms is irrelevant.
  • the system determines what dimensions to use to measure the frame data set. For example, a squatting gesture may have two dimensions whereas a more complex gesture may have four or five dimensions.
  • the system determines the location of the key points in a model representing the new gesture based on the starting and ending times provided by the user. The system does this by finding the most prominent peaks and valleys for each dimension, and then aligning these extrema across all the dimensions of the new gesture.
  • the system calculates a probability distribution of each state or key point in the model.
  • the system has a set of routines for calculating the statistics at the key points given the set of sample gestures.
  • the statistics of interest include the mean and variance values for each dimension of the gesture and statistics regarding the timing with respect to the start of the gesture.
  • the system sets the allowable upper and lower bounds for the key points, which are used during the recognition phase to accept or reject the incoming input frame data sets as a possible gesture match.
  • the system will examine the samples and derive a probability for each key point. For example, if an incoming gesture reaches the third state of a four-state gesture, the probability that the incoming gesture will match the newly entered gesture may be 90%. On the other hand, if an incoming gesture meets the newly entered gesture's first state, there may only be a 10% probability that the incoming gesture will match the newly entered gesture. This is done for each key point in each dimension for the newly entered gesture.
  • the system refines the model representing the new gesture by trying out different threshold values based on a Gaussian distribution.
  • a first version of the model has already been created.
  • the system runs the same data from the initial samples and some extraneous data that clearly falls outside the model through the model.
  • the system determines how much of the first set of data can be recognized by the initial model.
  • the thresholds of each state are initially set narrowly and are expanded until the model can recognize all the initial samples but not any of the extraneous data entered that should not fall within the model. The purpose of this is to ensure that the refined model is sufficiently broad to recognize all the samples of the gesture but not so broad as to accept arbitrary gestures (as represented by the extraneous data). Essentially, the system is determining what is an acceptable gesture and what is not.
  • the system checks if there are anymore new gestures to be entered into the system by examining frames of the subject's movements. If the system does not detect any additional movements by the subject, it proceeds to block 814.
  • the system updates a gesture confusion matrix.
  • the matrix has an entry for each gesture known to the system.
  • the system checks the newly trained gesture against existing gestures in the library for confusability. If the newly trained gesture is highly confusable with one or more existing gestures, it should be retrained using more features or different features.
  • the matrix would be made up of rows and columns in which the columns represent the known gestures and the rows represent or contain data on each of the gestures.
  • a cell in which the data for a gesture, for example, jump, intersects with the jump column, should contain the highest confusability indicator.
  • a cell in which a jump column intersects with a row for arm flap data should contain a low confusability factor or indicator.
  • the system continues monitoring for additional movements by the subject starting with block 502 of Figure 5a.
  • the image of the person performing the gesture does not need to be composited onto a destination image and then displayed on the computer monitor.
  • the system can, for example, simply recognize the gesture and perform a particular function based on the semantic meaning of the gesture.
  • the system can obtain data frames from another medium such as a video or film created at an earlier time, instead of obtaining the data frames from a live figure whose movements are captured by a camera in real-time.
  • the frame data set can contain coordinates of sections of a moving subject other than coordinates specifically for a human body.

Abstract

A system and method are disclosed for real-time gesture recognition by moving subject within an image and performing an operation based on the semantic meaning of the gesture (510). A subject enters the viewing field of a camera connected to a computer and performs a gesture (504). The gesture is then examined by the system one image frame (506) at a time. Positional data is derived from the input frames (502) and compared to data representing gestures, already known to the system (508). A frame of the input image containing the subject is obtained after a background image module has been created (500). An input frame (502) is used to derive a frame data set that contains particular coordinates of the subject at a given moment in time. This series of frame data sets is examined to determine whether it conveys a gesture that is known to the system.

Description

METHOD AND APPARATUS FOR REAL-TIME GESTURE RECOGNITION
BACKGROUND OF THE INVENTION
1. BACKGROUND The present invention relates generally to methods and apparatus for computer- implemented real-time gesture recognition. More particularly, the present invention relates to capturing a sequence of images of a subject moving subject performing a particular movement or gesture; extracting relevant data points from these images and comparing the resulting sequence of data points to patterns of data points for known gestures to determine if there is a match.
2. Prior Art
An emerging and increasingly important procedure in the field of computer science is gesture recognition. In order to make gesture recognition systems commercially useful and widespread, they must recognize known gestures in real-time and must do so with minimum or reduced use of the CPU. From a process perspective a gesture is defined as a time-dependent trajectory following a prescribed pattern through a feature space, e.g., a bodily movement or handwriting. Prior art methods for gesture recognition typically uses neural networks or a Hidden Markov Model's (HMM's) with HMM's being the most prevalent choice.
A Hidden Markov Model is a model made up of interconnected nodes or states. Each state contains information concerning itself and its relation to other states in the model. More specifically, each state contains (1) the probability of producing a particular observable output and (2) the probabilities of going from that state to any other state in the model. Since only the output is observed a system based on HMM's does not know which state it is in at any given time; it only knows what the probabilities are that a particular model produces the outputs seen thus far. Knowledge of the state is hidden from the system or application. Examples of gesture recognition systems based on Hidden Markov Models include a tennis stroke recognition system, an American sign language recognition system, a system for recognizing lip movements, and systems for recognizing handwriting. The statistical nature of HMM's can capture the variance in the way different people perform gestures at different times. However, the same statistical nature makes HMM a "black box." For example, one state in the model may represent one particular point in a bodily gesture. An HMM-based application may know many things about this point, such as the probabilities that the gesturer will change position or move in other directions. However, the application will not be able to determine precisely when it has reached that point. Thus, the application is not able to determine whether the person has completed 25% or 50% of a known gesture.
Therefore, it would be desirable to have a real-time gesture recognition system that removes the "hidden" layer found in current systems which uses Hidden Markov Models while still capturing the variance in the way different people perform a gesture at different times. In addition, it would be desirable to have a system that would allow more control over the training and recognition of gestures.
SUMMARY OF THE INVENTION
The present invention provides a system for recognizing gestures made by a subject within a sequence of images and performing an operation based on the semantic meaning of the gesture. In a preferred embodiment, a subject, such as a human being, enters the viewing field of a camera connected to a computer and performs a gesture. The gesture is then examined by the system one image frame at a time. Positional data is derived from the input frame and compared to previously derived data representing gestures known to the system. The comparisons are done in real time and the system can be trained to better recognize known gestures or to recognize new gestures. In a preferred embodiment, a computer-implemented gesture recognition system is described. A background image model is created by examining frames of an average background image before the subject that will perform the gesture enters the image. A frame of the input image containing the subject, such as a human being, is obtained after the background image model has been created. The frame captures the person in the action of performing the gesture at one moment in time. The input frame is used to derive a frame data set that contains particular coordinates of the subject at that given moment. These sequence of frame data sets taken over a period of time is compared to sequences of positional data making up one or more recognizable gestures i.e., gestures already known to the system. If the gesture performed by the subject is recognizable to the system, an operation based on the semantic meaning of the gesture may be performed by the system.
In another embodiment the gesture recognition procedure includes a routine setting its confidence level according to the degree of mismatch between the input gesture data and the patterns of positional data making up the system's recognizable gestures. If the confidence passes a threshold, a material is considered found. In yet another preferred embodiment the gesture recognition procedure includes a partial completion query routine that updates a status report which provides information on how many of the requirements of the known gestures have been met by the input gesture. This allows queries of how much or what percentage of a known gesture is completed by probing the status report. This is done by determining how many key points of a recognizable gesture have been met.
In yet another embodiment the gesture recognition procedure includes a routine for training the system to recognize new gestures or to recognize certain gestures performed by an individual more efficiently. Several samples of the subject, i.e., individual, performing the new gesture are used by the system to extract the number of key points, the dimensions, and other relevant characteristics of the gesture. A probability distribution for each key point indicating the likelihood of producing a particular observable output at that key point is also derived. Once a characteristic data pattern is obtained for the new gesture, it can be compared to patterns of previously stored known gestures to produce a confusion matrix. The confusion matrix describes possible similarities between the new gesture and known gestures as well as the likelihood that the system will confuse these similar gestures.
In yet another embodiment the gesture recognition procedure visually displays the subject performing the gesture and any resulting transformations or augmentations to the subject on a computer monitor through model-based compositing. Such a compositing method includes shadow reduction and hole and gap filling routines for isolating the subject being composited.
In another aspect of the present invention a computer-based system for extracting data to be used to recognize gestures made by a subject is described. In a preferred embodiment an image modular for creating a background model that does not contain the subject is used to create an initial background model. The system includes a frame capturer for obtaining an image frame and a frame analyzer for analyzing the image thereby determining particular coordinates of the subject at a particular time. Also described is a data set creator for creating a frame data set from the particular coordinates and a data set analyzer for examining the coordinates in the frame data set and comparing them to positional data representing a known gesture. Advantages of the methods and systems described and claimed are real-time recognition of gestures made by subjects within a dynamic background image. Gestures are recognized and processed immediately in a computer system that can also be trained to recognize new gestures or to recognize certain known gestures more efficiently. In addition, the subject is composited onto a destination image without distorting effects from shadows cast by the subject or from color uniformity between the subject and the background. This provides for a clean, well- defined composited subject on a display monitor which can be processed by the computer system according to the semantic meaning of the recognized or known gesture.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention, together with further advantages thereof, may best be understood by reference of the following description taken in conjunction with the accompanying drawings in which: Figure 1 is a schematic illustration of a general purpose computer system suitable for implementing the present invention.
Figure 2 is a diagram of a preferred embodiment of the present invention showing a person with arms extended and with the image composited onto a computer monitor through the use of a camera. Figure 3 shows a series of screen shots showing a human figure performing a gesture, an arm flap, and the resulting function performed by the system of transforming the human figure to an image of a flying bird.
Figure 4 shows another series of screen shots showing a human figure performing another recognizable gesture, jumping, and the system augmenting the human figure once the gesture is recognized.
Figure 5 a is a flowchart showing a process for a preferred embodiment for gesture recognition of the present invention.
Figure 5b shows data stored in a frame data set as derived from a data or image frame containing a subject performing a gesture as described in block 502 of Figure 5a. Figure 6 is a flowchart showing in greater detail block 504 of Figure 5 a in which the system runs the gesture recognition process.
Figures 7a and 7b are flowcharts showing in greater detail block 600 of Figure 6 in which the system processes the frame data to determine whether it matches a recognized gesture. Figures 8A and 8B are flowcharts showing a process for training the system to recognize a new gesture.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Reference will now be made in detail to a preferred embodiment of the invention. An example of the preferred embodiment is illustrated in the accompanying drawings. While the invention will be described in conjunction with a preferred embodiment, it will be understood that it is not intended to limit the invention to one preferred embodiment. To the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims.
The present invention employs various processes involving data stored in computer systems. These processes are those requiring physical manipulation of physical quantities.
Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It is sometimes convenient, principally for reasons of common usage, to refer to these signals as bits, values, elements, variables, characters, data structures, or the like. It should be remembered, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Further, the manipulations performed are often referred to in terms such as identifying, running, comparing, or detecting. In any of the operations described herein that form part of the present invention, these operations are machine operations. Useful machines for performing the operations of the present invention include general purpose digital computers or other similar devices. In all cases, it should be borne in mind the distinction between the method of operations in operating a computer and the method of computation itself. The present invention relates to method blocks for operating a computer in processing electrical or other physical signals to generate other desired physical signals. The present invention also relates to a computer system for performing these operations. This computer system may be specially constructed for the required purposes, or it may be a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. The processes presented herein are not inherently related to any particular computer or other computing apparatus. In particular, various general purpose computing machines may be used with programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized computer apparatus to perform the required method blocks.
Figure 1 is a schematic illustration of a general purpose computer system suitable for implementing the process of the present invention. The computer system includes a central processing unit (CPU) 102, which CPU is coupled bi-directionally with random access memory (RAM) 104 and unidirectionally with read only memory (ROM) 106. Typically RAM 104 includes programming instructions and data, including text objects as described herein in addition to other data and instructions for processes currently operating on CPU 102. ROM 106 typically includes basic operating instructions, data and objects used by the computer to perform its functions. In addition, a mass storage device 108, such as a hard disk, CD ROM, magneto- optical (floptical) drive, tape drive or the like, is coupled bi-directionally with CPU 102. Mass storage device 108 generally includes additional programming instructions, data and text objects that typically are not in active use by the CPU, although the address space may be accessed by the CPU, e.g., for virtual memory or the like. Each of the above described computers further includes an input/output source 110 that typically includes input media such as a keyboard, pointer devices (e.g., a mouse or stylus) and the like. Each computer can also include a network connection 112 over which data, including, e.g., text objects, and instructions can be transferred. Additional mass storage devices (not shown) may also be connected to CPU 102 through network connection 112. It will be appreciated by those skilled in the art that the above described hardware and software elements are of standard design and construction.
As discussed above, Hidden Markov Models are typically used in current gesture recognition systems to account for variance in possible movements in a gesture. The present invention uses the HMM construct and removes the hidden nature of the model by allowing the application to determine which state in the model it is in. The present invention also forces the application to move in a certain direction by removing all the connections from a particular state to the other states except for one. For example, at state one in a Hidden Markov Model, an application may be able to go to states two, three, or four. State one would have the probabilities that from it, the gesture would go to any one of the those states. In a preferred embodiment of the present invention, the connections to states three and four are removed, thus forcing the application or system to go to state two or to stay in state one. It should be noted that the HMM construct also allows for this case, which is generally known as the left-to-right HMM. However, in an HMM implement, state one will have two probabilities: one indicating the probability that it will stay in state one and another that it will go to state two. In the present invention, there are no transition probabilities. The application will stay in state one until it meets the criteria, such as reaching a local extrema for moving to state two. Also included in a preferred embodiment of the present invention is a timing constraint built into the application. This timing constraint applies to individual states in the model. For example, a state may have a timing constraint such that the person cannot stay in a particular pose or position in the gesture for more than a predetermined length of time. Furthermore, by removing the hidden layer in the HMM, the system can determine at any time how much of a particular gesture has been completed since the system knows what state the gesture is in. In another preferred embodiment of the gesture recognition system of the present invention, a training interface is included which requires a small degree of human intervention.
A person can " teach" the system new gestures for it to recognize by performing samples of the new gesture in front of a camera. The user can then enter certain information about the new gesture allowing the system to create a model of the new gesture to store in its library.
Figure 2 is a diagram of a preferred embodiment of the present invention showing a person with arms extended and having the image composited on a computer monitor through the use of a camera. It shows a computer 206 connected to a camera 200. In other preferred embodiments, the camera can be located further away from the computer. Camera 200 has within its range or field of vision, a person 202 with her arms extended, as if in the middle of an arm flap gesture.
In a preferred embodiment, the image of person 202 performing the gesture is composited onto a destination image 208 which is displayed on a computer monitor as shown in Figure 2. Assuming one of the system's recognizable gestures is arm flapping, once the system recognizes that the person is performing this gesture it will perform an operation associated with that gesture. Examples of this are shown in Figures 3 and 4 below. In other preferred embodiments, the person's image does not need to be composited onto a destination image or displayed on the computer monitor. The system can simply recognize the gesture and perform an operation, without having to composite the image of the person. In a preferred embodiment, although the person may be located in a room with background items that are static, such as furniture, or non- static, such as a television screen or open window showing moving objects, such items are not composited onto a destination image; only the human figure is composited.
Figure 3 shows a series of screen shots showing a human figure performing a gesture — in this case an arm flap — and the resulting function performed by the system, i.e. transforming the human figure to other images of a flying bird. In other preferred embodiments, the human figure can perform other types of gestures and be transformed to another figure or be augmented, as shown in Figure 4 below. At 300 of Figure 3, the person is initially flapping her arms up and down at a rate acceptable to the system. This rate can vary in various embodiments but is generally dependent on factors such as camera frame speed or CPU clock speed. At shots 302 and 304, the person is moving her arms up and down in full range and is performing the complete gesture of arm flapping. Once this is done and the system recognizes the gesture, the system transforms the person to a bird as shown at shot 306. Transforming the human figure to a bird is one example of a function or operation the computer can perform once it recognizes the arm flapping gesture. More generally, once recognized the computer can perform any type of function that the computer was programmed to perform upon recognition, such as, changing applications or turning the computer on or off. Performing the recognized gesture is essentially the same as pressing a key on the keyboard or clicking a button on a mouse.
Figure 4 shows another example of a preferred embodiment where the human figure performing a recognizable gesture — in this case jumping up and down — is augmented with a new hat by the system once it recognizes the gesture. In this example, the figure or subject is not transformed as in Figure 3, but rather is augmented (i.e., a less significant change to the figure) by having an object, the hat, added to it. At shot 400 the figure is standing still. At shots 402 and 404 the figure is shown jumping straight up and down at an acceptable rate to the system as described above. Once this gesture is recognized by the system, the computer performs the function of augmenting the figure by placing a hat on the figure's head as shown at 406. As described above, this system can perform any type of function that it could normally perform from a user pressing a key or clicking a mouse, once it recognizes the gesture. This gesture recognition and training process is described in greater detail with respect to Figures 5 through 9. Figure 5a is a flow diagram showing a process for a preferred embodiment of object gesture recognition of the present invention. At 500, the system creates or digitally builds a background model by capturing several frames of a background image. The background image is essentially the setting the system is being used in, for example, a child's playroom, an office, or a living room. It is the setting in which the subject, e.g. a person, will enter and, possibly, perform a gesture. A preferred embodiment of creating a background model is described in an application titled "Method and Apparatus for Model-Based Compositing" by inventor Subutai Ahmad, assigned to Electric Planet, Inc., filed on October 15, 1997.
Once the background model is created in block 500, in a preferred embodiment, the system preprocesses an image frame within which the subject is performing a particular gesture in block 502. In a preferred embodiment, this preprocessing involves compositing the object onto a destination image and displaying the destination image on a computer monitor, as described with respect to Figure 2 above. The compositing process can involve sub-processes for reducing the effect of shadows and filling holes and gaps in the object once composited. The destination image can be an image very different from the background image, such as an outdoor scene, outer space, or other type of imaginary scene. This gives the effect of the person performing a gesture, and being augmented or transformed, in an unusual environment or setting. A preferred embodiment of the compositing process is described in detail in co-pending application titled "Method and Apparatus for Model-Based Compositing" by inventor Subutai Ahmad, assigned to Electric Planet, Inc., filed on October 15, 1997.
At 504 the system analyzes the person's gesture by performing a gesture recognition process using as data a sequence of image frames captured in block 502. A preferred embodiment of the gesture recognition process is described in greater detail with respect to Figure 6. The gesture recognition process is performed using a gesture database as shown in block 506. Gesture database 506 contains data arrays representing gestures known to the system and other information such as status reports, described in greater detail below. The gesture recognition process deconstructs and analyzes the gesture or gestures being made by the person. At 508 the system determines whether the gesture performed by the person is actually a recognized or known gesture. The system has a set of recognizable gestures to which the gesture being performed by the person is compared. The data representing the recognizable gestures is stored in data arrays, described in greater detail with respect to Figure 6 below. If the gesture performed by the person is a recognizable gesture, the system proceeds to block 510.
At 510 the system performs a particular function or operation based on the semantic meaning of the recognized gesture. As described above this meaning can translate to transforming the person to another figure, like a bird, or augmenting the person, for example, by adding a hat. Once the system recognizes a gesture and performs an operation based on the gesture, the system returns to block 502 and continues analyzing image frames of the person performing further gestures. That is, even though the person has performed a gesture recognizable to the system and the system has carried out an operation based on the gesture, the processing continues as long as the image frames are being sent to the system. The system will continue processing movements by the person to see if they match any of its recognizable gestures. However, if the gesture performed by the person is not recognized by the system, control also returns to block 502 where the system captures and preprocesses the next frame of the person continuing performance of a gesture (i.e. the person's continuing movements in front of the camera).
Figure 5b shows data stored in a frame data set as derived from an image frame containing the person performing a gesture as described in block 502 of Figure 5a. In a preferred embodiment, the frame data set shown in Figure 5b contains x and y coordinate values of certain portions of a person performing a gesture. For example, these portions can include: a left extremity, a right extremity, a center of mass, width, top of head, and center of head. In this example, the left and right extremities can be the end of a person's right and left arms and the width can be the person's shoulder span. In other preferred embodiments, the coordinates can be of other significant or relevant portions depending on the subject performing the movements and the type of movement. The frame data set contains information on the positions (via x and y coordinates) of significant or meaningful portions of the subject's "body" . What is significant or meaningful can depend on the nature and range of gestures expected to be performed by the object or that are recognized or known to the system. For example, the left and right extremities of a person are significant because one of the recognizable gestures is flapping of the arms which is determined by the movement of the ends of the person's arms. In a preferred embodiment, each image or data frame captured has a corresponding frame data set. The sequence of frame data sets is analyzed by the gesture recognition process as shown in block 504 of Figure 5a and described in greater detail in Figures 6 and 7. As will be described in greater detail below, information from the frame data set is extracted in various combinations and can also be scaled as needed by the system. For example, with an arm flapping gesture the system would extract width coordinates, coordinates of right and left extremities, and center of mass coordinates, and possibly others. Essentially, the frame data set indicates the location of significant parts of the moving subject at a given moment in time. Figure 6 is a flow diagram showing in greater detail block 504 of Figure 5a. In step 600 the system processes the frame data for a known gesture (gesture #1). This process is repeated for each known gesture contained in the gesture database shown in Figure 5 as item 506. Once the frame data has been compared to gesture data as shown in blocks 600 through 604 (known gesture #N), the system then determines whether the gesture made by the moving subject meets any of the completion requirements for the known gestures in the system in block 606. If the moving subject's gesture does not meet the requirements for any of the known gestures, control returns to block 502 of Figure 5 in which the system preprocesses a new frame of the moving subject. If the moving subject's gesture meets the requirements of any of the known gestures, the system then performs an operation based on the semantic meaning of the recognized gesture.
For example, if the gesture by the moving object is recognized to be a flapping gesture, the system can then transform the human figure on the monitor into a bird or other objects. The transformation to an image of a bird would be an example of a semantic meaning of the arm flapping gesture. Figure 7 is a flowchart showing in greater detail block 600 of Figure 6 in which the system processes the frame data to determine whether it matches the completion point of a known gesture. At 700 the system begins processing a frame data set representative of a captured image frame. An example of a frame data set is shown in Figure 5b. As described above, the frame data set contains coordinates of various significant positions of the moving subject. The frame data set contains information on the moving subject at one particular point in time. As will be described below, the system continues capturing image frames and, thus, deriving frame data sets, as long as there is movement by the subject within view of the camera.
At 702 the system will extract from the frame data set positional coordinates it needs in order to perform a proper comparison with each of the gestures known or recognizable to the system. For example, a known gesture, such as squatting, may only have two relevant or necessary coordinates that need to be checked, such as top of head and center of mass. Other coordinates do not need to be checked in order to determine whether a person is performing a squatting movement. Thus, in block 702 the system extracts relevant coordinates from the frame data set (in some cases it may be all the available coordinates) for comparison to known gestures.
At 704 the system compares the extracted positional coordinates from the frame data set to the positional coordinates of a particular point of the characteristic pattern of each known gesture. Each of the known gestures in the system is made up of one or more dimensions. For example, the flapping gesture may have four dimensions: normalized x and y for the right arm and normalized x' and y' for the left arm. A jump may have only two dimensions: one for the normalized top of the head and another for the normalized center of mass. Each dimension turns out a characteristic pattern of positional coordinates representing the expected movements of the gesture in a particular space over time. The extracted positional coordinates from the frame data set is compared to a particular point along each of these dimensional patterns for each gesture.
Each dimensional pattern has a number of key points, also referred to as states. A key point can be a characteristic pose for a particular gesture. For example, in an arm flapping gesture, a key point can be when the arms are at the highest or lowest positions. In the case of a jump, a key point may be when the object reaches the highest point. Thus, a key point can be a point where the object has a significant change in direction. Each dimension is typically made up of a few key points and flexible zones which are the areas between the key points. At 706 the system determines whether a new state has been reached. In the course of comparing the positional data to the dimensional patterns, the system determines whether the input (potential) gesture has reached a key point for any of the known gestures. Thus, if a person bends her knees to a certain point, the system may interpret that as a key point for the jump gesture or possibly a squatting or sitting gesture. Another example is a person moving her arms up to a certain point and then moving them down. The point at which the person begins moving her arms down can be interpreted by the system as a key point for the arm flap gesture. At 708 the system will make this determination. If a new state has been reached for any of the gestures, the system updates a status report to reflect this event at 710. This informs the system that the person has performed at least a part of one known gesture.
This information can be used for a partial completion query to determine whether a person's movement is likely to be a known gesture. For example, a system can inquire or automatically be informed when an input gesture has met three-quarters or two-thirds of a known gesture. This can be determined by probing the status report to see how many states of a known gesture have been reached. The system can then begin preparing for the completion of the known event. Essentially, the system can get a head start in performing the operation associated with the known gesture.
At 712 the system checks whether there is a severe mismatch between data from the frame data set and the allowable positional coordinates for each dimensional pattern of each known gesture. A severe mismatch would result, for example, from coordinates indicating a change in direction that clearly shows that the gesture does not conform to a particular known gesture (e.g., an arm going up when the system would expect it to go down for a certain gesture).
A severe mismatch would first be detected at one of a known gesture's key points. If there is a severe mismatch the system resets the data array for the known gesture with which there was a mismatch at block 714. The system maintains data arrays for each gesture in which the system stores information regarding the "history" of the movements performed by the person and captured by the camera. This information is no longer needed if it determined that it is highly unlikely that the movements by the person will match a particular known gesture. Once these data arrays are cleared so they can begin storing new information, the system also resets the status reports to reflect the mismatch at block 716. By clearing the status report regarding a particular gesture, the system will not provide misleading information when a partial completion query is made regarding that gesture. The status report will indicate, at the time there is a severe mismatch, that no part of the particular gesture has been completed. At 718 the system will continue obtaining and processing input image frames of the person performing movements in the range of the camera as shown generally in Figure 5a. Returning to block 708, if a new state has not been reached for any of the known gestures, the system continues with block 712 where it checks for any severe mismatches. If there are no severe mismatches, the system checks whether there is a match between the coordinates in the frame data set and any of the known gestures in block 720. Once again, this is done by comparing the positional coordinates from the frame data to the coordinates of a particular point along the characteristic pattern of each dimension of each of the known gestures.
If there is a less-than-severe mismatch, but a mismatch nonetheless, between the positional coordinates and a known gesture, the most recent data in the known gesture's data arrays is kept and older data is discarded at 722. This is also done if a timing constraint for a state has been violated. This can occur if a person holds a position in a gesture for too long. In a preferred embodiment, the subject's gesture should be continuous. New data stored in the array is stored from where the most recent data was kept. The system then continues obtaining new image input frames as shown in block 718.
If the system determines that the movements performed by the person matches a known gesture, a recognition flag for that gesture is set at 724. A match is found when the sequence of positional coordinates from consecutive frame data sets match each of the patterns of positional coordinates of each dimension for a known gesture. Once a match is found, the system can perform an operation associated with the known and recognized gesture, such as transforming the person to another image or augmenting the person, as shown on a computer monitor. However, the system will also continue obtaining input image frames as long as the person is moving within the range of the camera. Thus, control returns to block 718.
In a preferred embodiment of the present invention, it is possible for the user to enter new gestures into the system, thereby adding them to the system library of known or recognized gestures. One process for doing this is through training the system to recognize the new gesture.
The training feature can also be used to show the system how a particular person does one of the already known gestures, such as the arm flap. For example, a particular person may not raise her arms as high as someone with longer arms. By showing the system how a particular person performs a gesture, the system will be more likely to recognize that gesture done by that person and recognize it sooner and with a greater confidence level. This is a useful procedure for frequent users or for users who pattern one gesture frequently.
Figures 8A and 8B are flowcharts showing a process for training the system to recognize a new gesture. At 800 the system collects samples of the new gesture. One method of providing samples of the new gesture is for a person to enter the field of the camera and do the gesture a certain number of times. This, naturally, requires some user intervention. In a preferred embodiment, the user or users perform the new gesture about 30 times. The number of users and the number of samples have a direct bearing on the accuracy of the model representing the new gesture and the accuracy of the statistics of each key point (discussed in greater detail below). The more representative samples provided to the system, the more robust the recognition process will be.
At 802 the number of key points in the gesture is entered as well as the complete time it takes to finish one full gesture, from start to finish. Essentially, in blocks 800 and 802, the system is provided with a sequence of key points and flexible zones. The number of key points will vary depending on the complexity of the new gesture. The key points determine what coordinates from the input frame data set should be extracted. For example, if the new gesture is a squatting movement, the motion of the hands or arms is irrelevant. At 804 the system determines what dimensions to use to measure the frame data set. For example, a squatting gesture may have two dimensions whereas a more complex gesture may have four or five dimensions. In block 806 the system determines the location of the key points in a model representing the new gesture based on the starting and ending times provided by the user. The system does this by finding the most prominent peaks and valleys for each dimension, and then aligning these extrema across all the dimensions of the new gesture.
At 808 the system calculates a probability distribution of each state or key point in the model. The system has a set of routines for calculating the statistics at the key points given the set of sample gestures. The statistics of interest include the mean and variance values for each dimension of the gesture and statistics regarding the timing with respect to the start of the gesture. Using these means and variances, the system sets the allowable upper and lower bounds for the key points, which are used during the recognition phase to accept or reject the incoming input frame data sets as a possible gesture match. The system will examine the samples and derive a probability for each key point. For example, if an incoming gesture reaches the third state of a four-state gesture, the probability that the incoming gesture will match the newly entered gesture may be 90%. On the other hand, if an incoming gesture meets the newly entered gesture's first state, there may only be a 10% probability that the incoming gesture will match the newly entered gesture. This is done for each key point in each dimension for the newly entered gesture.
At 810 the system refines the model representing the new gesture by trying out different threshold values based on a Gaussian distribution. At this stage a first version of the model has already been created. The system then runs the same data from the initial samples and some extraneous data that clearly falls outside the model through the model. The system then determines how much of the first set of data can be recognized by the initial model. The thresholds of each state are initially set narrowly and are expanded until the model can recognize all the initial samples but not any of the extraneous data entered that should not fall within the model. The purpose of this is to ensure that the refined model is sufficiently broad to recognize all the samples of the gesture but not so broad as to accept arbitrary gestures (as represented by the extraneous data). Essentially, the system is determining what is an acceptable gesture and what is not.
At 812 the system checks if there are anymore new gestures to be entered into the system by examining frames of the subject's movements. If the system does not detect any additional movements by the subject, it proceeds to block 814.
At 814 the system updates a gesture confusion matrix. The matrix has an entry for each gesture known to the system. The system checks the newly trained gesture against existing gestures in the library for confusability. If the newly trained gesture is highly confusable with one or more existing gestures, it should be retrained using more features or different features. In a preferred embodiment the matrix would be made up of rows and columns in which the columns represent the known gestures and the rows represent or contain data on each of the gestures. A cell in which the data for a gesture, for example, jump, intersects with the jump column, should contain the highest confusability indicator. In another example, a cell in which a jump column intersects with a row for arm flap data should contain a low confusability factor or indicator. Once the confusion matrix has been set for the newly entered gesture, the system continues monitoring for additional movements by the subject starting with block 502 of Figure 5a. Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. For example, the image of the person performing the gesture does not need to be composited onto a destination image and then displayed on the computer monitor. The system can, for example, simply recognize the gesture and perform a particular function based on the semantic meaning of the gesture. In another example, the system can obtain data frames from another medium such as a video or film created at an earlier time, instead of obtaining the data frames from a live figure whose movements are captured by a camera in real-time. In yet another example, the frame data set can contain coordinates of sections of a moving subject other than coordinates specifically for a human body. Furthermore, it should be noted that there are alternative ways of implementing both the process and apparatus of the present invention. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.

Claims

CLAIMSWhat is claimed is:
1. A computer-implemented method of storing and recognizing gestures made by a moving subject within an image, the method including: a) building a background model by obtaining at least one frame of a data stream; b) obtaining a data frame containing a subject performing part of a gesture; c) analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the gesture; d) adding the particular coordinates to a frame data set; e) examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; f) repeating b through e for a plurality of data frames; and g) determining whether the plurality of the data frames when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer.
2. A method as recited in claim 1 wherein building a background model further includes determining whether there is significant activity in the background image thereby restarting the process for building the background model.
3. A method as recited in claim 1 wherein obtaining a data frame further includes separating the subject in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates.
4. A method as recited in claim 1 wherein a dimension represents a movement space and has at least one key point wherein a key point represents a significant direction change or pose.
5. A method as recited in claim 4 wherein analyzing the data frame further includes: determining a certainty score at a key point in a dimension wherein the certainty score represents a probability that another key point in the dimension has been reached; and using the certainty score at each key point to determine whether a subject gesture matches a recognizable gesture.
6. A method as recited in claim 1 wherein determining whether the plurality of the data frames conveys a subject gesture further includes comparing the frame data set to positional data corresponding to a dimensional pattern for a recognizable gesture.
7. A method as recited in claim 1 further including: obtaining a next data frame thereby determining whether the subject gesture has reached a next key point; and updating a status report containing data on key points reached in a dimension.
8. A method as recited in claim 7 further including checking the status report to determine if the subject gesture is a partial completion of a recognizable gesture by comparing a previous data frame to the positional data for a recognizable gesture and determining how many key points have been reached.
9. A method as recited in claim 1 further including: determining whether the particular coordinates in the frame data set match the positional data for a potential gesture; resetting a data array representative of the potential gesture and resetting a status report if the particular coordinates in the frame data set severely mismatch the positional data making up a plurality of recognizable gestures; discarding data in the data array representative of the potential gesture if the particular coordinates in the frame data set mismatch the positional data to a degree lesser than a severe mismatch; and signaling if the particular coordinates in the frame data set match positional data for a recognizable gesture thereby indicating that requirements for a recognizable gesture have been met.
10. A method as recited in claim 9 further including discarding the data in the data array representative of the potential gesture if a predetermined amount of time has passed.
11. A method as recited in claim 1 wherein the step of examining the particular coordinates further includes extracting data from the frame data set based on characteristics of the recognizable gesture being checked.
12. A method as recited in claim 1 wherein the recognizable gesture that matches the subject gesture first is the recognizable gesture that causes an operation to be performed in a computer.
13. A method as recited in claim 1 wherein adding the particular coordinates to a frame data set further includes storing the frame data set in a plurality of arrays wherein an array corresponds to one dimension for each recognizable gesture.
14. A method as recited in claim 1 further including: storing a plurality of samples of a subject gesture; inputting a number of key points that fit in the subject gesture and a time value representing the time for the subject gesture to complete; inputting a number of dimensions of the subject gesture; determining locations of key points in a model representative of the subject gesture; and calculating a probability distribution for key points indicating the likelihood that a certain output will be observed.
15. A method as recited in claim 14 further including refining the model such that the plurality of samples of the subject gesture fit within the model.
16. A method as recited in claim 14 further including calculating a confusion matrix wherein the subject gesture is compared with previously stored recognizable gestures so that similarities between the new gesture to previously stored recognizable gestures can be determined.
17. A method as recited in claim 1 further including pre-processing the data frame such that the subject is visually displayed on a computer display monitor.
18. A method as recited in claim 17 wherein the subject is composited onto a destination image such that the background image is subtracted from the data frame thereby isolating the subject to be composited.
19. A computer readable medium including program instructions implementing the process of claim 1.
20. A computer-implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising: an image modeller for creating a background model by examining a plurality of frames of an input image that does not contain a subject; a frame capturer for obtaining a data frame containing the subject performing part of a subject gesture; a frame analyzer for analyzing the data frame thereby determining relevant coordinates of the subject at a particular time while the subject is performing the subject gesture; a data set creator for creating a frame data set by collecting the relevant coordinates; a data set analyzer for examining the particular coordinates in the frame data set such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein each recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; and a gesture recognizer for determining whether a plurality of the data frames, wherein a data frame is represented by a frame data set, when examined in a particular sequence, conveys a gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer.
21. A system as recited in claim 20 wherein the image modeller further comprises an image initializer for initializing the input image that does not contain the subject.
22. A system as recited in claim 20 wherein the frame capturer further comprises a frame separator for categorizing the subject represented in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates.
23. A system as recited in claim 20 wherein a dimension represents a movement space and has at least one key point wherein a key point represents a significant direction change.
24. A system as recited in claim 23 wherein the frame analyzer further comprises: a probability evaluator for determining a certainty score at a key point in a dimension wherein the certainty score represents a probability that a sequence of outputs observed belongs to a gesture model; and a gesture recognizer for determining whether a subject gesture matches a recognizable gesture by using the certainty score at each key point.
25. A system as recited in claim 20 wherein the gesture recognizer further comprises a data comparator for comparing the frame data set to positional data corresponding to a dimension of a recognizable gesture.
26. A system as recited in claim 20 further comprising a status updater for updating a status report containing data on key points reached in a dimension after obtaining a next data frame thereby determining whether the subject gesture has reached a next key point.
27. A system as recited in claim 26 further comprising a status checker for checking the status report to determine if the subject gesture is a partial completion of a recognizable gesture by comparing a previous data frame to the positional data for a recognizable gesture and determining how many key points have been reached.
28. A system as recited in claim 20 further comprising: a position comparator for determining whether the particular coordinates in the frame data set match the positional data for a potential gesture; a data resetter for resetting a data array representative of the potential gesture and resetting a status report if the particular coordinates in the frame data set severely mismatch the positional data making up a plurality of recognizable gestures; a data discarder for discarding data in the data array representative of the potential gesture if the particular coordinates in the frame data set mismatch the positional data to a degree lesser than a severe mismatch; and a match indicator for signaling if the particular coordinates in the frame data set match positional data for a recognizable gesture thereby indicating that requirements for a recognizable gesture have been met.
29. A system as recited in claim 28 wherein the data discarder further comprising a timer for discarding the data in the data array representative of the potential gesture if a predetermined amount of time has passed.
30. A system as recited in claim 20 wherein the data set analyzer further comprises a data extractor for extracting data from the frame data set based on characteristics of the recognizable gesture being checked.
31. A system as recited in claim 20 wherein the recognizable gesture that matches the subject gesture first is the recognizable gesture that causes an operation to be performed in a computer.
32. A system as recited in claim 20 wherein the data set creator further comprises a data set allocator for storing the frame data set in a plurality of arrays wherein an array corresponds to one dimension for a recognizable gesture.
33. A system as recited in claim 20 further comprising: a sample receiver for storing a plurality of samples of a subject gesture; a gesture data intaker for accepting a plurality of key points that fits in the subject gesture, a time value representing the time for the subject gesture to complete and a plurality of dimensions of the subject gesture; a key point locator for determining locations of key points in a model representative of the subject gesture; and a probability evaluator for calculating a probability distribution at the key points indicating the likelihood of observing a particular output.
34. A system as recited in claim 33 further including refining the model such that the plurality of samples of the subject gesture fit within the model.
35. A system as recited in claim 33 further comprising a gesture confusion evaluator for calculating a confusion matrix wherein the subject gesture is compared with previously stored recognizable gestures so that similarities between the subject gesture to previously stored recognizable gestures can be determined.
36. A system as recited in claim 20 further comprising a data frame processor for preprocessing the data frame such that the subject is visually displayed on a computer display monitor.
37. A system as recited in claim 36 further comprising a subject compositor for compositing the subject onto a destination image such that the background image is subtracted from the data frame thereby isolating the subject to be composited.
38. A computer- implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising: means for building a background model by obtaining at least one frame of an image; means for obtaining a data frame containing a subject performing a part of a subject gesture; means for analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the subject gesture; means for adding the particular coordinates to a frame data set; means for examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; and means for determining whether a plurality of data frames, where a data frame is represented by the frame data set, when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer.
39. A system as recited in claim 38 wherein means for building a background model further includes means for determining whether there is significant activity in the background image thereby restarting the process for building the background model.
40. A system as recited in claim 38 wherein means for obtaining a data frame further includes means for separating the subject in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates.
41. A system as recited in claim 38 wherein a dimension represents a movement space and has at least one key point wherein a key point represents a significant direction change.
42. A system as recited in claim 41 wherein means for analyzing the data frame further includes: means for determining a certainty score at a key point in a dimension wherein the certainty score represents a probability that a sequence of previous data frames fit a gesture model; and means for using the certainty score at each key point to determine whether a subject gesture matches a recognizable gesture.
43. A system as recited in claim 38 wherein means for determining whether the plurality of the data frames conveys a subject gesture further includes means for comparing the frame data set to positional data coπesponding to a dimension for a recognizable gesture.
44. A system as recited in claim 38 further including: means for obtaining a next data frame thereby determining whether the subject gesture has reached a next key point; and means for updating a status report containing data on key points reached in a dimension.
45. A system as recited in claim 44 further including means for checking the status report to determine if a subject gesture is a partial completion of a recognizable gesture by comparing a previous data frame to the positional data for a recognizable gesture and determining how many key points have been reached.
46. A system as recited in claim 38 further including: means for determining whether the particular coordinates in the frame data set match the positional data for a potential gesture; means for resetting a data aπay representative of the potential gesture and resetting a status report if the particular coordinates in the frame data set severely mismatch the positional data making up a plurality of recognizable gestures; means for discarding data in the data array representative of the potential gesture if the particular coordinates in the frame data set mismatch the positional data to a degree lesser than a severe mismatch; and means for signaling if the particular coordinates in the frame data set match positional data for a recognizable gesture thereby indicating that requirements for a recognizable gesture have been met.
47. A system as recited in claim 46 further including means for discarding the data in the data aπay representative of the potential gesture if a predetermined amount of time has passed.
48. A system as recited in claim 38 wherein means for examining the particular coordinates further includes means for extracting data from the frame data set based on characteristics of the recognizable gesture being checked.
49. A system as recited in claim 38 wherein the recognizable gesture that matches the subject gesture first is the recognizable gesture that causes an operation to be performed in a computer.
50. A system as recited in claim 38 wherein means for adding the particular coordinates to a frame data set further includes means for storing the frame data set in a plurality of arrays wherein an array coπesponds to one dimension for each recognizable gesture.
51. A system as recited in claim 38 further including: means for storing a plurality of samples of a subject gesture; means for inputting a number of key points that fit in the gesture and a time value representing the time for the subject gesture to complete; means for inputting a number of dimensions of the subject gesture; means for determining locations of key points in a model representative of the subject gesture; and means for calculating a probability distribution for key points indicating the likelihood of observing a particular output.
52. A system as recited in claim 51 further including means for refining the model such that the plurality of samples of the subject gesture fit within the model.
53. A system as recited in claim 51 further including means for calculating a confusion matrix wherein the subject gesture is compared with previously stored recognizable gestures so that similarities between the subject gesture to previously stored recognizable gestures can be determined.
54. A system as recited in claim 38 further including means for pre-processing the data frame such that the subject is visually displayed on a computer display monitor.
55. A system as recited in claim 54 wherein the subject is composited onto a destination image such that the background image is subtracted from the data frame thereby isolating the subject to be composited.
PCT/US1998/021718 1997-10-15 1998-10-14 Method and apparatus for real-time gesture recognition WO1999019788A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU10867/99A AU1086799A (en) 1997-10-15 1998-10-14 Method and apparatus for real-time gesture recognition

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/951,070 1997-10-15
US08/951,070 US6072494A (en) 1997-10-15 1997-10-15 Method and apparatus for real-time gesture recognition

Publications (1)

Publication Number Publication Date
WO1999019788A1 true WO1999019788A1 (en) 1999-04-22

Family

ID=25491219

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/021718 WO1999019788A1 (en) 1997-10-15 1998-10-14 Method and apparatus for real-time gesture recognition

Country Status (3)

Country Link
US (2) US6072494A (en)
AU (1) AU1086799A (en)
WO (1) WO1999019788A1 (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001075568A1 (en) * 2000-03-30 2001-10-11 Ideogramic Aps Method for gesture based modeling
EP1324269A1 (en) * 2000-10-06 2003-07-02 Sony Computer Entertainment Inc. Image processing apparatus, image processing method, record medium, computer program, and semiconductor device
WO2008107733A1 (en) * 2007-03-07 2008-09-12 Sony Ericsson Mobile Communications Ab Method and system for a self timer function for a camera and camera equipped mobile radio terminal
DE102008026030A1 (en) 2008-05-30 2009-12-03 Continental Automotive Gmbh Information and assistance system and a method for its control
EP2278823A3 (en) * 2009-07-20 2011-03-16 J Touch Corporation Stereo image interaction system
EP2399243A2 (en) * 2009-02-17 2011-12-28 Omek Interactive , Ltd. Method and system for gesture recognition
CN103020648A (en) * 2013-01-09 2013-04-03 北京东方艾迪普科技发展有限公司 Method and device for identifying action types, and method and device for broadcasting programs
US8639020B1 (en) 2010-06-16 2014-01-28 Intel Corporation Method and system for modeling subjects from a depth map
US8958631B2 (en) 2011-12-02 2015-02-17 Intel Corporation System and method for automatically defining and identifying a gesture
US9477303B2 (en) 2012-04-09 2016-10-25 Intel Corporation System and method for combining three-dimensional tracking with a three-dimensional display for a user interface
DE102007005027B4 (en) * 2006-03-27 2017-02-23 Volkswagen Ag Display and operating device for a motor vehicle with an interactive user interface
US9910498B2 (en) 2011-06-23 2018-03-06 Intel Corporation System and method for close-range movement tracking
EP2772812A3 (en) * 2013-02-27 2018-03-14 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with redundant system input support
EP2772811A3 (en) * 2013-02-27 2018-03-28 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with confidence-based decision support
CN110636964A (en) * 2017-05-23 2019-12-31 奥迪股份公司 Method for determining a driving instruction
EP2577483B1 (en) * 2010-05-28 2020-04-29 Microsoft Technology Licensing, LLC Cloud-based personal trait profile data
CN111125437A (en) * 2019-12-24 2020-05-08 四川新网银行股份有限公司 Method for identifying lip language picture in video
US11048333B2 (en) 2011-06-23 2021-06-29 Intel Corporation System and method for close-range movement tracking

Families Citing this family (686)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8352400B2 (en) 1991-12-23 2013-01-08 Hoffberg Steven M Adaptive pattern recognition based controller apparatus and method and human-factored interface therefore
US6947571B1 (en) 1999-05-19 2005-09-20 Digimarc Corporation Cell phones with optical capabilities, and related applications
US6750848B1 (en) * 1998-11-09 2004-06-15 Timothy R. Pryor More useful man machine interfaces and applications
GB9722766D0 (en) 1997-10-28 1997-12-24 British Telecomm Portable computers
EP0919906B1 (en) * 1997-11-27 2005-05-25 Matsushita Electric Industrial Co., Ltd. Control method
US6514083B1 (en) * 1998-01-07 2003-02-04 Electric Planet, Inc. Method and apparatus for providing interactive karaoke entertainment
US6971882B1 (en) 1998-01-07 2005-12-06 Electric Planet, Inc. Method and apparatus for providing interactive karaoke entertainment
KR100595920B1 (en) 1998-01-26 2006-07-05 웨인 웨스터만 Method and apparatus for integrating manual input
US7808479B1 (en) 2003-09-02 2010-10-05 Apple Inc. Ambidextrous mouse
US9292111B2 (en) 1998-01-26 2016-03-22 Apple Inc. Gesturing with a multipoint sensing device
US9239673B2 (en) 1998-01-26 2016-01-19 Apple Inc. Gesturing with a multipoint sensing device
US7844914B2 (en) 2004-07-30 2010-11-30 Apple Inc. Activating virtual keys of a touch-screen virtual keyboard
US7663607B2 (en) 2004-05-06 2010-02-16 Apple Inc. Multipoint touchscreen
US7614008B2 (en) 2004-07-30 2009-11-03 Apple Inc. Operation of a computer with touch screen interface
US8479122B2 (en) 2004-07-30 2013-07-02 Apple Inc. Gestures for touch sensitive input devices
US7036094B1 (en) * 1998-08-10 2006-04-25 Cybernet Systems Corporation Behavior recognition system
US6265993B1 (en) * 1998-10-01 2001-07-24 Lucent Technologies, Inc. Furlable keyboard
US6351222B1 (en) * 1998-10-30 2002-02-26 Ati International Srl Method and apparatus for receiving an input by an entertainment device
US7966078B2 (en) 1999-02-01 2011-06-21 Steven Hoffberg Network media appliance system and method
US7760905B2 (en) 1999-06-29 2010-07-20 Digimarc Corporation Wireless mobile phone with content processing
US7261612B1 (en) 1999-08-30 2007-08-28 Digimarc Corporation Methods and systems for read-aloud books
US7406214B2 (en) 1999-05-19 2008-07-29 Digimarc Corporation Methods and devices employing optical sensors and/or steganography
US7113918B1 (en) * 1999-08-01 2006-09-26 Electric Planet, Inc. Method for video enabled electronic commerce
JP4052498B2 (en) 1999-10-29 2008-02-27 株式会社リコー Coordinate input apparatus and method
US8391851B2 (en) * 1999-11-03 2013-03-05 Digimarc Corporation Gestural techniques with wireless mobile phone devices
US7224995B2 (en) * 1999-11-03 2007-05-29 Digimarc Corporation Data entry method and system
JP2001184161A (en) 1999-12-27 2001-07-06 Ricoh Co Ltd Method and device for inputting information, writing input device, method for managing written data, method for controlling display, portable electronic writing device, and recording medium
SE0000850D0 (en) * 2000-03-13 2000-03-13 Pink Solution Ab Recognition arrangement
US7106887B2 (en) * 2000-04-13 2006-09-12 Fuji Photo Film Co., Ltd. Image processing method using conditions corresponding to an identified person
CN1367990A (en) * 2000-04-24 2002-09-04 三菱电机株式会社 Cellular phone and remote control system
FI20001429A (en) * 2000-06-15 2001-12-16 Nokia Corp Choosing an alternative
US6803906B1 (en) 2000-07-05 2004-10-12 Smart Technologies, Inc. Passive touch system and method of detecting user input
JP5042437B2 (en) * 2000-07-05 2012-10-03 スマート テクノロジーズ ユーエルシー Camera-based touch system
US6795068B1 (en) * 2000-07-21 2004-09-21 Sony Computer Entertainment Inc. Prop input device and method for mapping an object from a two-dimensional camera image to a three-dimensional space for controlling action in a game program
US7227526B2 (en) * 2000-07-24 2007-06-05 Gesturetek, Inc. Video-based image control system
WO2002015560A2 (en) * 2000-08-12 2002-02-21 Georgia Tech Research Corporation A system and method for capturing an image
US7071914B1 (en) 2000-09-01 2006-07-04 Sony Computer Entertainment Inc. User input device and method for interaction with graphic images
US7000200B1 (en) * 2000-09-15 2006-02-14 Intel Corporation Gesture recognition system recognizing gestures within a specified timing
US7058204B2 (en) * 2000-10-03 2006-06-06 Gesturetek, Inc. Multiple camera control system
US6904408B1 (en) * 2000-10-19 2005-06-07 Mccarthy John Bionet method, system and personalized web content manager responsive to browser viewers' psychological preferences, behavioral responses and physiological stress indicators
US7095401B2 (en) * 2000-11-02 2006-08-22 Siemens Corporate Research, Inc. System and method for gesture interface
US6894714B2 (en) 2000-12-05 2005-05-17 Koninklijke Philips Electronics N.V. Method and apparatus for predicting events in video conferencing and other applications
US7030861B1 (en) 2001-02-10 2006-04-18 Wayne Carl Westerman System and method for packing multi-touch gestures onto a hand
US7412412B2 (en) * 2001-02-12 2008-08-12 Avotus Inc. Network reverse auction and spending analysis methods
US6804396B2 (en) 2001-03-28 2004-10-12 Honda Giken Kogyo Kabushiki Kaisha Gesture recognition system
US7274800B2 (en) * 2001-07-18 2007-09-25 Intel Corporation Dynamic gesture recognition from stereo sequences
US7284201B2 (en) * 2001-09-20 2007-10-16 Koninklijke Philips Electronics N.V. User attention-based adaptation of quality level to improve the management of real-time multi-media content delivery and distribution
US7545949B2 (en) * 2004-06-09 2009-06-09 Cognex Technology And Investment Corporation Method for setting parameters of a vision detector using production line information
US9092841B2 (en) 2004-06-09 2015-07-28 Cognex Technology And Investment Llc Method and apparatus for visual detection and inspection of objects
US20050226490A1 (en) * 2002-01-29 2005-10-13 Phillips Brian S Method and apparatus for improved vision detector image capture and analysis
US6990639B2 (en) 2002-02-07 2006-01-24 Microsoft Corporation System and process for controlling electronic components in a ubiquitous computing environment using multimodal integration
US6938222B2 (en) * 2002-02-08 2005-08-30 Microsoft Corporation Ink gestures
TW554293B (en) * 2002-03-29 2003-09-21 Ind Tech Res Inst Method for extracting and matching hand gesture features of image
US7366645B2 (en) * 2002-05-06 2008-04-29 Jezekiel Ben-Arie Method of recognition of human motion, vector sequences and speech
US7209883B2 (en) * 2002-05-09 2007-04-24 Intel Corporation Factorial hidden markov model for audiovisual speech recognition
US20030212552A1 (en) * 2002-05-09 2003-11-13 Liang Lu Hong Face recognition procedure useful for audiovisual speech recognition
US7165029B2 (en) 2002-05-09 2007-01-16 Intel Corporation Coupled hidden Markov model for audiovisual speech recognition
US20040001144A1 (en) 2002-06-27 2004-01-01 Mccharles Randy Synchronization of camera images in camera-based touch system to enhance position determination of fast moving objects
US7089185B2 (en) * 2002-06-27 2006-08-08 Intel Corporation Embedded multi-layer coupled hidden Markov model
US7656393B2 (en) 2005-03-04 2010-02-02 Apple Inc. Electronic device having display and surrounding touch sensitive bezel for user interface and control
US11275405B2 (en) 2005-03-04 2022-03-15 Apple Inc. Multi-functional hand-held device
US7161579B2 (en) 2002-07-18 2007-01-09 Sony Computer Entertainment Inc. Hand-held computer interactive device
US7623115B2 (en) 2002-07-27 2009-11-24 Sony Computer Entertainment Inc. Method and apparatus for light input device
US7646372B2 (en) * 2003-09-15 2010-01-12 Sony Computer Entertainment Inc. Methods and systems for enabling direction detection when interfacing with a computer program
US7102615B2 (en) * 2002-07-27 2006-09-05 Sony Computer Entertainment Inc. Man-machine interface using a deformable device
US8797260B2 (en) 2002-07-27 2014-08-05 Sony Computer Entertainment Inc. Inertially trackable hand-held controller
US9393487B2 (en) 2002-07-27 2016-07-19 Sony Interactive Entertainment Inc. Method for mapping movements of a hand-held controller to game commands
US7760248B2 (en) 2002-07-27 2010-07-20 Sony Computer Entertainment Inc. Selective sound source listening in conjunction with computer interactive processing
US9474968B2 (en) 2002-07-27 2016-10-25 Sony Interactive Entertainment America Llc Method and system for applying gearing effects to visual tracking
US8313380B2 (en) 2002-07-27 2012-11-20 Sony Computer Entertainment America Llc Scheme for translating movements of a hand-held controller into inputs for a system
US7627139B2 (en) * 2002-07-27 2009-12-01 Sony Computer Entertainment Inc. Computer image and audio processing of intensity and input devices for interfacing with a computer program
US8686939B2 (en) 2002-07-27 2014-04-01 Sony Computer Entertainment Inc. System, method, and apparatus for three-dimensional input control
US8570378B2 (en) 2002-07-27 2013-10-29 Sony Computer Entertainment Inc. Method and apparatus for tracking three-dimensional movements of an object using a depth sensing camera
US9682319B2 (en) 2002-07-31 2017-06-20 Sony Interactive Entertainment Inc. Combiner method for altering game gearing
US7200266B2 (en) * 2002-08-27 2007-04-03 Princeton University Method and apparatus for automated video activity analysis
US7171043B2 (en) 2002-10-11 2007-01-30 Intel Corporation Image recognition using hidden markov models and coupled hidden markov models
US8494859B2 (en) * 2002-10-15 2013-07-23 Gh, Llc Universal processing system and methods for production of outputs accessible by people with disabilities
US6954197B2 (en) * 2002-11-15 2005-10-11 Smart Technologies Inc. Size/scale and orientation determination of a pointer in a camera-based touch system
US7472063B2 (en) * 2002-12-19 2008-12-30 Intel Corporation Audio-visual feature fusion and support vector machine useful for continuous speech recognition
US7203368B2 (en) 2003-01-06 2007-04-10 Intel Corporation Embedded bayesian network for pattern recognition
US7224830B2 (en) * 2003-02-04 2007-05-29 Intel Corporation Gesture detection from digital video images
US9177387B2 (en) 2003-02-11 2015-11-03 Sony Computer Entertainment Inc. Method and apparatus for real time motion capture
US7629967B2 (en) 2003-02-14 2009-12-08 Next Holdings Limited Touch screen signal processing
US8456447B2 (en) 2003-02-14 2013-06-04 Next Holdings Limited Touch screen signal processing
US8508508B2 (en) 2003-02-14 2013-08-13 Next Holdings Limited Touch screen signal processing with single-point calibration
US7532206B2 (en) * 2003-03-11 2009-05-12 Smart Technologies Ulc System and method for differentiating between pointers used to contact touch surface
US7665041B2 (en) 2003-03-25 2010-02-16 Microsoft Corporation Architecture for controlling a computer using hand gestures
US8745541B2 (en) 2003-03-25 2014-06-03 Microsoft Corporation Architecture for controlling a computer using hand gestures
US20040196400A1 (en) * 2003-04-07 2004-10-07 Stavely Donald J. Digital camera user interface using hand gestures
US7256772B2 (en) 2003-04-08 2007-08-14 Smart Technologies, Inc. Auto-aligning touch system and method
EP1617374A4 (en) * 2003-04-11 2008-08-13 Nat Inst Inf & Comm Tech Image recognizing device and image recognizing program
US8072470B2 (en) * 2003-05-29 2011-12-06 Sony Computer Entertainment Inc. System and method for providing a real-time three-dimensional interactive environment
JP3752246B2 (en) * 2003-08-11 2006-03-08 学校法人慶應義塾 Hand pattern switch device
US7874917B2 (en) 2003-09-15 2011-01-25 Sony Computer Entertainment Inc. Methods and systems for enabling depth and direction detection when interfacing with a computer program
US8323106B2 (en) 2008-05-30 2012-12-04 Sony Computer Entertainment America Llc Determination of controller three-dimensional location using image analysis and ultrasonic communication
US10279254B2 (en) 2005-10-26 2019-05-07 Sony Interactive Entertainment Inc. Controller having visually trackable object for interfacing with a gaming system
US7411575B2 (en) 2003-09-16 2008-08-12 Smart Technologies Ulc Gesture recognition method and touch system incorporating the same
US7607097B2 (en) * 2003-09-25 2009-10-20 International Business Machines Corporation Translating emotion to braille, emoticons and other special symbols
CN1860429A (en) * 2003-09-30 2006-11-08 皇家飞利浦电子股份有限公司 Gesture to define location, size, and/or content of content window on a display
US7274356B2 (en) 2003-10-09 2007-09-25 Smart Technologies Inc. Apparatus for determining the location of a pointer within a region of interest
US8133115B2 (en) 2003-10-22 2012-03-13 Sony Computer Entertainment America Llc System and method for recording and displaying a graphical path in a video game
GB2424723B (en) * 2003-11-13 2007-09-19 Japan Science & Tech Agency Method for driving robot
US20050131744A1 (en) * 2003-12-10 2005-06-16 International Business Machines Corporation Apparatus, system and method of automatically identifying participants at a videoconference who exhibit a particular expression
US20050131697A1 (en) * 2003-12-10 2005-06-16 International Business Machines Corporation Speech improving apparatus, system and method
US7355593B2 (en) 2004-01-02 2008-04-08 Smart Technologies, Inc. Pointer tracking across multiple overlapping coordinate input sub-regions defining a generally contiguous input region
US7663689B2 (en) * 2004-01-16 2010-02-16 Sony Computer Entertainment Inc. Method and apparatus for optimizing capture device settings through depth information
US7707039B2 (en) 2004-02-15 2010-04-27 Exbiblio B.V. Automatic modification of web pages
US8442331B2 (en) 2004-02-15 2013-05-14 Google Inc. Capturing text from rendered documents using supplemental information
FI117308B (en) * 2004-02-06 2006-08-31 Nokia Corp gesture Control
US7812860B2 (en) 2004-04-01 2010-10-12 Exbiblio B.V. Handheld device for capturing text from both a document printed on paper and a document displayed on a dynamic display device
US10635723B2 (en) 2004-02-15 2020-04-28 Google Llc Search engines and systems with handheld document data capture devices
US7232986B2 (en) * 2004-02-17 2007-06-19 Smart Technologies Inc. Apparatus for detecting a pointer within a region of interest
JP2005269605A (en) * 2004-02-20 2005-09-29 Fuji Photo Film Co Ltd Digital picture book system, and picture book retrieving method and program therefor
JP2005242694A (en) * 2004-02-26 2005-09-08 Mitsubishi Fuso Truck & Bus Corp Hand pattern switching apparatus
US8081849B2 (en) 2004-12-03 2011-12-20 Google Inc. Portable scanning and memory device
US7894670B2 (en) 2004-04-01 2011-02-22 Exbiblio B.V. Triggering actions in response to optically or acoustically capturing keywords from a rendered document
US20060081714A1 (en) 2004-08-23 2006-04-20 King Martin T Portable scanning device
US9116890B2 (en) 2004-04-01 2015-08-25 Google Inc. Triggering actions in response to optically or acoustically capturing keywords from a rendered document
US8146156B2 (en) 2004-04-01 2012-03-27 Google Inc. Archive of text captures from rendered documents
US7990556B2 (en) 2004-12-03 2011-08-02 Google Inc. Association of a portable scanner with input/output and storage devices
US9143638B2 (en) 2004-04-01 2015-09-22 Google Inc. Data capture from rendered documents using handheld device
US9008447B2 (en) 2004-04-01 2015-04-14 Google Inc. Method and system for character recognition
US20060098900A1 (en) 2004-09-27 2006-05-11 King Martin T Secure data gathering from rendered documents
US8713418B2 (en) 2004-04-12 2014-04-29 Google Inc. Adding value to a rendered document
US8345918B2 (en) * 2004-04-14 2013-01-01 L-3 Communications Corporation Active subject privacy imaging
US8489624B2 (en) 2004-05-17 2013-07-16 Google, Inc. Processing techniques for text capture from a rendered document
US8874504B2 (en) 2004-12-03 2014-10-28 Google Inc. Processing techniques for visual capture data from a rendered document
US8620083B2 (en) 2004-12-03 2013-12-31 Google Inc. Method and system for character recognition
US7460110B2 (en) 2004-04-29 2008-12-02 Smart Technologies Ulc Dual mode touch system
US7492357B2 (en) 2004-05-05 2009-02-17 Smart Technologies Ulc Apparatus and method for detecting a pointer relative to a touch surface
US7538759B2 (en) 2004-05-07 2009-05-26 Next Holdings Limited Touch panel display system with illumination and detection provided from a single edge
US8120596B2 (en) 2004-05-21 2012-02-21 Smart Technologies Ulc Tiled touch system
US20050276445A1 (en) * 2004-06-09 2005-12-15 Silver William M Method and apparatus for automatic visual detection, recording, and retrieval of events
US8243986B2 (en) * 2004-06-09 2012-08-14 Cognex Technology And Investment Corporation Method and apparatus for automatic visual event detection
US8891852B2 (en) * 2004-06-09 2014-11-18 Cognex Technology And Investment Corporation Method and apparatus for configuring and testing a machine vision detector
US8127247B2 (en) * 2004-06-09 2012-02-28 Cognex Corporation Human-machine-interface and method for manipulating data in a machine vision system
US7788606B2 (en) * 2004-06-14 2010-08-31 Sas Institute Inc. Computer-implemented system and method for defining graphics primitives
US8466893B2 (en) * 2004-06-17 2013-06-18 Adrea, LLC Use of a two finger input on touch screens
US8346620B2 (en) 2004-07-19 2013-01-01 Google Inc. Automatic modification of web pages
US7653883B2 (en) * 2004-07-30 2010-01-26 Apple Inc. Proximity detector in handheld device
US8381135B2 (en) 2004-07-30 2013-02-19 Apple Inc. Proximity detector in handheld device
CN100555200C (en) 2004-08-16 2009-10-28 苹果公司 The method of the spatial resolution of touch sensitive devices and raising touch sensitive devices
JP4433948B2 (en) * 2004-09-02 2010-03-17 株式会社セガ Background image acquisition program, video game apparatus, background image acquisition method, and computer-readable recording medium recording the program
JP4419768B2 (en) * 2004-09-21 2010-02-24 日本ビクター株式会社 Control device for electronic equipment
US20060072009A1 (en) * 2004-10-01 2006-04-06 International Business Machines Corporation Flexible interaction-based computer interfacing using visible artifacts
US20060071933A1 (en) 2004-10-06 2006-04-06 Sony Computer Entertainment Inc. Application binary interface for multi-pass shaders
GB2418974B (en) * 2004-10-07 2009-03-25 Hewlett Packard Development Co Machine-human interface
JP2006133937A (en) * 2004-11-04 2006-05-25 Fuji Xerox Co Ltd Behavior identifying device
US7583819B2 (en) * 2004-11-05 2009-09-01 Kyprianos Papademetriou Digital signal processing methods, systems and computer program products that identify threshold positions and values
US7636449B2 (en) * 2004-11-12 2009-12-22 Cognex Technology And Investment Corporation System and method for assigning analysis parameters to vision detector using a graphical interface
US7386150B2 (en) * 2004-11-12 2008-06-10 Safeview, Inc. Active subject imaging with body identification
US7720315B2 (en) * 2004-11-12 2010-05-18 Cognex Technology And Investment Corporation System and method for displaying and using non-numeric graphic elements to control and monitor a vision system
US9292187B2 (en) 2004-11-12 2016-03-22 Cognex Corporation System, method and graphical user interface for displaying and controlling vision system operating parameters
CN100345085C (en) * 2004-12-30 2007-10-24 中国科学院自动化研究所 Method for controlling electronic game scene and role based on poses and voices of player
US7598942B2 (en) * 2005-02-08 2009-10-06 Oblong Industries, Inc. System and method for gesture based control system
US7664571B2 (en) * 2005-04-18 2010-02-16 Honda Motor Co., Ltd. Controlling a robot using pose
US20060260624A1 (en) * 2005-05-17 2006-11-23 Battelle Memorial Institute Method, program, and system for automatic profiling of entities
US7636126B2 (en) 2005-06-22 2009-12-22 Sony Computer Entertainment Inc. Delay matching in audio/video systems
US7774713B2 (en) 2005-06-28 2010-08-10 Microsoft Corporation Dynamic user experience with semantic rich objects
JP5091857B2 (en) * 2005-06-30 2012-12-05 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ System control method
US20070057912A1 (en) * 2005-09-14 2007-03-15 Romriell Joseph N Method and system for controlling an interface of a device through motion gestures
US7697827B2 (en) 2005-10-17 2010-04-13 Konicek Jeffrey C User-friendlier interfaces for a camera
US7599520B2 (en) * 2005-11-18 2009-10-06 Accenture Global Services Gmbh Detection of multiple targets on a plane of interest
US8209620B2 (en) 2006-01-31 2012-06-26 Accenture Global Services Limited System for storage and navigation of application states and interactions
US20070130547A1 (en) * 2005-12-01 2007-06-07 Navisense, Llc Method and system for touchless user interface control
US8593502B2 (en) * 2006-01-26 2013-11-26 Polycom, Inc. Controlling videoconference with touch screen interface
US8872879B2 (en) * 2006-01-26 2014-10-28 Polycom, Inc. System and method for controlling videoconference with touch screen interface
US8537111B2 (en) 2006-02-08 2013-09-17 Oblong Industries, Inc. Control system for navigating a principal dimension of a data space
US9823747B2 (en) 2006-02-08 2017-11-21 Oblong Industries, Inc. Spatial, multi-modal control device for use with spatial operating system
US8370383B2 (en) 2006-02-08 2013-02-05 Oblong Industries, Inc. Multi-process interactive systems and methods
US9910497B2 (en) * 2006-02-08 2018-03-06 Oblong Industries, Inc. Gestural control of autonomous and semi-autonomous systems
US9075441B2 (en) * 2006-02-08 2015-07-07 Oblong Industries, Inc. Gesture based control using three-dimensional information extracted over an extended depth of field
US8537112B2 (en) * 2006-02-08 2013-09-17 Oblong Industries, Inc. Control system for navigating a principal dimension of a data space
US8531396B2 (en) 2006-02-08 2013-09-10 Oblong Industries, Inc. Control system for navigating a principal dimension of a data space
US8578282B2 (en) * 2006-03-15 2013-11-05 Navisense Visual toolkit for a virtual user interface
US7538760B2 (en) 2006-03-30 2009-05-26 Apple Inc. Force imaging input device and system
US7978181B2 (en) 2006-04-25 2011-07-12 Apple Inc. Keystroke tactility arrangement on a smooth touch surface
US8279180B2 (en) 2006-05-02 2012-10-02 Apple Inc. Multipoint touch surface controller
US7965859B2 (en) 2006-05-04 2011-06-21 Sony Computer Entertainment Inc. Lighting control of a user environment via a display device
US7880746B2 (en) 2006-05-04 2011-02-01 Sony Computer Entertainment Inc. Bandwidth management through lighting control of a user environment via a display device
WO2007132451A2 (en) * 2006-05-11 2007-11-22 Prime Sense Ltd. Modeling of humanoid forms from depth maps
CN104965621B (en) 2006-06-09 2018-06-12 苹果公司 Touch screen LCD and its operating method
US8259078B2 (en) 2006-06-09 2012-09-04 Apple Inc. Touch screen liquid crystal display
KR102125605B1 (en) 2006-06-09 2020-06-22 애플 인크. Touch screen liquid crystal display
WO2008014826A1 (en) * 2006-08-03 2008-02-07 Alterface S.A. Method and device for identifying and extracting images of multiple users, and for recognizing user gestures
JP4267648B2 (en) * 2006-08-25 2009-05-27 株式会社東芝 Interface device and method thereof
US7725547B2 (en) * 2006-09-06 2010-05-25 International Business Machines Corporation Informing a user of gestures made by others out of the user's line of sight
EP2067119A2 (en) 2006-09-08 2009-06-10 Exbiblio B.V. Optical scanners, such as hand-held optical scanners
US8781151B2 (en) 2006-09-28 2014-07-15 Sony Computer Entertainment Inc. Object detection using video input combined with tilt angle information
USRE48417E1 (en) 2006-09-28 2021-02-02 Sony Interactive Entertainment Inc. Object direction using video input combined with tilt angle information
US8310656B2 (en) 2006-09-28 2012-11-13 Sony Computer Entertainment America Llc Mapping movements of a hand-held controller to the two-dimensional image plane of a display screen
US8005257B2 (en) * 2006-10-05 2011-08-23 The United States Of America As Represented By The Secretary Of The Navy Gesture recognition apparatus and method
GB2452644B (en) * 2006-10-10 2009-09-16 Promethean Ltd Automatic tool docking
US9442607B2 (en) 2006-12-04 2016-09-13 Smart Technologies Inc. Interactive input system and method
US8493330B2 (en) 2007-01-03 2013-07-23 Apple Inc. Individual channel phase delay scheme
US9710095B2 (en) 2007-01-05 2017-07-18 Apple Inc. Touch screen stack-ups
US7840031B2 (en) * 2007-01-12 2010-11-23 International Business Machines Corporation Tracking a range of body movement based on 3D captured image streams of a user
US8269834B2 (en) 2007-01-12 2012-09-18 International Business Machines Corporation Warning a user about adverse behaviors of others within an environment based on a 3D captured image stream
US7877706B2 (en) * 2007-01-12 2011-01-25 International Business Machines Corporation Controlling a document based on user behavioral signals detected from a 3D captured image stream
US8295542B2 (en) * 2007-01-12 2012-10-23 International Business Machines Corporation Adjusting a consumer experience based on a 3D captured image stream of a consumer response
US7971156B2 (en) * 2007-01-12 2011-06-28 International Business Machines Corporation Controlling resource access based on user gesturing in a 3D captured image stream of the user
US8588464B2 (en) * 2007-01-12 2013-11-19 International Business Machines Corporation Assisting a vision-impaired user with navigation based on a 3D captured image stream
US7792328B2 (en) * 2007-01-12 2010-09-07 International Business Machines Corporation Warning a vehicle operator of unsafe operation behavior based on a 3D captured image stream
US7801332B2 (en) * 2007-01-12 2010-09-21 International Business Machines Corporation Controlling a system based on user behavioral signals detected from a 3D captured image stream
US7770136B2 (en) 2007-01-24 2010-08-03 Microsoft Corporation Gesture recognition interactive feedback
US8060841B2 (en) * 2007-03-19 2011-11-15 Navisense Method and device for touchless media searching
US8005238B2 (en) 2007-03-22 2011-08-23 Microsoft Corporation Robust adaptive beamforming with enhanced noise suppression
EP2135155B1 (en) 2007-04-11 2013-09-18 Next Holdings, Inc. Touch screen system with hover and click input methods
AU2007351713B2 (en) 2007-04-20 2011-11-17 Softkinetic Software Volume recognition method and system
WO2008134452A2 (en) * 2007-04-24 2008-11-06 Oblong Industries, Inc. Proteins, pools, and slawx in processing environments
US8005237B2 (en) 2007-05-17 2011-08-23 Microsoft Corp. Sensor array beamformer post-processor
US8237099B2 (en) * 2007-06-15 2012-08-07 Cognex Corporation Method and system for optoelectronic detection and location of objects
US8103109B2 (en) * 2007-06-19 2012-01-24 Microsoft Corporation Recognizing hand poses and/or object classes
US8302033B2 (en) 2007-06-22 2012-10-30 Apple Inc. Touch screen device, method, and graphical user interface for providing maps, directions, and location-based information
US8094137B2 (en) 2007-07-23 2012-01-10 Smart Technologies Ulc System and method of detecting contact on a display
KR20100072198A (en) 2007-08-19 2010-06-30 링보우 리미티드 Finger-worn device and related methods of use
US8432377B2 (en) 2007-08-30 2013-04-30 Next Holdings Limited Optical touchscreen with improved illumination
US8384693B2 (en) 2007-08-30 2013-02-26 Next Holdings Limited Low profile touch panel systems
US8482613B2 (en) * 2007-09-10 2013-07-09 John Kempf Apparatus and method for photographing birds
JP5430572B2 (en) * 2007-09-14 2014-03-05 インテレクチュアル ベンチャーズ ホールディング 67 エルエルシー Gesture-based user interaction processing
US8144780B2 (en) * 2007-09-24 2012-03-27 Microsoft Corporation Detecting visual gestural patterns
US8103085B1 (en) 2007-09-25 2012-01-24 Cognex Corporation System and method for detecting flaws in objects using machine vision
US8629976B2 (en) * 2007-10-02 2014-01-14 Microsoft Corporation Methods and systems for hierarchical de-aliasing time-of-flight (TOF) systems
US8094090B2 (en) * 2007-10-19 2012-01-10 Southwest Research Institute Real-time self-visualization system
US8159682B2 (en) 2007-11-12 2012-04-17 Intellectual Ventures Holding 67 Llc Lens system
KR101079598B1 (en) * 2007-12-18 2011-11-03 삼성전자주식회사 Display apparatus and control method thereof
US20090166684A1 (en) * 2007-12-26 2009-07-02 3Dv Systems Ltd. Photogate cmos pixel for 3d cameras having reduced intra-pixel cross talk
US8149210B2 (en) * 2007-12-31 2012-04-03 Microsoft International Holdings B.V. Pointing device and method
US8327272B2 (en) 2008-01-06 2012-12-04 Apple Inc. Portable multifunction device, method, and graphical user interface for viewing and managing electronic calendars
US8405636B2 (en) 2008-01-07 2013-03-26 Next Holdings Limited Optical position sensing system and optical position sensor assembly
US8166421B2 (en) * 2008-01-14 2012-04-24 Primesense Ltd. Three-dimensional user interface
US8933876B2 (en) 2010-12-13 2015-01-13 Apple Inc. Three dimensional user interface session control
US9035876B2 (en) 2008-01-14 2015-05-19 Apple Inc. Three-dimensional user interface session control
US8259163B2 (en) 2008-03-07 2012-09-04 Intellectual Ventures Holding 67 Llc Display with built in 3D sensing
US10642364B2 (en) 2009-04-02 2020-05-05 Oblong Industries, Inc. Processing tracking and recognition data in gestural recognition systems
US9952673B2 (en) 2009-04-02 2018-04-24 Oblong Industries, Inc. Operating environment comprising multiple client devices, multiple displays, multiple users, and gestural control
US9495013B2 (en) 2008-04-24 2016-11-15 Oblong Industries, Inc. Multi-modal gestural interface
US9740922B2 (en) 2008-04-24 2017-08-22 Oblong Industries, Inc. Adaptive tracking system for spatial input devices
US8723795B2 (en) 2008-04-24 2014-05-13 Oblong Industries, Inc. Detecting, representing, and interpreting three-space input: gestural continuum subsuming freespace, proximal, and surface-contact modes
US9684380B2 (en) 2009-04-02 2017-06-20 Oblong Industries, Inc. Operating environment with gestural control and multiple client devices, displays, and users
US9740293B2 (en) 2009-04-02 2017-08-22 Oblong Industries, Inc. Operating environment with gestural control and multiple client devices, displays, and users
US8902193B2 (en) 2008-05-09 2014-12-02 Smart Technologies Ulc Interactive input system and bezel therefor
US20090298020A1 (en) * 2008-06-03 2009-12-03 United Parcel Service Of America, Inc. Systems and methods for improving user efficiency with handheld devices
US9343034B2 (en) * 2008-06-07 2016-05-17 Nokia Technologies Oy User interface, device and method for displaying a stable screen view
KR101652535B1 (en) * 2008-06-18 2016-08-30 오블롱 인더스트리즈, 인크 Gesture-based control system for vehicle interfaces
US8385557B2 (en) 2008-06-19 2013-02-26 Microsoft Corporation Multichannel acoustic echo reduction
US9513705B2 (en) 2008-06-19 2016-12-06 Tactile Displays, Llc Interactive display with tactile feedback
US8217908B2 (en) 2008-06-19 2012-07-10 Tactile Displays, Llc Apparatus and method for interactive display with tactile feedback
US8665228B2 (en) 2008-06-19 2014-03-04 Tactile Displays, Llc Energy efficient interactive display with energy regenerative keyboard
US8115745B2 (en) 2008-06-19 2012-02-14 Tactile Displays, Llc Apparatus and method for interactive display with tactile feedback
US8514251B2 (en) * 2008-06-23 2013-08-20 Qualcomm Incorporated Enhanced character input using recognized gestures
US8325909B2 (en) 2008-06-25 2012-12-04 Microsoft Corporation Acoustic echo suppression
US8203699B2 (en) 2008-06-30 2012-06-19 Microsoft Corporation System architecture design for time-of-flight system having reduced differential pixel size, and time-of-flight systems so designed
WO2010011923A1 (en) * 2008-07-24 2010-01-28 Gesturetek, Inc. Enhanced detection of circular engagement gesture
US8339378B2 (en) 2008-11-05 2012-12-25 Smart Technologies Ulc Interactive input system with multi-angle reflector
US8379987B2 (en) * 2008-12-30 2013-02-19 Nokia Corporation Method, apparatus and computer program product for providing hand segmentation for gesture analysis
US8681321B2 (en) * 2009-01-04 2014-03-25 Microsoft International Holdings B.V. Gated 3D camera
US9069385B1 (en) * 2009-01-08 2015-06-30 Sprint Communications Company L.P. Communicating physical gestures as compressed data streams
US20120202569A1 (en) * 2009-01-13 2012-08-09 Primesense Ltd. Three-Dimensional User Interface for Game Applications
US8704767B2 (en) * 2009-01-29 2014-04-22 Microsoft Corporation Environmental gesture recognition
US20100199228A1 (en) * 2009-01-30 2010-08-05 Microsoft Corporation Gesture Keyboarding
US20100199231A1 (en) * 2009-01-30 2010-08-05 Microsoft Corporation Predictive determination
US8577084B2 (en) * 2009-01-30 2013-11-05 Microsoft Corporation Visual target tracking
US8577085B2 (en) * 2009-01-30 2013-11-05 Microsoft Corporation Visual target tracking
US8565477B2 (en) 2009-01-30 2013-10-22 Microsoft Corporation Visual target tracking
US8565476B2 (en) 2009-01-30 2013-10-22 Microsoft Corporation Visual target tracking
US8448094B2 (en) * 2009-01-30 2013-05-21 Microsoft Corporation Mapping a natural input device to a legacy system
US7996793B2 (en) 2009-01-30 2011-08-09 Microsoft Corporation Gesture recognizer system architecture
US8295546B2 (en) * 2009-01-30 2012-10-23 Microsoft Corporation Pose tracking pipeline
US8267781B2 (en) 2009-01-30 2012-09-18 Microsoft Corporation Visual target tracking
US8866821B2 (en) 2009-01-30 2014-10-21 Microsoft Corporation Depth map movement tracking via optical flow and velocity prediction
US8588465B2 (en) * 2009-01-30 2013-11-19 Microsoft Corporation Visual target tracking
US8487938B2 (en) * 2009-01-30 2013-07-16 Microsoft Corporation Standard Gestures
US8294767B2 (en) 2009-01-30 2012-10-23 Microsoft Corporation Body scan
US9652030B2 (en) * 2009-01-30 2017-05-16 Microsoft Technology Licensing, Llc Navigation of a virtual plane using a zone of restriction for canceling noise
US8682028B2 (en) * 2009-01-30 2014-03-25 Microsoft Corporation Visual target tracking
TWI395145B (en) 2009-02-02 2013-05-01 Ind Tech Res Inst Hand gesture recognition system and method
DE202010018601U1 (en) 2009-02-18 2018-04-30 Google LLC (n.d.Ges.d. Staates Delaware) Automatically collecting information, such as gathering information using a document recognizing device
US8447066B2 (en) 2009-03-12 2013-05-21 Google Inc. Performing actions based on capturing information from rendered documents, such as documents under copyright
WO2010105245A2 (en) 2009-03-12 2010-09-16 Exbiblio B.V. Automatically providing content associated with captured information, such as information captured in real-time
US20100235786A1 (en) * 2009-03-13 2010-09-16 Primesense Ltd. Enhanced 3d interfacing for remote devices
US8773355B2 (en) 2009-03-16 2014-07-08 Microsoft Corporation Adaptive cursor sizing
US8988437B2 (en) 2009-03-20 2015-03-24 Microsoft Technology Licensing, Llc Chaining animations
US9256282B2 (en) 2009-03-20 2016-02-09 Microsoft Technology Licensing, Llc Virtual object manipulation
US9313376B1 (en) 2009-04-01 2016-04-12 Microsoft Technology Licensing, Llc Dynamic depth power equalization
US20100257462A1 (en) 2009-04-01 2010-10-07 Avaya Inc Interpretation of gestures to provide visual queues
US9317128B2 (en) 2009-04-02 2016-04-19 Oblong Industries, Inc. Remote devices used in a markerless installation of a spatial operating environment incorporating gestural control
US10824238B2 (en) 2009-04-02 2020-11-03 Oblong Industries, Inc. Operating environment with gestural control and multiple client devices, displays, and users
US20100277470A1 (en) * 2009-05-01 2010-11-04 Microsoft Corporation Systems And Methods For Applying Model Tracking To Motion Capture
US9377857B2 (en) 2009-05-01 2016-06-28 Microsoft Technology Licensing, Llc Show body position
US8660303B2 (en) * 2009-05-01 2014-02-25 Microsoft Corporation Detection of body and props
US8638985B2 (en) 2009-05-01 2014-01-28 Microsoft Corporation Human body pose estimation
US8181123B2 (en) 2009-05-01 2012-05-15 Microsoft Corporation Managing virtual port associations to users in a gesture-based computing environment
US8942428B2 (en) * 2009-05-01 2015-01-27 Microsoft Corporation Isolate extraneous motions
US9898675B2 (en) * 2009-05-01 2018-02-20 Microsoft Technology Licensing, Llc User movement tracking feedback to improve tracking
US8340432B2 (en) 2009-05-01 2012-12-25 Microsoft Corporation Systems and methods for detecting a tilt angle from a depth image
US8503720B2 (en) 2009-05-01 2013-08-06 Microsoft Corporation Human body pose estimation
US8649554B2 (en) 2009-05-01 2014-02-11 Microsoft Corporation Method to control perspective for a camera-controlled computer
US8253746B2 (en) 2009-05-01 2012-08-28 Microsoft Corporation Determine intended motions
US9015638B2 (en) 2009-05-01 2015-04-21 Microsoft Technology Licensing, Llc Binding users to a gesture based system and providing feedback to the users
US9498718B2 (en) 2009-05-01 2016-11-22 Microsoft Technology Licensing, Llc Altering a view perspective within a display environment
US20100295771A1 (en) * 2009-05-20 2010-11-25 Microsoft Corporation Control of display objects
US20100295782A1 (en) 2009-05-21 2010-11-25 Yehuda Binder System and method for control based on face ore hand gesture detection
US9417700B2 (en) 2009-05-21 2016-08-16 Edge3 Technologies Gesture recognition systems and related methods
US8009022B2 (en) 2009-05-29 2011-08-30 Microsoft Corporation Systems and methods for immersive interaction with virtual objects
US20100306685A1 (en) * 2009-05-29 2010-12-02 Microsoft Corporation User movement feedback via on-screen avatars
US9400559B2 (en) * 2009-05-29 2016-07-26 Microsoft Technology Licensing, Llc Gesture shortcuts
US8542252B2 (en) * 2009-05-29 2013-09-24 Microsoft Corporation Target digitization, extraction, and tracking
US9383823B2 (en) 2009-05-29 2016-07-05 Microsoft Technology Licensing, Llc Combining gestures beyond skeletal
US8856691B2 (en) 2009-05-29 2014-10-07 Microsoft Corporation Gesture tool
US8379101B2 (en) 2009-05-29 2013-02-19 Microsoft Corporation Environment and/or target segmentation
US8145594B2 (en) * 2009-05-29 2012-03-27 Microsoft Corporation Localized gesture aggregation
US8625837B2 (en) 2009-05-29 2014-01-07 Microsoft Corporation Protocol and format for communicating an image from a camera to a computing environment
US8693724B2 (en) 2009-05-29 2014-04-08 Microsoft Corporation Method and system implementing user-centric gesture control
US8744121B2 (en) 2009-05-29 2014-06-03 Microsoft Corporation Device for identifying and tracking multiple humans over time
US8803889B2 (en) 2009-05-29 2014-08-12 Microsoft Corporation Systems and methods for applying animations or motions to a character
US8176442B2 (en) * 2009-05-29 2012-05-08 Microsoft Corporation Living cursor control mechanics
US20100306716A1 (en) * 2009-05-29 2010-12-02 Microsoft Corporation Extending standard gestures
US9182814B2 (en) 2009-05-29 2015-11-10 Microsoft Technology Licensing, Llc Systems and methods for estimating a non-visible or occluded body part
US8418085B2 (en) 2009-05-29 2013-04-09 Microsoft Corporation Gesture coach
US8509479B2 (en) * 2009-05-29 2013-08-13 Microsoft Corporation Virtual object
US8320619B2 (en) 2009-05-29 2012-11-27 Microsoft Corporation Systems and methods for tracking a model
US8487871B2 (en) 2009-06-01 2013-07-16 Microsoft Corporation Virtual desktop coordinate transformation
US8390680B2 (en) 2009-07-09 2013-03-05 Microsoft Corporation Visual representation expression based on player expression
US8692768B2 (en) 2009-07-10 2014-04-08 Smart Technologies Ulc Interactive input system
US9159151B2 (en) 2009-07-13 2015-10-13 Microsoft Technology Licensing, Llc Bringing a visual representation to life via learned input from the user
US20110025689A1 (en) * 2009-07-29 2011-02-03 Microsoft Corporation Auto-Generating A Visual Representation
US9563350B2 (en) * 2009-08-11 2017-02-07 Lg Electronics Inc. Mobile terminal and method for controlling the same
US8565479B2 (en) * 2009-08-13 2013-10-22 Primesense Ltd. Extraction of skeletons from 3D maps
US8654524B2 (en) 2009-08-17 2014-02-18 Apple Inc. Housing as an I/O device
US8264536B2 (en) * 2009-08-25 2012-09-11 Microsoft Corporation Depth-sensitive imaging via polarization-state mapping
US9141193B2 (en) 2009-08-31 2015-09-22 Microsoft Technology Licensing, Llc Techniques for using human gestures to control gesture unaware programs
US8508919B2 (en) * 2009-09-14 2013-08-13 Microsoft Corporation Separation of electrical and optical components
US8330134B2 (en) 2009-09-14 2012-12-11 Microsoft Corporation Optical fault monitoring
US8428340B2 (en) * 2009-09-21 2013-04-23 Microsoft Corporation Screen space plane identification
US8976986B2 (en) * 2009-09-21 2015-03-10 Microsoft Technology Licensing, Llc Volume adjustment based on listener position
US8760571B2 (en) * 2009-09-21 2014-06-24 Microsoft Corporation Alignment of lens and image sensor
US9014546B2 (en) 2009-09-23 2015-04-21 Rovi Guides, Inc. Systems and methods for automatically detecting users within detection regions of media devices
US8452087B2 (en) * 2009-09-30 2013-05-28 Microsoft Corporation Image selection techniques
US8723118B2 (en) * 2009-10-01 2014-05-13 Microsoft Corporation Imager for constructing color and depth images
US20110083108A1 (en) * 2009-10-05 2011-04-07 Microsoft Corporation Providing user interface feedback regarding cursor position on a display screen
US8867820B2 (en) * 2009-10-07 2014-10-21 Microsoft Corporation Systems and methods for removing a background of an image
US8564534B2 (en) 2009-10-07 2013-10-22 Microsoft Corporation Human tracking system
US7961910B2 (en) * 2009-10-07 2011-06-14 Microsoft Corporation Systems and methods for tracking a model
US8963829B2 (en) 2009-10-07 2015-02-24 Microsoft Corporation Methods and systems for determining and tracking extremities of a target
US9971807B2 (en) 2009-10-14 2018-05-15 Oblong Industries, Inc. Multi-process interactive systems and methods
US9933852B2 (en) 2009-10-14 2018-04-03 Oblong Industries, Inc. Multi-process interactive systems and methods
US9400548B2 (en) * 2009-10-19 2016-07-26 Microsoft Technology Licensing, Llc Gesture personalization and profile roaming
US20110099476A1 (en) * 2009-10-23 2011-04-28 Microsoft Corporation Decorating a display environment
US8988432B2 (en) * 2009-11-05 2015-03-24 Microsoft Technology Licensing, Llc Systems and methods for processing an image for target tracking
US20110109617A1 (en) * 2009-11-12 2011-05-12 Microsoft Corporation Visualizing Depth
US8843857B2 (en) 2009-11-19 2014-09-23 Microsoft Corporation Distance scalable no touch computing
US9081799B2 (en) 2009-12-04 2015-07-14 Google Inc. Using gestalt information to identify locations in printed information
US9323784B2 (en) 2009-12-09 2016-04-26 Google Inc. Image search using text-based elements within the contents of images
US9244533B2 (en) 2009-12-17 2016-01-26 Microsoft Technology Licensing, Llc Camera navigation for presentations
US20110151974A1 (en) * 2009-12-18 2011-06-23 Microsoft Corporation Gesture style recognition and reward
US20110150271A1 (en) * 2009-12-18 2011-06-23 Microsoft Corporation Motion detection using depth images
US8320621B2 (en) 2009-12-21 2012-11-27 Microsoft Corporation Depth projector system with integrated VCSEL array
EP2339507B1 (en) 2009-12-28 2013-07-17 Softkinetic Software Head detection and localisation method
KR20110076458A (en) * 2009-12-29 2011-07-06 엘지전자 주식회사 Display device and control method thereof
US8902259B1 (en) * 2009-12-29 2014-12-02 Google Inc. Finger-friendly content selection interface
TWI408610B (en) * 2009-12-30 2013-09-11 Ind Tech Res Inst Methods and systems for gesture recognition, and computer program products thereof
US8862576B2 (en) 2010-01-06 2014-10-14 Apple Inc. Device, method, and graphical user interface for mapping directions between search results
US20110164032A1 (en) * 2010-01-07 2011-07-07 Prime Sense Ltd. Three-Dimensional User Interface
US9268404B2 (en) * 2010-01-08 2016-02-23 Microsoft Technology Licensing, Llc Application gesture interpretation
US9019201B2 (en) * 2010-01-08 2015-04-28 Microsoft Technology Licensing, Llc Evolving universal gesture sets
US8631355B2 (en) * 2010-01-08 2014-01-14 Microsoft Corporation Assigning gesture dictionaries
US8284157B2 (en) * 2010-01-15 2012-10-09 Microsoft Corporation Directed performance in motion capture system
US8933884B2 (en) 2010-01-15 2015-01-13 Microsoft Corporation Tracking groups of users in motion capture system
US8334842B2 (en) 2010-01-15 2012-12-18 Microsoft Corporation Recognizing user intent in motion capture system
US8676581B2 (en) 2010-01-22 2014-03-18 Microsoft Corporation Speech recognition analysis via identification information
US8265341B2 (en) 2010-01-25 2012-09-11 Microsoft Corporation Voice-body identity correlation
US8864581B2 (en) * 2010-01-29 2014-10-21 Microsoft Corporation Visual based identitiy tracking
US8891067B2 (en) 2010-02-01 2014-11-18 Microsoft Corporation Multiple synchronized optical sources for time-of-flight range finding systems
US8687044B2 (en) * 2010-02-02 2014-04-01 Microsoft Corporation Depth camera compatibility
US8619122B2 (en) * 2010-02-02 2013-12-31 Microsoft Corporation Depth camera compatibility
US8717469B2 (en) * 2010-02-03 2014-05-06 Microsoft Corporation Fast gating photosurface
US8499257B2 (en) * 2010-02-09 2013-07-30 Microsoft Corporation Handles interactions for human—computer interface
US8659658B2 (en) * 2010-02-09 2014-02-25 Microsoft Corporation Physical interaction zone for gesture-based user interfaces
US8633890B2 (en) * 2010-02-16 2014-01-21 Microsoft Corporation Gesture detection based on joint skipping
US20110199302A1 (en) * 2010-02-16 2011-08-18 Microsoft Corporation Capturing screen objects using a collision volume
US8928579B2 (en) * 2010-02-22 2015-01-06 Andrew David Wilson Interacting with an omni-directionally projected display
US9400695B2 (en) * 2010-02-26 2016-07-26 Microsoft Technology Licensing, Llc Low latency rendering of objects
US8787663B2 (en) * 2010-03-01 2014-07-22 Primesense Ltd. Tracking body parts by combined color image and depth processing
US8411948B2 (en) 2010-03-05 2013-04-02 Microsoft Corporation Up-sampling binary images for segmentation
US8422769B2 (en) 2010-03-05 2013-04-16 Microsoft Corporation Image segmentation using reduced foreground training data
US8655069B2 (en) 2010-03-05 2014-02-18 Microsoft Corporation Updating image segmentation following user input
US20110223995A1 (en) 2010-03-12 2011-09-15 Kevin Geisner Interacting with a computer based application
US20110221755A1 (en) * 2010-03-12 2011-09-15 Kevin Geisner Bionic motion
US8279418B2 (en) 2010-03-17 2012-10-02 Microsoft Corporation Raster scanning for depth detection
US8213680B2 (en) * 2010-03-19 2012-07-03 Microsoft Corporation Proxy training data for human body tracking
US20110234481A1 (en) * 2010-03-26 2011-09-29 Sagi Katz Enhancing presentations using depth sensing cameras
US8514269B2 (en) * 2010-03-26 2013-08-20 Microsoft Corporation De-aliasing depth images
US8523667B2 (en) * 2010-03-29 2013-09-03 Microsoft Corporation Parental control settings based on body dimensions
US8605763B2 (en) 2010-03-31 2013-12-10 Microsoft Corporation Temperature measurement and control for laser and light-emitting diodes
US9098873B2 (en) 2010-04-01 2015-08-04 Microsoft Technology Licensing, Llc Motion-based interactive shopping environment
US9646340B2 (en) 2010-04-01 2017-05-09 Microsoft Technology Licensing, Llc Avatar-based virtual dressing room
US10719131B2 (en) 2010-04-05 2020-07-21 Tactile Displays, Llc Interactive display with tactile feedback
US20200393907A1 (en) 2010-04-13 2020-12-17 Tactile Displays, Llc Interactive display with tactile feedback
US8351651B2 (en) 2010-04-26 2013-01-08 Microsoft Corporation Hand-location post-process refinement in a tracking system
US8379919B2 (en) 2010-04-29 2013-02-19 Microsoft Corporation Multiple centroid condensation of probability distribution clouds
US9539510B2 (en) 2010-04-30 2017-01-10 Microsoft Technology Licensing, Llc Reshapable connector with variable rigidity
US8284847B2 (en) 2010-05-03 2012-10-09 Microsoft Corporation Detecting motion for a multifunction sensor device
US8885890B2 (en) 2010-05-07 2014-11-11 Microsoft Corporation Depth map confidence filtering
US8498481B2 (en) 2010-05-07 2013-07-30 Microsoft Corporation Image segmentation using star-convexity constraints
US10786736B2 (en) 2010-05-11 2020-09-29 Sony Interactive Entertainment LLC Placement of user information in a game space
US8457353B2 (en) 2010-05-18 2013-06-04 Microsoft Corporation Gestures and gesture modifiers for manipulating a user-interface
US8396252B2 (en) 2010-05-20 2013-03-12 Edge 3 Technologies Systems and related methods for three dimensional gesture recognition in vehicles
US20130120280A1 (en) * 2010-05-28 2013-05-16 Tim Kukulski System and Method for Evaluating Interoperability of Gesture Recognizers
US20130120282A1 (en) * 2010-05-28 2013-05-16 Tim Kukulski System and Method for Evaluating Gesture Usability
US8594425B2 (en) 2010-05-31 2013-11-26 Primesense Ltd. Analysis of three-dimensional scenes
US8803888B2 (en) 2010-06-02 2014-08-12 Microsoft Corporation Recognition system for sharing information
US8602887B2 (en) 2010-06-03 2013-12-10 Microsoft Corporation Synthesis of information from multiple audiovisual sources
US9008355B2 (en) 2010-06-04 2015-04-14 Microsoft Technology Licensing, Llc Automatic depth camera aiming
US8751215B2 (en) 2010-06-04 2014-06-10 Microsoft Corporation Machine based sign language interpreter
US9557574B2 (en) 2010-06-08 2017-01-31 Microsoft Technology Licensing, Llc Depth illumination and detection optics
US8330822B2 (en) 2010-06-09 2012-12-11 Microsoft Corporation Thermally-tuned depth camera light source
US8675981B2 (en) 2010-06-11 2014-03-18 Microsoft Corporation Multi-modal gender recognition including depth data
US9384329B2 (en) 2010-06-11 2016-07-05 Microsoft Technology Licensing, Llc Caloric burn determination from body movement
US8749557B2 (en) 2010-06-11 2014-06-10 Microsoft Corporation Interacting with user interface via avatar
US8982151B2 (en) 2010-06-14 2015-03-17 Microsoft Technology Licensing, Llc Independently processing planes of display data
US8558873B2 (en) 2010-06-16 2013-10-15 Microsoft Corporation Use of wavefront coding to create a depth image
US8670029B2 (en) 2010-06-16 2014-03-11 Microsoft Corporation Depth camera illuminator with superluminescent light-emitting diode
US8296151B2 (en) 2010-06-18 2012-10-23 Microsoft Corporation Compound gesture-speech commands
US8381108B2 (en) 2010-06-21 2013-02-19 Microsoft Corporation Natural user input for driving interactive stories
US8416187B2 (en) 2010-06-22 2013-04-09 Microsoft Corporation Item navigation using motion-capture data
US8878656B2 (en) 2010-06-22 2014-11-04 Microsoft Corporation Providing directional force feedback in free space
US9086727B2 (en) 2010-06-22 2015-07-21 Microsoft Technology Licensing, Llc Free space directional force feedback apparatus
US20110317871A1 (en) * 2010-06-29 2011-12-29 Microsoft Corporation Skeletal joint recognition and tracking system
US20120016641A1 (en) 2010-07-13 2012-01-19 Giuseppe Raffa Efficient gesture processing
US9201501B2 (en) 2010-07-20 2015-12-01 Apple Inc. Adaptive projector
CN102959616B (en) 2010-07-20 2015-06-10 苹果公司 Interactive reality augmentation for natural interaction
US9075434B2 (en) 2010-08-20 2015-07-07 Microsoft Technology Licensing, Llc Translating user motion into multiple object responses
US8613666B2 (en) 2010-08-31 2013-12-24 Microsoft Corporation User selection and navigation based on looped motions
WO2012030872A1 (en) 2010-09-02 2012-03-08 Edge3 Technologies Inc. Method and apparatus for confusion learning
US8655093B2 (en) 2010-09-02 2014-02-18 Edge 3 Technologies, Inc. Method and apparatus for performing segmentation of an image
US8582866B2 (en) 2011-02-10 2013-11-12 Edge 3 Technologies, Inc. Method and apparatus for disparity computation in stereo images
US8666144B2 (en) 2010-09-02 2014-03-04 Edge 3 Technologies, Inc. Method and apparatus for determining disparity of texture
US20120058824A1 (en) 2010-09-07 2012-03-08 Microsoft Corporation Scalable real-time motion recognition
US8437506B2 (en) 2010-09-07 2013-05-07 Microsoft Corporation System for fast, probabilistic skeletal tracking
US8417058B2 (en) 2010-09-15 2013-04-09 Microsoft Corporation Array of scanning sensors
US8582867B2 (en) 2010-09-16 2013-11-12 Primesense Ltd Learning-based pose estimation from depth maps
US8963836B2 (en) * 2010-09-17 2015-02-24 Tencent Technology (Shenzhen) Company Limited Method and system for gesture-based human-machine interaction and computer-readable medium thereof
CN102402279B (en) * 2010-09-17 2016-05-25 腾讯科技(深圳)有限公司 Man-machine interaction method based on gesture and system
US8638364B2 (en) * 2010-09-23 2014-01-28 Sony Computer Entertainment Inc. User interface system and method using thermal imaging
US8988508B2 (en) 2010-09-24 2015-03-24 Microsoft Technology Licensing, Llc. Wide angle field of view active illumination imaging system
US8959013B2 (en) 2010-09-27 2015-02-17 Apple Inc. Virtual keyboard for a non-tactile three dimensional user interface
US8681255B2 (en) 2010-09-28 2014-03-25 Microsoft Corporation Integrated low power depth camera and projection device
US8548270B2 (en) 2010-10-04 2013-10-01 Microsoft Corporation Time-of-flight depth imaging
US9484065B2 (en) 2010-10-15 2016-11-01 Microsoft Technology Licensing, Llc Intelligent determination of replays based on event identification
US8223589B2 (en) * 2010-10-28 2012-07-17 Hon Hai Precision Industry Co., Ltd. Gesture recognition apparatus and method
US9195345B2 (en) * 2010-10-28 2015-11-24 Microsoft Technology Licensing, Llc Position aware gestures with visual feedback as input method
US8592739B2 (en) 2010-11-02 2013-11-26 Microsoft Corporation Detection of configuration changes of an optical element in an illumination system
US8866889B2 (en) 2010-11-03 2014-10-21 Microsoft Corporation In-home depth camera calibration
US8667519B2 (en) 2010-11-12 2014-03-04 Microsoft Corporation Automatic passive and anonymous feedback system
US10726861B2 (en) 2010-11-15 2020-07-28 Microsoft Technology Licensing, Llc Semi-private communication in open environments
US9349040B2 (en) 2010-11-19 2016-05-24 Microsoft Technology Licensing, Llc Bi-modal depth-image analysis
US10234545B2 (en) 2010-12-01 2019-03-19 Microsoft Technology Licensing, Llc Light source module
US8553934B2 (en) 2010-12-08 2013-10-08 Microsoft Corporation Orienting the position of a sensor
US8872762B2 (en) 2010-12-08 2014-10-28 Primesense Ltd. Three dimensional user interface cursor control
US8618405B2 (en) 2010-12-09 2013-12-31 Microsoft Corp. Free-space gesture musical instrument digital interface (MIDI) controller
US8408706B2 (en) 2010-12-13 2013-04-02 Microsoft Corporation 3D gaze tracker
US9171264B2 (en) 2010-12-15 2015-10-27 Microsoft Technology Licensing, Llc Parallel processing machine learning decision tree training
US8884968B2 (en) 2010-12-15 2014-11-11 Microsoft Corporation Modeling an object from image data
US8920241B2 (en) 2010-12-15 2014-12-30 Microsoft Corporation Gesture controlled persistent handles for interface guides
US8448056B2 (en) 2010-12-17 2013-05-21 Microsoft Corporation Validation analysis of human target
US8803952B2 (en) 2010-12-20 2014-08-12 Microsoft Corporation Plural detector time-of-flight depth mapping
US9821224B2 (en) 2010-12-21 2017-11-21 Microsoft Technology Licensing, Llc Driving simulator control with virtual skeleton
US8994718B2 (en) 2010-12-21 2015-03-31 Microsoft Technology Licensing, Llc Skeletal control of three-dimensional virtual world
US8385596B2 (en) 2010-12-21 2013-02-26 Microsoft Corporation First person shooter control with virtual skeleton
US9823339B2 (en) 2010-12-21 2017-11-21 Microsoft Technology Licensing, Llc Plural anode time-of-flight sensor
US9848106B2 (en) 2010-12-21 2017-12-19 Microsoft Technology Licensing, Llc Intelligent gameplay photo capture
US8804056B2 (en) 2010-12-22 2014-08-12 Apple Inc. Integrated touch screens
US9123316B2 (en) 2010-12-27 2015-09-01 Microsoft Technology Licensing, Llc Interactive content creation
US8488888B2 (en) 2010-12-28 2013-07-16 Microsoft Corporation Classification of posture states
CN103415825B (en) * 2010-12-29 2016-06-01 汤姆逊许可公司 System and method for gesture identification
US8401242B2 (en) 2011-01-31 2013-03-19 Microsoft Corporation Real-time camera tracking using depth maps
US9247238B2 (en) 2011-01-31 2016-01-26 Microsoft Technology Licensing, Llc Reducing interference between multiple infra-red depth cameras
US8401225B2 (en) 2011-01-31 2013-03-19 Microsoft Corporation Moving object segmentation using depth images
US8587583B2 (en) 2011-01-31 2013-11-19 Microsoft Corporation Three-dimensional environment reconstruction
US8724887B2 (en) 2011-02-03 2014-05-13 Microsoft Corporation Environmental modifications to mitigate environmental factors
CN103347437B (en) 2011-02-09 2016-06-08 苹果公司 Gaze detection in 3D mapping environment
US10025388B2 (en) 2011-02-10 2018-07-17 Continental Automotive Systems, Inc. Touchless human machine interface
US8970589B2 (en) 2011-02-10 2015-03-03 Edge 3 Technologies, Inc. Near-touch interaction with a stereo camera grid structured tessellations
US8942917B2 (en) 2011-02-14 2015-01-27 Microsoft Corporation Change invariant scene recognition by an agent
US8497838B2 (en) 2011-02-16 2013-07-30 Microsoft Corporation Push actuation of interface controls
US9551914B2 (en) 2011-03-07 2017-01-24 Microsoft Technology Licensing, Llc Illuminator with refractive optical element
US9067136B2 (en) 2011-03-10 2015-06-30 Microsoft Technology Licensing, Llc Push personalization of interface controls
US8571263B2 (en) 2011-03-17 2013-10-29 Microsoft Corporation Predicting joint positions
US9857868B2 (en) * 2011-03-19 2018-01-02 The Board Of Trustees Of The Leland Stanford Junior University Method and system for ergonomic touch-free interface
US9470778B2 (en) 2011-03-29 2016-10-18 Microsoft Technology Licensing, Llc Learning from high quality depth measurements
US9858343B2 (en) 2011-03-31 2018-01-02 Microsoft Technology Licensing Llc Personalization of queries, conversations, and searches
US10642934B2 (en) 2011-03-31 2020-05-05 Microsoft Technology Licensing, Llc Augmented conversational understanding architecture
US9842168B2 (en) 2011-03-31 2017-12-12 Microsoft Technology Licensing, Llc Task driven user intents
US9760566B2 (en) 2011-03-31 2017-09-12 Microsoft Technology Licensing, Llc Augmented conversational understanding agent to identify conversation context between two humans and taking an agent action thereof
US9298287B2 (en) 2011-03-31 2016-03-29 Microsoft Technology Licensing, Llc Combined activation for natural user interface systems
US9244984B2 (en) 2011-03-31 2016-01-26 Microsoft Technology Licensing, Llc Location based conversational understanding
US8503494B2 (en) 2011-04-05 2013-08-06 Microsoft Corporation Thermal management system
US8824749B2 (en) 2011-04-05 2014-09-02 Microsoft Corporation Biometric recognition
US8620113B2 (en) 2011-04-25 2013-12-31 Microsoft Corporation Laser diode modes
US9259643B2 (en) 2011-04-28 2016-02-16 Microsoft Technology Licensing, Llc Control of separate computer game elements
US8702507B2 (en) 2011-04-28 2014-04-22 Microsoft Corporation Manual and camera-based avatar control
US10671841B2 (en) 2011-05-02 2020-06-02 Microsoft Technology Licensing, Llc Attribute state classification
US8888331B2 (en) 2011-05-09 2014-11-18 Microsoft Corporation Low inductance light source module
US9137463B2 (en) 2011-05-12 2015-09-15 Microsoft Technology Licensing, Llc Adaptive high dynamic range camera
US9454962B2 (en) 2011-05-12 2016-09-27 Microsoft Technology Licensing, Llc Sentence simplification for spoken language understanding
US9064006B2 (en) 2012-08-23 2015-06-23 Microsoft Technology Licensing, Llc Translating natural language utterances to keyword search queries
US8788973B2 (en) 2011-05-23 2014-07-22 Microsoft Corporation Three-dimensional gesture controlled avatar configuration interface
US8740702B2 (en) 2011-05-31 2014-06-03 Microsoft Corporation Action trigger gesturing
US8760395B2 (en) 2011-05-31 2014-06-24 Microsoft Corporation Gesture recognition techniques
US8657683B2 (en) 2011-05-31 2014-02-25 Microsoft Corporation Action selection gesturing
US8845431B2 (en) 2011-05-31 2014-09-30 Microsoft Corporation Shape trace gesturing
US8526734B2 (en) 2011-06-01 2013-09-03 Microsoft Corporation Three-dimensional background removal for vision system
US9594430B2 (en) 2011-06-01 2017-03-14 Microsoft Technology Licensing, Llc Three-dimensional foreground selection for vision system
US8929612B2 (en) 2011-06-06 2015-01-06 Microsoft Corporation System for recognizing an open or closed hand
US9724600B2 (en) 2011-06-06 2017-08-08 Microsoft Technology Licensing, Llc Controlling objects in a virtual environment
US8597142B2 (en) 2011-06-06 2013-12-03 Microsoft Corporation Dynamic camera based practice mode
US9013489B2 (en) 2011-06-06 2015-04-21 Microsoft Technology Licensing, Llc Generation of avatar reflecting player appearance
US10796494B2 (en) 2011-06-06 2020-10-06 Microsoft Technology Licensing, Llc Adding attributes to virtual representations of real-world objects
US8897491B2 (en) 2011-06-06 2014-11-25 Microsoft Corporation System for finger recognition and tracking
US9208571B2 (en) 2011-06-06 2015-12-08 Microsoft Technology Licensing, Llc Object digitization
US9098110B2 (en) 2011-06-06 2015-08-04 Microsoft Technology Licensing, Llc Head rotation tracking from depth-based center of mass
US9597587B2 (en) 2011-06-08 2017-03-21 Microsoft Technology Licensing, Llc Locational node device
US8881051B2 (en) 2011-07-05 2014-11-04 Primesense Ltd Zoom-based gesture user interface
US9459758B2 (en) 2011-07-05 2016-10-04 Apple Inc. Gesture-based interface with enhanced features
US9377865B2 (en) 2011-07-05 2016-06-28 Apple Inc. Zoom-based gesture user interface
US9342817B2 (en) 2011-07-07 2016-05-17 Sony Interactive Entertainment LLC Auto-creating groups for sharing photos
EP2737436A4 (en) * 2011-07-28 2015-06-17 Arb Labs Inc Systems and methods of detecting body movements using globally generated multi-dimensional gesture data
US10088924B1 (en) * 2011-08-04 2018-10-02 Amazon Technologies, Inc. Overcoming motion effects in gesture recognition
US9030498B2 (en) 2011-08-15 2015-05-12 Apple Inc. Combining explicit select gestures and timeclick in a non-tactile three dimensional user interface
US8786730B2 (en) 2011-08-18 2014-07-22 Microsoft Corporation Image exposure using exclusion regions
US8771206B2 (en) * 2011-08-19 2014-07-08 Accenture Global Services Limited Interactive virtual care
US9218063B2 (en) 2011-08-24 2015-12-22 Apple Inc. Sessionless pointing user interface
US9122311B2 (en) 2011-08-24 2015-09-01 Apple Inc. Visual feedback for tactile and non-tactile user interfaces
US9002099B2 (en) 2011-09-11 2015-04-07 Apple Inc. Learning-based estimation of hand and finger pose
US20130077820A1 (en) * 2011-09-26 2013-03-28 Microsoft Corporation Machine learning gesture detection
US9811255B2 (en) 2011-09-30 2017-11-07 Intel Corporation Detection of gesture data segmentation in mobile devices
US9557836B2 (en) 2011-11-01 2017-01-31 Microsoft Technology Licensing, Llc Depth image compression
US9117281B2 (en) 2011-11-02 2015-08-25 Microsoft Corporation Surface segmentation from RGB and depth images
US8854426B2 (en) 2011-11-07 2014-10-07 Microsoft Corporation Time-of-flight camera with guided light
US9672609B1 (en) 2011-11-11 2017-06-06 Edge 3 Technologies, Inc. Method and apparatus for improved depth-map estimation
US8724906B2 (en) 2011-11-18 2014-05-13 Microsoft Corporation Computing pose and/or shape of modifiable entities
US8509545B2 (en) 2011-11-29 2013-08-13 Microsoft Corporation Foreground subject detection
US8635637B2 (en) 2011-12-02 2014-01-21 Microsoft Corporation User interface presenting an animated avatar performing a media reaction
US8803800B2 (en) 2011-12-02 2014-08-12 Microsoft Corporation User interface control based on head orientation
US9100685B2 (en) 2011-12-09 2015-08-04 Microsoft Technology Licensing, Llc Determining audience state or interest using passive sensor data
US8971612B2 (en) 2011-12-15 2015-03-03 Microsoft Corporation Learning image processing tasks from scene reconstructions
US8879831B2 (en) 2011-12-15 2014-11-04 Microsoft Corporation Using high-level attributes to guide image processing
US8630457B2 (en) 2011-12-15 2014-01-14 Microsoft Corporation Problem states for pose tracking pipeline
US8811938B2 (en) 2011-12-16 2014-08-19 Microsoft Corporation Providing a user interface experience based on inferred vehicle state
US9342139B2 (en) 2011-12-19 2016-05-17 Microsoft Technology Licensing, Llc Pairing a computing device to a user
US9651499B2 (en) 2011-12-20 2017-05-16 Cognex Corporation Configurable image trigger for a vision system and method for using the same
US10691219B2 (en) 2012-01-17 2020-06-23 Ultrahaptics IP Two Limited Systems and methods for machine control
US8638989B2 (en) 2012-01-17 2014-01-28 Leap Motion, Inc. Systems and methods for capturing motion in three-dimensional space
US9679215B2 (en) 2012-01-17 2017-06-13 Leap Motion, Inc. Systems and methods for machine control
US9501152B2 (en) 2013-01-15 2016-11-22 Leap Motion, Inc. Free-space user interface and control using virtual constructs
US8693731B2 (en) 2012-01-17 2014-04-08 Leap Motion, Inc. Enhanced contrast for object detection and characterization by optical imaging
US9070019B2 (en) 2012-01-17 2015-06-30 Leap Motion, Inc. Systems and methods for capturing motion in three-dimensional space
US11493998B2 (en) 2012-01-17 2022-11-08 Ultrahaptics IP Two Limited Systems and methods for machine control
US9720089B2 (en) 2012-01-23 2017-08-01 Microsoft Technology Licensing, Llc 3D zoom imager
US9336456B2 (en) 2012-01-25 2016-05-10 Bruno Delean Systems, methods and computer program products for identifying objects in video data
US9229534B2 (en) 2012-02-28 2016-01-05 Apple Inc. Asymmetric mapping for tactile and non-tactile user interfaces
CN104246682B (en) 2012-03-26 2017-08-25 苹果公司 Enhanced virtual touchpad and touch-screen
US8898687B2 (en) 2012-04-04 2014-11-25 Microsoft Corporation Controlling a media program based on a media reaction
US9025111B2 (en) 2012-04-20 2015-05-05 Google Inc. Seamless display panel using fiber optic carpet
US9047507B2 (en) 2012-05-02 2015-06-02 Apple Inc. Upper-body skeleton extraction from depth maps
US9210401B2 (en) 2012-05-03 2015-12-08 Microsoft Technology Licensing, Llc Projected visual cues for guiding physical movement
CA2775700C (en) 2012-05-04 2013-07-23 Microsoft Corporation Determining a future portion of a currently presented media program
US9823742B2 (en) 2012-05-18 2017-11-21 Microsoft Technology Licensing, Llc Interaction and management of devices using gaze detection
CN104395929B (en) 2012-06-21 2017-10-03 微软技术许可有限责任公司 Constructed using the incarnation of depth camera
US9836590B2 (en) 2012-06-22 2017-12-05 Microsoft Technology Licensing, Llc Enhanced accuracy of user presence status determination
US20140018169A1 (en) * 2012-07-16 2014-01-16 Zhong Yuan Ran Self as Avatar Gaming with Video Projecting Device
US9305229B2 (en) 2012-07-30 2016-04-05 Bruno Delean Method and system for vision based interfacing with a computer
US9696427B2 (en) 2012-08-14 2017-07-04 Microsoft Technology Licensing, Llc Wide angle depth detection
US8819812B1 (en) * 2012-08-16 2014-08-26 Amazon Technologies, Inc. Gesture recognition for device input
US9557846B2 (en) 2012-10-04 2017-01-31 Corning Incorporated Pressure-sensing touch system utilizing optical and capacitive systems
US9014417B1 (en) 2012-10-22 2015-04-21 Google Inc. Method and apparatus for themes using photo-active surface paint
US9164596B1 (en) 2012-10-22 2015-10-20 Google Inc. Method and apparatus for gesture interaction with a photo-active painted surface
US9195320B1 (en) 2012-10-22 2015-11-24 Google Inc. Method and apparatus for dynamic signage using a painted surface display system
US9473740B2 (en) 2012-10-24 2016-10-18 Polycom, Inc. Automatic positioning of videoconference camera to presenter at presentation device
US9019267B2 (en) 2012-10-30 2015-04-28 Apple Inc. Depth mapping with enhanced resolution
US8867786B2 (en) * 2012-10-31 2014-10-21 Microsoft Corporation Scenario-specific body-part tracking
US9285893B2 (en) 2012-11-08 2016-03-15 Leap Motion, Inc. Object detection and tracking with variable-field illumination devices
CN103841358B (en) * 2012-11-23 2017-12-26 中兴通讯股份有限公司 The video conferencing system and method for low code stream, sending ending equipment, receiving device
US10682016B2 (en) * 2012-11-29 2020-06-16 Vorwerk & Co. Interholding Gmbh Food processor
WO2014083029A1 (en) 2012-11-29 2014-06-05 Vorwerk & Co. Interholding Gmbh Food processor
US8882310B2 (en) 2012-12-10 2014-11-11 Microsoft Corporation Laser die light source module with low inductance
TWI499879B (en) 2012-12-21 2015-09-11 Ind Tech Res Inst Workflow monitoring and analysis system and method thereof
US9857470B2 (en) 2012-12-28 2018-01-02 Microsoft Technology Licensing, Llc Using photometric stereo for 3D environment modeling
US10609285B2 (en) 2013-01-07 2020-03-31 Ultrahaptics IP Two Limited Power consumption in motion-capture systems
US9465461B2 (en) 2013-01-08 2016-10-11 Leap Motion, Inc. Object detection and tracking with audio and optical signals
US9632658B2 (en) 2013-01-15 2017-04-25 Leap Motion, Inc. Dynamic user interactions for display control and scaling responsiveness of display objects
US9459697B2 (en) 2013-01-15 2016-10-04 Leap Motion, Inc. Dynamic, free-space user interactions for machine control
US9251590B2 (en) 2013-01-24 2016-02-02 Microsoft Technology Licensing, Llc Camera pose estimation for 3D reconstruction
US9052746B2 (en) 2013-02-15 2015-06-09 Microsoft Technology Licensing, Llc User center-of-mass and mass distribution extraction using depth images
US9940553B2 (en) 2013-02-22 2018-04-10 Microsoft Technology Licensing, Llc Camera/object pose from predicted coordinates
US9524028B2 (en) 2013-03-08 2016-12-20 Fastvdo Llc Visual language for human computer interfaces
US9135516B2 (en) 2013-03-08 2015-09-15 Microsoft Technology Licensing, Llc User body angle, curvature and average extremity positions extraction using depth images
US9092657B2 (en) 2013-03-13 2015-07-28 Microsoft Technology Licensing, Llc Depth image processing
US20140267611A1 (en) * 2013-03-14 2014-09-18 Microsoft Corporation Runtime engine for analyzing user motion in 3d images
US9274606B2 (en) 2013-03-14 2016-03-01 Microsoft Technology Licensing, Llc NUI video conference controls
US9702977B2 (en) 2013-03-15 2017-07-11 Leap Motion, Inc. Determining positional information of an object in space
US10721448B2 (en) 2013-03-15 2020-07-21 Edge 3 Technologies, Inc. Method and apparatus for adaptive exposure bracketing, segmentation and scene organization
US9953213B2 (en) 2013-03-27 2018-04-24 Microsoft Technology Licensing, Llc Self discovery of autonomous NUI devices
US10620709B2 (en) 2013-04-05 2020-04-14 Ultrahaptics IP Two Limited Customized gesture interpretation
US9916009B2 (en) 2013-04-26 2018-03-13 Leap Motion, Inc. Non-tactile interface systems and methods
US9442186B2 (en) 2013-05-13 2016-09-13 Microsoft Technology Licensing, Llc Interference reduction for TOF systems
US9747696B2 (en) 2013-05-17 2017-08-29 Leap Motion, Inc. Systems and methods for providing normalized parameters of motions of objects in three-dimensional space
US9829984B2 (en) * 2013-05-23 2017-11-28 Fastvdo Llc Motion-assisted visual language for human computer interfaces
USD781869S1 (en) 2013-08-01 2017-03-21 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
USD749631S1 (en) * 2013-08-01 2016-02-16 Palantir Technologies Inc. Display screen or portion thereof with icon set
USD796550S1 (en) 2013-08-01 2017-09-05 Palantir Technologies, Inc. Display screen or portion thereof with icon set
USD757028S1 (en) 2013-08-01 2016-05-24 Palantir Technologies Inc. Display screen or portion thereof with graphical user interface
USD749127S1 (en) * 2013-08-01 2016-02-09 Palantir Technologies Inc. Display screen or portion thereof with icon set
AU355184S (en) 2013-08-01 2014-05-01 Palantir Tech Display screen
USD749126S1 (en) * 2013-08-01 2016-02-09 Palantir Technologies Inc. Display screen or portion thereof with icon set
US10281987B1 (en) 2013-08-09 2019-05-07 Leap Motion, Inc. Systems and methods of free-space gestural interaction
US9721383B1 (en) 2013-08-29 2017-08-01 Leap Motion, Inc. Predictive information for free space gesture control and communication
US9462253B2 (en) 2013-09-23 2016-10-04 Microsoft Technology Licensing, Llc Optical modules that reduce speckle contrast and diffraction artifacts
US9632572B2 (en) 2013-10-03 2017-04-25 Leap Motion, Inc. Enhanced field of view to augment three-dimensional (3D) sensory space for free-space gesture interpretation
US9443310B2 (en) 2013-10-09 2016-09-13 Microsoft Technology Licensing, Llc Illumination modules that emit structured light
US9996638B1 (en) 2013-10-31 2018-06-12 Leap Motion, Inc. Predictive information for free space gesture control and communication
US9674563B2 (en) 2013-11-04 2017-06-06 Rovi Guides, Inc. Systems and methods for recommending content
US9769459B2 (en) 2013-11-12 2017-09-19 Microsoft Technology Licensing, Llc Power efficient laser diode driver circuit and method
US9508385B2 (en) 2013-11-21 2016-11-29 Microsoft Technology Licensing, Llc Audio-visual project generator
IN2013MU04097A (en) * 2013-12-27 2015-08-07 Tata Consultancy Services Ltd
KR102285915B1 (en) * 2014-01-05 2021-08-03 마노모션 에이비 Real-time 3d gesture recognition and tracking system for mobile devices
US9971491B2 (en) 2014-01-09 2018-05-15 Microsoft Technology Licensing, Llc Gesture library for natural user input
US20150199022A1 (en) * 2014-01-13 2015-07-16 Fisoc, Inc. Gesture recognition for drilling down into metadata in augmented reality devices
US9613262B2 (en) 2014-01-15 2017-04-04 Leap Motion, Inc. Object detection and tracking for providing a virtual device experience
US9990046B2 (en) 2014-03-17 2018-06-05 Oblong Industries, Inc. Visual collaboration interface
CN204480228U (en) 2014-08-08 2015-07-15 厉动公司 motion sensing and imaging device
US11308928B2 (en) 2014-09-25 2022-04-19 Sunhouse Technologies, Inc. Systems and methods for capturing and interpreting audio
US9536509B2 (en) 2014-09-25 2017-01-03 Sunhouse Technologies, Inc. Systems and methods for capturing and interpreting audio
USD780770S1 (en) 2014-11-05 2017-03-07 Palantir Technologies Inc. Display screen or portion thereof with graphical user interface
USD786931S1 (en) * 2014-12-30 2017-05-16 Sony Corporation Portion of a display panel or screen with graphical user interface
US10674139B2 (en) 2015-06-03 2020-06-02 University Of Connecticut Methods and systems for human action recognition using 3D integral imaging
WO2016199749A1 (en) * 2015-06-10 2016-12-15 コニカミノルタ株式会社 Image processing system, image processing device, image processing method, and image processing program
JP6222405B2 (en) * 2015-06-11 2017-11-01 コニカミノルタ株式会社 Motion detection system, motion detection device, motion detection method, and motion detection program
US10043279B1 (en) 2015-12-07 2018-08-07 Apple Inc. Robust detection and classification of body parts in a depth map
US10412280B2 (en) 2016-02-10 2019-09-10 Microsoft Technology Licensing, Llc Camera with light valve over sensor array
US10257932B2 (en) 2016-02-16 2019-04-09 Microsoft Technology Licensing, Llc. Laser diode chip on printed circuit board
US10462452B2 (en) 2016-03-16 2019-10-29 Microsoft Technology Licensing, Llc Synchronizing active illumination cameras
USD802016S1 (en) 2016-06-29 2017-11-07 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
USD802000S1 (en) 2016-06-29 2017-11-07 Palantir Technologies, Inc. Display screen or portion thereof with an animated graphical user interface
USD858572S1 (en) 2016-06-29 2019-09-03 Palantir Technologies Inc. Display screen or portion thereof with icon
USD803246S1 (en) 2016-06-29 2017-11-21 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
USD826269S1 (en) 2016-06-29 2018-08-21 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
US10529302B2 (en) 2016-07-07 2020-01-07 Oblong Industries, Inc. Spatially mediated augmentations of and interactions among distinct devices and applications via extended pixel manifold
USD835646S1 (en) 2016-07-13 2018-12-11 Palantir Technologies Inc. Display screen or portion thereof with an animated graphical user interface
USD847144S1 (en) 2016-07-13 2019-04-30 Palantir Technologies Inc. Display screen or portion thereof with graphical user interface
USD811424S1 (en) 2016-07-20 2018-02-27 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
USD836673S1 (en) * 2016-07-25 2018-12-25 3M Innovative Properties Company Electronic display or portion thereof with icons
US10366278B2 (en) 2016-09-20 2019-07-30 Apple Inc. Curvature-based face detector
USD808991S1 (en) 2016-12-22 2018-01-30 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
USD868827S1 (en) 2017-02-15 2019-12-03 Palantir Technologies, Inc. Display screen or portion thereof with set of icons
USD834039S1 (en) 2017-04-12 2018-11-20 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
USD839298S1 (en) 2017-04-19 2019-01-29 Palantir Technologies Inc. Display screen or portion thereof with graphical user interface
USD822705S1 (en) 2017-04-20 2018-07-10 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
USD872736S1 (en) 2017-05-04 2020-01-14 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
USD837234S1 (en) 2017-05-25 2019-01-01 Palantir Technologies Inc. Display screen or portion thereof with transitional graphical user interface
USD874472S1 (en) 2017-08-01 2020-02-04 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
US11113885B1 (en) 2017-09-13 2021-09-07 Lucasfilm Entertainment Company Ltd. Real-time views of mixed-reality environments responsive to motion-capture data
US10176366B1 (en) 2017-11-01 2019-01-08 Sorenson Ip Holdings Llc Video relay service, communication system, and related methods for performing artificial intelligence sign language translation services in a video relay service environment
USD872121S1 (en) 2017-11-14 2020-01-07 Palantir Technologies, Inc. Display screen or portion thereof with transitional graphical user interface
CN109992102A (en) * 2017-12-30 2019-07-09 广州大正新材料科技有限公司 A kind of gesture identification and transmitting device and its system
US11566993B2 (en) 2018-01-24 2023-01-31 University Of Connecticut Automated cell identification using shearing interferometry
USD883997S1 (en) 2018-02-12 2020-05-12 Palantir Technologies, Inc. Display screen or portion thereof with transitional graphical user interface
US11269294B2 (en) 2018-02-15 2022-03-08 University Of Connecticut Portable common path shearing interferometry-based holographic microscopy system with augmented reality visualization
USD883301S1 (en) 2018-02-19 2020-05-05 Palantir Technologies, Inc. Display screen or portion thereof with transitional graphical user interface
USD888082S1 (en) 2018-04-03 2020-06-23 Palantir Technologies, Inc. Display screen or portion thereof with transitional graphical user interface
USD869488S1 (en) 2018-04-03 2019-12-10 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
USD886848S1 (en) 2018-04-03 2020-06-09 Palantir Technologies Inc. Display screen or portion thereof with transitional graphical user interface
USD885413S1 (en) 2018-04-03 2020-05-26 Palantir Technologies Inc. Display screen or portion thereof with transitional graphical user interface
US11875012B2 (en) 2018-05-25 2024-01-16 Ultrahaptics IP Two Limited Throwable interface for augmented reality and virtual reality environments
USD879821S1 (en) 2018-08-02 2020-03-31 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
US11461592B2 (en) 2018-08-10 2022-10-04 University Of Connecticut Methods and systems for object recognition in low illumination conditions
TWI715903B (en) * 2018-12-24 2021-01-11 財團法人工業技術研究院 Motion tracking system and method thereof
USD919645S1 (en) 2019-01-02 2021-05-18 Palantir Technologies, Inc. Display screen or portion thereof with transitional graphical user interface
USD916789S1 (en) 2019-02-13 2021-04-20 Palantir Technologies, Inc. Display screen or portion thereof with transitional graphical user interface
USD953345S1 (en) 2019-04-23 2022-05-31 Palantir Technologies, Inc. Display screen or portion thereof with graphical user interface
US11200691B2 (en) 2019-05-31 2021-12-14 University Of Connecticut System and method for optical sensing, visualization, and detection in turbid water using multi-dimensional integral imaging
CN111178308A (en) * 2019-12-31 2020-05-19 北京奇艺世纪科技有限公司 Gesture track recognition method and device
CN112330713B (en) * 2020-11-26 2023-12-19 南京工程学院 Improvement method for speech understanding degree of severe hearing impairment patient based on lip language recognition
US20230359280A1 (en) * 2022-05-09 2023-11-09 KaiKuTek Inc. Method of customizing hand gesture
CN117111751B (en) * 2023-10-25 2024-04-02 北京大学 Gesture change detection method, device, equipment and medium based on pulse array

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5148477A (en) * 1990-08-24 1992-09-15 Board Of Regents Of The University Of Oklahoma Method and apparatus for detecting and quantifying motion of a body part
US5454043A (en) * 1993-07-30 1995-09-26 Mitsubishi Electric Research Laboratories, Inc. Dynamic and static hand gesture recognition through low-level image analysis
US5548659A (en) * 1991-12-27 1996-08-20 Kabushiki Kaisha Toshiba Method and apparatus for detecting changes in dynamic images
US5570113A (en) * 1994-06-29 1996-10-29 International Business Machines Corporation Computer based pen system and method for automatically cancelling unwanted gestures and preventing anomalous signals as inputs to such system
US5581276A (en) * 1992-09-08 1996-12-03 Kabushiki Kaisha Toshiba 3D human interface apparatus using motion recognition based on dynamic image processing

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4641349A (en) * 1985-02-20 1987-02-03 Leonard Flom Iris recognition system
US4843568A (en) * 1986-04-11 1989-06-27 Krueger Myron W Real time perception of and response to the actions of an unencumbered participant/user
US5577179A (en) * 1992-02-25 1996-11-19 Imageware Software, Inc. Image editing system
US5594469A (en) * 1995-02-21 1997-01-14 Mitsubishi Electric Information Technology Center America Inc. Hand gesture machine control system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5148477A (en) * 1990-08-24 1992-09-15 Board Of Regents Of The University Of Oklahoma Method and apparatus for detecting and quantifying motion of a body part
US5548659A (en) * 1991-12-27 1996-08-20 Kabushiki Kaisha Toshiba Method and apparatus for detecting changes in dynamic images
US5581276A (en) * 1992-09-08 1996-12-03 Kabushiki Kaisha Toshiba 3D human interface apparatus using motion recognition based on dynamic image processing
US5454043A (en) * 1993-07-30 1995-09-26 Mitsubishi Electric Research Laboratories, Inc. Dynamic and static hand gesture recognition through low-level image analysis
US5570113A (en) * 1994-06-29 1996-10-29 International Business Machines Corporation Computer based pen system and method for automatically cancelling unwanted gestures and preventing anomalous signals as inputs to such system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUANG C-L, WU M-S: "A MODEL-BASED COMPLEX BACKGROUND GESTURE RECOGNITION SYSTEM", 1996 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS. INFORMATION INTELLIGENCE AND SYSTEMS. BEIJING, OCT. 14 - 17, 1996., NEW YORK, IEEE., US, vol. 01, 1 October 1996 (1996-10-01), US, pages 93 - 98, XP002916848, ISBN: 978-0-7803-3281-2, DOI: 10.1109/ICSMC.1996.569805 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001075568A1 (en) * 2000-03-30 2001-10-11 Ideogramic Aps Method for gesture based modeling
US7096454B2 (en) 2000-03-30 2006-08-22 Tyrsted Management Aps Method for gesture based modeling
EP1324269A1 (en) * 2000-10-06 2003-07-02 Sony Computer Entertainment Inc. Image processing apparatus, image processing method, record medium, computer program, and semiconductor device
EP1324269B1 (en) * 2000-10-06 2017-01-25 Sony Interactive Entertainment Inc. Image processing apparatus, image processing method, record medium, computer program, and semiconductor device
DE102007005027B4 (en) * 2006-03-27 2017-02-23 Volkswagen Ag Display and operating device for a motor vehicle with an interactive user interface
WO2008107733A1 (en) * 2007-03-07 2008-09-12 Sony Ericsson Mobile Communications Ab Method and system for a self timer function for a camera and camera equipped mobile radio terminal
DE102008026030A1 (en) 2008-05-30 2009-12-03 Continental Automotive Gmbh Information and assistance system and a method for its control
EP2399243A4 (en) * 2009-02-17 2013-07-24 Omek Interactive Ltd Method and system for gesture recognition
US8824802B2 (en) 2009-02-17 2014-09-02 Intel Corporation Method and system for gesture recognition
EP2399243A2 (en) * 2009-02-17 2011-12-28 Omek Interactive , Ltd. Method and system for gesture recognition
EP2278823A3 (en) * 2009-07-20 2011-03-16 J Touch Corporation Stereo image interaction system
EP2577483B1 (en) * 2010-05-28 2020-04-29 Microsoft Technology Licensing, LLC Cloud-based personal trait profile data
US8639020B1 (en) 2010-06-16 2014-01-28 Intel Corporation Method and system for modeling subjects from a depth map
US9330470B2 (en) 2010-06-16 2016-05-03 Intel Corporation Method and system for modeling subjects from a depth map
US11048333B2 (en) 2011-06-23 2021-06-29 Intel Corporation System and method for close-range movement tracking
US9910498B2 (en) 2011-06-23 2018-03-06 Intel Corporation System and method for close-range movement tracking
US8958631B2 (en) 2011-12-02 2015-02-17 Intel Corporation System and method for automatically defining and identifying a gesture
US9477303B2 (en) 2012-04-09 2016-10-25 Intel Corporation System and method for combining three-dimensional tracking with a three-dimensional display for a user interface
CN103020648B (en) * 2013-01-09 2016-04-13 艾迪普(北京)文化科技股份有限公司 A kind of type of action recognition methods, program broadcasting method and device
CN103020648A (en) * 2013-01-09 2013-04-03 北京东方艾迪普科技发展有限公司 Method and device for identifying action types, and method and device for broadcasting programs
EP2772811A3 (en) * 2013-02-27 2018-03-28 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with confidence-based decision support
EP2772812A3 (en) * 2013-02-27 2018-03-14 Rockwell Automation Technologies, Inc. Recognition-based industrial automation control with redundant system input support
CN110636964A (en) * 2017-05-23 2019-12-31 奥迪股份公司 Method for determining a driving instruction
CN111125437A (en) * 2019-12-24 2020-05-08 四川新网银行股份有限公司 Method for identifying lip language picture in video
CN111125437B (en) * 2019-12-24 2023-06-09 四川新网银行股份有限公司 Method for recognizing lip language picture in video

Also Published As

Publication number Publication date
US6072494A (en) 2000-06-06
US6256033B1 (en) 2001-07-03
AU1086799A (en) 1999-05-03

Similar Documents

Publication Publication Date Title
US6072494A (en) Method and apparatus for real-time gesture recognition
US6052481A (en) Automatic method for scoring and clustering prototypes of handwritten stroke-based data
Amor et al. Action recognition using rate-invariant analysis of skeletal shape trajectories
Chen et al. Hand gesture recognition using a real-time tracking method and hidden Markov models
Wang et al. Unsupervised analysis of human gestures
Devanne et al. 3-d human action recognition by shape analysis of motion trajectories on riemannian manifold
Wilson et al. Parametric hidden markov models for gesture recognition
Wilson et al. Recognition and interpretation of parametric gesture
JP4208898B2 (en) Object tracking device and object tracking method
Li et al. Model-based segmentation and recognition of dynamic gestures in continuous video streams
Kadous Machine recognition of Auslan signs using PowerGloves: Towards large-lexicon recognition of sign language
US5343537A (en) Statistical mixture approach to automatic handwriting recognition
Han et al. Modelling and segmenting subunits for sign language recognition based on hand motion analysis
EP0550865A2 (en) A continuous parameter hidden Markov model approach to automatic handwriting recognition
KR20020037660A (en) Object activity modeling method
WO2008139399A2 (en) Method of determining motion-related features and method of performing motion classification
CN112101243A (en) Human body action recognition method based on key posture and DTW
Elakkiya et al. Enhanced dynamic programming approach for subunit modelling to handle segmentation and recognition ambiguities in sign language
CN102314591B (en) Method and equipment for detecting static foreground object
Mohandes et al. Arabic sign language recognition an image-based approach
Just et al. HMM and IOHMM for the recognition of mono-and bi-manual 3D hand gestures
Sharma et al. Exploiting speech/gesture co-occurrence for improving continuous gesture recognition in weather narration
WO1996008787A1 (en) System and method for automatic subcharacter unit and lexicon generation for handwriting recognition
Fakhfakh et al. Gesture recognition system for isolated word sign language based on key-point trajectory matrix
Stoll et al. Applications of HMM modeling to recognizing human gestures in image sequences for a man-machine interface

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: KR

NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase