CN101719015B

CN101719015B - Method for positioning finger tips of directed gestures

Info

Publication number: CN101719015B
Application number: CN2009101981969A
Authority: CN
Inventors: 管业鹏; 于蕴杰
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2009-11-03
Filing date: 2009-11-03
Publication date: 2011-08-31
Anticipated expiration: 2029-11-03
Also published as: CN101719015A

Abstract

The invention relates to a method for positioning finger tips of directed gestures. In the method, the positions of the finger tips of the directed gestures are automatically determined according to the characteristics of hands of the directed gestures in directed gesture actions. A background difference method is adopted to extract objects of the directed gestures, a complexion partition method is used to extract a zone of the hand of the directed gestures, and a finger tip point is automatically determined according to a position of the finger tip of the directed gestures which is positioned at the outline of the edge of the hand of the directed gestures and farthest from the zone of the finger tip of the directed gestures. Therefore, the invention fast and effectively positions the position of the finger tip point of the directed gestures, and meets the requirements for extracting features in human-computer interaction of the directed gestures.

Description

The finger tip localization method of indication gesture

Technical field

The present invention relates to a kind of finger tip localization method of indicating gesture, be used for video digital images analysis and understanding.Belong to the intelligent information processing technology field.

Background technology

Along with rapid development of computer technology; research meets the human novel human-computer interaction technology Showed Very Brisk that exchanges custom naturally; and human-computer interaction technology is from being that the center transfers to progressively that focus be put on man with the computing machine; multimedia user interface has then greatly enriched the form of expression of computerized information, the user can be replaced or utilizes a plurality of sensory channels simultaneously.Yet the man-machine interaction form of multimedia user interface still forces the user to use conventional input equipment (as keyboard, Genius mouse and touch-screen etc.) to import, and becomes the bottleneck of current man-machine interaction.Virtual reality can realize harmonious, man-machine interface focusing on people as a kind of novel human-machine interaction form.In virtual reality, if with staff directly as computer entry device, then can make full use of human daily technical ability, and not need special training or study, the communication of between humans and machines will no longer need intermediary.In with the field of tool of staff as natural interaction, indication gesture (pointing gesture) is explained easily.The indication gesture is to use the reflection of finger to the spatial impression targets of interest in people's daily life, is the important pioneer of human family of languages development and ontogeny, can disclose human society intelligence, is a kind of desirable natural interactive mode.

For realizing man-machine interaction based on the indication gesture, adopt based on the data helmet, data glove and body marker etc. at present, these class methods are intrusive mood, the user needs specialized training, operation inconvenience; For overcoming above-mentioned deficiency, adopt based on non-contact sensor (as video camera), obtain the characteristics of objects of indication gesture, determine to refer to the gesture extraterrestrial target, realize man-machine interaction, wherein, refer to the gesture extraterrestrial target, determine with the intersection point on plane, target place by the finger tip of indication gesture and the line of people's an eye line.

When obtaining the finger tip position of indication gesture, following method is arranged generally: the one, to wear the wireless sensing gloves or wear the gloves of special color, algorithm need be equipped with special instrument and equipment, operation inconvenience; The 2nd, the requirement background is simple, and indication gesture object is single, and this method is difficult to adapt to complicated scene change.

Summary of the invention

The objective of the invention is to finger tip extracting method at existing indication gesture, existence need or require background simple by contact external device (as gloves), indicate problems such as the gesture object is single, a kind of finger tip localization method of improved indication gesture is provided.It is to locate fast according to the feature of the indication gesture hand in the behavior of indication gesture to refer to the gesture finger tip, thereby improves the dirigibility and the simplicity of man-machine interaction.

For achieving the above object, design of the present invention is: adopt the background subtraction point-score, extract indication gesture object, utilization skin color segmentation method, extract the hand region of indication gesture, finger tip according to the indication gesture is positioned at the hand edge profile of indication gesture and the finger areas center of gravity farthest of distance indication gesture, determines the finger cusp automatically, thus the finger cusp position of the gesture of location indication fast and effeciently.

According to the foregoing invention design, the present invention adopts following technical proposals:

A kind of finger tip localization method of indicating gesture.It is the hand-characteristic according to the indication gesture in the behavior of indication gesture, determines the finger tip position of indication gesture automatically, and concrete steps are as follows:

1) starts indication images of gestures acquisition system: gather video image;

2) obtain background image

Continuous acquisition does not comprise the scene image of indication gesture target, when two image difference are less than certain setting threshold in certain setting-up time interval, a certain width of cloth image in then should the time interval is image as a setting, otherwise gather again, two image difference in the time interval of satisfying setting are less than certain setting threshold;

3) indication gesture object cuts apart

Current frame image and step 2 by camera acquisition) background image that obtains subtracts each other, and is partitioned into indication gesture subject area;

4) extract hand region;

5) hand region of definite indication gesture;

6) the finger tip position of definite indication gesture.

Above-mentioned steps 4) concrete operations step is as follows:

(1) color space conversion, calculate color-values Cr, Cb:, calculate the color-values Cr of YCbCr color space, Cb by the red R of RGB color space, green G, blue B three-component:

Cr＝0.5×R-0.4187×G-0.0813×B

Cb＝-0.1687×R-0.3313×G+0.5×B

(2) area of skin color extracts: determine color-values Cr, the threshold value T of Cb and Cr/Cb respectively ₁, T ₂, T ₃, T ₄, T ₅, the zone with all pixels that satisfy following formula are formed is defined as area of skin color S

S＝(Cr≥T ₁∩Cr≤T ₂)∩(Cb≥T ₃∩Cb≤T ₄)∩(Cr/Cb≥T ₅)

Wherein, ∩ is " logical and " operational character;

(3) extract the area of skin color of possible indication gesture object: will satisfy the image-region of step 3) and step (2) simultaneously, as the area of skin color of possible indication gesture object;

(4) extract hand region: the bianry image to step (3) carries out the connected region search, calculate the ratio of high S1 of connected region and wide Sw, and the hole in the connected region is counted H and connected region size W, the zone that all pixels that satisfy following formula are formed is considered as non-hand region, rejects from the bianry image zone of step (3);

F＝(S _l/S _w≥T ₆∩S _l/S _w≤T ₇)∩(H＞1)∩W＜T ₈

Wherein, T ₆, T ₇, T ₈Be threshold value.

Above-mentioned steps 5) method of the hand region of definite indication gesture is: calculate the bottom pixels i through the bianry image connected region of step (4) gained respectively, the ordinate By:By=max (j) of j and horizontal ordinate Bx:Bx=i (j=By), to comprise and have minimum By value and the corresponding with it connected region that Bx formed, be defined as indicating the hand region of gesture;

Above-mentioned steps 6) method of the finger tip position of definite indication gesture is: the hand region center of gravity Cx that calculates the indication gesture, Cy, and the pixel coordinate i of each point on the hand region outline line of indication gesture, j is to center of gravity Cx, the distance D of Cy, distance D had peaked pixel coordinate i, j is defined as indicating the finger tip position Px of gesture, Py, Px:Px=i (D=max (D)), Py:Py=j (D=max (D)).

Principle of the present invention is as follows:

In technical scheme of the present invention, can provide characteristic more completely according to the background subtraction point-score, all can be embodied in the variation of scene image sequence based on any perceptible target travel in the scene, utilize the difference between present image and the background image, from video image, detect indication gesture object.

If in the time interval Δ t, obtain t respectively _N-1With t _nThe two two field picture f (t in two moment _N-1, x, y), f (t _n, x y), asks difference with two width of cloth images by pixel, difference image Diff (x, y):

DiffR(x，y)＝|fR(t _n，x，y)-fR(t _n-1，x，y)|

DiffG(x，y)＝|fG(t _n，x，y)-fG(t _n-1，x，y)|

DiffB(x，y)＝|fB(t _n，x，y)-fB(t _n-1，x，y)|

Wherein, DiffR, DiffG, DiffB is corresponding difference image red, green, blue three-component respectively, | f| is the absolute value of f.If two sequence image f (t in the time interval Δ t _N-1, x, y), f (t _n, x, difference DiffR y) (x, y)≤T|DiffG (x, y)≤T|DiffB (x, y)≤T, wherein, T is a threshold value, | be " logical OR " operational symbol, show not change object in the Δ t time interval, thus can be with t _n～t _N-1Between the image in a certain moment, image as a setting.

Utilize the gained background image,, adopt the background subtraction point-score, be partitioned into indication gesture subject area according to the current current frame image that obtains.

Simultaneously, although the colour of skin varies with each individual, and vary, in the color space of YCrCb, the human colour of skin presents good cluster characteristic, and insensitive to the attitude variation, can overcome variable effects such as rotation, expression, has strong robustness.In the YCbCr model, the monochrome information of Y representation in components color, Cr and Cb component are represented red and blue colourity respectively.The conversion formula that is transformed into the YCbCr space from rgb space is as follows:

[\begin{matrix} Y \\ Cb \\ Cr \end{matrix}] = [\begin{matrix} 0.2989 & 0.5866 & 0.1145 \\ - 0.1688 & - 0.3312 & 0.5000 \\ 0.5000 & - 0.4183 & - 0.0817 \end{matrix}] [\begin{matrix} R \\ G \\ B \end{matrix}]

In the YCbCr space, the colour of skin is in the stable scope in the distribution of Cb, Cr and Cr/Cb,, in the current image that obtains, will satisfy (Cr 〉=T that is ₁∩ Cr≤T ₂) ∩ (Cb 〉=T ₃∩ Cb≤T ₄) ∩ (Cr/Cb 〉=T ₅) pixel of condition, as the area of skin color of present image, wherein, ∩ is " logical and " operational character.For overcoming the influence of class colour of skin information (as timber floor, wooden cabinet etc.) in the present image, with satisfying the indication gesture subject area that area of skin color that this colour of skin condition extracted and above-mentioned employing background subtraction point-score are partitioned into simultaneously, as indication gesture subject area.

For extracting the finger areas of indication gesture, need from the indication gesture subject area of obtaining, get rid of the human face region that also satisfies above-mentioned two conditions simultaneously, because color characteristic areas such as human eye, eyebrow, lip are not in face complexion, therefore, in the human face region that extracts, will there be hole, and the height of face complexion area, wide ratio, be distributed in the stable scope, reject the influence that people's face extracts indication gesture finger areas in view of the above.

According to the behavioural characteristic of indication gesture and since the finger areas of the more non-indication gesture of finger areas of indication gesture in image on the upper side, therefore, the connected region that will comprise bottom vertical pixel coordinates value minimum of finger areas is defined as indicating the hand region of gesture.Because the forefinger zone of indication gesture is little than arm area in size, therefore, finger tip one is positioned from finger areas center of gravity farthest, thereby realizes the location of indication gesture finger tip.

The present invention compared with prior art, have following conspicuous outstanding substantive distinguishing features and remarkable advantage: the present invention is according to the hand-characteristic of the indication gesture in the behavior of indication gesture, though and the human colour of skin varies with each individual, but in the color space of YCrCb, present good cluster characteristic, in conjunction with the background subtraction separating method, extract the hand region of indication gesture object, and the finger tip of indication gesture is positioned at the hand edge profile of indication gesture and the finger areas center of gravity farthest of distance indication gesture, location finger tip point, computing is easy, flexibly, realize easily, solved when extracting the finger tip of indication gesture, need be simple by contact external device (as gloves) or context request, indication gesture object is single, and dynamic scene is changed responsive, noise is big, the deficiency of computing complexity; Improve the robustness that indication gesture finger tip extracts, can adapt to the automatic location of indication gesture finger tip under the complex background condition.Easy, flexible, the easy realization of method of the present invention.

Description of drawings

Fig. 1 is the flowsheet of the inventive method.

Fig. 2 is the original background image of one embodiment of the invention.

Fig. 3 is the current indication images of gestures of one embodiment of the invention.

Fig. 4 is the indication gesture object bianry image that is partitioned into based on the background difference of one embodiment of the invention.

Fig. 5 is the colour of skin bianry image that is partitioned into based on the YCbCr color space of one embodiment of the invention.

Fig. 6 is the area of skin color that possible indicate the gesture object of one embodiment of the invention

The hand region bianry image that Fig. 7 extracts.

Fig. 8 indicates gesture hand region bianry image.

Embodiment

A specific embodiment of the present invention is: running program as shown in Figure 1.This routine original background image as shown in Figure 2, current indication images of gestures such as Fig. 3 according to the hand-characteristic of the indication gesture in the indication gesture behavior, to coloured image shown in Figure 3, indicate the finger tip location of gesture.Concrete steps are as follows:

1) starts indication images of gestures acquisition system: gather video image;

2) obtain background image

3) indication gesture object cuts apart

Subtract each other by current indication images of gestures (Fig. 3) and original background image (Fig. 2), be partitioned into indication gesture subject area, as shown in Figure 4;

4) extract hand region

The concrete operations step is as follows:

Cr＝0.5×R-0.4187×G-0.0813×B

Cb＝-0.1687×R-0.3313×G+0.5×B

(2) area of skin color extracts: determine color-values Cr respectively, and the threshold range of Cb and Cr/Cb, the zone with all pixels that satisfy following formula are formed is defined as area of skin color S, as shown in Figure 5;

S＝(Cr≥87∩Cr≤133)∩(Cb≥78∩Cb≤127)∩(Cr/Cb≥1.05)

Wherein, ∩ is " logical and " operational character.

(3) extract the area of skin color of possible indication gesture object: will satisfy the bianry image of step 3) and step (2) simultaneously, as the area of skin color of possible indication gesture object, as shown in Figure 6;

(4) extract hand region: to bianry image shown in Figure 6, carry out the connected region search, calculate the high S of connected region _lWith wide S _wRatio, and the hole in the connected region counts H and connected region size W, the zone that all pixels that satisfy following formula are formed is considered as non-hand region, rejects from bianry image zone shown in Figure 6, obtains hand region, as shown in Figure 7;

F＝(S _l/S _w≥0.75∩S _l/S _w≤2.5)∩(H＞1)∩W＜300

5) hand region of definite indication gesture: the bottom pixels i that calculates bianry image connected region shown in Figure 7, the ordinate By:By=max (j) of j and horizontal ordinate Bx:Bx=i (j=By), to comprise and have minimum By value and the corresponding with it connected region that Bx formed, be defined as indicating the hand region of gesture, as shown in Figure 8;

6) the finger tip position of definite indication gesture: the hand region center of gravity (Cx that calculates the indication gesture, Cy), and the pixel coordinate i of each point on the hand region outline line of indication gesture, j is to center of gravity (Cx, Cy) distance D, distance D is had peaked pixel coordinate i, and j is defined as indicating the finger tip position (Px of gesture, Py), Px:Px=i (D=max (D)), Py:Py=j (D=max (D))), shown in spider among Fig. 8.

Claims

1. a finger tip localization method of indicating gesture is characterized in that the hand-characteristic according to the indication gesture in the behavior of indication gesture, determines the finger cusp position of indication gesture automatically, and concrete steps are as follows:

1) starts indication images of gestures acquisition system: gather video image;

2) obtain background image

3) indication gesture object cuts apart

4) extract hand region;

5) hand region of definite indication gesture;

6) the finger tip position of definite indication gesture;

The concrete operations step that described step 4) is extracted hand region is as follows:

Cr＝0.5×R-0.4187×G-0.0813×B

Cb＝-0.1687×R-0.3313×G+0.5×B

S＝(Cr≥T ₁∩Cr≤T ₂)∩(Cb≥T ₃∩Cb≤T ₄)∩(Cr/Cb≥T ₅)

Wherein, ∩ is " logical and " operational character;

(4) extract hand region: the bianry image to step (3) carries out the connected region search, calculates the high S of connected region _lWith wide S _wRatio, and the hole in the connected region counts H and connected region size W, the zone that all pixels that satisfy following formula are formed is considered as non-hand region, rejects from the bianry image zone of step (3);

F＝(S _l/S _w≥T ₆∩S _l/S _w≤T ₇)∩(H＞1)∩W＜T ₈

Wherein, T ₆, T ₇, T ₈Be threshold value;

The concrete operations step of the hand region of the definite indication of described step 5) gesture is as follows: calculate the bottom pixels i through the bianry image connected region of step (4) gained respectively, ordinate By:By=max (j) and the horizontal ordinate Bx:Bx=i of j, j=By, to comprise and have minimum By value and the corresponding with it connected region that Bx formed, be defined as indicating the hand region of gesture;

Described step 6) determines that the method for the finger tip position of indication gesture is: the hand region center of gravity Cx that calculates the indication gesture, Cy, and the pixel coordinate i of each point on the hand region outline line of indication gesture, j is to center of gravity Cx, the distance D of Cy has peaked pixel coordinate i, j with distance D, be defined as indicating the finger tip position Px of gesture, Py.