CN103279767A

CN103279767A - Human-machine interaction information generation method based on multi-feature-point combination

Info

Publication number: CN103279767A
Application number: CN2013101751997A
Authority: CN
Inventors: 佘青山; 杨伟健; 昌凤玲
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2013-05-10
Filing date: 2013-05-10
Publication date: 2013-09-04

Abstract

The invention relates to a human-machine interaction information generation method based on multi-feature-point combination. At present, the problem of low head posture identification accuracy in a complicated environment still exists in the field of vision-control-based human-machine interfaces, accordingly, incorrect human-machine interaction information is easily generated. According to the invention, human-machine interaction information is generated by positioning and correspondingly combining respective feature points of a user. The human-machine interaction information generation method comprises the following steps of acquiring a head video sequence through a photographing device; preprocessing, namely denoising and enhancing, the acquired image sequence; then detecting a face by an Adaboost algorithm, and positioning the feature points in a face region; defining three feature point combinations, and identifying the three feature point combinations by designing an identification rule; and finally generating the corresponding human-machine interaction information according to identification results of the three feature point combinations. According to the method, the human-machine interaction information can be generated with higher accuracy; and in the field of the vision-control-based human-machine interfaces, the human-machine interaction information generation method has a wide application prospect particularly in the intelligent human-machine interaction.

Description

Human-machine interactive information generation method based on the multi-characteristic points combination

Technical field

The invention belongs to the man-machine interface field, relate to a kind of human-machine interactive information generation method based on the multi-characteristic points combination.

Background technology

Along with the raising of expanding economy and civilization degree, the survival state of this specific group of disabled person just more and more receives the concern of society.2007, national disabled person's sample survey result showed for the second time, and the existing disabled person's sum of China is about 8,296 ten thousand, and the ratio that accounts for country's total population surpasses 6%, and wherein physical disabilities 2,412 ten thousand, are that proportion is the highest in all kinds of deformity.Meanwhile, " Chinese population aging prediction of the development trend research report " estimates that China's elderly population scale in 2014 will reach 200,000,000,2051 and will reach 4.37 hundred million, the elderly is the colony occurred frequently of cerebral apoplexy disease, has physical disabilities in various degree among the patient.Improving disabled person's survival state, improve disabled person's self-care ability and quality of life, has been the emphasis problem that the whole society pays close attention to, becomes an important and pressing task of all-round construction well-to-do level and harmonious society.

Improving patient's mobility, expand the patient moving scope, is fundamental purpose and the approach that the physical disabilities patient is carried out rehabilitation.For serious physical disabilities patients such as amputation, because the damaged or forfeiture of motor function, scope of activities and activity space are restricted, and how to strengthen patient's locomitivity, have become an important topic of biomedical and field of engineering technology.On the one hand, from the source of human motion, analyze brain to the domination function of limb motion and control, by obtaining the motion thinking information of brain, instruct thereby obtain limb action, realize that brain-machine is mutual; On the other hand, from the main body of human motion, by rehabilitation auxiliary implements such as exploitation artificial limb, wheelchairs, the motor function of compensatory realization patient disappearance enlarges scope of activities and self care ability.Break away from the constraint of traditional man-machine interfaces such as operating rod, button, mode and smart machine (comprising the rehabilitation accessory) by the mankind such as voice, body language custom and nature are linked up, allow it that taking the initiative in offering a hand of intelligence is provided, become an important research field of current man-machine interaction, be subjected to correlative study mechanism and scientific research personnel's extensive concern.Yet be pattern-recognition and the application of control technology in the rehabilitation accessory of expression way with voice, body language, also face many difficult problems.Not high as pattern accuracy of identification under the complex environment, man-machine interaction is nature inadequately, particularly be subject to interference such as light variation, complex environment, still there is the key feature point precision and the head pose recognition accuracy is not high and problems such as generation error human-machine interactive information are novel man-machine interface practicability and an intelligentized ubiquity difficult point.Therefore estimate to carry out research and development at head pose, have that the important techniques research and development are worth and market application foreground widely.

Summary of the invention

Purpose of the present invention be exactly at existing based in the man-machine interface field of vision control because of the not high problem that causes the generation error human-machine interactive information of head pose recognition accuracy in the complex environment, a kind of human-machine interactive information generation method based on the multi-characteristic points combination is provided.

The head visual signal can better be passed on effector's intention, has the feasibility as non-contact type natural interaction information source.When the user in the man-machine interaction process, always the variation because of environment produces the mistake location of some unique points or leaks the location, the human-machine interactive information that system may generation error causes maloperation easily.Therefore, the human-machine interactive information generation method based on the multi-characteristic points combination of the present invention's research can improve the accuracy that human-machine interactive information generates under the complex environment, can avoid effectively causing maloperation because of mistake location or the leakage location of unique point.

In order to realize above purpose, the inventive method mainly may further comprise the steps:

Step (1) head video sequence obtains.Head visual information is obtained by optical lens and cmos image sensor assembly.

Step (2) image sequence pre-service.The image sequence that obtains is carried out gray processing, histogram equalization, morphology preprocess method, the head image sequence of obtaining is carried out de-noising, strengthen the effective information of picture headers.

Step (3) people's face and feature point detection.Adopt earlier image sequence after the Adaboost algorithm strengthens step (2) to carry out people's face and detect, then with facial image as input picture, adopt the Adaboost algorithm to carry out eyes, nose, face location equally, thereby obtain unique point.

Step (4) unique point combination definition and identification.According to the oriented unique point of step (3), three kinds of unique point combinations of definition earlier design recognition rule again three kinds of unique point combinations are identified.

Step (5) human-machine interactive information generates.According to the recognition result self-definition design of three kinds of unique points combination in the step (4) and generate human-machine interactive information.

The present invention compares with existing human-machine interactive information generation method, has following characteristics:

1, user's head movement is unrestricted.

In existing intelligent human-machine interaction, the generation of human-machine interactive information needs user's head to remain on a certain position to carry out corresponding operating as far as possible, head and camera head such as the user keep certain distance, and perhaps user's head will remain on video image centre etc.Will make the user feel constraint like this, not nature.And the present invention has carried out corresponding design to recognition rule, user's head can about or seesaw, but can not influence the correct generation of human-machine interactive information, strengthen the handling of user, make the user nature carry out man-machine interaction.

2, the present invention can generate human-machine interactive information in real time and than high-accuracy ground.

Through the repeatedly test statistics under the varying environment, three kinds of unique point combinations can obtain high recognition among the present invention, therefore also can generate human-machine interactive information than high-accuracy ground, can carry out man-machine interaction in real time fully.

Description of drawings

Fig. 1 is implementing procedure figure of the present invention.

Fig. 2 is each characteristic point coordinates Parameter Map.

Fig. 3 changes recognition rule figure about head.

Fig. 4 upwarps recognition rule figure for face.

Fig. 5 is the recognition rule figure that closes one's eyes.

Embodiment

Describe the impetus that the present invention is based on vision in detail below in conjunction with accompanying drawing and have or not the intention method of discrimination, Fig. 1 is implementing procedure figure.

As Fig. 1, the enforcement of the inventive method mainly comprises seven steps: (1) obtains head visual information by optical lens and cmos image sensor assembly; (2) image sequence that obtains is carried out gray processing, histogram equalization, morphology preprocess method, the head image sequence of obtaining is carried out de-noising; (3) adopt earlier the Adaboost algorithm that the image sequence after strengthening in (2) is carried out people's face and detect, then with facial image as input picture, adopt the Adaboost algorithm to carry out eyes, nose, face location equally; (4) according to oriented unique point in (3), the combination of three kinds of unique points of definition earlier designs recognition rule again three kinds of unique point combinations is identified; (5) make up the self-defining settings of recognition result and generate human-machine interactive information according to three kinds of unique points in (4).One by one each step is elaborated below.

Step 1: the head video sequence obtains

Head visual information is obtained by optical lens and cmos image sensor assembly.

Step 2: image sequence pre-service

Because optical system or electron device influence, image unavoidably can be subjected to noise, need carry out denoising Processing.Concrete gray processing, histogram equalization, the morphology preprocess method of adopting carries out de-noising to the header sequence image that obtains, and strengthens the effective information of picture headers.

Step 3: people's face and feature point detection

Earlier by location people's face the sequence of video images of people's face sorter after strengthening that is trained by the Adaboost learning algorithm, then with people's face as area-of-interest, eyes, nose, the face sorter that adopts the Adaboost algorithm to train carries out eyes, nose, face location respectively.

Suppose

Be pattern (sample) space, it comprises Individual pattern

Training set and corresponding class label

, owing to be the two-value classification problem, herein

, the positive and negative of difference representative sample.Then the algorithm steps of Adaboost training strong classifier is described below:

(1) number of initialization Weak Classifier

(2) make the sorter training number of plies

, and the identical weights of initialization sample

(3)

Layer calls the set of training sample and weight Training obtains Weak Classifier

Weak Classifier

Each pattern of giving

Specify a real number value;

(4) calculate the weighting error in classification

(1)

Wherein

It is the empirical probability of observing at training sample.

If

The time, Weak Classifier then

The performance evaluation factor

(2)

If

The time, then delete the epicycle Weak Classifier, and algorithm stops.

(5) upgrade weight

(3)

Wherein, Be normalized factor, make

(6) order

(7) if

, changeed for (3) step.

(8) final strong classifier is defined as

(4)

Error clearly in (4) step

Be with respect to weight distribution Calculate, and

With

Be inversely proportional to, error is characterized as probability

And, wherein need to consider simultaneously weight distribution And sample

The degree of correct classification.But it should be noted, when

The time, algorithm can be deleted the epicycle Weak Classifier, and the algorithm termination, is that the weight update mechanism lost efficacy because in the weight renewal process in (5) step, can be reduced by the weight of mis-classification sample, and the sample weights of correctly being classified can increase.

Final strong classifier is by all

Individual Weak Classifier is being considered its specific performance properties evaluation factor

Rear weight is chosen in a vote.

Step 4: definition and the identification of unique point combination

In order to realize the generation of human-machine interactive information, the present invention has defined three kinds of unique point combinations, is respectively left eye, right eye, nose (LREN), nose, face (NM) and left eye, right eye (LRE).And by designed regular to three kinds of unique points combination identify.

For identifying clearly is described, in Fig. 2, marked each characteristic point coordinates parameter.The concrete identifying of LREN, NM, LRE is as follows:

（1）LREN。When face during towards the dead ahead, the horizontal ordinate of two line mid points equates with the horizontal ordinate of nose, namely

And when head deflection,

(

, head is turned right;

, head turns left), effect is as shown in Figure 3.

Recognition rule is defined as follows: if

Set up, LREN is identified so.

（2）NM。When face towards the dead ahead and in the raw, when remaining unchanged with the distance of camera simultaneously, the distance of nose and face Be normal value

, therefore as shown in Figure 4, when face upwarps slightly, just detect the variation of its distance easily.

Recognition rule is defined as follows: if

Set up, NM is identified so.

According to this recognition rule, when user's head is done when seesawing, the distance between the corresponding nose face also can change, and this state also can be identified, and with the human-machine interactive information of generation error.Therefore, the present invention has done following design:

According to the geometric relationship of people's face and each unique point, no matter the people is bold for a short time, and the relative geometry position of its unique point remains unchanged, and that is to say that the ratio between it remains unchanged.Therefore the nose face that among the present invention face is upwarped front and back carries out size relatively apart from the ratio with people's face height, the nose face distance before the nose face upwarped less than face apart from the ratio with people's face height after if face upwarped and the ratio of people's face height, this state is identified so.Its mathematic(al) representation is:

(5)

Wherein,

For face upwarp before the distance of nose and face;

For face upwarp after the distance of nose and face;

For face upwarps people's face height before; For face upwarps people's face height afterwards.

Therefore, according to above-mentioned design, the user upwarps face or upwarp face when head keeps motionless when rocking before and after head, and system all will generate human-machine interactive information accurately.

（3）LRE。As shown in Figure 5, when being in open configuration for two, can detect two, and mark

And when being in closure state for two, detect less than two i.e. mark

Recognition rule is defined as follows: if

Set up, LRE is identified so.

Step 5: human-machine interactive information generates

According to four kinds of states in the step 4, namely head turns left, head is turned right, face upwarps and the identification of closing one's eyes, and the user can generate four kinds of corresponding human-machine interactive informations at least according to actual conditions.

(1) one to one.Be that a kind of state generates a kind of human-machine interactive information.Be example (down with) with the electric wheelchair, head only turns left that corresponding electric wheelchair turns left.

(2) one-to-many.Be that a kind of state generates multiple human-machine interactive information.As, face upwarps corresponding electric wheelchair and advances, and face upwarps then that corresponding electric wheelchair stops again, and face upwarps again and corresponding electric wheelchair advances.By that analogy, face upwarps this state and can generate alternately that electric wheelchair advances and Stop message.

(3) many-one.Be that various states generates a kind of human-machine interactive information.As, head turns left and the combination of two states of head right-hand rotation can generate the electric wheelchair Stop message.

(4) multi-to-multi.Be that various states generates multiple human-machine interactive information.As, the combination that head left-hand rotation and face upwarp two states can generate the electric wheelchair forward information, and the combination that head right-hand rotation and face upwarp two states can generate the electric wheelchair backward information.

Claims

1. based on the human-machine interactive information generation method of multi-characteristic points combination, it is characterized in that this method comprises the steps:

Step (1) head video sequence obtains, and specifically: head visual information is obtained by optical lens and cmos image sensor assembly;

Step (2) image sequence pre-service, specifically: the image sequence that obtains is carried out gray processing, histogram equalization, morphology preprocess method, the head image sequence of obtaining is carried out de-noising, strengthen the effective information of picture headers;

Step (3) people's face and feature point detection, specifically: adopt the image sequence after the Adaboost algorithm strengthens step (2) to carry out the detection of people's face earlier, then with facial image as input picture, adopt the Adaboost algorithm to carry out eyes, nose, face location equally, thereby obtain unique point;

Step (4) unique point combination definition and identification, specifically: according to the oriented unique point of step (3), three kinds of unique point combinations of definition earlier design recognition rule again three kinds of unique point combinations are identified;

Step (5) human-machine interactive information generates, specifically: according to the recognition result self-definition design of three kinds of unique point combinations in the step (4) and generate human-machine interactive information.