Summary of the invention
The technical problem that the present invention solves has provided video capture control device and method, with number of people tracking technique, face tracking technology and motion tracking technological incorporation, at target person during back to camera lens, utilize number of people tracking technique and motion tracking technology to follow the tracks of, solved the problem that face tracking lost efficacy, realized effective tracking target person.
In order to address the above problem, a kind of video capture control device of the present invention comprises: control unit is merged in video acquisition unit, face tracking unit, number of people tracking cell, motion tracking unit, tracking, memory cell, wherein,
It is that video image is gathered by unit that described video acquisition unit is used for the frame;
Described memory cell is used to store the trace information of face template, number of people template, video image;
Described face tracking unit is used for that the current frame video image of described video acquisition unit collection is carried out face characteristic and extracts, if find to exist the people's face that mates with face template, then the output human face region positional information corresponding with described people's face merges control unit to following the tracks of;
Described motion tracking unit is used for that the current frame video image of described video acquisition unit collection is carried out motion feature and extracts, if find to exist moving object, then the output moving region positional information corresponding with described moving object merges control unit to following the tracks of;
Described number of people tracking cell is used for the current frame video image of described video acquisition unit collection is carried out number of people feature extraction, if find the number of people of existence and number of people template matches, then export the number of people zone position information corresponding to following the tracks of the fusion control unit with the described number of people;
The output result work of control unit based on described face tracking unit, motion tracking unit, number of people tracking cell is merged in described tracking, if there is the human face region positional information, then with the trace information of human face region positional information as the next frame video image; If there is not the human face region positional information, then judge whether to exist the moving region positional information, if there is the moving region positional information, then described moving region positional information and number of people zone position information are merged, the trace information of the zone position information after merging as the next frame video image; If there is not a moving region positional information, then with the trace information of the video image in the described memory cell trace information as the next frame video image; Described tracking fusion control unit deposits the trace information of definite video image in described memory cell;
Described tracking is merged control unit and is utilized trace information control of video collecting unit to gather video image.
Alternatively, also comprise pretreatment unit, be used for the video image of video acquisition unit output is carried out preliminary treatment, remove the noise in the described current frame video image, the video image after the denoising is sent to face tracking unit, number of people tracking cell, motion tracking unit.
Alternatively, described tracking is merged control unit behind the trace information of determining video image, produces the movement instruction corresponding with the trace information of described video image, and is sent to described video acquisition unit;
Described video acquisition unit comprises movable mechanical underprop and the ball machine camera lens that is controlled by mechanical underprop that is installed on the mechanical underprop; Described mechanical underprop receives to follow the tracks of and merges the movement instruction that control unit sends, and carries out moving accordingly with movement instruction, drives described ball machine camera motion.
The present invention also provides a kind of video acquisition control method, comprising:
The video image of the present frame that obtained carried out people's face specially extracts, motion feature extracts, number of people feature extraction;
When face characteristic extract to obtain people's face with the face template coupling, with the human face region positional information of described people's face correspondence as trace information;
When motion feature extract to obtain moving object, the number of people positional information with number of people template matches that moving object region positional information and number of people feature extraction are obtained merged, with the position after merging as trace information;
When motion feature extracts when not obtaining moving object, with the trace information of former frame as trace information;
Collection with described trace information control next frame video image.
Alternatively, before current frame video image being carried out the special extraction of people's face, motion feature extraction, number of people feature extraction,, remove the noise in the current frame video image to the current frame video image preliminary treatment.
Alternatively, described face template, number of people template obtain by study.
Alternatively, comprise as trace information with described human face region positional information:
With center described and people's face that face template mates is the center, is that the length of side is determined square area with first length;
With the positional information of described square area as trace information.
Alternatively, the described number of people positional information with number of people template matches with moving object region positional information and number of people feature extraction acquisition merges, and the position after merging is comprised as trace information:
Center with moving object is the center, is that the length of side is determined square area with the 3rd length;
Center with the described number of people is the center, is that the length of side is determined square area with second length;
Mid point with the moving object and the number of people line of centres is the center, with less in the 3rd length and second length be that the length of side is determined square area;
With the positional information of described square area as trace information.
The present invention is corresponding to provide a kind of recording and broadcasting system that comprises video capture control device.
The present invention is corresponding to provide a kind of recorded broadcast method that comprises the video acquisition control method.
Compared with prior art, the invention provides personage's tracking cell, motion tracking unit, face tracking unit, follow the tracks of and merge control unit, the output of personage's tracking cell, motion tracking unit, face tracking unit is merged, form trace information, avoided prior art utilize the face tracking technology people's face back to or side follow the tracks of the problem that lost efficacy during to camera lens, guaranteed the tracking effect of recording and broadcasting system.
Embodiment
The inventor finds, utilizes the face tracking technology to carry out personage's tracking merely, need people's face directly to face the requirement of camera lens, and have the problem of following the tracks of inefficacy, and the tracking results instability.
Accordingly, the invention provides a kind of video capture control device, comprising: video capture control device comprises: control unit is merged in video acquisition unit, face tracking unit, number of people tracking cell, motion tracking unit, tracking, memory cell, wherein,
It is that video image is gathered by unit that described video acquisition unit is used for the frame;
Described memory cell is used to store the trace information of face template, number of people template, video image;
Described face tracking unit is used for that the current frame video image of described video acquisition unit collection is carried out face characteristic and extracts, if find to exist the people's face that mates with face template, then the output human face region positional information corresponding with described people's face merges control unit to following the tracks of;
Described motion tracking unit is used for that the current frame video image of described video acquisition unit collection is carried out motion feature and extracts, if find to exist moving object, then the output moving region positional information corresponding with described moving object merges control unit to following the tracks of;
Described number of people tracking cell is used for the current frame video image of described video acquisition unit collection is carried out number of people feature extraction, if find the number of people of existence and number of people template matches, then export the number of people zone position information corresponding to following the tracks of the fusion control unit with the described number of people;
The output result work of control unit based on described face tracking unit, motion tracking unit, number of people tracking cell is merged in described tracking, if there is the human face region positional information, then with the trace information of human face region positional information as the next frame video image; If there is not the human face region positional information, then judge whether to exist the moving region positional information, if there is the moving region positional information, then described moving region positional information and number of people zone position information are merged, the trace information of the zone position information after merging as the next frame video image; If there is not a moving region positional information, then with the trace information of the video image in the described memory cell trace information as the next frame video image; Described tracking fusion control unit deposits the trace information of definite video image in described memory cell;
Described tracking is merged control unit and is utilized trace information control of video collecting unit to gather video image.
Video acquisition device of the present invention can be applied to the personage and follow the tracks of, and for example when being applied to classroom instruction teacher is followed the tracks of, and when being applied to the meeting occasion participant is followed the tracks of.To be applied to classroom instruction, be target person with teacher below, the application that teacher is followed the tracks of describes.
Specifically please refer to Fig. 1, be the video capture control device structural representation of one embodiment of the invention.Described video capture control device comprises: control unit 103, memory cell 107 are merged in video acquisition unit 101, face tracking unit 104, number of people tracking cell 105, motion tracking unit 106, tracking.
As the preferred embodiments of the present invention, described video capture control device also comprises pretreatment unit 102, be used for the video image of video acquisition unit output is carried out preliminary treatment, remove the noise in the described current frame video image, the video image after the denoising is sent to face tracking unit 104, number of people tracking cell 105, motion tracking unit 106.
Described tracking is merged control unit 103 and is used for control of video collecting unit 101, face tracking unit 104, number of people tracking cell 105,106 work of motion tracking unit.Described tracking is merged control unit 103 and is comprised operating unit, is used for user's opening and closing and follows the tracks of fusion control unit 103.The user controls described tracking fusion control unit 103 and correspondingly opens or close video acquisition unit 101, face tracking unit 104, number of people tracking cell 105, motion tracking unit 106 by instruction of input open operation or shutoff operation are instructed to operating unit.
As an embodiment, described tracking is merged control unit 103 behind the trace information of determining video image, produces the movement instruction corresponding with the trace information of described video image, and is sent to described video acquisition unit 101.
In the present embodiment, described video acquisition unit 101 comprises movable mechanical underprop and the ball machine camera lens that is controlled by mechanical underprop that is installed on the mechanical underprop.Described mechanical underprop receives to follow the tracks of and merges the movement instruction that control unit 103 sends, and carries out moving accordingly with movement instruction, drives described ball machine camera motion.Particularly, described movement instruction comprises information such as the direction of motion, movement velocity, and for example, described movement instruction can be the speed motion of mechanical underprop along continuous straight runs with 2 centimetres of per minutes.
Because being unit with the frame, video acquisition unit 101 gathers video image, the trace information corresponding with video image also is that unit upgrades with the frame, thereby the movement instruction that described mechanical underprop receives is corresponding is that unit upgrades with the frame, make described mechanical underprop can drive ball machine camera motion like this, target person is carried out track up.
In the present embodiment, described face tracking unit 104 is used for that the video image that described video acquisition unit 101 is gathered is carried out face characteristic and extracts, if find to exist the people's face that mates with face template, then the output human face region positional information corresponding with described people's face merges control unit 103 to following the tracks of; Described number of people tracking cell 105 is used for the video image that described video acquisition unit 101 is gathered is carried out number of people feature extraction, if find the number of people of existence and number of people template matches, then export the number of people zone position information corresponding to following the tracks of fusion control unit 103 with the described number of people; Described motion tracking unit 106 is used for the video image of described video acquisition unit collection is carried out the motion feature extraction, if find to exist moving object, then control unit 103 is merged in output moving region positional information to the tracking corresponding with described moving object.
As one embodiment of the present of invention, described people's face positional information is the foursquare positional information (position coordinates) of the length of side with first length for being the center with people's face center.Described people's face positional information can also be the circular positional information of diameter or the zone position information of other shapes with first length for being the center with people's face center.Described first length can require specifically to be provided with according to the resolution and the tracking accuracy of the ball machine camera lens of application scenario, video acquisition unit.Because usually the collection of ball machine camera lens is that the video image of unit collection is a plurality of pel arrays that matrix is arranged that are with the frame, therefore, described first length can be the distance of plurality of pixels described in the described video image, thus described be that the square of the length of side is corresponding with the subregion in the institute video image with first length.Described first length is more little, and tracking accuracy is high more.For example, be the video capture control device of 300,000 pixels for ball machine camera lens, described first length can be the distance of 500 pixels, thus people's face positional information is to be the center with people's face center, is the square area of the length of side with the distance of 500 pixels.
Described number of people positional information is the foursquare positional information (position coordinates) of the length of side with second length for being the center with number of people center.Described number of people positional information can also be the circular positional information of diameter or the zone position information of other shapes with second length for being the center with number of people center.Described second length can require specifically to be provided with according to the resolution and the tracking accuracy of the ball machine camera lens of application scenario, video acquisition unit.Definite method of described second length can not done and give unnecessary details here with reference to definite method of first length.As an embodiment, for ball machine camera lens is the video capture control device of 300,000 pixels, described second length can be the distance of 600 pixels, thereby number of people positional information is to be the center with number of people center, is the square area of the length of side with the distance of 600 pixels.
Described moving region information is the foursquare positional information (position coordinates) of the length of side with the 3rd length for being the center with the moving object center.Described moving region information can also be the circular positional information of diameter or the zone position information of other shapes with the 3rd length for being the center with the moving object center.Described the 3rd length can require specifically to be provided with according to the resolution and the tracking accuracy of the ball machine camera lens of application scenario, video acquisition unit.Definite method of described the 3rd length can not done and give unnecessary details here with reference to definite method of first length.As an embodiment, for ball machine camera lens is the video capture control device of 300,000 pixels, described the 3rd length can be the distance of 800 pixels, thereby moving region information is to be the center with the moving object center, is the square area of the length of side with the distance of 800 pixels.
Described tracking is merged control unit 103 and is used for obtaining trace information based on the output result of face tracking unit 104, number of people tracking cell 105, motion tracking unit 106.Described trace information is as the trace information of next frame video image.After obtaining trace information, described tracking fusion control unit 103 deposits the trace information of definite video image in described memory cell 107.Because face tracking unit 105 carries out work based on face template usually, number of people tracking cell 105 is discerned based on the elliptic contour of the number of people, motion tracking unit 106 is based on the brightness work of current frame video image, the reliability that the 105 couples of personages in face tracking unit follow the tracks of is higher than the reliability of motion tracking unit 106, and the reliability of motion tracking unit 106 is higher than the reliability of number of people tracking cell 105.Therefore, the control unit 103 preferential output results that adopt face tracking unit 104 are merged in tracking of the present invention.
If 104 outputs of face tracking unit have people's face positional information, then with described people's face positional information as trace information, if face tracking unit 104 is not exported positional information is arranged, judge then whether motion tracking unit 106 has exported moving region information, if output movement area information not, illustrate that then tracking target (being teacher in the present embodiment) is inactive state, then the trace information of a directly above frame video image is as the trace information of video image; If motion tracking unit 106 output has moving region information, then with the moving region information of the output of motion tracking unit 106 and the moving region information fusion of number of people tracking cell 105, as trace information.
Of the present invention being fused to: with the center of the number of people and the center of the moving object line of centres is the center, with less in second length and the 3rd length be the length of side, obtain square area, with described square region location information domain as trace information.In practice, can also merge, should too much not limit at this based on other method.
Face template of the present invention, number of people template, previous frame video image can obtain by study, and described learning process specifically comprises:
When video capture control device is opened, follow the tracks of fusion control unit 103 video acquisition unit 101 is locked in presumptive area, it is the video image acquisition of unit that presumptive area is carried out with the frame, each frame video image is sent pretreatment unit 102, after pretreatment unit 102 is handled, be sent to face tracking unit 104, number of people tracking cell 105 and motion tracking unit 106;
The 104 pairs of described video images in face tracking unit carry out face characteristic and extract, and obtain people's face, with the face template of described people's face as target person, face template are sent to memory cell 107;
105 pairs of described video images of number of people tracking cell carry out number of people feature extraction, obtain the number of people, with the number of people template of the described number of people as target person, number of people template are sent to memory cell 107;
The 106 pairs of described video images in described motion tracking unit carry out motion feature and extract, and obtain the monochrome information of described video image, and the monochrome information of described video image is sent to memory cell 107;
Described tracking is merged control unit 103 after memory cell 107 obtains face template, number of people template, people's face position with the face template correspondence of face tracking unit 104 is initial trace information, video acquisition unit 101 is unlocked, described initial trace information is converted to movement instruction, makes 101 pairs of target persons of video acquisition unit carry out track up.
For example, described presumptive area can be the dais, when the user opens at video acquisition device, stands on the dais, is learnt by face tracking unit 102, number of people tracking cell 105 and motion tracking unit 106.Usually follow the tracks of and merge the described memory cell 107 of control unit 103 detections, determine that memory cell 107 has obtained face template, number of people template, monochrome information, tracking is merged control unit 103 and is sent a signal to the user, and the user can begin to teach.Usually, the described study duration is shorter, is approximately the time of gathering 3~5 frame video images.Face template that obtains for fear of face tracking unit 104 and number of people tracking cell 105 and number of people template are not same target persons.As preferred embodiment, described face tracking unit 104 is prior to 105 work of number of people tracking cell, after face tracking unit 104 obtains the track human faces template, number of people tracking cell 105 is based on carrying out number of people feature extraction near the face template position, with the distance face template position number of people the most nearby as number of people template.
The present invention also provides a kind of video acquisition control method, please refer to Fig. 2, is video acquisition control method schematic flow sheet of the present invention.Described method comprises:
Step S1 carries out to the video image of the present frame that obtained that people's face specially extracts, motion feature extracts, number of people feature extraction;
Step S2, when face characteristic extract to obtain people's face with the face template coupling, with the human face region positional information of described people's face correspondence as trace information;
When motion feature extract to obtain moving object, the number of people positional information with number of people template matches that moving object region positional information and number of people feature extraction are obtained merged, with the position after merging as trace information;
When motion feature extracts when not obtaining moving object, with the trace information of former frame as trace information;
Step S3 is with the collection of described trace information control next frame video image.
Wherein, before current frame video image being carried out the special extraction of people's face, motion feature extraction, number of people feature extraction, can also be to the current frame video image preliminary treatment, remove the noise in the current frame video image, pretreated video image is carried out face characteristic extraction, motion feature extraction, number of people feature extraction, improve the speed of face characteristic extraction, motion feature extraction, number of people feature extraction.
The special extraction of described people's face, motion feature extract, number of people feature extraction can be carried out simultaneously, also can successively carry out according to a graded.
As one embodiment of the present of invention, described face template, number of people template obtain by study.The method of study is introduced in the operation principle of video control collecting unit, does not elaborate at this.Described face template, number of people template can also be deposited in memory cell in advance.
Wherein, describedly comprise step as trace information with described human face region positional information:
Obtain people's face with the face template coupling;
Center with people's face is the center, is that the length of side is determined square area with first length;
With the positional information of described square area as trace information;
The number of people positional information with number of people template matches with moving object region positional information and number of people feature extraction acquisition of the present invention merges, and the position after merging is comprised as trace information:
Obtain moving object;
Center with moving object is the center, is that the length of side is determined square area with the 3rd length;
Obtain the number of people with number of people template matches;
Center with the described number of people is the center, is that the length of side is determined square area with second length;
Mid point with the moving object and the number of people line of centres is the center, with less in the 3rd length and second length be that the length of side is determined square area;
With described square area as trace information.
The present invention also provides a kind of recording and broadcasting system that comprises described video capture control device, and described recording and broadcasting system can be used for occasions such as teaching, meeting, can carry out track up to the spokesman of teacher in the classroom or meeting.
The present invention also provides a kind of recorded broadcast method that comprises the video acquisition control method.
To sum up, video acquisition control unit provided by the invention and method thereof, solved personage's tracking technique people's face back to or side follow the tracks of the problem that lost efficacy during to camera lens, guaranteed the tracking effect of target following.
Though the present invention with preferred embodiment openly as above; but it is not to be used for limiting the present invention; any those skilled in the art without departing from the spirit and scope of the present invention; can utilize the method and the technology contents of above-mentioned announcement that technical solution of the present invention is made possible change and modification; therefore; every content that does not break away from technical solution of the present invention; to any simple modification, equivalent variations and modification that above embodiment did, all belong to the protection range of technical solution of the present invention according to technical spirit of the present invention.