WO2005064525A1 - A method and apparatus for providing information relating to a body part of a person, such as for identifying the person - Google Patents

A method and apparatus for providing information relating to a body part of a person, such as for identifying the person Download PDF

Info

Publication number
WO2005064525A1
WO2005064525A1 PCT/DK2004/000924 DK2004000924W WO2005064525A1 WO 2005064525 A1 WO2005064525 A1 WO 2005064525A1 DK 2004000924 W DK2004000924 W DK 2004000924W WO 2005064525 A1 WO2005064525 A1 WO 2005064525A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
providing
radiation
body part
person
Prior art date
Application number
PCT/DK2004/000924
Other languages
French (fr)
Inventor
Kield Martin Kieldsen
Claus Gramkow
Carsten Panch Pedersen
Ole K. Neckelmann
Original Assignee
Kield Martin Kieldsen
Claus Gramkow
Carsten Panch Pedersen
Neckelmann Ole K
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kield Martin Kieldsen, Claus Gramkow, Carsten Panch Pedersen, Neckelmann Ole K filed Critical Kield Martin Kieldsen
Publication of WO2005064525A1 publication Critical patent/WO2005064525A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions

Definitions

  • the present invention relates to a method and an apparatus for providing information relating to a body part, such as a face or a hand, of a person in order to e.g. identify the person.
  • This identification may be used for e.g. border control, passport control or at ATM's where the owner of the card must be identified before being able to withdraw cash.
  • the present invention aims at improving the existing techniques, and in a first aspect, the invention relates to a method of providing information relating to a body part, such as a face, of a person, the method comprising: providing 3D information relating to the body part, determining, from the 3D information, one or more predetermined positions on the body part, providing radiation information relating to radiation emitted from, transmitted by, or reflected by the body part, providing the information relating to the body part from positions or parts in the radiation information corresponding to the determined positions.
  • any body part may be used. Normally, it is preferred that the body part used is normally exposed to the surroundings, such as the face or a hand of the person, in order to avoid wasting time and embarrassing the person by requiring the person to remove part of or all of his/her clothing. Apart from the clothing issue, nothing prevents the use of larger or other parts of the body for this method.
  • 3D information will, in the present context, mean that information relating to the shape of an outer surface of the body part is provided. As will become clear below, a number of manners exist of providing this information.
  • the positions determined using the 3D information are preferably positions of the outer surface of the body part. Such positions may be determined, as will become clear below, in any suitable manner of determining points of a surface. Saddle points, intersections with predetermined planes or lines, maxima, minima, extreme points, points determined by curvatures or other graphs, intersections with predetermined graphs or the like may all be used for determining these positions.
  • the radiation information may be information relating to any type of radiation transmitted by (such as X-rays or ultraviolet radiation), reflected by (such as visible, infrared or ultraviolet radiation) or emitted by (typically infrared radiation) the body part. All these types of radiation provide information relating to the body part, such as the temperature thereof (unveiling blood vessels, scars, etc), the colour thereof (unveiling birth marks, scars, etc), internal structure thereof (blood vessels, bone structure), the texture thereof (scars, different types of skin), moisture (increased reflection) etc. etc.
  • the methods of providing the 3D information and the radiation information may be separated and optimized individually.
  • the correspondence between the radiation information and the 3D information aims at identifying the parts of the radiation information where radiation is detected or determined as emitted/reflected/transmitted by the body part at the predetermined points.
  • This correspondence may require scaling/translation etc, in order for it to be performed.
  • the relative positions of the cameras could be known or calibrated. This is standard in the art.
  • the step of providing the radiation information comprises providing a 2D or 3D image relating to infrared radiation emitted/reflected from the body part, such as wherein the radiation information comprises information relating to a temperature of the body part at the one or more positions or parts.
  • This temperature firstly, is hard to falsify in the body part, if it was desired to cheat the system. Secondly, this temperature is determined by a number of factors of which some are visible on the outside of the body part (e.g. scars) and of which some are not (e.g. blood vessels). Thus, a large amount of information is obtainable using infrared/thermal information.
  • the step of providing the radiation information comprises providing a 2D or 3D image of emission/reflection of visible radiation from the body part.
  • Visible radiation provides a number of alternative features of the person, such as birth marks, eye colour, hair colour, skin colour, etc. in addition to the colour and structure of e.g. scars.
  • the step of providing the radiation information may, for the sake of simplicity, comprise providing the information as a 2D image.
  • the 3D information may be provided in a number of manners.
  • One such manner is one wherein the steps of providing the 3D information and the radiation information comprises providing the information on the basis of multiple images of the body part, each image being taken from a direction different from the direction(s) from which the other image(s) is/are taken. This may be stereo vision or enhanced versions where more than two images are used.
  • the step of providing the 3D information comprises the steps of: - illuminating at least part of the body part with predetermined radiation, detecting radiation scattered by or reflected from the body part, and generating the 3D information on the basis of the radiation detected.
  • the step of providing the predetermined radiation comprises providing radiation having a predetermined spatial distribution. This distribution may be the providing of a net or a scanning line over the body part as well as an image of a number of coloured lines or spots/areas.
  • the step of providing the predetermined radiation could comprise providing radiation having a predetermined phase, such as it is used for time of flight measurements also known for providing 3D - or depth - information relating to surfaces.
  • any suitable manner of obtaining the information in the radiation information may be used, such as using the information at the exact positions corresponding to the predetermined positions in the body part.
  • One such manner is one, wherein at least part of the information relating to the body part is determined from a position by: defining an area in the radiation information at the position, - performing a predetermined mathematical operation on the area of the radiation information, and providing the at least part of the information on the basis of a result of the mathematic operation.
  • the definition of the area in the radiation information may be defined by: - defining a first area in the 3D information at the position, defining the area in the radiation information as an area corresponding to the first area.
  • the first area normally will be an area of the surface of the body part. This area may be determined by that single point such as the area defined within a predetermined distance of the point, wherein the distance may be the distance between two other positions in the 3D information or an overall, predetermined distance.
  • Another such method is one, wherein at least part of the information relating to the body part is determined from a plurality of the positions by: defining an area in the radiation information defined by the plurality of positions, performing a predetermined mathematical operation on the area of the radiation information, and providing the at least part of the information on the basis of a result of the mathematic operation.
  • the area may be defined on the basis of the positions in the 3D information by: defining a first area in the 3D information defined by the plurality of positions, - defining the area of the radiation information as an area corresponding to the first area.
  • any mathematical operation may be used, such as a mathematical operation selected from the group consisting of a minimum value, a maximum value, a mean value, a percentile, an average, a gradient, a curvature, a standard deviation, and/or a fit of a mathematical function, such as a polynomial, over the information in the area, the result of the operation being characteristics of the function (such as for modelling blood vessels or scars).
  • the curvature may be a curvature of grey levels or pixel values of the information (if e.g. digitized).
  • the fitting of the function over the area may provide detailed information of e.g. blood vessels or scars due to the fact that the function may be of any suitable type which may precisely fit to any desired shape.
  • the result of the fit may therefore be a number of parameters describing the information precisely.
  • a very interesting embodiment is one relating to the use of the provided information for identification purposes.
  • the method further comprises the step of comparing the information relating to the body part to one or more sets of: information relating to a body part of a person and - the identity (or other pertinent information) of the person, in order to identify the person.
  • these sets of information may be provided in a database which may be specific for a particular purpose, such as admittance control of a company, or may be general purpose, such as for use in identifying the user of a credit card and for checking the person's identity when crossing a border.
  • a simple use of this method is one where a supermarket wishes to count the number of customers but avoid counting the same customer twice.
  • This analysis may be provided anonymously in that the identity of the person is not required.
  • the radiation information and 3D information is provided each time a person enters the supermarket, and if a match is determined with an earlier person (the database holding the information relating to the body part and the identity information represents the fact that this information is already present). In this manner, if a match is found, no new entry is made in the database and no increase in the number of customers is made.
  • the step of providing the radiation information is performed at a first point in time and wherein the providing of the 3D information and the comparing step is provided at a second, later point in time.
  • the method may be used for actually identifying the body part or person in the radiation information due to the fact that a match between the 3D information and the radiation information (providing the identity of the person from which the 3D information was derived) will, ideally, only be found if the radiation information was derived from the same person.
  • the steps of providing the 3D information and the radiation information are provided within a predetermined period of time, such as within 10 seconds, such as within 1 second, preferably within 500 ms, such as within 10 ms. In this manner, it is ensured that the images or information relate to the same angles, grimace etc.
  • the step of providing the 3D information may comprise providing the 3D information using one or more first cameras and wherein the step of providing the radiation information comprises providing the radiation using one or more second cameras, the method then further comprising the step of providing a correspondence between positions in image information provided by the one or more first cameras and positions in image information provided by the one or more second cameras.
  • a second aspect of the invention relates to a more specific use in that it relates to a method of determining whether a first person depicted in an image and a second person are one and the same, the image representing radiation information relating to radiation emitted from or reflected by a body part of the first person, the method comprising:
  • step 3) providing information relating to the body part of the first person from positions or parts in the image corresponding to the determined positions, 4) comparing the information provided in step 3) to first information relating to radiation emitted from or reflected by the body part of the second person from positions or parts corresponding to the determined positions, and
  • corresponding body parts will normally be the same body parts of the two persons (e.g. the right hand of each of the two persons).
  • the information which, in the first aspect, all knowingly relates to the same person does not necessarily belong to the same person.
  • One situation of this type is one where the step of providing the image at a point in time earlier than any of the steps 1)- 5).
  • a particular example is one where the image represents radiation information relating to a person which is sought after, e.g. by the police.
  • a suspect is apprehended and the 3D information is provided.
  • the suspect's identity is known, so now the comparison is made as if the radiation information and 3D information did, in fact, relate to the same person. If the corresponding information did, in fact, identify the same identity, a strong indication is obtained that the apprehended suspect is, in fact, in the radiation information.
  • step 4) comprises providing the first information as an image representing the radiation information.
  • step 4) comprises deriving the first information from a database in which the identity of the second person is related to the first information.
  • step 3) may comprise determining a direction from which the image has been taken and rotating the 3D information in order to obtain an overlap between the 3D information and the image.
  • the 3D information may be rotated in order to generate the same angle as the radiation information, e.g. as seen by an operator of a computer performing the present method, while still preserving the precise positions of the determined positions.
  • information may not be present in the image from all the determined positions. In that situation, only part of the information desired is provided. In that situation, it may not be possible to fully identify the suspect, but a number of other suspects may be ruled out, whereby a narrowing of the investigation is obtained.
  • step 4 comprise both providing a plurality of sets of first information and then comparing each set of first information with the information provided in step 3).
  • sets of first information may be e.g. images relating to different angles or directions of view of the first information on the body part. This may take into account different angles of view of different images analyzed.
  • the AAM routine may be desired to compare images or information using the AAM routine in that it is designed to compare or overlay images.
  • the determined positions may be transferred from one set of information (e.g. image) to another.
  • a third aspect of the invention relates to an apparatus for providing information relating to a body part, such as a face, of a person, the apparatus comprising: - means for providing 3D information relating to the body part, means for determining, from the 3D information, one or more predetermined positions on the body part, means for providing radiation information relating to radiation emitted from, transmitted by, or reflected by the body part, - means for providing the information relating to the body part from positions or parts in the radiation information corresponding to the determined positions.
  • the means for providing the radiation information may comprise means for providing a 2D or 3D image relating to infrared radiation emitted/ reflected from the body part.
  • These means for providing the radiation information preferably comprises means for providing information relating to a temperature of the body part at the one or more positions or parts.
  • the means for providing the radiation information may comprise means for providing a 2D or 3D image of emission/reflection/transmission of visible radiation from the body part.
  • the means for providing the radiation information comprise means for providing the information as a 2D image.
  • This may be a standard camera, such as a digital or analogue still camera or video camera.
  • the means for providing the 3D information and the radiation information may, in one embodiment, comprise means for providing the information on the basis of multiple images of the body part, each image being taken from a direction different from the direction(s) from which the other image(s) is/are taken. This may be standard stereoscopy or enhanced versions using more than two images.
  • the means for providing the 3D information comprises means for: illuminating at least part of the body part with predetermined radiation, - detecting radiation scattered or reflected from the body part, and generating the 3D information on the basis of the radiation detected.
  • the means for providing the predetermined radiation may comprise means for providing radiation having a predetermined spatial distribution.
  • This spatial distribution may be the providing of a net shaped radiation on the body part or a scanning line of radiation, or a number of lines or areas/points of different colours.
  • This radiation provider may be e.g. a laser or a projector, and the detector may again be a camera of any desired type.
  • the means for providing the predetermined radiation comprises means for providing radiation having a predetermined phase, such as is useful for time of flight measurements.
  • the means for determining the information relating to the body part are adapted to determine at least part of the information relating to the body part from a position by: defining an area in the radiation information at the position, - performing a predetermined mathematical operation on the area of the radiation information, and providing the at least part of the information on the basis of a result of the mathematic operation.
  • the area may be determined in the 3D information by: defining a first area or volume in the 3D information at the position, defining the area in the radiation information as one corresponding to the first area.
  • the means for determining the information relating to the body part are adapted to determine at least part of the information relating to the body part from a plurality of the positions by: defining an area in the radiation information defined by the plurality of positions, - performing a predetermined mathematical operation on the area of the radiation information corresponding to the first area, and providing the at least part of the information on the basis of a result of the mathematic operation.
  • the area may be determined by: - defining a first area in the 3D information defined by the plurality of positions, defining the area of the radiation information as one corresponding to the first area.
  • the mathematical operation may be selected from the group consisting of a minimum value, a maximum value, a mean value, a percentile, an average, a gradient, a curvature, a standard deviation, and/or a fit of a mathematical function, such as a polynomial, over the information in the area, the result of the operation being characteristics of the function.
  • An especially interesting embodiment of this fourth aspect is one comprising means for comparing the information relating to the body part to one or more sets of: - information relating to a body part of a person and the identity of the person, in order to identify the person.
  • the means for providing the radiation information are adapted to be operated at a first point in time and the means for providing of the 3D information and the means for comparing are adapted to be operated at a second, later point in time.
  • the apparatus further comprises means for controlling the means for providing the 3D information and the means for providing the radiation information so as to provide the 3D information and the radiation information within a predetermined period of time. In that manner, it is ensured that the images or information relate to the same person with the same position/angle and the same grimace etc.
  • the means for providing the 3D information comprises one or more first cameras and wherein the means for providing the radiation information comprises one or more second cameras, the apparatus further comprising means for providing a correspondence between positions in image information provided by the one or more first cameras and positions in image information provided by the one or more second cameras. As described above, this facilitates the transfer of positions from one set of information to the other.
  • This correspondence may e.g. be in image data between pairs of one of the first cameras and one of the second cameras or may be on 3D information provided on the basis of image data from the first and second camera(s), respectively.
  • a fourth aspect of the invention corresponds to the second aspect and relates to a particular situation where the apparatus is for determining whether a first person depicted in an image and a second person are one and the same, the image representing radiation information relating to radiation emitted from or reflected by a body part of the first person, the apparatus comprising:
  • the comparing means may comprise means for providing the first information as an image representing the radiation information.
  • the comparing means may comprise means for deriving the first information from a database in which the identity of the second person is related to the first information.
  • the apparatus may comprise means for providing the image at a point in time earlier than any of the means l)-5) are operated.
  • the means for providing the information relating to the body part of the first person may comprise means for determining a direction from which the image has been taken and means for rotating the 3D information in order to obtain an overlap between the 3D information and the image.
  • the comparing means may comprise means for providing a plurality of sets of first information and may be adapted to compare each set of first information with the information provided by the means for providing the information relating to the body part of the first person. As mentioned above, this may be used for taking into account e.g. different characteristics of different images to be analyzed.
  • Figure 1 illustrates a setup for quickly obtaining the 3D and 2D information relating to a face of a person
  • Figure 2 illustrates an alternative setup for obtaining the same information.
  • the following preferred embodiments are directed to a method of identifying a person on the basis of 3D information relating to the face of the person as well as both an RGB image and an IR image of the face. It is clear that B/W images, X-ray images or any other type of image or radiation information (the radiation being transmitted through, reflected by, or emitted by the body part) relating the to the face or other body part may equally well be used.
  • the method and apparatus derive 3D information from the face and derive information from the radiation information in order to provide information relating to the face. This information is compared to information in a database in order to identify the person.
  • this database needs to be set up.
  • This definition both relates to the definition of the actual information desired from the radiation information (and maybe also from the 3D-information) as well as how to structure the database and future searches.
  • the overall purpose of the profile scanning is to extract 3D information relating to the face being scanned.
  • the 3D information will consist of a data structure entailing three-dimensional coordinate information about all detected points on the face.
  • the choices of methods for performing the profile scanning are, e.g. :
  • Laser scanner projecting a dot matrix pattern, and a camera displaced there from to detect the corresponding points in the pattern matrix.
  • One setup of the camera positioning may be seen from fig. 1, wherein the face 50 of the person is positioned in relation to four mirrors 20 positioned on an ellipse having two focal points.
  • the face 50 is directed directly in the middle of the four mirrors 20 which are adapted to direct light from the face 50 toward the other focal point of the ellipse.
  • four other mirrors illustrated at 30 are provided which direct the light from the face to a camera 25.
  • the use of the elliptic setup ensures that the light path from the camera 25 to the face 50 is the same for each of the mirrors 20.
  • the mirrors 30 are, in fact, positioned along an axis extending out of the plane of the figure so that the camera 25 is only able to receive light from a single mirror 20, via the pertaining mirror 30.
  • the camera 25 is presented with different images relating to different directions of view of the face 50.
  • the mirrors 30 each has a sufficient extension along the direction of movement, sufficient time is allowed for the camera 25 to provide the image even though the mirror 30 is, in fact, moving relative to the camera.
  • the four images may be obtained in 0.2 s due to the fact that the camera is not moved and that the reciprocating mass may be made quite small. If the camera was moved, time would also have to be spent for allowing the camera to become at rest, in order to avoid shaken and unclear images.
  • the camera 25 may be adapted to provide both images of visible radiation and of IR radiation emitted by or from the face 50. Otherwise, one of the mirrors 30 may be directed toward another camera providing the information not provided by the camera 25. These two cameras may be positioned, relative to each other, along a line extending perpendicular to the plane of figure 1. In addition, the relative positioning of the cameras is preferably calibrated and known in order to facilitate referring positions in images of one camera to positions in images of the other camera.
  • the setup may additionally provide means (not illustrated) for providing structured light to the face 50 if the selected method of providing the 3D requires such light.
  • a computer or controller 28 is provided for controlling the camera(s) 25, any light emitters, the movement of the mirrors 30, and for generating the 3D information, other information, deriving the information from the data received, and for comparing that data to a database of data and for identifying the person.
  • This computer 28 may alternatively be a cluster of computers in a network or any other type of processor/group of processors.
  • the computer 28 may in addition comprise a monitor for providing information to an operator, such as for providing different alternatives to the operator (different alternatives of identities of the person, e.g.) or for providing additional information relating to the person or persons having been imaged/identified.
  • Figure 2 illustrates an alternative embodiment dispensing with the mirrors 20 and instead providing three cameras 31, 32, and 33, where the cameras 31 and 33 are used for providing 3D-information relating to the body part 50 using standard stereo vision. This means that the relative positions of the cameras 31 and 33 are known, calibrated or determinable.
  • the camera 32 is used for generating IR image data relating to the body part 50 at the same point in time. Naturally, more cameras may be used for this task in order to also provide the IR data as 3D data.
  • the cameras 31-33 may be moved vertically in order to take into account different heights of different persons. Preferably, however, the relative positioning thereof is fixed.
  • the 3D data is provided using stereo vision using two or more cameras.
  • structured light may be used as described in the following:
  • the system comprises at least:
  • a device controlled by the computer 28 to project a light pattern on the body part 50.
  • a digital camera 25, 31, 32 or 33 displaced horizontally from the projector, and rotated to be able to obtain images of the projected pattern on the body part.
  • an additional camera can be placed symmetrically on the other side of the projector to increase robustness by avoiding occluded projector points not visible to one camera.
  • two vertically displaced cameras can be placed above and under the projector in order to increase 3D scan resolution remarkably.
  • An enhancement of the system could be to place two projectors that form an angle towards the person. This way, light stripes can be projected on the ears of the person in order to obtain 3D information about these as well.
  • the generator specifies a pattern where the colours of three consecutive stripes have not occurred previously in the sequence.
  • the algorithm resembles a deBruijn pattern, although several restrictions regarding colours have been incorporated into the model.
  • the colours used are the ones in the corners of the RGB colour cube. These are red, green, yellow, white, blue, magenta and cyan. The black colour is reserved, as explained later.
  • black stripes are inserted between each colour stripe, in order to increase the intensity changes when a stripe transition occurs. All stripes have equal width.
  • the lines described can either be vertical or horizontal. The choice is highly dependent on the choice of line pattern as well as the setup of the system. In order to take advantage of the epipolar geometry discussed later, vertical lines are preferred when the cameras are horizontally displaced from each other. When using horizontal lines, the cameras are preferably displaced vertically from each other with respect to the projector.
  • a dynamic programming approach is utilized, based on the article by Li Zhang, Brian Curless and Steven M. Seitz (Li Zhang, Brian Curless and Steven M. Seitz: "Rapid Shape Acquisition Using Colour Structured Light and Multi-pass Dynamic Programming". Department of Computer Science and Engineering, University of Washington, Seattle WA 98195, USA. From the 1st IEEE International Symposium on 3D Data Processing, Visualization, and Transmission, June 2002, pages 24-36.)
  • One of the fundamentals of the algorithm is a dynamic programming algorithm, which incorporates a score-based line detection for each projector stripe. This is described in detail in the article mentioned above.
  • multiple images can be obtained by time- shifting the image pattern.
  • the RGB and IR spectral image information is added to the 3D model by superposition. This is achieved by calibration of the various acquisition modules.
  • the images are acquired from (such as by cameras provided at) positions very close to each other compared to the distance to the person. This implies that view-point differences can be estimated to pure translations in the image. This gives rise to a very simple calibration that may be performed on-line.
  • the acquisition modules, or view-points can spaced be further apart. The requirement is then a more elaborate calibration, where information about the offsets and view direction differences between view-points are calibrated off-line.
  • the 3D model is projected to match the 2D spectral images, and the RGB and IR information is added by employing the image correspondences.
  • This step may be performed before or after the actual (below) determination of the landmarks in the 3D information. Determining the landmarks in the 3D information
  • the system relies on accurate placement of facial landmarks to ensure that the recognition of each individual is based on the shape and spectral information in comparable regions.
  • Landmarks are preferably positioned in a dense pattern near regions with a high information content (e.g. eyes, nose, mouth), and more sparsely in low information regions (e.g. cheeks, jaw, upper forehead).
  • a high information content e.g. eyes, nose, mouth
  • low information regions e.g. cheeks, jaw, upper forehead
  • the system uses geometrical features like saddle points, min/max/total surface curvature, crest lines, and extreme points to place landmarks automatically.
  • the system uses the spectral information to aid the placement of landmarks on e.g. eyebrows and lips.
  • the nose and ears are identified as maxima, the back of the nose as a saddle point, and the cheeks may be determined at an intersection of the 3D model and predetermined planes.
  • sample points generate both 3D- information and may at the same time be used as points for determination of information from the RGB image as well as in the IR image.
  • the actual positions of the landmarks may be used for determining e.g. a temperature of a colour/texture at that specific position in the RGB/IR images or may be used for determining other positions (defined by one or more of the landmarks), where this information may be derived.
  • information may be derived from areas around a landmark (such as a circle or ellipse defined by one or more landmarks and predetermined data, such as distances) or multiple landmarks (defining e.g. corners of a polygon defining the area).
  • ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇
  • An area may be used for determining e.g. blood vessels, birth marks, scars or the like. Blood vessels are relatively hot, where scars are relatively cold. Thus, the area of colder/warmer skin (paler or darker skin) may be interesting or the coldest/warmest spot may be interesting. Also, a shape of a blood vessel may be interesting. Resampling the 3D model
  • the different areas or patches may be used as the areas above.
  • the information contained in the reduced 3D model is organized in vectors that are used in the shape and texture analysis stage.
  • the shape information is put in a vector by concatenating the three coordinates for each node in the model in one long vector.
  • colours (RGB), temperatures, and the derived measures, are concatenated to form two vectors encapsulating colour and temperature information.
  • the steps of obtaining the 3D information as well as the RGB/IR images, vectorization etc. are repeated for each of a number of persons in order to build the recognition database. All types of persons that should be recognized by the 3D system are preferably added in this manner. It should, however, be noted that it is quite possible to add additional persons after building of the database.
  • This variation could be gestures, short-term and long-term colour and temperature variation, artefacts (like glasses), and in general illustrate the stability of the landmarks selected.
  • the result of this analysis is built into the model through the shape analysis step below.
  • the extracted information is used to perform a look-up in a database in the recognition process. It is therefore advantageous to employ a data reduction scheme on the three vectors from the vectorization step. Data reduction can be performed using e.g. Principal Component Analysis (PCA), or Maximum Autocorrelation Factors (MAF), or the FastMap method, etc.
  • PCA Principal Component Analysis
  • MAF Maximum Autocorrelation Factors
  • Pre-processing ensures that the shape and texture parameters in the reduced data are spent only on describing the variations with relation to an average person, sitting in a fixed position, and with a standard light setting.
  • Normalization and Centralizing may, e.g., be seen in:
  • FastMap A Fast Algorithm for indexing, Data-Mining and Visualization of Traditional Multimedia Datasets by Faioutsos, C. and Lin, King I, 1995, Institute of Systems Research and Dept. of Computer Science, Univ. of Maryland, College Park. For the ACM 1995 SIGMOD conference.
  • An Average Person is determined from the three vectors deduced in the vectorization step. This is obtained by calculating the mean vector of the colour and temperature vectors, and by performing a so-called Procrustes Mean analysis on the shape vector. This analysis leads to a mean shape where differences in face position, orientation, and scale are eliminated. The scale is recorded for recognition purposes.
  • the Procrustes Mean analysis is based on reliable regions in the eye and nose neighbourhood.
  • the three vectors from the vectorization step are modified to describe only the variation with respect to the average person. That is, the colour and temperature vectors are centralized around the average person by translation, and the shape vector is centralized, rotated and normalized according to the parameters determined through the Procrustes Mean analysis.
  • the three vectors from all individuals output from the normalization step are used as input in a Principal Component Analysis (PCA).
  • PCA determines the directions in the vector space that encompass most of the variation across the set of all persons.
  • the output of the PCA is a new coordinate basis with a variance associated with each coordinate axis describing how much of the total variation the axis accounts for.
  • the axes with little variation are discarded and the remaining axes define a new coordinate system that allows each person to be described with a reduced number of parameters.
  • These parameters are the recognition descriptors.
  • the information in the IR, RGB, and shape vectors leads to three reduced recognition descriptor vectors.
  • the average person in the normalization step is calculated for each individual that has been acquired multiple times, and each of these persons is only centralized/normalized with respect to her own average, it is possible to perform a PCA that captures the intra-person variation, i.e. the variation displayed by the database population as a result of different gestures, long-term and short-term temperature and colour variation, ill-determined landmarks etc. This analysis is performed in a separate step to determine sub-spaces in the vector spaces that are not reliable in discriminating between different individuals.
  • the PCA is based on the common average person determined from all persons in the recognition database. This PCA is performed subject to the constraint that intra-person variation cannot be a descriptor in the new coordinate system. This is achieved by pre-processing the three input vectors for each person in order to remove the intra-person variation components prior to performing the inter-person PCA. The resulting new coordinate system will describe the variation between individuals in the database independent of intra-person variation.
  • Each axis in the new coordinate system output by the PCA describes a mode of variation.
  • the coordinate axes are ordered according to descending amounts of variation, but there need be no natural physical interpretation to these axes. Once the axes are selected as recognition descriptors, the actual ordering becomes insignificant and any linear mixing of the axes constitutes an equally well-suited coordinate basis.
  • the physical features are chosen to match common verbal descriptors or those of other biometric systems, e.g. the temperature between the eyes or the mean temperature of an area defined on the cheek of the person.
  • Another descriptor may be a mean colour of the cheek, the colour of an eye - or the reddest point on the cheek.
  • PCA descriptors such as the size of the head, the distance between the eyes, the length or width of the nose, and the length-to-width ratio of the face, etc.
  • the desired physical features are placed in the first positions in the sequence of descriptors.
  • the variation corresponding to each feature is recorded.
  • the remaining coordinate axes that are not associated with a physical feature are kept as-is.
  • the result of this step is the actual information desired, in the form of a single vector - called the Condensed Parameter Vector - in order to perform identification of the person.
  • the Condensed Parameter Vector - in order to perform identification of the person.
  • a larger amount of information was obtained in order to obtain information to select from, but this information has now been reduced. It is seen that information from both the IR image, the RGB image as well as from the 3D information may be selected.
  • only the landmarks and information relating to the result of the modification step (or the PCA step in that the modification step is not required) need be obtained.
  • the coordinate axes output from the modification (or the PCA) step are the final coordinate basis describing the variation of the database with respect to colour, temperature and shape.
  • Each individual is now parameterized with respect to this basis by linear regression, i.e. each of the three vectors from each person is projected upon the coordinate axes of the corresponding basis to give the recognition coordinates for colour, temperature, and shape.
  • linear regression i.e. each of the three vectors from each person is projected upon the coordinate axes of the corresponding basis to give the recognition coordinates for colour, temperature, and shape.
  • the measure of equality is the length of the difference vector. This length can be measured using isotropic scales or a different scale for each axis, reflecting the observed variation along the axes.
  • the actual searching may be divided into a number of sub steps:
  • the information is sent to the database in order to perform a similarity search on the data already stored.
  • Some of the data relating to the person may, however, be used to pre-index into the database in order to decrease search time.
  • a preliminary search could be performed to filter out the most unlikely of entries in the database. This is done by setting upper and lower thresholds for entries in the CPV that concern height, eye colour, gender, etc. For instance, if the height is measured to be 170 cm, all entries in the database not within 165 and 175 cm are filtered out.
  • the top candidates such as the 10 or 100 matches, may be chosen for final analysis and decision-making.
  • correlation techniques are used to obtain a similarity measure between the recognition and database images.
  • 3D profile and IR information is compared using statistical methods such as least squares, etc.
  • the matching process may be set to yield an unsuccessful result. However, if the discrimination is satisfactory, all information regarding the matching database entry is then finally returned to e.g. an operator for displaying.
  • the PCA procedure described in the previous steps is well-suited for data reduction and fast database searching, but it may, depending on the situation not be suited for capturing small individual details, that may be of significance in the recognition process. By storing all raw material, it is possible to discriminate further between a set of candidates drawn from the database.
  • the final discrimination can be based on the residual, i.e. the shape and texture not accounted for by the PCA model.
  • a birth mark could be a feature that is not described by the PCA model, but which is still a valid discriminative feature to refine the recognition, or to request extra attention from e.g. a system operator.
  • the system may store all available identification information such as ID number (country specific: personal ID number, passport number etc.), height, gender, race, nationality etc. This information may also be used to organize the database to speed up the search for recognition candidates.
  • ID number country specific: personal ID number, passport number etc.
  • This information may also be used to organize the database to speed up the search for recognition candidates.
  • the knowledge of gender and race may be used to monitor if the variation between races or gender is significant compared to the variation within the same race or gender. If this is so, the system may split the database population and perform separate texture and shape analyses for each sub-class. This may keep the variation models compact, and is believed to strengthen the discriminative power of the models.
  • the situation is one where the full desired set of data, i.e. the 3D-information and the RGB and/or IR images are not available of the same person may be a situation where a photo taken of a person is to be connected to a person e.g. in the custody of the police.
  • the photo may be a mug shot or a photo taken at a crime scene.
  • the 3D-information is derived from the person and the landmarks desired are determined.
  • the 3D model (with the landmarks) is rotated in order to have the same angle of view as that of the photo.
  • the landmarks are still provided on the basis of the 3D information of the suspect. Rotating this model will not reduce the precision with which the landmarks are positioned.
  • the information from the photo is now derived as it normally would. It is noted that if the photo is an RGB image and where no IR information exists from the crime scene, only information relating to the RGB image may be derived. This gives a non-optimal condensed parameter vector also due to the fact that only information visible in the photo from the angle of view.
  • This method may also be used at different types of gates or borders, where each person passing is automatically compared to registers of wanted persons.
  • One manner of actually comparing or overlaying images is using the AAM model.
  • the AAM model is normally built from regular 2D images (B/W or RGB) with manually placed landmarks.
  • the shape and texture information give rise to a PCA model that accounts for the primary variation in the observed data.
  • the landmarks and the PCA regression parameters are estimated simultaneously in an iterative procedure in order to produce a good match with the input image. This can be done in several ways, but in general, a strategy is employed to produce possible outcomes of the model, and these outcomes are evaluated in terms of their similarity with the image. If the similarity is improved, the parameters are updated, and the algorithm will eventually achieve a (sub-) optimal solution to the match problem.
  • the parameters are the texture and shape descriptors that can be used to recognize the person.
  • the 2D shape and texture model is built without manually annotating the images with landmarks. Instead, the 3D face models are projected to 2D including their automatically found landmarks. This leads to 2D images with landmarks that can be used to build the AAM model directly.
  • This approach makes it possible to build AAM models from any desired view-point, i.e. a model can be build for frontal images and another model can be built for profile images. It is also possible to model different light settings.
  • the recognition proceeds by parameter estimation, and by recognition in the same way as in the 3D case described above.

Abstract

A method and an apparatus for providing information relating to a body part of a person. 3D information is provided using visible radiation and 2D or 3D information is provided using IR radiation. Positions in the visible 3D information are provided and correlated to positions in the IR information, and the information relating to the person is derived from the positions in the IR information. The information may be derived at different points in time so that persons viewed on surveillance cameras may subsequently be identified.

Description

A METHOD AND AN APPARATUS FOR PROVIDING INFORMATION RELATING TO A BODY PART OF A PERSON, SUCH AS FOR IDENTIFYING THE PERSON
The present invention relates to a method and an apparatus for providing information relating to a body part, such as a face or a hand, of a person in order to e.g. identify the person.
Especially automatic identification of persons has been sought after for a number of years. This identification may be used for e.g. border control, passport control or at ATM's where the owner of the card must be identified before being able to withdraw cash.
Today's identification is normally performed by manual identification or the use of different types of cards or access tokens optionally accompanied by the entering of a code on a keyboard. However, identification is desired at more points in everyday life. Also, a higher security (positive identification and lower risk of being fooled by e.g. thieves or terrorists) is desired, while the cost of the system should be as low as possible.
Identification and aspects thereof may e.g. be seen in the following references:
US-A-5,163,094, 6,301,370, 4,699,149, 6,097,029 and 6,526,161, US 2001/31072, 2001/31073, 2002/106114, 2002/122573, 2002/136435, 2003/53664, 2003/108223, 2003/123713, and 2003/215115, WO00/129769, 02/09024, 99/27838, DE 40 09 051 and 197 12 844, "Automatic Face Authentification..." by Beumier and Acheroy, British machine Vision Conference, 1998, "History, Current Status, and Future of Infrared Identification" by Francine Prokoski, IEEE, 2000, "Comparison of Visible and Infra-Red..." by Wilder et al, IEEE, 1996, pp 182-187, "Face Identification Using Thermal..." by Yoshitomi et al, IEEE International Workshop on Robot and Human Communication, IEEE, 1997, pp 374-379, "Skin Colour-Based Video...." by Sigal et al, Boston University Computer Science Technical Report No. 2003-006, 25/3-2003, "Illumination Invariant Face Recognition..." by Socolinsky et al, IEEE, 2001, pp 1-527 to 1-534, "Human Identification Technical Challenges" by Phillips, IEEE ICIP, 2002, pp 1-49 to 1-52, and "FastMap: A Fast Algorithm..." by Faloutsos and Lin, ACM 1995 SIGMOD conference.
The present invention aims at improving the existing techniques, and in a first aspect, the invention relates to a method of providing information relating to a body part, such as a face, of a person, the method comprising: providing 3D information relating to the body part, determining, from the 3D information, one or more predetermined positions on the body part, providing radiation information relating to radiation emitted from, transmitted by, or reflected by the body part, providing the information relating to the body part from positions or parts in the radiation information corresponding to the determined positions.
In the present context, any body part may be used. Normally, it is preferred that the body part used is normally exposed to the surroundings, such as the face or a hand of the person, in order to avoid wasting time and embarrassing the person by requiring the person to remove part of or all of his/her clothing. Apart from the clothing issue, nothing prevents the use of larger or other parts of the body for this method.
"3D information" will, in the present context, mean that information relating to the shape of an outer surface of the body part is provided. As will become clear below, a number of manners exist of providing this information.
The positions determined using the 3D information are preferably positions of the outer surface of the body part. Such positions may be determined, as will become clear below, in any suitable manner of determining points of a surface. Saddle points, intersections with predetermined planes or lines, maxima, minima, extreme points, points determined by curvatures or other graphs, intersections with predetermined graphs or the like may all be used for determining these positions.
The radiation information may be information relating to any type of radiation transmitted by (such as X-rays or ultraviolet radiation), reflected by (such as visible, infrared or ultraviolet radiation) or emitted by (typically infrared radiation) the body part. All these types of radiation provide information relating to the body part, such as the temperature thereof (unveiling blood vessels, scars, etc), the colour thereof (unveiling birth marks, scars, etc), internal structure thereof (blood vessels, bone structure), the texture thereof (scars, different types of skin), moisture (increased reflection) etc. etc.
Thus, according to the present invention, the methods of providing the 3D information and the radiation information may be separated and optimized individually. A number of manners exist of providing the 3D information and a number of different manners exist of providing information relating to the radiation emission/transmission/reflection of the body part. These manners include the use of any of visible, IR, UV, NIR, and/or X-ray radiation.
In the present context, the correspondence between the radiation information and the 3D information (in order to transfer the locations of the predetermined points - often denoted landmarks - to the radiation information) aims at identifying the parts of the radiation information where radiation is detected or determined as emitted/reflected/transmitted by the body part at the predetermined points. This correspondence may require scaling/translation etc, in order for it to be performed. Also, the relative positions of the cameras could be known or calibrated. This is standard in the art.
In one preferred embodiment, the step of providing the radiation information comprises providing a 2D or 3D image relating to infrared radiation emitted/reflected from the body part, such as wherein the radiation information comprises information relating to a temperature of the body part at the one or more positions or parts. This temperature, firstly, is hard to falsify in the body part, if it was desired to cheat the system. Secondly, this temperature is determined by a number of factors of which some are visible on the outside of the body part (e.g. scars) and of which some are not (e.g. blood vessels). Thus, a large amount of information is obtainable using infrared/thermal information.
In another preferred embodiment, or in addition to the above preferred embodiment, the step of providing the radiation information comprises providing a 2D or 3D image of emission/reflection of visible radiation from the body part. Visible radiation provides a number of alternative features of the person, such as birth marks, eye colour, hair colour, skin colour, etc. in addition to the colour and structure of e.g. scars.
The step of providing the radiation information may, for the sake of simplicity, comprise providing the information as a 2D image. A number of standard, components, such as standard still cameras or video cameras, exist for providing this type of information (with all the above-mentioned wavelength intervals).
The 3D information may be provided in a number of manners. One such manner is one wherein the steps of providing the 3D information and the radiation information comprises providing the information on the basis of multiple images of the body part, each image being taken from a direction different from the direction(s) from which the other image(s) is/are taken. This may be stereo vision or enhanced versions where more than two images are used.
Another manner is one, wherein the step of providing the 3D information comprises the steps of: - illuminating at least part of the body part with predetermined radiation, detecting radiation scattered by or reflected from the body part, and generating the 3D information on the basis of the radiation detected. In this situation, the step of providing the predetermined radiation comprises providing radiation having a predetermined spatial distribution. This distribution may be the providing of a net or a scanning line over the body part as well as an image of a number of coloured lines or spots/areas.
Also, the step of providing the predetermined radiation could comprise providing radiation having a predetermined phase, such as it is used for time of flight measurements also known for providing 3D - or depth - information relating to surfaces.
Naturally, any suitable manner of obtaining the information in the radiation information may be used, such as using the information at the exact positions corresponding to the predetermined positions in the body part.
However, more sophisticated manners also exist.
One such manner is one, wherein at least part of the information relating to the body part is determined from a position by: defining an area in the radiation information at the position, - performing a predetermined mathematical operation on the area of the radiation information, and providing the at least part of the information on the basis of a result of the mathematic operation.
In fact, the definition of the area in the radiation information may be defined by: - defining a first area in the 3D information at the position, defining the area in the radiation information as an area corresponding to the first area.
The first area normally will be an area of the surface of the body part. This area may be determined by that single point such as the area defined within a predetermined distance of the point, wherein the distance may be the distance between two other positions in the 3D information or an overall, predetermined distance.
Another such method is one, wherein at least part of the information relating to the body part is determined from a plurality of the positions by: defining an area in the radiation information defined by the plurality of positions, performing a predetermined mathematical operation on the area of the radiation information, and providing the at least part of the information on the basis of a result of the mathematic operation.
In fact, the area may be defined on the basis of the positions in the 3D information by: defining a first area in the 3D information defined by the plurality of positions, - defining the area of the radiation information as an area corresponding to the first area.
In general, any mathematical operation may be used, such as a mathematical operation selected from the group consisting of a minimum value, a maximum value, a mean value, a percentile, an average, a gradient, a curvature, a standard deviation, and/or a fit of a mathematical function, such as a polynomial, over the information in the area, the result of the operation being characteristics of the function (such as for modelling blood vessels or scars).
Naturally, different mathematical models may be used for different positions or areas, and more than a single mathematical operation may be used for any position or area.
In this context, the curvature may be a curvature of grey levels or pixel values of the information (if e.g. digitized).
The fitting of the function over the area may provide detailed information of e.g. blood vessels or scars due to the fact that the function may be of any suitable type which may precisely fit to any desired shape. The result of the fit may therefore be a number of parameters describing the information precisely.
A very interesting embodiment is one relating to the use of the provided information for identification purposes. In this situation, the method further comprises the step of comparing the information relating to the body part to one or more sets of: information relating to a body part of a person and - the identity (or other pertinent information) of the person, in order to identify the person.
Naturally, these sets of information may be provided in a database which may be specific for a particular purpose, such as admittance control of a company, or may be general purpose, such as for use in identifying the user of a credit card and for checking the person's identity when crossing a border. A simple use of this method is one where a supermarket wishes to count the number of customers but avoid counting the same customer twice. This analysis may be provided anonymously in that the identity of the person is not required. The radiation information and 3D information is provided each time a person enters the supermarket, and if a match is determined with an earlier person (the database holding the information relating to the body part and the identity information represents the fact that this information is already present). In this manner, if a match is found, no new entry is made in the database and no increase in the number of customers is made.
In one situation, the step of providing the radiation information is performed at a first point in time and wherein the providing of the 3D information and the comparing step is provided at a second, later point in time. In this manner, the method may be used for actually identifying the body part or person in the radiation information due to the fact that a match between the 3D information and the radiation information (providing the identity of the person from which the 3D information was derived) will, ideally, only be found if the radiation information was derived from the same person.
In another situation, the steps of providing the 3D information and the radiation information are provided within a predetermined period of time, such as within 10 seconds, such as within 1 second, preferably within 500 ms, such as within 10 ms. In this manner, it is ensured that the images or information relate to the same angles, grimace etc.
Alternatively or in addition, in the first aspect, the step of providing the 3D information may comprise providing the 3D information using one or more first cameras and wherein the step of providing the radiation information comprises providing the radiation using one or more second cameras, the method then further comprising the step of providing a correspondence between positions in image information provided by the one or more first cameras and positions in image information provided by the one or more second cameras.
With this correspondence, the transfer of positions from one set of information to the other is easy. The alternative would be to identify certain features in both sets of information and generate the correspondence in that manner, such as from person to person.
A second aspect of the invention relates to a more specific use in that it relates to a method of determining whether a first person depicted in an image and a second person are one and the same, the image representing radiation information relating to radiation emitted from or reflected by a body part of the first person, the method comprising:
1) providing 3D information relating to a body part of the second person, the body part of the second person corresponding to the body part of the first person, 2) determining, from the 3D information, one or more predetermined positions on the body part of the second person,
3) providing information relating to the body part of the first person from positions or parts in the image corresponding to the determined positions, 4) comparing the information provided in step 3) to first information relating to radiation emitted from or reflected by the body part of the second person from positions or parts corresponding to the determined positions, and
5) determining, from the comparison, whether the first and second persons is one and the same person.
In this context, corresponding body parts will normally be the same body parts of the two persons (e.g. the right hand of each of the two persons).
Thus, in this situation, the information which, in the first aspect, all knowingly relates to the same person, does not necessarily belong to the same person. One situation of this type is one where the step of providing the image at a point in time earlier than any of the steps 1)- 5).
A particular example is one where the image represents radiation information relating to a person which is sought after, e.g. by the police. A suspect is apprehended and the 3D information is provided. The suspect's identity is known, so now the comparison is made as if the radiation information and 3D information did, in fact, relate to the same person. If the corresponding information did, in fact, identify the same identity, a strong indication is obtained that the apprehended suspect is, in fact, in the radiation information.
In one embodiment, step 4) comprises providing the first information as an image representing the radiation information.
In another or the same embodiment, step 4) comprises deriving the first information from a database in which the identity of the second person is related to the first information.
Naturally, the radiation information may be provided from a direction which is non-optimal with respect to identification purposes. In any case, step 3) may comprise determining a direction from which the image has been taken and rotating the 3D information in order to obtain an overlap between the 3D information and the image. Thus, due to the 3D information being three-dimensional, it may be rotated in order to generate the same angle as the radiation information, e.g. as seen by an operator of a computer performing the present method, while still preserving the precise positions of the determined positions. Naturally, information may not be present in the image from all the determined positions. In that situation, only part of the information desired is provided. In that situation, it may not be possible to fully identify the suspect, but a number of other suspects may be ruled out, whereby a narrowing of the investigation is obtained.
It may be preferred to, in fact, have step 4) comprise both providing a plurality of sets of first information and then comparing each set of first information with the information provided in step 3). These sets of first information may be e.g. images relating to different angles or directions of view of the first information on the body part. This may take into account different angles of view of different images analyzed.
This may be relevant in situations where a number of images are sought through, such as a mug shot database, in order to determine whether the person exists in the database. If the different images in the database are not taken of persons from the same angle or direction, the different sets of first information may take the different directions into account and thereby still enable the search in spite of the different directions of view.
It may be desired to compare images or information using the AAM routine in that it is designed to compare or overlay images. In this manner, the determined positions may be transferred from one set of information (e.g. image) to another.
A third aspect of the invention relates to an apparatus for providing information relating to a body part, such as a face, of a person, the apparatus comprising: - means for providing 3D information relating to the body part, means for determining, from the 3D information, one or more predetermined positions on the body part, means for providing radiation information relating to radiation emitted from, transmitted by, or reflected by the body part, - means for providing the information relating to the body part from positions or parts in the radiation information corresponding to the determined positions.
As indicated above, the means for providing the radiation information may comprise means for providing a 2D or 3D image relating to infrared radiation emitted/ reflected from the body part. These means for providing the radiation information preferably comprises means for providing information relating to a temperature of the body part at the one or more positions or parts. Alternatively or in addition, the means for providing the radiation information may comprise means for providing a 2D or 3D image of emission/reflection/transmission of visible radiation from the body part.
In general, a cheap and suitable manner is to have the means for providing the radiation information comprise means for providing the information as a 2D image. This may be a standard camera, such as a digital or analogue still camera or video camera.
The means for providing the 3D information and the radiation information may, in one embodiment, comprise means for providing the information on the basis of multiple images of the body part, each image being taken from a direction different from the direction(s) from which the other image(s) is/are taken. This may be standard stereoscopy or enhanced versions using more than two images.
Another embodiment is one wherein the means for providing the 3D information comprises means for: illuminating at least part of the body part with predetermined radiation, - detecting radiation scattered or reflected from the body part, and generating the 3D information on the basis of the radiation detected.
In this situation, the means for providing the predetermined radiation may comprise means for providing radiation having a predetermined spatial distribution. This spatial distribution may be the providing of a net shaped radiation on the body part or a scanning line of radiation, or a number of lines or areas/points of different colours. This radiation provider may be e.g. a laser or a projector, and the detector may again be a camera of any desired type.
Yet another situation is one, wherein the means for providing the predetermined radiation comprises means for providing radiation having a predetermined phase, such as is useful for time of flight measurements.
In one particular embodiment, the means for determining the information relating to the body part are adapted to determine at least part of the information relating to the body part from a position by: defining an area in the radiation information at the position, - performing a predetermined mathematical operation on the area of the radiation information, and providing the at least part of the information on the basis of a result of the mathematic operation. As mentioned above, the area may be determined in the 3D information by: defining a first area or volume in the 3D information at the position, defining the area in the radiation information as one corresponding to the first area.
In another embodiment, or in addition to the above embodiment, the means for determining the information relating to the body part are adapted to determine at least part of the information relating to the body part from a plurality of the positions by: defining an area in the radiation information defined by the plurality of positions, - performing a predetermined mathematical operation on the area of the radiation information corresponding to the first area, and providing the at least part of the information on the basis of a result of the mathematic operation.
The area may be determined by: - defining a first area in the 3D information defined by the plurality of positions, defining the area of the radiation information as one corresponding to the first area.
In any of the above two embodiments, the mathematical operation may be selected from the group consisting of a minimum value, a maximum value, a mean value, a percentile, an average, a gradient, a curvature, a standard deviation, and/or a fit of a mathematical function, such as a polynomial, over the information in the area, the result of the operation being characteristics of the function.
An especially interesting embodiment of this fourth aspect is one comprising means for comparing the information relating to the body part to one or more sets of: - information relating to a body part of a person and the identity of the person, in order to identify the person.
In a particular situation, the means for providing the radiation information are adapted to be operated at a first point in time and the means for providing of the 3D information and the means for comparing are adapted to be operated at a second, later point in time. In this manner, the above-mentioned later identification of a person or body part in the radiation information may be provided on the basis of earlier information in e.g. a database. In another embodiment, the apparatus further comprises means for controlling the means for providing the 3D information and the means for providing the radiation information so as to provide the 3D information and the radiation information within a predetermined period of time. In that manner, it is ensured that the images or information relate to the same person with the same position/angle and the same grimace etc.
According to the third aspect, in addition or alternatively, the means for providing the 3D information comprises one or more first cameras and wherein the means for providing the radiation information comprises one or more second cameras, the apparatus further comprising means for providing a correspondence between positions in image information provided by the one or more first cameras and positions in image information provided by the one or more second cameras. As described above, this facilitates the transfer of positions from one set of information to the other.
This correspondence may e.g. be in image data between pairs of one of the first cameras and one of the second cameras or may be on 3D information provided on the basis of image data from the first and second camera(s), respectively.
A fourth aspect of the invention corresponds to the second aspect and relates to a particular situation where the apparatus is for determining whether a first person depicted in an image and a second person are one and the same, the image representing radiation information relating to radiation emitted from or reflected by a body part of the first person, the apparatus comprising:
1) means for providing 3D information relating to a body part of the second person, the body part of the second person corresponding to the body part of the first person,
2) means for determining, from the 3D information, one or more predetermined positions on the body part of the second person,
3) means for providing information relating to the body part of the first person from positions or parts in the image corresponding to the determined positions,
4) means for comparing the information provided by the means for providing the information relating to the body part of the first person to first information relating to radiation emitted from or reflected by the body part of the second person from positions or parts corresponding to the determined positions, and
5) means for determining, from the comparison, whether the first and second persons is one and the same person. Then, the comparing means may comprise means for providing the first information as an image representing the radiation information.
Also, the comparing means may comprise means for deriving the first information from a database in which the identity of the second person is related to the first information.
In addition, the apparatus may comprise means for providing the image at a point in time earlier than any of the means l)-5) are operated.
Also, the means for providing the information relating to the body part of the first person may comprise means for determining a direction from which the image has been taken and means for rotating the 3D information in order to obtain an overlap between the 3D information and the image.
The comparing means may comprise means for providing a plurality of sets of first information and may be adapted to compare each set of first information with the information provided by the means for providing the information relating to the body part of the first person. As mentioned above, this may be used for taking into account e.g. different characteristics of different images to be analyzed.
In the following, preferred embodiments of the invention will be described with reference to the drawing, wherein: Figure 1 illustrates a setup for quickly obtaining the 3D and 2D information relating to a face of a person and - Figure 2 illustrates an alternative setup for obtaining the same information.
The following preferred embodiments are directed to a method of identifying a person on the basis of 3D information relating to the face of the person as well as both an RGB image and an IR image of the face. It is clear that B/W images, X-ray images or any other type of image or radiation information (the radiation being transmitted through, reflected by, or emitted by the body part) relating the to the face or other body part may equally well be used.
In general, the method and apparatus derive 3D information from the face and derive information from the radiation information in order to provide information relating to the face. This information is compared to information in a database in order to identify the person.
Naturally, this database needs to be set up. In the following, the preferred method of defining the database is described. This definition both relates to the definition of the actual information desired from the radiation information (and maybe also from the 3D-information) as well as how to structure the database and future searches.
At the end, situations of identifying a person on the basis of a less-than-perfect set of data are described. This illustrates that the present invention actually is quite useful in situations where information is available from different sources and where identification or at least an indication of the identity of the person is desired.
Extracting 3D profile information
The overall purpose of the profile scanning is to extract 3D information relating to the face being scanned. The 3D information will consist of a data structure entailing three-dimensional coordinate information about all detected points on the face. The choices of methods for performing the profile scanning are, e.g. :
• Simple line laser scanner and a camera displaced from each other to obtain the depth information.
• Laser scanner projecting a dot matrix pattern, and a camera displaced there from to detect the corresponding points in the pattern matrix.
• 3D lasers employing a time of flight method, removing the need of a camera to obtain corresponding points on the surface.
• Projection of a structured colour light pattern and subsequent estimation of depth using a camera displaced from the projector. Stereo vision can also be employed in this setup.
• Calibrated stereo vision using two cameras that estimate depth by solving the point correspondence problem. This entails locating specific points in both camera images in the absence of any structured lighting.
The methods for 3D profile scanning are numerous. However, only a few live up to the demands of fast acquisition. To minimize the effects of the person moving while being scanned, a fast and rapid acquisition is preferred. Also, due to the potential risks of permanent eye damage, solutions based on regular lasers may be less desired. System setup
One setup of the camera positioning may be seen from fig. 1, wherein the face 50 of the person is positioned in relation to four mirrors 20 positioned on an ellipse having two focal points. The face 50 is directed directly in the middle of the four mirrors 20 which are adapted to direct light from the face 50 toward the other focal point of the ellipse. In this other focal point, four other mirrors (illustrated at 30) are provided which direct the light from the face to a camera 25.
The use of the elliptic setup ensures that the light path from the camera 25 to the face 50 is the same for each of the mirrors 20.
The mirrors 30 are, in fact, positioned along an axis extending out of the plane of the figure so that the camera 25 is only able to receive light from a single mirror 20, via the pertaining mirror 30. By moving the mirrors 30, which are attached to a reciprocating rod (not illustrated), along the direction perpendicular to the plane of the figure, the camera 25 is presented with different images relating to different directions of view of the face 50.
When the mirrors 30 each has a sufficient extension along the direction of movement, sufficient time is allowed for the camera 25 to provide the image even though the mirror 30 is, in fact, moving relative to the camera.
In this manner, sufficient images are taken of the person in order to be able to provide both the 3D information and the 2D information. In fact, the four images may be obtained in 0.2 s due to the fact that the camera is not moved and that the reciprocating mass may be made quite small. If the camera was moved, time would also have to be spent for allowing the camera to become at rest, in order to avoid shaken and unclear images.
The camera 25 may be adapted to provide both images of visible radiation and of IR radiation emitted by or from the face 50. Otherwise, one of the mirrors 30 may be directed toward another camera providing the information not provided by the camera 25. These two cameras may be positioned, relative to each other, along a line extending perpendicular to the plane of figure 1. In addition, the relative positioning of the cameras is preferably calibrated and known in order to facilitate referring positions in images of one camera to positions in images of the other camera.
The setup may additionally provide means (not illustrated) for providing structured light to the face 50 if the selected method of providing the 3D requires such light. A computer or controller 28 is provided for controlling the camera(s) 25, any light emitters, the movement of the mirrors 30, and for generating the 3D information, other information, deriving the information from the data received, and for comparing that data to a database of data and for identifying the person. This computer 28 may alternatively be a cluster of computers in a network or any other type of processor/group of processors. Naturally, the computer 28 may in addition comprise a monitor for providing information to an operator, such as for providing different alternatives to the operator (different alternatives of identities of the person, e.g.) or for providing additional information relating to the person or persons having been imaged/identified.
Figure 2 illustrates an alternative embodiment dispensing with the mirrors 20 and instead providing three cameras 31, 32, and 33, where the cameras 31 and 33 are used for providing 3D-information relating to the body part 50 using standard stereo vision. This means that the relative positions of the cameras 31 and 33 are known, calibrated or determinable.
The camera 32 is used for generating IR image data relating to the body part 50 at the same point in time. Naturally, more cameras may be used for this task in order to also provide the IR data as 3D data.
Naturally, the cameras 31-33 may be moved vertically in order to take into account different heights of different persons. Preferably, however, the relative positioning thereof is fixed.
Preferably, the 3D data is provided using stereo vision using two or more cameras.
Alternatively, structured light may be used as described in the following:
Projector pattern
The system comprises at least:
• A device (not illustrated) controlled by the computer 28 to project a light pattern on the body part 50. • A digital camera (25, 31, 32 or 33) displaced horizontally from the projector, and rotated to be able to obtain images of the projected pattern on the body part. Optionally, an additional camera can be placed symmetrically on the other side of the projector to increase robustness by avoiding occluded projector points not visible to one camera. Additionally, two vertically displaced cameras can be placed above and under the projector in order to increase 3D scan resolution remarkably.
An enhancement of the system could be to place two projectors that form an angle towards the person. This way, light stripes can be projected on the ears of the person in order to obtain 3D information about these as well.
It is important for robust profile scanning, that the projected colour line pattern is unique and that redundancy is avoided throughout the pattern. Therefore, the generator specifies a pattern where the colours of three consecutive stripes have not occurred previously in the sequence. The algorithm resembles a deBruijn pattern, although several restrictions regarding colours have been incorporated into the model.
The colours used are the ones in the corners of the RGB colour cube. These are red, green, yellow, white, blue, magenta and cyan. The black colour is reserved, as explained later.
When the sequence of colour stripes has been generated, black stripes are inserted between each colour stripe, in order to increase the intensity changes when a stripe transition occurs. All stripes have equal width.
There are numerous ways of developing efficient colour patterns, and the above mentioned is just one. However, due to the generic nature of the algorithm that detects the stripes in the image, very little adjustments have to be made in order to incorporate other stripe types. A few of these patterns could be: • Binary pattern resembling a bar code, with alternating black and white stripes of various widths.
• Colour lines without black lines.
• Two consecutive colour stripe images, one consisting of a random dot pattern, another of a series of white stripes.
The lines described can either be vertical or horizontal. The choice is highly dependent on the choice of line pattern as well as the setup of the system. In order to take advantage of the epipolar geometry discussed later, vertical lines are preferred when the cameras are horizontally displaced from each other. When using horizontal lines, the cameras are preferably displaced vertically from each other with respect to the projector. To detect the structured light pattern, a dynamic programming approach is utilized, based on the article by Li Zhang, Brian Curless and Steven M. Seitz (Li Zhang, Brian Curless and Steven M. Seitz: "Rapid Shape Acquisition Using Colour Structured Light and Multi-pass Dynamic Programming". Department of Computer Science and Engineering, University of Washington, Seattle WA 98195, USA. From the 1st IEEE International Symposium on 3D Data Processing, Visualization, and Transmission, June 2002, pages 24-36.)
Based on prior knowledge of the placement of the colour stripes in the image and by incorporating the constraints posed by the epipolar geometry from stereo vision, a very efficient algorithm is developed that quickly analyzes an image and outputs a depth structure that is passed on for further processing.
One of the fundamentals of the algorithm is a dynamic programming algorithm, which incorporates a score-based line detection for each projector stripe. This is described in detail in the article mentioned above.
To enhance the resolution of the depth map, multiple images can be obtained by time- shifting the image pattern.
Merging the 3D data and the IR/RGB data
In any of the above embodiments, the RGB and IR spectral image information is added to the 3D model by superposition. This is achieved by calibration of the various acquisition modules. In the compact versions of the system, the images are acquired from (such as by cameras provided at) positions very close to each other compared to the distance to the person. This implies that view-point differences can be estimated to pure translations in the image. This gives rise to a very simple calibration that may be performed on-line. In the extended versions of the system, the acquisition modules, or view-points, can spaced be further apart. The requirement is then a more elaborate calibration, where information about the offsets and view direction differences between view-points are calibrated off-line. Using this calibration, the 3D model is projected to match the 2D spectral images, and the RGB and IR information is added by employing the image correspondences.
This step may be performed before or after the actual (below) determination of the landmarks in the 3D information. Determining the landmarks in the 3D information
The system relies on accurate placement of facial landmarks to ensure that the recognition of each individual is based on the shape and spectral information in comparable regions.
Some landmarks are found by use of geometrical operators and spectral information, and other landmarks are placed automatically with respect to these ones. Landmarks are preferably positioned in a dense pattern near regions with a high information content (e.g. eyes, nose, mouth), and more sparsely in low information regions (e.g. cheeks, jaw, upper forehead).
The system uses geometrical features like saddle points, min/max/total surface curvature, crest lines, and extreme points to place landmarks automatically. In addition, the system uses the spectral information to aid the placement of landmarks on e.g. eyebrows and lips.
For example: the nose and ears are identified as maxima, the back of the nose as a saddle point, and the cheeks may be determined at an intersection of the 3D model and predetermined planes.
Now, a large amount of data is available in that the sample points generate both 3D- information and may at the same time be used as points for determination of information from the RGB image as well as in the IR image.
The actual positions of the landmarks may be used for determining e.g. a temperature of a colour/texture at that specific position in the RGB/IR images or may be used for determining other positions (defined by one or more of the landmarks), where this information may be derived. In addition, information may be derived from areas around a landmark (such as a circle or ellipse defined by one or more landmarks and predetermined data, such as distances) or multiple landmarks (defining e.g. corners of a polygon defining the area). Within these areas, different types of information may be derived, such as a maximum, minimum, a proportion, an average, a standard deviation, a percentile, a gradient, a curvature, data relating to a fit of a mathematical curve to the data in the area, or the like.
An area may be used for determining e.g. blood vessels, birth marks, scars or the like. Blood vessels are relatively hot, where scars are relatively cold. Thus, the area of colder/warmer skin (paler or darker skin) may be interesting or the coldest/warmest spot may be interesting. Also, a shape of a blood vessel may be interesting. Resampling the 3D model
It is possible to resample the 3D model to obtain a reduced polygon surface model, where the 3D landmarks are nodes. The spectral information for each surface patch is recorded, and this represents the RGB and IR information which is built into the recognition model.
In this situation, the different areas or patches may be used as the areas above.
Vectorization
Having now obtained this large amount of landmarks and corresponding data from the 3D model and the IR/RGB images, it is desired to put these onto a form on which they may be analyzed in order to determine the information which enables identification of a person, separates persons and which at the same time provides a fast search. Thus, as little information as possible is desired while retaining a high probability of performing the correct identification.
The information contained in the reduced 3D model is organized in vectors that are used in the shape and texture analysis stage. The shape information is put in a vector by concatenating the three coordinates for each node in the model in one long vector. Similarly, colours (RGB), temperatures, and the derived measures, are concatenated to form two vectors encapsulating colour and temperature information.
Thus, three long vectors having the 3D-positions, the RGB information and the IR information are provided.
Building the recognition database
The steps of obtaining the 3D information as well as the RGB/IR images, vectorization etc. are repeated for each of a number of persons in order to build the recognition database. All types of persons that should be recognized by the 3D system are preferably added in this manner. It should, however, be noted that it is quite possible to add additional persons after building of the database.
In order to analyze the typical variation within a single individual, some persons are, in addition, recorded multiple times. This variation could be gestures, short-term and long-term colour and temperature variation, artefacts (like glasses), and in general illustrate the stability of the landmarks selected. The result of this analysis is built into the model through the shape analysis step below.
Normalizing the vectors
The extracted information is used to perform a look-up in a database in the recognition process. It is therefore advantageous to employ a data reduction scheme on the three vectors from the vectorization step. Data reduction can be performed using e.g. Principal Component Analysis (PCA), or Maximum Autocorrelation Factors (MAF), or the FastMap method, etc.
It is common to all methods, that performance is enhanced if the data is pre-processed in order to avoid a large number of parameters describing the generic appearance of a face and to eliminate trivial deviations originating from e.g. pose or ambient light level.
Pre-processing ensures that the shape and texture parameters in the reduced data are spent only on describing the variations with relation to an average person, sitting in a fixed position, and with a standard light setting.
Normalization and Centralizing may, e.g., be seen in:
"On Properties of Active Shape Models" by Stegmann, M. B, (Project supervisor: Rune Fisker), March, 2000, Department of Mathematical Modelling, Technical University of Denmark, DTU
"Shape Modelling using Maximum Autocorrelation Factors" by Larsen, Rasmus, Department of Mathematical Modelling , Technical University of Denmark, DTU
FastMap: A Fast Algorithm for indexing, Data-Mining and Visualization of Traditional Multimedia Datasets by Faioutsos, C. and Lin, King I, 1995, Institute of Systems Research and Dept. of Computer Science, Univ. of Maryland, College Park. For the ACM 1995 SIGMOD conference.
An Average Person is determined from the three vectors deduced in the vectorization step. This is obtained by calculating the mean vector of the colour and temperature vectors, and by performing a so-called Procrustes Mean analysis on the shape vector. This analysis leads to a mean shape where differences in face position, orientation, and scale are eliminated. The scale is recorded for recognition purposes. The Procrustes Mean analysis is based on reliable regions in the eye and nose neighbourhood. The three vectors from the vectorization step are modified to describe only the variation with respect to the average person. That is, the colour and temperature vectors are centralized around the average person by translation, and the shape vector is centralized, rotated and normalized according to the parameters determined through the Procrustes Mean analysis.
Performing Principal Component Analysis - PCA
The three vectors from all individuals output from the normalization step are used as input in a Principal Component Analysis (PCA). The PCA determines the directions in the vector space that encompass most of the variation across the set of all persons. The output of the PCA is a new coordinate basis with a variance associated with each coordinate axis describing how much of the total variation the axis accounts for. The axes with little variation are discarded and the remaining axes define a new coordinate system that allows each person to be described with a reduced number of parameters. These parameters are the recognition descriptors. The information in the IR, RGB, and shape vectors leads to three reduced recognition descriptor vectors.
When the average person in the normalization step is calculated for each individual that has been acquired multiple times, and each of these persons is only centralized/normalized with respect to her own average, it is possible to perform a PCA that captures the intra-person variation, i.e. the variation displayed by the database population as a result of different gestures, long-term and short-term temperature and colour variation, ill-determined landmarks etc. This analysis is performed in a separate step to determine sub-spaces in the vector spaces that are not reliable in discriminating between different individuals.
To determine the inter-person variation, the PCA is based on the common average person determined from all persons in the recognition database. This PCA is performed subject to the constraint that intra-person variation cannot be a descriptor in the new coordinate system. This is achieved by pre-processing the three input vectors for each person in order to remove the intra-person variation components prior to performing the inter-person PCA. The resulting new coordinate system will describe the variation between individuals in the database independent of intra-person variation.
Modification of the PCA descriptors
Each axis in the new coordinate system output by the PCA describes a mode of variation. The coordinate axes are ordered according to descending amounts of variation, but there need be no natural physical interpretation to these axes. Once the axes are selected as recognition descriptors, the actual ordering becomes insignificant and any linear mixing of the axes constitutes an equally well-suited coordinate basis.
This fact may be exploited in a process where the new coordinate basis is rotated to obtain a number of descriptors with a physical interpretation. This may be due to a number of reasons, one being an increase in the understandability of the system now that the individual descriptors may be described with reference to features of the face.
The physical features are chosen to match common verbal descriptors or those of other biometric systems, e.g. the temperature between the eyes or the mean temperature of an area defined on the cheek of the person. Another descriptor may be a mean colour of the cheek, the colour of an eye - or the reddest point on the cheek.
It is clear that also information relating to features derivable from the 3D information may also be selected as PCA descriptors, such as the size of the head, the distance between the eyes, the length or width of the nose, and the length-to-width ratio of the face, etc.
In this manner, the desired physical features are placed in the first positions in the sequence of descriptors. In the same process the variation corresponding to each feature is recorded. The remaining coordinate axes that are not associated with a physical feature are kept as-is.
The result of this step is the actual information desired, in the form of a single vector - called the Condensed Parameter Vector - in order to perform identification of the person. In the beginning, a larger amount of information was obtained in order to obtain information to select from, but this information has now been reduced. It is seen that information from both the IR image, the RGB image as well as from the 3D information may be selected. In the future, only the landmarks and information relating to the result of the modification step (or the PCA step in that the modification step is not required) need be obtained.
Identifying a person in the user database
The coordinate axes output from the modification (or the PCA) step are the final coordinate basis describing the variation of the database with respect to colour, temperature and shape. Each individual is now parameterized with respect to this basis by linear regression, i.e. each of the three vectors from each person is projected upon the coordinate axes of the corresponding basis to give the recognition coordinates for colour, temperature, and shape. These are the variables that are compared in the recognition process. The measure of equality is the length of the difference vector. This length can be measured using isotropic scales or a different scale for each axis, reflecting the observed variation along the axes.
The actual searching may be divided into a number of sub steps:
When the condensed parameter vector has been generated for the person to be recognized, the information is sent to the database in order to perform a similarity search on the data already stored.
Some of the data relating to the person may, however, be used to pre-index into the database in order to decrease search time. A preliminary search could be performed to filter out the most unlikely of entries in the database. This is done by setting upper and lower thresholds for entries in the CPV that concern height, eye colour, gender, etc. For instance, if the height is measured to be 170 cm, all entries in the database not within 165 and 175 cm are filtered out.
Then, all parameters from the condensed parameter vector are taken into consideration during the search and are used simultaneously to obtain the closest matches in the database. Specifically, the information generated from merging the 2D texture-and temperatures into the 3D profile is used as comparison parameters.
Depending on the computational power available and the user requirements for the specific system, the top candidates, such as the 10 or 100 matches, may be chosen for final analysis and decision-making.
For the final analysis, a number of manners are available, the simplest one being the selection of the closest of the above matches.
However, depending on the amount of storage and computing capacity available and on the importance of the identification of the person, all available data for the top matches in the database may actually be used.
Thus, in fact, the following information may be used:
• The 3D profile scan
• The 3D IR information (2D IR merged into 3D profile) • The 3D Texture information (2D frontal image merged into 3D profile)
• Any 2D Profile images
• Any 2D Frontal image
• All physical and biometrical data available: Height, gender, eye colour, eye-eye distance, etc.
Thus, for image comparison, correlation techniques are used to obtain a similarity measure between the recognition and database images. 3D profile and IR information is compared using statistical methods such as least squares, etc.
For each of the closest matches, an overall value expresses the similarity between the matches and the recognition information. These distinct values are then compared to yield the final result.
However, it may be desired that, for a successful match, the following criteria are fulfilled:
• The highest value of the top matches is sufficiently high.
• The distance from the highest value to the second-highest exceeds a certain threshold.
If these criteria are not met, the matching process may be set to yield an unsuccessful result. However, if the discrimination is satisfactory, all information regarding the matching database entry is then finally returned to e.g. an operator for displaying.
For very important recognition applications
If a large number of persons exists in the database so that the distance between individual person's vectors is small, or in applications where it is essential to identify the right person, it may be desired that all raw data is stored in the database for future reference and to allow for refined matching.
The PCA procedure described in the previous steps is well-suited for data reduction and fast database searching, but it may, depending on the situation not be suited for capturing small individual details, that may be of significance in the recognition process. By storing all raw material, it is possible to discriminate further between a set of candidates drawn from the database.
The final discrimination can be based on the residual, i.e. the shape and texture not accounted for by the PCA model. For instance, a birth mark could be a feature that is not described by the PCA model, but which is still a valid discriminative feature to refine the recognition, or to request extra attention from e.g. a system operator.
Along with the raw image material, the system may store all available identification information such as ID number (country specific: personal ID number, passport number etc.), height, gender, race, nationality etc. This information may also be used to organize the database to speed up the search for recognition candidates. At the same time, the knowledge of gender and race, may be used to monitor if the variation between races or gender is significant compared to the variation within the same race or gender. If this is so, the system may split the database population and perform separate texture and shape analyses for each sub-class. This may keep the variation models compact, and is believed to strengthen the discriminative power of the models.
Identification of a person on the basis of a reduced set of information
The situation is one where the full desired set of data, i.e. the 3D-information and the RGB and/or IR images are not available of the same person may be a situation where a photo taken of a person is to be connected to a person e.g. in the custody of the police. The photo may be a mug shot or a photo taken at a crime scene.
In this situation, the 3D-information is derived from the person and the landmarks desired are determined.
In order to be able to ascertain that the person in the photo and the person in the custody of the police (and of which the 3D information is taken) is the same, the 3D model (with the landmarks) is rotated in order to have the same angle of view as that of the photo.
Thus, the landmarks are still provided on the basis of the 3D information of the suspect. Rotating this model will not reduce the precision with which the landmarks are positioned.
Having obtained the correct angle, the information from the photo is now derived as it normally would. It is noted that if the photo is an RGB image and where no IR information exists from the crime scene, only information relating to the RGB image may be derived. This gives a non-optimal condensed parameter vector also due to the fact that only information visible in the photo from the angle of view.
Thus, it may not be possible to provide sufficient proof or indication that the person in the photo is the person in the custody of the police, but it may be sufficient to rule out other suspects in the case.
Also, if it was desired to look through a large database of images, such as a mug shot database, it may be desired to make this analysis automatic. This may be done in the same manner, where 3D and 2D information relating to a given person is compared to all images in the database.
In databases of this type, however, not all images need be taken from the same direction of view. In order to take that into account, a number of 2D images may be made of the 3D information taking different directions of view into account, and then all these sets of 2D information is compared to the database.
This method may also be used at different types of gates or borders, where each person passing is automatically compared to registers of wanted persons.
One manner of actually comparing or overlaying images is using the AAM model.
The AAM model is normally built from regular 2D images (B/W or RGB) with manually placed landmarks. As in the 3D case described above, the shape and texture information give rise to a PCA model that accounts for the primary variation in the observed data. In the recognition (or reconstruction) process, the landmarks and the PCA regression parameters are estimated simultaneously in an iterative procedure in order to produce a good match with the input image. This can be done in several ways, but in general, a strategy is employed to produce possible outcomes of the model, and these outcomes are evaluated in terms of their similarity with the image. If the similarity is improved, the parameters are updated, and the algorithm will eventually achieve a (sub-) optimal solution to the match problem. As in the 3D case, the parameters are the texture and shape descriptors that can be used to recognize the person.
In the framework presented herein, the 2D shape and texture model is built without manually annotating the images with landmarks. Instead, the 3D face models are projected to 2D including their automatically found landmarks. This leads to 2D images with landmarks that can be used to build the AAM model directly. This approach makes it possible to build AAM models from any desired view-point, i.e. a model can be build for frontal images and another model can be built for profile images. It is also possible to model different light settings.
Once the AAM models are built for 2D images, the recognition proceeds by parameter estimation, and by recognition in the same way as in the 3D case described above.

Claims

1. A method of providing information relating to a body part, such as a face, of a person, the method comprising: providing 3D information relating to the body part, - determining, from the 3D information, one or more predetermined positions on the body part, providing radiation information relating to radiation emitted from or reflected by the body part, providing the information relating to the body part from positions or parts in the radiation information corresponding to the determined positions.
2. A method according to claim 1, wherein the step of providing the radiation information comprises providing a 2D or 3D image relating to infrared radiation emitted/reflected from the body part.
3. A method according to claim 2, wherein the radiation information comprises information relating to a temperature of the body part at the one or more positions or parts.
4. A method according to claim 1, wherein the step of providing the radiation information comprises providing a 2D or 3D image of emission/reflection of visible radiation from the body part.
5. A method according to any of the preceding claims, wherein the step of providing the radiation information comprises providing the information as a 2D image.
6. A method according to any of claims 1-4, wherein the steps of providing the 3D information and the radiation information comprises providing the information on the basis of multiple images of the body part, each image being taken from a direction different from the direction(s) from which the other image(s) is/are taken.
7. A method according to any of the preceding claims, wherein at least part of the information relating to the body part is determined from a position by: defining a first area in the 3D information at the position, performing a predetermined mathematical operation on a second area of the radiation information corresponding to the first area, and - providing the at least part of the information on the basis of a result of the mathematic operation.
8. A method according to any of the preceding claims, wherein at least part of the information relating to the body part is determined from a plurality of the positions by: defining a first area in the 3D information defined by the plurality of positions, performing a predetermined mathematical operation on a second area of the radiation information corresponding to the first area, and providing the at least part of the information on the basis of a result of the mathematic operation.
9. A method according to claim 7 or 8, wherein the mathematical operation is selected from the group consisting of a minimum value, a maximum value, a mean value, a percentile, an average, a gradient, a curvature, a standard deviation, and/or a fit of a mathematical function, such as a polynomial, over the information in the area, the result of the operation being characteristics of the function.
10. A method according to any of the preceding claims, wherein the step of providing the 3D information comprises the steps of: - illuminating at least part of the body part with predetermined radiation, detecting radiation scattered or reflected from the body part, and generating the 3D information on the basis of the radiation detected.
11. A method according to claim 10, wherein the step of providing the predetermined radiation comprises providing radiation having a predetermined spatial distribution.
12. A method according to claim 10, wherein the step of providing the predetermined radiation comprises providing radiation having a predetermined phase.
13. A method according to any of the preceding claims, further comprising the step of comparing the information relating to the body part to one or more sets of: information relating to a body part of a person and - the identity of the person, in order to identify the person.
14. A method according to claim 13, wherein the step of providing the radiation information is performed at a first point in time and wherein the providing of the 3D information and the comparing step is provided at a second, later point in time.
15. A method according to any of claims 1-13, wherein the steps of providing the 3D information and the radiation information are provided within a predetermined period of time.
16. A method according to any of the preceding claims, wherein the step of providing the 3D information comprises providing the 3D information using one or more first cameras and wherein the step of providing the radiation information comprises providing the radiation using one or more second cameras, the method further comprising the step of providing a correspondence between positions in image information provided by the one or more first cameras and positions in image information provided by the one or more second cameras.
17. A method of determining whether a first person depicted in an image and a second person are one and the same, the image representing radiation information relating to radiation emitted from or reflected by a body part of the first person, the method comprising:
1) providing 3D information relating to a body part of the second person, the body part of the second person corresponding to the body part of the first person,
2) determining, from the 3D information, one or more predetermined positions on the body part of the second person,
3) providing information relating to the body part of the first person from positions or parts in the image corresponding to the determined positions,
4) comparing the information provided in step 3) to first information relating to radiation emitted from or reflected by the body part of the second person from positions or parts corresponding to the determined positions, and
5) determining, from the comparison, whether the first and second person is one and the same person.
18. A method according to claim 17, wherein step 4) comprises providing the first information as an image representing the radiation information.
19. A method according to claim 17, wherein step 4) comprises deriving the first information from a database in which the identity of the second person is related to the first information.
20. A method according to claim 17, further comprising the step of providing the image at a point in time earlier than any of the steps l)-5).
21. A method according to claim 17, wherein step 3) comprises determining a direction from which the image has been taken and rotating the 3D information in order to obtain an overlap between the 3D information and the image.
22. A method according to any of claims 17-21, wherein step 4 comprises providing a plurality of sets of first information and comparing each set of first information with the information provided in step 3).
23. An apparatus for providing information relating to a body part, such as a face, of a person, the apparatus comprising: means for providing 3D information relating to the body part, means for determining, from the 3D information, one or more predetermined positions on the body part, means for providing radiation information relating to radiation emitted from or reflected by the body part, means for providing the information relating to the body part from positions or parts in the radiation information corresponding to the determined positions.
24. An apparatus according to claim 23, wherein the means for providing the radiation information comprises means for providing a 2D or 3D image relating to infrared radiation emitted/reflected from the body part.
25. An apparatus according to claim 24, wherein the means for providing the radiation information comprises means for providing information relating to a temperature of the body part at the one or more positions or parts.
26. An apparatus according to claim 23, wherein the means for providing the radiation information comprises means for providing a 2D or 3D image of emission/reflection of visible radiation from the body part.
27. An apparatus according to any of claims 23-26, wherein the means for providing the radiation information comprises means for providing the information as a 2D image.
28. An apparatus according to any of claims 23-27, wherein the means for providing the 3D information and the radiation information comprises means for providing the information on the basis of multiple images of the body part, each image being taken from a direction different from the direction(s) from which the other image(s) is/are taken.
29. An apparatus according to any of claims 23-28, wherein the means for determining the information relating to the body part are adapted to determine at least part of the information relating to the body part from a position by: defining a first area in the 3D information at the position, performing a predetermined mathematical operation on a second area of the radiation information corresponding to the first area, and providing the at least part of the information on the basis of a result of the mathematic operation.
30. An apparatus according to any of claims 23-29, wherein the means for determining the information relating to the body part are adapted to determine at least part of the information relating to the body part from a plurality of the positions by: defining a first area in the 3D information defined by the plurality of positions, - performing a predetermined mathematical operation on a second area of the radiation information corresponding to the first area, and providing the at least part of the information on the basis of a result of the mathematic operation.
31. An apparatus according to claim 29 or 30, wherein the mathematical operation is selected from the group consisting of a minimum value, a maximum value, a mean value, a percentile, an average, a gradient, a curvature, a standard deviation, and/or a fit of a mathematical function, such as a polynomial, over the information in the area, the result of the operation being characteristics of the function.
32. An apparatus according to any of claims 23-31, wherein the means for providing the 3D information comprises means for: illuminating at least part of the body part with predetermined radiation, detecting radiation scattered or reflected from the body part, and generating the 3D information on the basis of the radiation detected.
33. An apparatus according to claim 32, wherein the means for providing the predetermined radiation comprises means for providing radiation having a predetermined spatial distribution.
34. An apparatus according to claim 32, wherein the means for providing the predetermined radiation comprises means for providing radiation having a predetermined phase.
35. An apparatus according to any of claims 23-34, further comprising means for comparing the information relating to the body part to one or more sets of: information relating to a body part of a person and the identity of the person, in order to identify the person.
36. An apparatus according to claim 35, wherein the means for providing the radiation information adapted to be operated at a first point in time and wherein the means for providing the 3D information and the comparing means are adapted to be operated at a second, later point in time.
37. An apparatus according to any of claims 23-35, further comprising means for controlling the means for providing the 3D information and the means for providing the radiation information so as to provide the 3D information and the radiation information within a predetermined period of time.
38. An apparatus according to any of claims 32-37, wherein the means for providing the 3D information comprises one or more first cameras and wherein the means for providing the radiation information comprises one or more second cameras, the apparatus further comprising means for providing a correspondence between positions in image information provided by the one or more first cameras and positions in image information provided by the one or more second cameras.
39. An apparatus for determining whether a first person depicted in an image and a second person are one and the same, the image representing radiation information relating to radiation emitted from or reflected by a body part of the first person, the apparatus comprising: 1) means for providing 3D information relating to a body part of the second person, the body part of the second person corresponding to the body part of the first person,
2) means for determining, from the 3D information, one or more predetermined positions on the body part of the second person, 3) means for providing information relating to the body part of the first person from positions or parts in the image corresponding to the determined positions,
4) means for comparing the information provided by the means for providing the information relating to the body part of the first person to first information relating to radiation emitted from or reflected by the body part of the second person from positions or parts corresponding to the determined positions, and
5) means for determining, from the comparison, whether the first and second persons is one and the same person.
40. An apparatus according to claim 39, wherein the comparing means comprise means for providing the first information as an image representing the radiation information.
41. An apparatus according to claim 39, wherein the comparing means comprise means for deriving the first information from a database in which the identity of the second person is related to the first information.
42. An apparatus according to claim 39, further comprising means for providing the image at a point in time earlier than any of the means l)-5) are operated.
43. An apparatus according to claim 39, wherein the means for providing the information relating to the body part of the first person comprise means for determining a direction from which the image has been taken and means for rotating the 3D information in order to obtain an overlap between the 3D information and the image.
44. An apparatus according to any of claims 39-43, wherein the comparing means comprise means for providing a plurality of sets of first information and is adapted to compare each set of first information with the information provided by the means for providing the information relating to the body part of the first person.
PCT/DK2004/000924 2003-12-30 2004-12-29 A method and apparatus for providing information relating to a body part of a person, such as for identifying the person WO2005064525A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DKPA200301951 2003-12-30
DKPA200301951 2003-12-30

Publications (1)

Publication Number Publication Date
WO2005064525A1 true WO2005064525A1 (en) 2005-07-14

Family

ID=34717092

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DK2004/000924 WO2005064525A1 (en) 2003-12-30 2004-12-29 A method and apparatus for providing information relating to a body part of a person, such as for identifying the person

Country Status (1)

Country Link
WO (1) WO2005064525A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008020038A1 (en) * 2006-08-16 2008-02-21 Guardia A/S A method of identifying a person on the basis of a deformable 3d model
DE102008002730A1 (en) * 2008-06-27 2009-12-31 Robert Bosch Gmbh Distance image generating method for three-dimensional reconstruction of object surface from correspondence of pixels of stereo image, involves selecting one of structural elements such that each element exhibits different intensity value
DE102010013580A1 (en) * 2010-03-31 2011-10-06 Rohde & Schwarz Gmbh & Co. Kg Device and method for identifying persons
CN102253392A (en) * 2010-04-15 2011-11-23 赛德斯安全与自动化公司 Time of flight camera unit and optical surveillance system
US10372974B2 (en) 2017-01-11 2019-08-06 Microsoft Technology Licensing, Llc 3D imaging recognition by stereo matching of RGB and infrared images
CN113485058A (en) * 2021-06-18 2021-10-08 苏州小优智能科技有限公司 Compact high-precision three-dimensional face imaging device and three-dimensional face imaging method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4699149A (en) * 1984-03-20 1987-10-13 Joseph Rice Apparatus for the identification of individuals
US5163094A (en) * 1991-03-20 1992-11-10 Francine J. Prokoski Method for identifying individuals from analysis of elemental shapes derived from biosensor data
DE19712844A1 (en) * 1997-03-26 1998-10-08 Siemens Ag Method for three-dimensional identification of objects
WO1999027838A2 (en) * 1997-12-01 1999-06-10 Eraslan Arsev H Three-dimensional face identification system
WO2002009024A1 (en) * 2000-07-25 2002-01-31 Bio4 Limited Identity systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4699149A (en) * 1984-03-20 1987-10-13 Joseph Rice Apparatus for the identification of individuals
US5163094A (en) * 1991-03-20 1992-11-10 Francine J. Prokoski Method for identifying individuals from analysis of elemental shapes derived from biosensor data
DE19712844A1 (en) * 1997-03-26 1998-10-08 Siemens Ag Method for three-dimensional identification of objects
WO1999027838A2 (en) * 1997-12-01 1999-06-10 Eraslan Arsev H Three-dimensional face identification system
WO2002009024A1 (en) * 2000-07-25 2002-01-31 Bio4 Limited Identity systems

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008020038A1 (en) * 2006-08-16 2008-02-21 Guardia A/S A method of identifying a person on the basis of a deformable 3d model
DE102008002730A1 (en) * 2008-06-27 2009-12-31 Robert Bosch Gmbh Distance image generating method for three-dimensional reconstruction of object surface from correspondence of pixels of stereo image, involves selecting one of structural elements such that each element exhibits different intensity value
DE102008002730B4 (en) 2008-06-27 2021-09-16 Robert Bosch Gmbh Method and device for 3D reconstruction
DE102010013580A1 (en) * 2010-03-31 2011-10-06 Rohde & Schwarz Gmbh & Co. Kg Device and method for identifying persons
CN102253392A (en) * 2010-04-15 2011-11-23 赛德斯安全与自动化公司 Time of flight camera unit and optical surveillance system
US8878901B2 (en) 2010-04-15 2014-11-04 Cedes Safety & Automation Ag Time of flight camera unit and optical surveillance system
US9332246B2 (en) 2010-04-15 2016-05-03 Rockwell Automation Safety Ag Time of flight camera unit and optical surveillance system
EP2378310B1 (en) * 2010-04-15 2016-08-10 Rockwell Automation Safety AG Time of flight camera unit and optical surveillance system
US10372974B2 (en) 2017-01-11 2019-08-06 Microsoft Technology Licensing, Llc 3D imaging recognition by stereo matching of RGB and infrared images
CN113485058A (en) * 2021-06-18 2021-10-08 苏州小优智能科技有限公司 Compact high-precision three-dimensional face imaging device and three-dimensional face imaging method
CN113485058B (en) * 2021-06-18 2022-07-08 苏州小优智能科技有限公司 Compact high-precision three-dimensional face imaging method

Similar Documents

Publication Publication Date Title
Papatheodorou et al. Evaluation of automatic 4D face recognition using surface and texture registration
EP1629415B1 (en) Face identification verification using frontal and side views
US7221809B2 (en) Face recognition system and method
US7881524B2 (en) Information processing apparatus and information processing method
US5450504A (en) Method for finding a most likely matching of a target facial image in a data base of facial images
CN108549873A (en) Three-dimensional face identification method and three-dimensional face recognition system
US20060039600A1 (en) 3D object recognition
JP2000306095A (en) Image collation/retrieval system
Li et al. Efficient 3D face recognition handling facial expression and hair occlusion
Godil et al. Face recognition using 3D facial shape and color map information: comparison and combination
Beumier et al. Automatic Face Authentication from 3D surface.
CN109145716B (en) Boarding gate verifying bench based on face recognition
Hu et al. Real-time view-based face alignment using active wavelet networks
KR20020022295A (en) Device And Method For Face Recognition Using 3 Dimensional Shape Information
JP4426029B2 (en) Image collation method and apparatus
Beumier et al. Automatic face verification from 3D and grey level clues
JP3577908B2 (en) Face image recognition system
Fransens et al. Parametric stereo for multi-pose face recognition and 3D-face modeling
Zappa et al. Stereoscopy based 3D face recognition system
WO2005064525A1 (en) A method and apparatus for providing information relating to a body part of a person, such as for identifying the person
WO2006061365A1 (en) Face recognition using features along iso-radius contours
WO2006019350A1 (en) 3d object recognition
Li et al. Exploring face recognition by combining 3D profiles and contours
JP2002208011A (en) Image collation processing system and its method
Ibikunle et al. Face recognition using line edge mapping approach

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DPEN Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase