WO1999052059A2

WO1999052059A2 - Method and apparatus for performing robust recognition

Info

Publication number: WO1999052059A2
Application number: PCT/IB1999/000975
Authority: WO
Inventors: Wenyi Zhao
Original assignee: Lg Electronics, Inc.
Priority date: 1998-04-06
Filing date: 1999-04-06
Publication date: 1999-10-14
Also published as: WO1999052059A3; KR20010013501A

Abstract

An apparatus and method for classifying input data, where differences between general characteristics of the input item and the training data are reduced by manipulating general characteristics of an original subspace defined by the training data, projecting the input item into the manipulated subspace before classifying the input item and determining projection coefficients used to project the input item into the manipulated subspace. The input item is classified by mapping the projection coefficients of the input item and the projection coefficients of the training data into a classification space. The input item and training data may correspond to images, sounds, colors or other data of varying dimension, where manipulation is performed by comparing one or more of the general characteristics of the original subspace including rotational orientation, translational orientation, scale, and illumination.

Description

METHOD AND APPARATUS FOR^" PERFORMING ROBUST RECOGNITION

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a robust face recognition system and method, and, more particularly, to a robust face recognition system and method capable of compensating for differences between the general characteristics of input data and training data in a classification system.

2. Discussion of the Related Art Conventional face recognition systems have been based on geometrical local feature based schemes and holistic template matching schemes. Two forms of analysis performed under the holistic approach are principle component analysis (PCA) and linear discriminant analysis (LDA) .

PCA is a standard technique used to approximate original data with lower dimensional feature vectors . More specifically, using PCA, the number of vector dimensions required to represent original data is reduced, thereby simplifying calculations. The basic approach of PCA recognition is to compute the eigenvectors of a covariance matrix corresponding to vectors representing the original data, and to classify the original data based on a linear combination of only the highest-order eigenvectors. Although conventional application of PCA generally introduces error by considering less than all of the dimensions of the vector representing the original data, the error is generally small since the highest order eigenvalues are used.

By contrast, LDA is used to map an input of the LDA into a classification space in which class identification may be determined based on calculations such as euclidean distance. More specifically, LDA produces an optimal linear discriminant function F (x) = ^Tx which maps an input into a classification space. The matrix is determined based on scatter matrices computed within the feature space.

Recently, a third system known as subspace LDA has been developed based on the holistic approach. Subspace LDA has been employed to improve upon conventional PCA and LDA based systems. Subspace LDA involves performing LDA using a space or subspace that is generated based upon the original input space, e.g., through PCA. A description of systems employing conventional PCA, LDA, and subspace LDA, and related mathematics, can be found in "Statistical Pattern Recognition" by K. Fukunaga, "Using Discriminant Eigenfeatures for Image Retrieval" by D.L. Swets and J. Weng, and "Mathematical Statistics" by S.S. Wilks, which references are herein incorporated by reference in their entirety.

The conventional PCA, LDA, and subspace LDA systems are each susceptible to errors introduced when the general characteristics of the original data differ from the general characteristics of the training data to which it is matched. Namely, classification errors result from differences between the general characteristics of input data and training data, such as rotational orientation, translational orientation, scale, and illumination.

SUMMARY OF THE INVENTION It is an object of the present invention to overcome the shortcomings of prior art classification systems.

It is also an object of the present invention to provide a system and method capable of compensating differences between general characteristics of input data and training data. It is an object of the present invention to provide a classification system and method for suppressing the effects of a difference in rotational orientation between input image data and training data against which the input image data is compared.

It is an object of the present invention to provide a classification system and method for suppressing the effects of a difference in scale between input image data and training data _. gainst which the input image data is compared.

It is an object of the present invention to provide a classification system and method for suppressing the effects of a difference in alignment between input image data and training data against which the input image data is compared.

It is an object of the present invention to provide a classification system and method for suppressing the effects of a difference in illumination between input image data and training data against which the input image data is compared.

It is an object of the present invention to provide a classification system and method for compensating for shadows on different portions of an input image .

To achieve these and other advantages and in accordance with the purpose .of the present invention, as embodied and broadly described, the present invention includes an apparatus and method for classifying input data.

One of the methods of the present invention includes the steps of reducing differences between general characteristics of an input item and training data used to classify the input item, and classifying the input item through comparison with the training data. The step of reducing differences between general characteristics of the input item and the training data includes manipulating general characteristics of an original subspace defined by the training data, projecting the input item into the manipulated subspace before classifying the input item, and determining projection coefficients that are used to project the input item into the manipulated subspace. The step of classifying the input item includes comparing the projected input item to the training data, and classifying the input item by comparing the projected input item and the training data based on differences between the projection coefficients of the input item and the projection coefficients of the training data that are defined by a projection of the training data into the original subspace. These differences are determined by mapping the projection coefficients of the input item and the projection coefficients of the training data into a classification space before comparison.

Another method of the present invention includes classifying input data by representing the input data using an input space, manipulating the input space, projecting the input data into the manipulated input space, and classifying the input data based on projection coefficients used to project the input data into the manipulated input space.

In these and other methods of the present invention, the input item and training data may correspond to images, sounds, colors or other data of varying dimension, where manipulation is performed by comparing one or more of the general characteristics of the original subspace including rotational orientation, translational orientation, scale, and illumination. Thus, the space or subspace to be used for classification of the image is selected based on whether the input item corresponds more closely to the training data after being projected into the manipulated subspace or after being projected into the original space or subspace.

An apparatus of the present invention includes a processor, collection of processors, or a program having one or more modules that are capable of performing the above-described functions. Further scope of applicability of the present invention will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while -indicating preferred embodiments of the invention, are given by way of example only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description. BRIEF DESCRIPTION OF THE ATTACHED DRAWINGS The present invention will become more fully understood from the detailed description given below and the accompanying drawings , which are given by way of illustration only and which are therefore not limiting of the present invention, and wherein:

Figure 1 illustrates the components of the face recognition system of the present invention,-

Figure 2 is a flowchart showing an example of the steps performed by the processor of the present invention;

Figures 3 -3B illustrate a system and method for correcting a two-dimensional rotation of the input face image with respect to the images used to form the training data;

Figures 4A-4B illustrate a system and method for correcting misalignment of the input face image with respect to the images used to form the training data; Figures 5A-5B illustrate a system and method for adjusting the scale of the input face image to more closely correspond with the scale of the images used to form the training data;

Figures 6A-6B illustrate methods for correcting differences in illumination between the input face image and the images used to form the training data, or to compensate for shadowing in the input face image; Figures 7A-7B are a flowchart describing a method for integrating plural manipulations of. the face subspace and/or input face image; and

Figure 8 is a block diagram showing the stages performed in one of many possible implementations of an integration between plural manipulations of the face subspace and/or input face image.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings. In the drawings, similar components and processes are identified by like reference numerals.

In general terms, the face recognition system of the present invention includes several modules that provide robustness against variances in the input face image . The modules operate to compensate for inconsistencies between general characteristics of the input face image and the training data before classification against pre-stored face images. The modules . are capable of correcting inconsistencies in one or more general characteristics between the input image data and the training images, such as differences in illumination, scale, alignment and two-dimensional orientation/rotation. These modules can be used independently or in combination, depending upon the particular application. Figure 1 illustrates components of the face recognition system of the present invention including an image input device (101) , a image storage device (102) , and a processor (103) . Image input device 101 is a camera, scanner or some other input device capable of supplying an input face image . Image storage device 102 is a hard -drive, optical disk, RAM, ROM, or other memory device capable of storing predetermined training data, such as a predetermined set of training images, projection coefficients or data corresponding thereto. Processor 103 is a digital signal processor, a microprocessor, or other device capable of performing the manipulations and comparisons discussed hereinafter. Figure 2 is a flowchart showing an example of steps performed by the processor 103 of Figure 1. Generally, processor 103 generates an input vector corresponding to an input face image received from image input device 102 (step 201) , performs pre- processing to manipulate the face subspace if necessary (step 202) , projects the input vector into the manipulated face subspace (step 203) , maps the input vector or projection coefficients from the manipulated face subspace to the classification space (step 204) , and compares the input vector with vectors corresponding to training data in the classification space (step 205) . More specifically, in step 201, the processor generates an input vector corresponding to the input face image. One method of generating such an input vector is to concatenate the rows of pixels forming the input face image, where the dimension of the input vector is defined by the number of pixels forming the input face image .

In step 202, the processor performs one or more manipulations of a face subspace, if necessary. It is necessary to perform manipulatio (s) of the input face image when the input face image and the training images have inconsistent general characteristics such as rotational orientation, scale, illumination and translational position. If an input face image and the training images have inconsistent general characteristics, projection of each onto a common subspace will produce different projection coefficients, even if the input face image and the training image are of the same person/item. These differences will likely lead to classification errors since classification is ordinarily based on a comparison between coefficients necessary to project the input face image onto the face subspace, after projecting those coefficients into the classification space. However, these classification errors can be reduced or prevented by manipulating the face subspace when the general characteristics of the input face image are inconsistent with the training images. Specifically, the face subspace is defined based on characteristics of the training images that show differences between those different training images.

The particular dimensions of the face subspace are generally determined through the use of principle component analysis (PCA) or artificial neural networks (ANN) . These and other techniques may be used to select a reduced group of dimensions that are representative of the trained or input face image, enabling a decrease in computational burden.

By manipulating the face subspace to have general characteristics similar to those of the input face image, the relationship between the input face image and the manipulated subspace will become consistent with the relationship between the training images and the original subspace. Thus, the projection coefficients _L required to project the input face image onto the manipulated subspace will be consistent with the projection coefficients a required to project the training images onto the original subspace. Since the projection coefficients are rendered consistent by this manipulation, classification error will be avoided.

Moreover, manipulation of the face subspace in step 202 is appropriate when the input face image is not perfectly normalized, imperfect normalization occurring when that input face image is rotated in two- dimensions, misaligned, changed in scale or illuminated differently relative to the images used to form the training data. The input face image is also imperfectly normalized when selective portions of that image are illuminated differently due to shadowing.

Manipulations performed by the processor include changes in rotation as described with respect to Figures 3A-3B, changes in alignment as described with respect to Figures 4A-4B, changes in scale as described with respect to Figures 5A-5B, changes with respect to illumination as described with respect to Figures 6A- 6B, and other changes not explicitly illustrated that are useful in bringing the input image into conformity with the general characteristics of the training images .

Mathematically, using subspace representation, the input face image J can be described using components within the face subspace as follows:

where N is a dimension of the original image space, M is a dimension of the subspace, represents a series of eigenimages arranged according to the decreasing order of the corresponding eigenvalues, and α^ represents a series of projection coefficients used to project the input face image into the face subspace. Similarly, a distorted input face image I having general characteristics that differ from the general characteristics of the training data can be described using components within the face subspace as follows:

wherein a represents a series of projection coefficients used to project the distorted input face image J into the face subspace. To obtain projection coefficients α₁ for the distorted input face image I, equation (2) can be represented as ollows:

where Φ_t is obtained from Φ with the same mapping which transforms I to I. This mapping includes geometric mapping, intensity mapping, and geometrical-intensity mapping such as filtering. Thus, the face subspace may be manipulated to have general characteristics Φ_± that are inconsistent with the general characteristics Φ_t of the face subspace to the same extent that the general characteristics of the distorted input image data I are inconsistent with the general characteristics of the training image data J. As such, the projection coefficients a_± of the distorted input image data I can be compared with the projection coefficients oι_t of the training image data without distortion or error. In step 203, the processor projects the input vector into the manipulated face subspace. Because the general characteristics of the face subspace are made consistent with the general characteristics of the input face image in step 202, the coefficients required to project a distorted input face image onto the manipulated subspace are the same as the coefficients required to project a corresponding normal training image onto the original face subspace. As such, no error is introduced into the projection coefficients through the projection of the input face image in step 203.

In steps 204 and 205, the input face image is classified by comparing the projection coefficients of the input face image to the projection coefficients of the training images. In step 204, the input face image is effectively projected onto a classification space C, which is a space with dimensions that are the same or less than those of the original space. That is, the projection coefficients a produced when projecting the input face image from the image space into the face subspace are projected into the classi ication space C. Thus, if the face subspace is manipulated in step 202, the projection coefficients a. used to project the input face image onto that manipulated face subspace are mapped, in step 204, onto the classification space C.

Here, linear discriminant analysis (LDA) is used to project coefficients, such as the projection coefficients o>. of the input face image and training images, into a classification space C for pending comparison. Using LDA, the coefficients may be separated to enable a more sensitive comparison. In step 205, the projection coefficients a. of the input face image are compared to the projection coefficients oc_t of the different training images within the classification space C. More specifically, once projected into the classification space C, the projected projection coefficients α_λ of each are compared to determine classification of the input face image. Similarity measurements, such as a distance like Euclidean distance, are determined for comparison of the projection coefficients a_L within the classification space. One or more comparison rules, such as the Nearest-Neighbor Rule, are then used to make comparisons based on the distance determined.

If the projection coefficients a_t of the input face image are sufficiently similar to the projection coefficients α^ -of a particular training image, the input face image is classified according to that training image. However, if the projection coefficients a__ of the input face image are not sufficiently similar to any of the projection coefficients _L of the training images, the operation may be repeated, or the input face image may be rejected . Generally, to determine whether manipulation is necessary, steps 203-205 are performed in parallel with respect to both the original face subspace and one or more manipulated subspaces . If the projection coefficients a. used to project the input image data into the original face subspace correlate most closely with the projection coefficients a. used to project any one of the training images into the same face subspace, then classification is performed based on the original, unmanipulated subspace. However, if the projection coefficients a. used to project the input image data into one of the manipulated subspaces most closely correlate to the projection coefficients a. used to project one of the training images, then classification is performed based on that manipulated subspace rather than the original face subspace. This process is described in more detail with respect to Figures 7A-7B and 8. Alternatively, although not shown in the drawings, it is also possible to detect inconsistencies between the input images data and those of the training images, and to determine whether to manipulate based on such detections rather than continuously performing the processes in parallel.

Figures 3A-6B illustrate examples of systems and methods performed by the processor during preprocessing.

Figures 3A-3B illustrate a system and method for correcting a two-dimensional rotation of the input face image with respect to the images used to form the training data. Figure 3A shows a system in which a face image is input to processor 103 by image input device 101, and at least one training image is input to processor 103 from image storage device 102. Processor 103 is shown having multiple modules for projecting the images into original and rotating subspaces, performing linear discriminant mapping, and comparing the mapped coefficients using Euclidean measurements and the nearest neighbor rule.

Figure 3B shows the method for correcting the two- dimensional rotation of the input face image with respect to the training images. Step 301 involves rotating the face subspace by one or more predetermined angles. Step 302 involves projecting the input face image onto each of the rotated subspaces generated in step 301, projection coefficients a. defining the projection of the input face image into each of the rotated subspaces. Step 303 involves mapping the different sets of projection coefficients α_i to a classification space C. This process may be performed using a linear discriminant transform W^τ based on linear discriminant analysis (LDA) , or a like method. Step 304 involves comparing the projection coefficients a_x of the input face image to the projection coefficients α₁ of the trained images in the classification space C. Step 305 involves classifying the input face image based on the comparisons made in step 304. Using the above-described method, the input face image may be projected into several rotated subspaces. The projection coefficients corresponding to each of those different rotated subspaces would then be mapped into classification space C via the same discriminant mapping W as applied to the pre-stored normal training images. The projection coefficients c_ of the input face image for each of the rotated subspaces are then compared to the projection coefficients a of the training images in the classification space C. Classification follows based on these comparisons and rules such as the nearest -neighbor rule.

The rotated versions of the face subspace can be obtained from the original subspace by rotating the eigenimages associated therewith. Thus, the eigenimages associated with the rotated subspaces may be either stored or recomputed routinely, depending upon the availability of storage space and processor time . Figures 4A-4B illustrate a system and method for correcting misalignment of the input face image with respect to the images used to form the training data. The system and method of Figures 4A-4B are similar to the system and method of Figures 3A-3B, except that the original subspace is translated rather than rotated. That is, the original subspace is shifted in an appropriate direction (left, right, up, down, diagonal, combinational direction) to account for a misalignment between the input face image and the pre-stored driving images. Because of the similarities between Figures 4A-4B and Figures 3A-3B, further discussion of the system and method of Figures 4A-4B are omitted. Figures 5A-5B illustrate a system and method for adjusting the scale of the input face image to more closely correspond with the scale of the images used to form the training data. The system and method of Figures .5A-5B are similar to the system and method of Figures 3A-3B, except that the original subspace is changed in scale rather than rotated. That is, the scale of the original subspace is changed (increased, decreased) to account for a differences in scale between the input face image and the driving images . Such differences may result from differences in the equipment used to generate the images, or the distance between the object and the equipment during image acquisition. Because of the similarities between Figures 5A-5B and Figures 3A-3B, further discussion of the system and method of Figures 5A-5B are omitted.

Figures 6A illustrates a method for detecting and compensating shadows appearing in the input face image. Shadows are recognized where an area of the input face image is significantly darker than the brightness under normal conditions.

In step 601, areas within the input face image are searched for shadows, excluding areas discovered based on geometric location information that are likely to experience hair growth. The search for shadows involves first and second order statistics and geometric location information. For instance, shadows may be detected by comparing a threshold against the illumination, and changes in illumination, of areas in the input face image. As such, areas of an image with relatively low intensity and low variance are candidates for shadow. More specifically, assuming the orientation of the face image is known, a shadow may be detected if the area of an image corresponding to one side is significantly darker than an area of the image corresponding to the other side.

In step 602, the shadows are compensated by replacing or modifying the detected shadow area with the illumination of a corresponding area of the image not having a shadow. For instance, if the face image is a non-distorted frontal view (i.e., no significant two dimensional distortions or three dimensional rotation) , the present invention replaces shadow areas detected on one side of the face image with corresponding non-shadow areas from the other side of the face image. The processed data is then transmitted for classification or further manipulation of the image or subspace. Figure 6B illustrates a method for correcting differences in illumination between the input face image and the images used to form the training data. In step 611, for purposes of detection not training, pixels positioned outside the facial area of the input face image are reset to zero using a face- shaped mask. In step 612, local and global textures are determined and adjusted based on a texture preserving filter having the characteristics noted in equation (₄) below:

h(n,m) Y

—>δ[n-klδ[m-j}) ... (4),

where h(n,m) is the 2D impulse response in the time domain, and (2M+1) (2L+1) is the window size. Specifically, the intensity variation J(x,y) in a local neighborhood is assumed to comprise two components - J-'-^y) corresponding to local texture, and l^~~(_κ,y) corresponding to smooth global variation. Thus, the intensity variation of two portions of an image, such as the left intensity variation Jι_efC(x,y) ^and the right intensity variation T_rxghc (x,y) , are expressed as: ^Jief_C(X y) = ef X/Y⁾ +

and ^rrigh- (X Y) = brig t (X Y) ⁺ ^rxghc (*> Y> ^{• • • ■} ^) ,

where the globally varying intensities are related by an unknown unbalanced illumination function. Rather than obtaining this function, a balanced global intensity variation may be obtained for classification by averaging the intensities of two portions of the image, e.g., I_g ^Leec(x,y) and I_g ^rιghc (x, y) . By adding the balanced global intensity variation to the original local texture using the texture preserving filter of Equation (4) , a face image can then be obtained. Since the texture preserving filter of Equation (4) includes a moving average filter, it essentially balances the global and smooth intensity variation and preserves the local textural variation. In addition, this filter will not change the illumination of the input face image under normal, balanced illumination. The texture preserving filter of Equation 4 represents a simple filter used in two dimensions, but other filters can be used in two dimensions to balance the global and/or smooth intensity variations of the input face image, particularly more complex variations such as a sharp specular. Also, if the data has less than two dimensions (e.g., sound) or more than two dimensions (e.g., color or a sequence of images), the filter can be modified to apply in a corresponding number of dimensions.

In step 613, after the global illumination is corrected, the projecting coefficients of the processed image are further pre-processed or projected into classification space C for classification. Although Figures 3A-6B describe multiple discrete methods of manipulating the face subspace and/or input face image, a combination of the manipulations may be used to address multiple sources of normalization error. Figures 7A-7B show a general method for integrating a combination of manipulations, such as those described with reference to Figures 3A-6B.

In Figures 7A-7B, step 701 involves projecting an input face image into one or more manipulated subspaces as well as the original subspace. Step 702 maps the projection coefficients α_x corresponding to each of the manipulated subspaces into a classification space C, and compares those projection coefficients o__L with projection coefficients α_x of the training images similarly mapped. Step 703 involves calculating a correlation between the projection coefficients _i for each manipulated subspace and the projection coefficients of the training images. Step 704 require the determination of a highest correlation among those calculated in step 703. In steps 705A- 705B, the input face is automatically classified if the highest correlation exceeds a threshold (e.g., 0.99). However, if the highest correlation does not exceed a threshold in steps 705A-705B, the highest correlation is used to select the manipulated subspace or the original subspace corresponding to that highest correlation. In step 707, one or more additional manipulations may be performed on the selected subspace to correct additional inconsistencies between the general characteristics of the input image data and the training data. In step 708, shadow detection and correction and/or illumination composition may be performed in accordance with the method described in Figures 6A and 6B . In step 709, the resulting projection coefficients a_i are mapped into a classification space C for comparison with projection coefficients a. of the training images, and, in step 710, classification of the input face image is performed.

Figure 8 shows a block diagram describing steps of a system capable of performing the manipulations and corrections described in the process of Figure 7A-7B. In Figure 8, an input face image I is simultaneously projected into an original subspace 801, a rotated subspace 802, a scaled subspace 803, and a translated subspace 804. Projection coefficients a_± resulting from each projection are mapped to a linear discriminant classifier 805 for comparison with projection coefficients o>_i used to project the trained images onto the original face subspace. Classification is attempted based on the coefficients that correlate most closely with the coefficients of a training image. If the highest correspondence between the input image and any of the training images exceeds a predetermined threshold, the results of the classification are used for identification and the process is concluded. Otherwise, the projection coefficients are passed through the switch 806 to the front -view detection module 807. In the front-view detection module 807, the projection coefficients a_t of the selected subspace are compared against projection coefficients ex. corresponding to a three-dimensional rotation. If the input face image is deemed to be rotated in three dimensions based on this comparison, either of two conditions exist depending upon the correlation between the selected image and the training image. Specifically, if the correlation between the selected input face image and the training image exceeds a second lower predetermined threshold (e.g., 0.85), the results of the attempted classification performed in module 805 are used for identification, and the process is concluded. Alternatively, if the correlation does not exceed the second lower predetermined threshold, the attempted classification of the face image is rejected.

However, if the input face image is deemed a frontal view based on the comparison performed in module 807, the selected subspace and/or corresponding projection coefficients α^ are passed to the shadow detection module 808 and light balancing module 809. The shadow detecting and light balancing modules operate as described previously with respect to Figures 6A and 6B. Upon completion of shadow detection and light balancing, the projection coefficients cι_L are mapped into the classification space C by subspace LDA classifier 810 for comparison with projection coefficients a. of training images, and classification is ultimately performed. Although the process and block diagram described by Figures 7A-7B and 8 describe a specific order among the processes, the noted manipulations and/or correction of illumination may be performed in any order, with or without any of the modules specifically mentioned. Furthermore, these processes may all be performed in parallel, in series, or any combination thereof .

Moreover, the present invention provides a system and method for compensating inconsistencies between the general characteristics -of an input face image and training images, thus enabling more accurate classification of input face images. General characteristics that may be compensated using the present invention included, but are not limited to, rotational orientation, scale, translational orientation, and illumination.

Although the present invention is described as manipulating the face subspace to compensate inconsistencies in the general characteristics of the input image data and the training data, it is also possible to manipulate the input image data to compensate for sueh inconsistencies.

It is also possible to use a composite projection, rather than a series of projections (e.g., image space to original/manipulated subspace to classification space) , effectively reducing storage requirements and processing time. In addition, standard intensity normalization can be used, such as histogram equalization or unit variance normalization.

In addition, although the present invention has been described with respect to classification of facial images, it is also applicable to classification systems for other types of image and non-image data, particularly sound. However, in those contexts, the input data and the training data are likely to differ with regard to different general characteristics. For instance, with sound, manipulations may be required for tone, pitch, and volume. Furthermore, the input device and storage device will obviously handle different data types. Still further, if necessary, the dimensionality of at least the input space, the subspace, the classification space and the texture preserving filter are changed to reflect the dimensionality of the input data .

While there have been illustrated and described what are presently considered- the preferred embodiments of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made, and equivalents may be substituted for, elements thereof, without departing from the true scope of the present invention. In addition, many modifications may be made to adapt a particular situation or material to the teaching of the present invention without departing from the central scope thereof. Therefor, it is intended that the present invention not be limited to the particular embodiment disclosed as the best mode contemplated for carrying out the present invention, but that the present invention includes all embodiments falling within the scope of the appended claims.

Claims

What is claimed is :

1. A method for classifying input data, comprising: reducing differences between general characteristics of an input item and training data used to classify the input item; and classifying the input item through comparison with the training data.

2. The method recited by claim 1, wherein the input item and the training data correspond to images .

3. The method recited by claim 1, wherein reducing differences between general characteristics of the input item and the training data comprises: manipulating general characteristics of an original subspace defined by the training data; and projecting the input item into the manipulated subspace before classifying the input item.

4. The method recited by claim 3, wherein classifying the input item comprises: comparing the projected input item to the training data,- and classifying the input item based on the comparison between the projected input item and the training data.

5. The method recited by claim 4, wherein reducing differences between general characteristics of the input item and the pre-stored training data further comprises : determining projection coefficients of the input item that are used to project the input item into the manipulated subspace, and wherein comparing the projected input item to the training data comprises : comparing the projection coefficients of the input item with projection coefficients of the training data that are defined by a projection of the training data into the original subspace .

6. The method recited by claim 5, wherein comparing the projected input item to the training data further comprises : mapping the projection coefficients of the input item and the projection coefficients of the training data into a classification space before comparison.

7. The method recited by claim 3, wherein manipulating general characteristics of the original subspace comprises: changing at least one of the general characteristics of the original subspace including rotational orientation, translational orientation, scale, and illumination.

8. The method recited by claim 7, further comprising: selecting a subspace for classification based on whether the input item corresponds more closely to the training data after being projected into the manipulated subspace or after being projected ^'into the original subspace.

9. The method recited by claim 7, wherein manipulation involves changing more than one of the general characteristics of the original subspace.

10. An apparatus for classifying input data, comprising: means for reducing differences between general characteristics of an input item and training data used to classify the input item,- and means for classifying the input item through comparison with the training data.

11. The apparatus recited by claim 10, wherein the input item and the training data correspond to images.

12. The apparatus recited by claim 10, wherein the means for reducing differences between general characteristics of the input item and the training data comprise: means for manipulating general characteristics of an original subspace defined by the training data_; and means for projecting the input item into the manipulated subspace before classifying the input item.

13. The apparatus recited by claim 12, wherein the means for classifying the input item comprise: means for comparing the projected input item to the training data,- and means for classifying the input item based on the comparison between the projected input item and the training data.

14. The apparatus recited by claim 13, wherein the means for reducing differences between general characteristics of the input item and the pre-stored training data further comprise: means for determining projection coefficients of the input item that are used to project the input item into the manipulated subspace, and wherein the means for comparing the projected input item to the training data comprise: means for comparing the projection coefficients of the input item with projection coefficients of the training data that are defined by a projection of the training data into the original subspace.

15. The apparatus recited by claim 14, wherein the means for comparing the projected input item to the training data further comprise: means for mapping the projection coefficients of the input item and the projection coefficients of the training data into a classification space before comparison.

16. The apparatus recited by claim 12, wherein the means for manipulating general characteristics of the original subspace comprise: changing at least one of the general characteristics of the original subspace including rotational orientation, translational orientation, scale, and illumination.

17. The apparatus recited by claim 16, further comprising: means for selecting a subspace for classification based on whether the input item corresponds more closely to the training data after being projected into the manipulated subspace or after being projected into the original subspace.

18. The apparatus recited by claim 16, wherein manipulation involves changing more than one of the general characteristics of the original subspace.

19. A method for classifying input data, comprising: representing the input data using an input space; manipulating the input space; projecting the input data into the manipulated input space; and classifying the input data based on projection coefficients used to project the input data into the manipulated input space.

20. The method recited by claim 19, wherein the input data represents an image, and wherein manipulating the input space comprises: forming a subspace based on the input space; and changing at least one of the general characteristics of the subspace including rotational orientation, translational orientation, scale, and illumination.

1/12

FIG. I