Proceedings of the 25th Chinese Control Conference 7-11 August, 2006, Harbin, Heilongjiang Face Recognition with Only One Training Sample Chong Lu 1,2 , Wanquan Liu 2 and Senjian An 2 1. Dept. of Computer Science, YiLi Normal College, Yining, China 835000 E-mail: [email protected] 2. Dept. of Computing, Curtin University of Technology, WA 6102 E-mail: {lu,wanquan,senjian}@curtin.edu.au Abstract: In this paper, we compare the face recognition performance for five different methods with using only one training sample. Firstly, we investigate the Singular Value Decomposition (SVD) of the face image and propose an augmenting algorithm via using only one sample to generate a group of training samples. Then we implement the methods of face recognition with Discrete Cosine Transform (DCT) and Two Dimensional Principal Component Analysis (2DPCA). Secondly, we implement face recognition approach via DCT directly with one training sample. Thirdly, we primarily use DCT to generate some low-frequency matrices in frequency domain and then converted into the spatial domain as independent training images. Then, 2DPCA will be used for face recognition. Finally, we use DCT to generate some low-frequency matrices in frequency domain and use DCT to do face recognition. Experiments on the AMP and Yale face database show that the approach DCT+2DPCA produces better results on the AMP database. The approach SVD+2DPCA produces better result on Yale database. Key Words: SVD, 2DPCA, DCT, face recognition, classification 1 INTRODUCTION Face recognition has received extensive attention as one of the most significant applications of image understanding[1],[2],[3]. Research into face recognition has flourished in recent years due to the increased need for surveillance with more robust systems and attracted a multidisciplinary research effort, in particular, for techniques based on PCA [4],[5], 2DPCA [6]. Those approaches usually use large and representative training samples per person to enhance the recognition rate in the circumstance of illumination, pose, facial expression, make up, etc. However, large training samples can not be guaranteed in practice, such as identity card verification, passport verification, etc. In such situation, only one frontal image per person captured under controlled lighting conditions will be available for training. Some face recognition algorithms have been proposed to solve the face recognition problem with only a single training image[7],[8]. SPCA [7] combines the original training image with its derived image by perturbing the image matrix's singular values and then performs PCA on the training sample(s). Then DCT and 2DPCA are performed on all the derived training images available. In order to compare the results of experiments, we also performed five type of experiments as detailed in this paper. Experiments on the AMP and Yale face databases[9] show that DCT+2DPCA produces much better results on the AMP database. SVD+2DPCA and DCT+2DPCA produce better results on Yale database. 2 FACE RECOGNITION ALGORITHM MS Word Authors: please try to use the paragraph styles contained in this document. 2.1 2DPCA In this section, we briefly outline the standard procedure of building the eigen vector space from a set of training images. We represent input images as matrices Ai ∈R m× n ,i=1,2…M , where m, n is the number of pixels in the image, and M is the number of the images. We adopt the following criterion as in [6]: J ( X ) = tr ( S X ) = tr{ X T [E( A − EA)T ( A − EA)] X } 2 joined images. (PC) A [8] combines the original training image with its vertical and horizontal projections and then performs PCA on the enriched version of the image. In this paper, we propose a algorithm after analyzing the Singular Value Decomposition(SVD) face image. Firstly, we combine the first one and several largest singular value with their corresponding feature spaces to generate more where S X is the covariance matrix of Ai (i=1,2...M) with the projection matrix of a stochastic variable. In fact, the covariance matrix G ∈R images is: G= IEEE Catalog Number: 06EX1310 X and E is the expectation 1 M M ∑(A i =1 j m×m with − A M )T ( A j − A M ) M where A M = 1 M ∑iM=1 Ai is the mean image matrix of the M training samples. Alternatively the criterion in (1) can be expressed by the following: J ( X ) = tr ( X T GX ) where X is a unitary column vector. The matrix X that maximizes the criterion is called the optimal projection axis. The optimal projection X opt is a set of unitary vectors that maximizes J(X). i.e. the eigenvectors of G corresponding to the large eigenvalues. We usually choose a subset of only d eigenvectors corresponding to larger eigenvalues to be included in the model, that is {X 1 … X d } = arg max J(x) satisfying X X j = 0 ( i ≠ j , i, j = 1, 2,Κ , d ). Each image T i can thus be optimally approximated in the least-squares sense up to a predefined reconstruction error. For face recognition every input image A i will project into a point in the d-dimensional subspace spanned by the selected eigen matrix X [10]. When given a testing image B, we also project the matrix DCT coefficients. Those facial features are more stable than the variable high-frequency facial features. This feature matrices contains the low-to-mid frequency DCT coefficients, respectively. To recognize a particular input face, the system compares this face's feature matrices to the feature matrices of the database faces using a Euclidean distance nearest-neighbor classifier [14],[15]. The Euclidean distance between one of the training set and the test is d = norm(A i − Bi ) (i=1,2,...n,) where n is number of low-frequency matrices per test A match is obtained by minimizing d. 2.3 SVD m× n denote face image, the SVD of A express Let A ∈ R [U,D,V] = svd(A) produces a diagonal matrix D of the same dimension as A, with nonnegative diagonal elements in decreasing order, and unitary matrices U and V so that A = U ∗ D ∗ V T = U m× m ∗ Dm× n ∗ VnT× n [13],[7]. We can approximate A by the following combination. A1 = λ1 ∗ u 1 ∗ v1T + λ2 ∗ u2 ∗ v2T A2 = λ1 ∗ u 1 ∗ v1T + λ2 ∗ u2 ∗ v2T + λ3 ∗ u3 ∗ v3T of B into point in subspace with the eigen matrix X. namely, YB = BX . Then a nearest neighbor classifier is used for classification. A3 = λ1 ∗ u1 ∗ v1T + λ2 ∗ u2 ∗ v2T + λ3 ∗ u3 ∗ v3T + λ4 ∗ u4 ∗ v4T d (Yi0 , YB ) = min Yi − YB where u i denotes the i column of U and Where B is regarded to be the individual represented by Ai0 . transpose of the i column of V, i 2.2 DCT The DCT has been widely applied to solve numerous problems among the digital signal processing community [11]. The DCT of an m C(u, v) = × n image f(x,y) is defined by n −1 m−1 2 ⎡ (2x + 1)uπ ⎤ ⎡ (2 y + 1)vπ ⎤ a(u)a(v) ∑ ∑ f ( x, y) cos⎢ ⎥ cos⎢ 2m ⎥ 2n mn ⎣ ⎦ ⎣ ⎦ x y λi viT denotes the denotes the diagonal value of D, respectively. Therefore, we could produce several images from one image in this way. Then , we could use 2DPCA and DCT to do face classifiion based on these derived images. 2.4 Face recognition Based on the derived training samples, we can five approaches to do face recognition as shown in the chart 1. For u=0,1,...,n-1, v=0,1,...m-1, and the inverse transform is defined by f (x, y) = 2 mn n−1 m−1 ⎡ (2x +1)uπ ⎤ ⎡ (2 y +1)vπ ⎤ cos 2n ⎥⎦ ⎢⎣ 2m ⎥⎦ ∑∑a(u)a(v)C(u, v) cos⎢⎣ u v For x=0,1,...,n-1, y=0,1,...m-1, a (ω ) = 1 2 for ω=0 and a (ω ) = 1 otherwise. For an m × n image, we have an m × n DCT coefficient matrix covering all the spatial frequency components of the image. The most significant facial features such as eyes and mouth, hair and face outline can be preserved by a very small number of low-frequency Chart 1: Five approaches to do face recognition Approach 1: We use SVD to obtain some new training images A 1 , A 2 , A 3 from one sample A. Then, use DCT to do face recognition, we denote this method as SVD+DCT. Approach 2: We use SVD to obtain some new training images A 1 , A 2 , A 3 from one sample A. Then, use 2DPCA to do face recognition, we denote this method as SVD+2DPCA. Approach 3:We use DCT to transform the training sample into frequency domain and then use DCT to do classification, we denote this method as DCT directly. Approach 4: We use DCT to generate some low-frequency matrices such as 8 × 8 or 16 × 16 matrices and then converted into the spatial domain with different low frequency blocks. Then, use 2DPCA to classification, we denote this method as DCT+2DPCA. Approach 5: We use DCT to generate some low-frequency matrices, and use the defined low frequency matrices to do classification, we denote this method as DCT+DCT. 3 EXPERIMENTS RESULTS AND ANALYSIS In this section, we will carry out several experiments to demonstrate the performance the proposed five techniques. As to the databases, we use two types of input images. One is the AMP face database, which contains 975 images of 13 individuals(each person has 75 different images) under various facial expressions and lighting conditions with each images being cropped and resized to 64 × 64 pixels in this experiment. The other database is the Yale face database [8], which contains 640 face images of 10 people , including frontal views of face with different facial expressions, lighting conditions. With these databases, we will conduct five experiments. We just use the first imager per person as training sample. The others are used as testing. Firstly, We use SVD to obtain some new training images A 1 , A 2 , A 3 as training images from only one sample A. Then, use DCT to do classification, the results in figure 1 and the best result is shown in table 1. And so does on the Yale database, the results are shown in table 6 and figure 2. than three new training images but only three new images combined with the original images could get better result on the AMP database in experiments, as shown in figure1 and figure 2. Fourthly: We use DCT to generate some low-frequency matrices such as 8 × 8 or 16 × 16 matrices and then convert into the spatial domain. We combined three converted spatial images with the original image to be training samples. Then, use 2DPCA to do classification, the results are shown in table 4. We may also obtain more or less than two new training images but only two new images which are 8 × 8 or 16 × 16 matrices combined with the original image could get best result on the AMP database in experiments. And so does on the Yale database, as shown in table 8.. Finally: We use DCT to generate some low-frequency matrices, and to do classification with these low frequency blocks. The results are shown in table 5. We may also obtain more or less than two new training images but only two new images which are 8 × 8 or 16 × 16 matrices combined with the original image could get best result on the AMP database in experiments. And only two new images which are 16 × 16 and 24 × 24 matrices combined with the original image could get best result on the Yale database. The results are shown in table 10. Table1: SVD + DCT (AMP) Individual 1 2 3 4 5 6 7 True 74 73 74 74 59 47 74 False 0 1 0 0 15 27 0 Individual 8 9 10 11 12 13 total True 74 74 74 74 74 74 919 False 0 0 0 0 0 0 43 Table2: SVD + 2DPCA (AMP) Individual 1 2 3 4 5 6 7 Secondly: We use SVD to obtain some new training True 74 60 65 74 63 74 74 images A 1 , A 2 , A 3 from only one sample A. Then, use 2DPCA to do classification, the results are shown in table 2 and figure 1. We may also obtain more or less than three new training images but only two new images combined with the original image could get the best result on the AMP database in our experiments. Only A1 combined with the original image produces best result on the Yale database. Shown in table 7 and figure 2. False 0 14 9 0 11 0 0 Individual 8 9 10 11 12 13 total True 54 74 74 74 72 74 901 False 20 0 0 0 2 0 57 Thirdly: We use DCT to transform the training images to frequency domain as training sample and to do classification based on this one training sample. The results are shown in table 3 on the AMP database and table 8 on the Yale database. We may also obtain more or less Table3: DCT directly (AMP) Individual 1 2 3 4 5 6 7 True 74 70 74 73 54 46 74 False 0 4 0 1 20 28 0 Individual 8 9 10 11 12 13 total True 74 74 74 74 74 74 909 False 0 0 0 0 0 0 53 Table4: DCT + 2DPCA (AMP) Table2: SVD + 2DPCA (Yale) Individual 1 2 3 4 5 6 True 32 34 43 27 48 63 False 31 29 20 36 15 0 Individual 8 9 10 11 total Individual 1 2 3 4 5 6 7 True 74 73 74 70 60 58 74 True 49 56 39 24 415 False 0 1 0 4 14 16 0 False 14 7 24 39 215 Individual 8 9 10 11 12 13 total True 74 74 74 74 74 74 927 False 0 0 0 0 0 0 35 Table5: DCT + DCT (AMP) Individual 1 2 3 4 5 6 7 True 74 74 74 68 64 62 74 False 0 0 0 6 10 12 0 Individual 8 9 10 11 12 13 total True 74 74 74 74 74 74 934 False 0 0 0 0 0 0 28 Table3: DCT directly Individual 1 2 3 4 5 6 True 28 32 33 23 29 52 False 35 31 30 40 34 11 Individual 8 9 10 11 total True 45 44 63 26 375 False 18 19 0 37 255 Table4: DCT + 2DPCA 1 (Yale) Individual 1 2 3 4 5 6 True 28 34 31 25 32 39 False 35 29 32 38 31 24 Individual 8 9 10 11 total True 56 46 63 32 386 False 7 17 0 31 244 figure1: Comparision of five approaches under the AMP database 0.98 (Yale) 0.96 Recognition accuracy 0.94 0.92 Table5: DCT + DCT 0.9 0.88 low-frequency DCT SVD+DCT SVD+2DPCA DCT+2DPCA DCT directly 0.86 0.84 0.82 0.8 0 0.5 1 1.5 2 2.5 3 Number of feature Table1: SVD + DCT 3.5 4 4.5 5 (Yale) Individual 1 2 3 4 5 6 True 28 32 32 23 29 52 False 35 31 31 40 34 11 Individual 8 9 10 11 total True 44 44 63 25 372 False 19 19 0 38 258 (Yale) Individual 1 2 3 4 5 6 True 27 33 32 23 29 52 False 36 30 31 40 34 11 Individual 8 9 10 11 total True 43 44 63 26 372 False 20 19 0 37 258 figure2: Comparision of five approaches under the Yale database Recognition accuracy 0.65 0.6 0.55 low-frequency DCT SVD+DCT SVD+2DPCA DCT+2DPCA DCT directly 0.5 0.45 0.4 0 0.5 1 1.5 2 2.5 3 Number of feature 3.5 4 4.5 5 From figure 1 and figure 2, we can see that DCT+2DPCA is very stable in two databases. With large noise changing in Yale database, SVD+2DPCA performs very good and it can deal with different light illumination and expressions. DCT+DCT works well in AMP database in which the faces are quite easy to be recognized. 4 CONCLUSION In this paper, we proposed how to derive more training samples from SVD and DCT. To enhance the classification performance with only one single training sample, each original training image is combined with its reconstructed images by our proposed algorithm, and then 2DPCA and DCT are performed on the derived training images. We also performed experiments on one training sample with different schemes. Experiments on the AMP and Yale face database show that DTC+2DPCA is quite stable due to a fact that it combines the frequency domain knowledge with the special domain technique. Further, the performance based on the derived training samples is improved to that based on only one training sample. REFERENCES [1] W.Zhao, R. Chellappa, A. Rosenfeld, and P.J.Phillips, "Face Recognition: A Literature Survey", Technical Report CAR-TR-948,Univ. of Maryland,CfAR,2000. [2] Lei Zhang and Dimitris Samaras, "Face Recognition from a single Training Image under Arbitrary Unknown Lighting Using Spherical Harmonics", Transactions on Pattern Analysis and Machine Intelligence, Vol. 28, No. 3,March 2006. [3] Ronny Tjahyadi,wanquan Liu and Svetha Venkatesh, "Automatic Parameter selection for Eigenface". Proceeding of 6th International Conference on Optimization Techniques and Applications (ICOTA 2004). [4] M.A.Grudin, OnInternal Representations in Face Recognition Systems, Pattern Recognition, vol.33, no.7, 1161-1177, 2000. [5] L. Zhao and Y.Yang, Theoretical Analysis of Illumination in PCA-Based Vision Systems, Pattern Recognition, vol.32,no.4, 547-564, 1999. [6] Jian Yang et al. "Two-Dimentional PCA: a new approach of 2DPCA to appearance-based face representation and recognition", IEEE Tran. Pattern Analysisand Machine Intelligence, vol.26, no.1, 131-137, Jan.2004. [7] D.Zhang, S. Chen, and Z.H.Zhou, "A new face recognition method based on SVD perturbation for single example image per person". Applied Mathematics and computation, 163(2) 895-907. 2005. [8] J. Wu, Z.H. Zhou, "Face recognition with one training image per person", Pattern Recognition Letters, 23(14) 1711-1719. 2002. [9] ftp//plucky.cs.yale.edu/CVC/pub/images/yalefaceB/Tarsets [10] L. Sirovichand M.Kirby, “Low-Dimensional Procedure for Characterization of Human Faces”, J.Optical Soc.Am., vol.4, 519-524,1987. [11] Ahmed, N., Natarajan, T., and Rao, K. Discrete cosine transform. IEEE Trans. on Computers, 23(1): 90—93, 1974. [12] Z. Pan, A.G.Rust, and H. Bolouri, "image redundancy reduction for neural network classification using discrete cosine transforms", IEEE Neural Networks vol 3, 149-154. 2000. [13] J. J. Gerbrands. "On the relationships between SVD, KLT and PCA". Pattern recognition 14:375-381. 1981. [14] Duda, R.O. and Hart, P.E. "Pattern Classification and SceneAnalysis". Wiley: New York, NY.1973. [15] Ziad M. Hafed and Martin D. Levine. "Face Recognition Using the Discrete Cosine Transform". International Journal of Computer Vision 43(3), 167--188, 2001.
© Copyright 2025