We present a facial recognition technique based on facial sparse representation. A dictionary is learned from data, and patches extracted from a face are decomposed in a sparse manner onto this dictionary. We particularly focus on the design of dictionaries that play a crucial role in the final identification rates. Applied to various databases and modalities, we show that this approach gives interesting performances. We propose also a score fusion framework that allows quantifying the saliency classifiers outputs and merging them according to these saliencies. 1. Introduction Face recognition is a topic which has been of increasing interest during the last two decades due to a vast number of possible applications: biometrics, video surveillance, advanced HMI, or image/video indexation. Although considerable progress has been made in this domain, especially with the development of powerful methods (such as the Eigenfaces or the Elastic Bunch Graph Matching methods), automatic face recognition is not enough accurate in uncontrolled environments for a large use. Many factors can degrade the performances of facial biometric system: illumination variation creates artificial shadows, changing locally the appearance of the face; head poses modify the distance between localized features; facial expression introduces global changes; artefacts wearing, such as glasses or scarf, may hide parts of the face. For the particular case of illumination, a lot of work has been done on the preprocessing step of the images to reduce the effect of the illumination on the face. Another approach is to use other imagery such as infrared, which has been showed to be a promising alternative. An infrared capture of a face is nearly invariant to illumination changes and allows a system to process in all the illumination conditions, including total darkness like night. While visual cameras measure the electromagnetic energy in the visible spectrum (0.4–0.7?μm), sensors in the IR respond to thermal radiation in the infrared spectrum (0.7–14.0?μm). The infrared spectrum can mainly be divided into reflected IR (Figure 1(b)) and emissive IR (Figure 1(c)). Reflected IR contains near infrared (NIR) (0.7–0.9?μm) and short-wave infrared (SWIR) (0.9–2.4?μm). The thermal IR band is associated with thermal radiation emitted by the objects. It contains the midwave infrared (MWIR) (3.0–5.0?μm) and long-wave infrared (LWIR) (8.0–14.0?μm). Although the reflected IR is by far the most studied, we use thermal long-wave IR in this study. Figure 1: A face captured under (a) visible spectrum, (b)
References
[1]
S. G. Kong, J. Heo, B. R. Abidi, J. Paik, and M. A. Abidi, “Recent advances in visual and infrared face recognition—a review,” Computer Vision and Image Understanding, vol. 97, no. 1, pp. 103–135, 2005.
[2]
A. K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 4–20, 2004.
[3]
R. Brunelli and T. Poggio, “Face recognition: features versus templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 10, pp. 1042–1052, 1993.
[4]
Y. Gao and M. K. H. Leung, “Face recognition using line edge map,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 6, pp. 764–779, 2002.
[5]
B. Heisele, P. Ho, J. Wu, and T. Poggio, “Face recognition: component-based versus global approaches,” Computer Vision and Image Understanding, vol. 91, no. 1-2, pp. 6–21, 2003.
[6]
J. R. Price and T. F. Gee, “Face recognition using direct, weighted linear discriminant analysis and modular subspaces,” Pattern Recognition, vol. 38, no. 2, pp. 209–219, 2005.
[7]
F. S. Samaria and A. C. Harter, “Parameterisation of a stochastic model for human face identification,” in Proceedings of the 2nd IEEE Workshop on Applications of Computer Vision, pp. 138–142, December 1994.
[8]
L. Wiskott, J. M. Fellous, N. Krüger, and C. D. Von Malsburg, “Face recognition by elastic bunch graph matching,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 775–779, 1997.
[9]
M. A. Turk and A. P. Pentland, “Face recognition using eigenfaces,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 586–591, June 1991.
[10]
X. Chen, P. J. Flynn, and K. W. Bowyer, “PCA-based face recognition in infrared imagery: baseline and comparative studies,” in Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures, pp. 127–134, IEEE Computer Society, 2003.
[11]
D. J. Kriegman, J. P. Hespanha, and P. N. Belhumeur, “Eigenfaces vs. fisherfaces: recognition using class-specific linear projection,” in Proceedings of the European Conference on Computer Vision, vol. 1, pp. 43–58, IEEE, 1996.
[12]
B. Sch?lkopf, A. Smola, and K. R. Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Computation, vol. 10, no. 5, pp. 1299–1319, 1998.
[13]
S. Mika, G. Ratsch, J. Weston, B. Scholkopf, and K. R. Muller, “Fisher discriminant analysis with kernels,” in Proceedings of the 9th IEEE Workshop on Neural Networks for Signal Processing (NNSP '99), pp. 41–48, August 1999.
[14]
L. J. P. van der Maaten, E. O. Postma, and H. J. van den Herik, “Dimensionality reduction: a comparative review,” Tech. Rep., 2009.
[15]
S. Yang and C. Zhang, “Regression nearest neighbor in face recognition,” in Proceedings of the 18th International Conference on Pattern Recognition (ICPR '06), pp. 515–518, August 2006.
[16]
H. J. Jia and A. M. Martinez, “Support vector machines in face recognition with occlusions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 136–141, 2009.
[17]
L. Bai and Y. Liu, “Neural networks and wavelets for face recognition,” in Proceedings of the 4th International Conference on Enterprise Information Systems (ICEIS '02), pp. 334–340, Ciudad Real, Spain, 2002.
[18]
J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210–227, 2009.
[19]
M. Yang, L. Zhang, J. Yang, and D. Zhang, “Robust sparse coding for face recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '11), pp. 625–632, IEEE, 2011.
[20]
Y.-W. Chao, Y.-R. Yeh, Y.-W. Chen, Y.-J. Lee, and Y.-C. F. Wang, “Locality-constrained group sparse representation for robust face recognition,” in Proceedings of the 18th IEEE International Conference on Image Processing (ICIP '11), B. Macq and P. Schelkens, Eds., pp. 761–764, IEEE, Brussels, Belgium, 2011.
[21]
E. Elhamifar and R. Vidal, “Robust classification using structured sparse representation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '11), pp. 1873–1879, Providence, RI, USA, 2011.
[22]
P. Buyssens and M. Revenu, “IR and visible identification via sparse representation,” in Biometrics: Theory, Applications and Systems, IEEE, Was, USA, 2010.
[23]
S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” Tech. Rep., Department of Statistics, Stanford University, 1996.
[24]
D. L. Donoho and M. Elad, “Maximal sparsity representation via minimization,” Proceedings of National Academy of Sciences, vol. 100, pp. 2197–2202, 2003.
[25]
C. Meszaros, “On the sparsity issues of interior point methods for quadratic programming,” Tech. Rep., Laboratory of Operations Research and Decision Systems, Hungarian Academy of Sciences, 1998.
[26]
M. J. Fadili and J.-L. Starck, “Sparse representation-based image deconvolution by iterative thresholding,” Astronomical Data Analysis, vol. 6, 18, 2006.
[27]
J. A. Tropp, “Greed is good: algorithmic results for sparse approximation,” IEEE Transactions on Information Theory, vol. 50, no. 10, pp. 2231–2242, 2004.
[28]
S. Mallat and Z. Zhang, “Matching pursuits with time-frequency dictionaries,” Tech. Rep., inst-courant-cs, 1992.
[29]
M. S. Lewicki, H. Hughes, and B. A. Olshausen, “A probabilistic framework for the adaptation and comparison of image codes,” Journal of the Optical Society of America, vol. 16, no. 7, pp. 1587–1601, 1998.
[30]
B. A. Olshausen and D. J. Field, “Sparse coding with an overcomplete basis set: a strategy employed by V1?” Vision Research, vol. 37, no. 23, pp. 3311–3325, 1997.
[31]
K. Engan, S. O. Aase, and J. H. Husoy, “Multi-frame compression: theory and design,” Signal Processing, vol. 80, no. 10, pp. 2121–2140, 2000.
[32]
K. Engan, B. Rao, and K. Kreutz-Delgado, “Frame design using FOCUSS with method of optimized directions (MOD),” in Proceedings of the Nordic Signal Processing Symposium, pp. 65–69, Oslo, Norway, September 1999.
[33]
K. Engan, S. O. Aase, and J. H. Husoy, “Frame based signal compression using method of Optimal Directions (MOD),” in Proceedings of the 1999 IEEE International Symposium on Circuits and Systems (ISCAS '99), pp. V-1–V-4, Orlando, Fla, USA, June 1999.
[34]
K. Kreutz-Delgado and B. D. Rao, “Focuss-based dictionary learning algorithms,” in Wavelet Applications in Signal and Image Processing, vol. 41, pp. 19–53, IEEE, 2000.
[35]
J. F. Murray and K. Kreutz-Delgado, “An improved focuss-based learning algorithm for solving sparse linear inverse problem,” in Proceedings of the International Conference on Signals, Systems and Computers, vol. 41, pp. 19–53, 2001.
[36]
M. Aharon, M. Elad, and A. Bruckstein, “K-svd: design of dictionaries for sparse representation,” in Proceedings of the Signal Processing with Adaptive Sparse Structured Representations (SPARS '5), vol. 5, pp. 9–12, 2005.
[37]
A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
[38]
A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman, “From few to many: illumination cone models for face recognition under variable lighting and pose,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 643–660, 2001.
[39]
P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET evaluation methodology for face-recognition algorithms,” Tech. Rep., 1999.
[40]
S. Kaski, “Dimensionality reduction by random mapping: fast similarity computation for clustering,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN '98), vol. 1, pp. 413–418, 1998.
[41]
A. Jain, K. Nandakumar, and A. Ross, “Score normalization in multimodal biometric systems,” Pattern Recognition, vol. 38, no. 12, pp. 2270–2285, 2005.
[42]
X. Chen, P. J. Flynn, and K. W. Bowyer, “IR and visible light face recognition,” Computer Vision and Image Understanding, vol. 99, no. 3, pp. 332–358, 2005.
[43]
J. Wang, Y. Li, X. Ao, C. Wang, and J. Zhou, “Multi-modal biometric authentication fusing iris and palmprint based on GMM,” in Proceedings of the IEEE/SP 15th Workshop on Statistical Signal Processing (SSP '09), pp. 349–352, September 2009.
[44]
P. Buyssens, M. Revenu, and O. Lepetit, “Fusion of IR and visible light modalities for face recognition,” in Proceedings of the 3rd International Conference on Biometrics: Theory, Applications and Systems (BTAS '09), Wash, USA, September 2009.