Evaluation of Face
Recognition Algorithms

PAPERS: Early, 2000 to 2003, CSU Papers

Summary of papers 2000 to 2003:

Here are published and unpublished papers related to our own work on the evaluation of face recognition algorithms. Abstracts and brief notes about each paper are included below, along with pointers to the papers themselves.

Abstracts and Background Information:


Marcio Luis Teixeira
Abstract
This thesis documents work that was performed using the Bayesian Interpersonal/Extrapersonal Classifier (BIC). We examine the implementation of the algorithm and address several numerical stability issues that were identified with the original design of the classifier. We also examine the performance of the algorithm on standard FERET data sets and explore a hybrid classifier which combines features of the BIC with a standard nearest-neighbor classifier.
Reference:
The Bayesian Intrapersonal/Extrapersonal Classfier, David Bolme, Masters Thesis, CSU Computer Science Department, July 2003.
Notes:
This is Marcio Teixeira's Masters Thesis. It describes both his implementation of the Moghaddam and Pentland Intrapersonal/Extrapersonal Image Difference Face Recognition Algorithm and a series of experiments exploring the behavior of the algorithm on the FERET data.
Back to top

David S. Bolme
Abstract
Elastic Bunch Graph Matching is a face recognition algorithm that is distributed with CSU's Evaluation of Face Recognition Algorithms System. The algorithm is modeled after the Bochum/USC face recognition algorithm used in the FERET evaluation. The algorithm recognizes novel faces by first localizing a set of landmark features and then measuring similarity between these features. Both localization and comparison uses Gabor jets extracted at landmark positions. In localization, jets are extracted from novel images and matched to jets extracted from a set of training/model jets. Similarity between novel images is expressed as function of similarity between localized Gabor jets corresponding to facial landmarks. A study of how accurately a landmark is localized using different displacement estimation methods is presented. The overall performance of the algorithm subject to changes in the number of training/model images, choice of specific wavelet encoding, displacement estimation technique and Gabor jet similarity measure is explored in a series of independent tests. Several findings were particularly striking, including results suggesting that landmark localization is less reliable than might be expected. However, it is also striking that this did not appear to greatly degrade recognition performance.
Reference:
Elastic Bunch Graph Matching, David Bolme, Masters Thesis, CSU Computer Science Department June 2003.
Notes:
This is David Bolme's Masters Thesis. It describes both this implementation of the Elastic Bunch Graph Matching algorithm and a series of experiments exploring the behavior of the algorithm on the FERET data.
Back to top

Geof H. Givens, J. Ross Beveridge, Bruce A. Draper and David Bolme
Abstract
This paper discusses the design and use of linear, generalized linear, and generalized linear mixed models for evaluation and comparison of human face recognition algorithms. These models are introduced in a cohesive framework, and their benefits (compared to simple one-way descriptive comparisons) are reviewed. Several example analyses involving algorithm configuration and subject covariates illustrate the importance of using models that allow one to control for confounding variables, to estimate all important effects including interactions, and to isolate extraneous sources of variation.
Reference:
Analysis of Recognition Algorithms using Linear, Generalized Linear, and Generalized Linear Mixed Models, Geof H. Givens, J. Ross Beveridge, Bruce A. Draper and David Bolme, CSU Technical Report.
Notes:
This paper is a succinct overview of several types of linear models with clear illustrations of how they may be used in the context of evaluating face recognition algorithms. It summarizes the methodology and results from the two studies covered in: A Statistical Assessment of Subject Factors in the PCA Recognition of Human Faces and Using A Generalized Linear Mixed Model to Study the Configuration Space of a PCA+LDA Human Face Recognition Algorithm.
Back to top

Geof H. Givens J. Ross Beveridge, Bruce A. Draper and David Bolme
Abstract
Some people's faces are easier to recognize than others, but it is not obvious what subject-specific factors make individual faces easy or difficult to recognize. This study considers 11 factors that might make recognition easy or difficult for 1,072 human subjects in the FERET dataset. The specific factors are: race (white, Asian, African-American, or other), gender, age (young or old), glasses (present or absent), facial hair (present or absent), bangs (present or absent), mouth (closed or other), eyes (open or other), complexion (clear or other), makeup (present or absent), and expression (neutral or other). An ANOVA is used to determine the relationship between these subject covariates and the distance between pairs of images of the same subject in a standard Eigenfaces subspace. Some results are not terribly surprising. For example, the distance between pairs of images of the same subject increases for people who change their appearance, e.g., open and close their eyes, open and close their mouth or change expression. Thus changing appearance makes recognition harder. Other findings are surprising. Distance between pairs of images for subjects decreases for people who consistently wear glasses, so wearing glasses makes subjects more recognizable. Pairwise distance also decreases for people who are either Asian or African-American rather than white. A possible shortcoming of our analysis is that minority classifications such as African-Americans and wearers-of-glasses are underrepresented in training. Followup experiments with balanced training addresses this concern and corroborates the original findings. Another possible shortcoming of this analysis is the novel use of pairwise distance between images of a single person as the predictor of recognition difficulty. A separate experiment confirms that larger distances between pairs of subject images implies a larger recognition rank for that same pair of images, thus confirming that the subject is harder to recognize.
Reference:
A Statistical Assessment of Subject Factors in the PCA Recognition of Human Faces, Geof H. Givens J. Ross Beveridge, Bruce A. Draper and David Bolme, CVPR 2003 Workshop on Statistical Analysis in Computer Vision Workshop, June 2003.
Notes:
This paper presents one of the few large studies associating common characteristics of human subjects with recognition difficulty using a standard face recognition algorithm. This work was presented at the CVPR 2003 Workshop on Statistical Analysis in Computer Vision Workshop and here are the PowerPoint and PDF versions of the talk.
Back to top

Geof H. Givens J. Ross Beveridge, Bruce A. Draper and David Bolme
Abstract
A generalized linear mixed model is used to estimate how rank 1 recognition of human faces with a PCA+LDA algorithm is affected by the choice of distance metric, image size, PCA space dimensionality, supplemental training and inclusion of subjects in the training. Random effects for replicated training sets and for repeated measures on people were included in the model. Results indicate between people variation was a dominant source of variability, and that there was moderate correlation within people. Statistically significant effects and interactions were found for all configuration factors except image size. Changes to the PCA+LDA configuration only improved recognition for subjects who had images included in the training data. For subjects not included in training, no configuration changes were helpful. This study is instructive for what it reveals about PCA+LDA. It is also a model for how to conduct such studies. For example, by accounting for subject variation as a random effect and explicitly looking for interaction effects, we are able to discern effects that might otherwise have been masked by subject variation and interaction effects.
Reference:
Using A Generalized Linear Mixed Model to Study the Configuration Space of a PCA+LDA Human Face Recognition Algorithm, Geof H. Givens J. Ross Beveridge, Bruce A. Draper and David Bolme. Technical Report, Computer Science.
Notes:
This should be considered a draft manuscript. It is complete, but there are aspects of the experiment that we may refine in the next few months (3/20/03)
Back to top

David S. Bolme, J. Ross Beveridge, Marcio Teixeira and Bruce A. Draper
Abstract
The CSU Face Identification Evaluation System provides standard face recognition algorithms and standard statistical methods for comparing face recognition algorithms. The system includes standardized image pre-processing software, three distinct face recognition algorithms, analysis software to study algorithm performance, and Unix shell scripts to run standard experiments. All code is written in ANSI C. The preprocessing code replicates feature of preprocessing used in the FERET evaluations. The three algorithms provided are Principle Components Analysis (PCA), a.k.a Eigenfaces, a combined Principle Components Analysis and Linear Discriminant Analysis algorithm (PCA+LDA), and a Bayesian Intrapersonal/Extrapersonal Classifier (BIC). The PCA+LDA and BIC algorithms are based upon algorithms used in the FERET study contributed by the University of Maryland and MIT respectively. There are two analysis. The first takes as input a set of probe images, a set of gallery images, and similarity matrix produced by one of the three algorithms. It generates a Cumulative Match Curve of recognition rate versus recognition rank. The second analysis tool generates a sample probability distribution for recognition rate at recognition rank 1, 2, etc. It takes as input multiple images per subject, and uses Monte Carlo sampling in the space of possible probe and gallery choices. This procedure will, among other things, add standard error bars to a Cumulative Match Curve. The System is available through our website and we hope it will be used by others to rigorously compare novel face identification algorithms to standard algorithms using a common implementation and known comparison techniques.
Reference:
“The CSU Face Identification Evaluation System: Its Purpose, Features and Structure”, D. Bolme, R. Beveridge, M. Teixeira and B. Draper, International Conference on Vision Systems, pp 304-311, Graz, Austria, April 1-3, 2003. Published by Springer-Verlag
Notes:
This article is a condensed version of The CSU Face Identification Evaluation System User’s Guide: Version 5.0 The User's Guide is included with our current algorithm distribution.

Here are slides for the talk that accompanies this paper (PowerPoint)(PDF).

Back to top

B. Draper, K. Baek, M.S. Bartlett and R. Beveridge
Abstract
This paper compares principal component analysis (PCA) and independent component analysis (ICA) in the context of a baseline face recognition system, a comparison motivated by contradictory claims in the literature. This paper shows how the relative performance of PCA and ICA depends on the task statement, the ICA architecture, the ICA algorithm, and (for PCA) the subspace distance metric. It then explores the space of PCA/ICA comparisons by systematically testing two ICA algorithms and two ICA architectures against PCA with four different distance measures on two tasks (facial identity and facial expression). In the process, this paper verifies the results of many of the previous comparisons in the literature, and relates them to each other and to this work. We are able to show that the FastICA algorithm configured according to ICA architecture II yields the highest performance for identifying faces, while the InfoMax algorithm configured according to ICA architecture II is better for recognizing facial actions. In both cases, PCA performs well but not as well as ICA.
Reference:
Recognizing Faces with PCA and ICA, B. Draper, K. Baek, M.S. Bartlett and R. Beveridge, Computer Vision and Image Understanding, (to appear).
Notes:
This paper greatly expands and largely supercedes PCA vs.ICA:A Comparison on the FERET Data Set
Back to top

J. Ross Beveridge, Kai She, Bruce Draper and Geof H. Givens.
Abstract
This paper reviews some of the major issues associated with the statistical evaluation of Human Identification algorithms, emphasizing comparisons between algorithms on the same set of sample images. A general notation is developed and common performance metrics are defined. A simple success/failure evaluation methodology where recognition rate depends upon a binomially distributed random variable, recognition count, is developed and the conditions under which this model is appropriate are discussed. Some nonparametric techniques are also introduced, including bootstrapping. When applied to estimating the distribution of recognition count for a single set of i.i.d. sampled probe images, bootstrapping is noted as equivalent to the parametric binomial model. Bootstrapping applied to recognition rate over resampled sets of images can be problematic. Specifically, sampling with replacement to form image probe sets is shown to introduce a conflict between assumptions required by bootstrapping and the way recognition rate is computed. In part to overcome this difficulty with bootstrapping, a different nonparametric Monte Carlo method is introduced, and its utility illustrated with an extended example. This method permutes the choice of gallery and probe images. It is used to answer two questions. Question 1: How much does recognition rate vary when comparing images of individuals taken on different days using the same camera? Question 2: When is the observed difference in recognition rates for two distinct algorithms significant relative to this variation? Two important general features of nonparametric methods are illustrated by the Monte Carlo study. First, within some broad limits, resampling generates sample distributions for any statistic of interest. Second, through careful choice of an appropriate statistic and subsequent estimation of its distribution, domain specific hypotheses may be readily formulated and tested.
Reference:
“Parametric and Nonparametric Methods for the Statistical Evaluation of Human ID Algorithms”, J.
Ross Beveridge and Kai She and Bruce Draper and Geof H. Givens, ThirdWorkshop on the Empirical
Evaluation of Computer Vision Systems, December 2001. (A version of this paper is under review by Image and Vision Computing)
Notes:
This paper gives additional background on the nonparametric method used in "A Nonparametric Statistical Comparison of Principal Component and Linear Discriminant Subspaces for Face Recognition"
Back to top

J. Ross Beveridge, Kai She, Bruce Draper and Geof H. Givens.
Abstract
The FERET evaluation compared recognition rates for different semi-automated and automated face recognition algorithms. We extend FERET by considering when differences in recognition rates are statistically distinguishable subject to changes in test imagery. Nearest Neighbor classifiers using principal component and linear discriminant subspaces are compared using different choices of distance metric. Probability distributions for algoriithm recognition rates and pairwise differences in recognition rates are determined using a permutation methodology. The principal component subspace with Mahalanobis distance is the best combination; using L2 is second best. Choice of distance measure for the linear discriminant subspace matters little, and performance is always worse than the principal components classifier using either Mahalanobis or L1 distance. We make the source code for the algorithms, scoring procedures and Monte Carlo study available in the hopes others will extend this comparison to newer algorithms.
Reference:
“A Nonparametric Statistical Comparison of Principal Component and Linear Discriminant Subspaces for Face Recognition”, J. Ross Beveridge and Kai She and Bruce Draper and Geof H. Givens, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 535 – 542, December 2001.
Notes:
There is an update to the results section of this paper: Fall 2001 Update to CSU PCA Versus PCA+LDA Comparison. This update used a somewhat improved version of the PCA+LDA algorithm.
Back to top

Kyungim Baek, Bruce A.Draper, J.Ross Beveridge and Kai She
Abstract
Over the last ten years, face recognition has become a specialized applications area within the field of computer vision. Sophisticated commercial systems have been developed that achieve high recognition rates. Although elaborate, many of these systems include a subspace projection step and a nearest neighbor classifier. The goal of this paper is to rigorously compare two subspace projection techniques within the context of a baseline system on the face recognition task. The first technique is principal component analysis (PCA), a well-known baseline for projection techniques. The second technique is independent component analysis (ICA), a newer method that produces spatially localized and statistically independent basis vectors. Testing on the FERET data set (and using standard partitions), we find that, when a proper distance metric is used, PCA significantly outperforms ICA on a human face recognition task. This is contrary to previously published results.
Reference:
PCA vs.ICA:A Comparison on the FERET Data Set, Kyungim Baek,Bruce A.Draper,J.Ross Beveridge and Kai She, International Conference on Computer Vision, Pattern Recognition and Image Processing in conjunction with the 6th JCIS, Durham, North Carolina, March 8-14, 2002.June 2001
Notes:
There is a more recent extended version of this paper: xxx
Back to top

J.Ross Beveridge
Abstract
This report will help in developing a geometric interpretation of Fisher Linear Discrimants. The report builds upon an understanding of the connection between Principal Component Analysis and Gaussian Distributions. It contains a running example showing how Fisher Discriminants are computed and what they look like for an illustrative 3 class problem in 3 dimensional space.
Reference:
The Geometry of LDA and PCA Classifiers Illustrated with 3D Examples. J Ross Beveridge, CSU Computer Science Technical Report 01-101, May 2001
Notes:
This paper was written in Maple and is best viewed in Maple. There is also an HTML version with 3D animations and a straight PDF version.
Back to top

Wendy S. Yambor, Bruce A. Draper and J. Ross Beveridge
Abstract
This study examines the role of Eigenvector selection and Eigenspace distance measures on PCA-based face recognition systems. In particular, it builds on earlier results from the FERET face recognition evaluation studies, which created a large face database (1,196 subjects) and a baseline face recognition system for comparative evaluations. This study looks at using a combinations of traditional distance measures (City-block, Euclidean, Angle, Mahalanobis) in Eigenspace to improve performance in the matching stage of face recognition. A statistically significant improvement is observed for the Mahalanobis distance alone when compared to the other three alone. However, no combinations of these measures appear to perform better than Mahalanobis alone. This study also examines questions of how many Eigenvectors to select and according to what ordering criterion. It compares variations in performance due to different distance measures and numbers of Eigenvectors. Ordering Eigenvectors according to a like-image difference value rather than their Eigenvalues is also considered.
Reference:
Analyzing PCA-based Face Recognition Algorithms: Eigenvector Selection and Distance Measures, W. Yambor, B. Draper and R. Beveridge, in Empirical Evaluation Methods in Computer Vision, H. Christensen and J. Phillips (eds.), World Scientific Press, Singapore, 2002.
Notes:
This is a relatively early work in our series on evaulating face recognition algorithms. It provides good background on PCA and covers material that is worth knowing if one is working with a standard PCA algorithm for face recognition algorithm.
Back to top

Wendy S. Yambor (M. S. Thesis)
Abstract
One method of identifying images is to measure the similarity between images. This is accomplished by using measures such as the L1 norm, L2 norm, covariance, Mahalanobis distance, and correlation. These similarity measures can be calculated on the images in their original space or on the images projected into a new space. I discuss two alternative spaces in which these similarity measures may be calculated, the subspace created by the eigenvectors of the covariance matrix of the training data and the subspace created by the Fisher basis vectors of the data. Variations of these spaces will be discussed as well as the behavior of similarity measures within these spaces. Experiments are presented comparing recognition rates for different similarity measures and spaces using hand labeled imagery from two domains: human face recognition and classifying an image as a cat or a dog.
Reference:
Analysis of PCA-Based and Fisher Discriminant-Based Image Recognition Algorithms, Wendy S. Yambor, M.S. Thesis, Technical Report CS-00-103, Computer Science, July 2000.
Notes:
This paper includes a very basic tutoriial level introduction to PCA and Fisher's Linear Discriminants that should be of help to graduate students new to this material.
Back to top

12:29:27 PM (Colorado)
Wednesday 18th of October 2017
Contact: Ross Beveridge

Apply to CSU | Contact CSU | Disclaimer | Equal Opportunity
Colorado State University, Fort Collins, CO 80523 USA
© 2010 Colorado State University