National Academies Press: OpenBook

Statistical Analysis of Massive Data Streams: Proceedings of a Workshop (2004)

Chapter: II. COMBINING WITHIN CLASS COVARIANCES AND LINEAR APPROXIMATIONS TO INVARIANCES

« Previous: TRANSCRIPT OF PRESENTATION
Suggested Citation:"II. COMBINING WITHIN CLASS COVARIANCES AND LINEAR APPROXIMATIONS TO INVARIANCES." National Research Council. 2004. Statistical Analysis of Massive Data Streams: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/11098.
×
Page 184
Suggested Citation:"II. COMBINING WITHIN CLASS COVARIANCES AND LINEAR APPROXIMATIONS TO INVARIANCES." National Research Council. 2004. Statistical Analysis of Massive Data Streams: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/11098.
×
Page 185

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

INCORPORATING INVARIANTS IN MAHALANOBIS DISTANCE-BASED CLASSIFIERS: APPLICATIONS TO FACE 184 RECOGNITION INCORPORATING INVARIANTS IN MAHALANOBIS DISTANCE BASED CLASSIFIERS: APPLICATION TO FACE RECOGNITION Andrew M.Fraser Portland State University and Los Alamos National Laboratory Nicolas W.Hengartner, Kevin R.Vixie, and Brendt E.Wohlberg Los Alamos National Laboratory Los Alamos, NM 87545 USA Abstract—We present a technique for combining prior knowledge about transformations that should be ignored with a covariance matrix estimated from training data to make an improved Mahalanobis distance classifier. Modern classification problems often involve objects represented by high-dimensional vectors or images (for example, sampled speech or human faces). The complex statistical structure of these representations is often difficult to infer from the relatively limited training data sets that are available in practice. Thus, we wish to efficiently utilize any available a priori information, such as transformations of the representations with respect to which the associated objects are known to retain the same classification (for example, spatial shifts of an image of a handwritten digit do not alter the identity of the digit). These transformations, which are often relatively simple in the space of the underlying objects, are usually non-linear in the space of the object representation, making their inclusion within the framework of a standard statistical classifier difficult. Motivated by prior work of Simard et al., we have constructed a new classifier which combines statistical information from training data and linear approximations to known invariance transformations. When tested on a face recognition task, performance was found to exceed by a significant margin that of the best algorithm in a reference software distribution. I. INTRODUCTION The task of identifying objects and features from image data is central in many active research fields. In this paper we address the inherent problem that a single object may give rise to many possible images, depending on factors such as the lighting conditions, the pose of the object, and its location and orientation relative to the camera. Classification should be invariant with respect to changes in such parameters, but recent empirical studies [1] have shown that the variation in the images produced from these sources for a single object are often of the same order of magnitude as the variation between different objects. Inspired by the work of Simard et al. [2] [3], we think of each object as generating a low dimensional manifold in image space by a group of transformations corresponding to changes in position, orientation, lighting, etc. If the functional form the transformation group is known, we could in principle calculate the entire manifold associated with a given object from a single image of it. Classification based on the entire manifold, instead of a single point leads to procedures that will be invariant to changes in instances from that group of transformations. The procedures we describe here approximate such a classification of equivalence classes of images. They are quite general and we expect them to be useful in the many contexts outside of face recognition and image processing where the problem of transformations to which classification should be invariant occur. For example, they provide a framework for classifying near field sonar signals by incorporating Doppler effects in an invariant manner. Although the procedures are general, in the remainder of the paper, we will use the terms faces or objects and image classification for concreteness. Of course, there are difficulties. Since the manifolds are highly nonlinear, finding the manifold to which a new point belongs is computationally expensive. For noisy data, the computational problem is further compounded with the uncertainty in the assigned manifold. To address these problems, we use tangents to the manifolds at selected points in image space. Using first and second derivatives of the transformations, our procedures provide substantial improvements to current image classification methods. II. COMBINING WITHIN CLASS COVARIANCES AND LINEAR APPROXIMATIONS TO INVARIANCES Here we outline our approach. For a more detailed development, see [4]. We start with the standard Mahalanobis distance classifier where Cw is the within class covariance for all of the classes, µk is the mean for class k, and Y is the image to be classified. We incorporate the known invariances while retaining this classifier structure by augmenting the within class covariance Cw to obtain class specific covariances, Ck for each class k. We design the augmentations to allow excursions in directions tangent to the manifold generated by the transformations to which the classifier should be invariant. We have sketched a geometrical view of our approach in Fig. 1. Denote the transformations with respect to which invariance is desired by τ(Y, θ), where and are the image and transform parameters respectively. The second order Taylor series for the transformation is where R is the remainder,

INCORPORATING INVARIANTS IN MAHALANOBIS DISTANCE-BASED CLASSIFIERS: APPLICATIONS TO FACE 185 RECOGNITION Fig. 1. A geometrical view of classification with augmented covariance matrices: The dots represent the centers µk about which approximations are made, the curves represent the true invariant manifolds, the straight lines represent tangents to the manifolds, and the ellipses represent the pooled within class covariance Cw estimated from the data. A new observation Y is assigned to a class using The novel aspect is our calculation of where α is a parameter corresponding to a Lagrange multiplier, and is a function of the tangent and curvature of the manifold (from the first and second derivatives respectively) with weighting of directions according to relevance estimated by diagonalizing Cw. We define (1) Where Cθ,k is a dim(Θ)×dim(Θ) matrix. We require that Cθ,k be non-negative definite. Consequently is also non-negative definite. When is used as a metric, the effect of the term is to discount displacement components in the subspace spanned by Vk, and the degree of the discount is controlled by Cθ,k. We developed [4] our treatment of Cθ,k by thinking of θ as having a Gaussian distribution and calculating expected values with respect to its distribution. Here we present some of that treatment, minimizing the probabilistic interpretation. Roughly, Cθ,k characterizes the costs of excursions of θ. We choose Cθ,k to balance the conflicting goals Big: We want to allow θ to be large so that we can classify images with large displacements in the invariant directions. Small: We want to be small so that the truncated Taylor series will be a good approximation. We search for a resolution of these conflicting goals in terms of a norm on θ and the covariance Cθ,k. For the remainder of this section let us consider a single individual k and drop the extra subscript, i.e., we will denote the covariance of θ for this individual by Cθ. If, for a particular image component d, the Hessian Hd has both a positive eigenvalue λ1 and a negative eigenvalue λ2, then the quadratic term θTHθ is zero along a direction e0 which is a linear combination of the corresponding eigenvectors, i.e. We suspect that higher order terms will contribute to significant errors when γ≥ min so we eliminate the canceling effect by replacing Hd with its positive square root, i.e. if an eigenvalue λ of Hd is negative, replace it with −λ. This suggests the following mean root square norm (2) Consider the following objection to the norm in Eqn. (2). If there is an image component d which is unimportant for recognition and for which Hd is large, e.g. a sharp boundary in the background, then requiring to be small might prevent parameter excursions that would only disrupt the background. To address this objection, we use the eigenvalues of the pooled within class covariance matrix Cw to quantify the importance of the components. If there is a large within class variance in the direction of component d, we will not curtail particular parameter excursions just because they cause errors in component d. We develop our formula for Cθ in terms of the eigendecomposition as follows. Break the dim(Θ)×dim(γ)×dim(Θ) tensor H into components (3) Then for each component, define the dim(Θ)×dim(Θ) matrix (4) and take the average to get (5) Define the norm

Next: III. FACE RECOGNITION RESULTS »
Statistical Analysis of Massive Data Streams: Proceedings of a Workshop Get This Book
×
 Statistical Analysis of Massive Data Streams: Proceedings of a Workshop
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Massive data streams, large quantities of data that arrive continuously, are becoming increasingly commonplace in many areas of science and technology. Consequently development of analytical methods for such streams is of growing importance. To address this issue, the National Security Agency asked the NRC to hold a workshop to explore methods for analysis of streams of data so as to stimulate progress in the field. This report presents the results of that workshop. It provides presentations that focused on five different research areas where massive data streams are present: atmospheric and meteorological data; high-energy physics; integrated data systems; network traffic; and mining commercial data streams. The goals of the report are to improve communication among researchers in the field and to increase relevant statistical science activity.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!