The many large-scale biometric systems in use today are deployed in a broad range of systems and social contexts. The successes and failures of these biometric systems offer insights into what can be learned from careful consideration of the larger system context, as well as purely technological or component-level aspects, during planning. Common characteristics of successful deployments include good project management and definition of goals, alignment of biometric capabilities with the underlying need and operational environment, and a thorough threat and risk analysis of the system under consideration. Common contributors to failures include the following:
Inappropriate technology choices,
Lack of sensitivity to user perceptions and requirements,
Presumption of a problem that does not exist,
Inadequate surrounding support processes and infrastructure,
Inappropriate application of biometrics where other technologies would better solve the problem,
Lack of a viable business case, and
Poor understanding of population issues, such as variability among those to be authenticated or identified.
Many of these factors apply in any technology deployment, biometrics-related or not.
While much can be learned from studying biometrics systems, it
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 76
3
Lessons from Other Large-Scale Systems
The many large-scale biometric systems in use today are deployed
in a broad range of systems and social contexts. The successes and fail -
ures of these biometric systems offer insights into what can be learned
from careful consideration of the larger system context, as well as purely
technological or component-level aspects, during planning. Common
characteristics of successful deployments include good project manage-
ment and definition of goals, alignment of biometric capabilities with the
underlying need and operational environment, and a thorough threat and
risk analysis of the system under consideration. Common contributors to
failures include the following:
• Inappropriate technology choices,
• Lack of sensitivity to user perceptions and requirements,
• Presumption of a problem that does not exist,
• Inadequate surrounding support processes and infrastructure,
• Inappropriate application of biometrics where other technologies
would better solve the problem,
• Lack of a viable business case, and
• Poor understanding of population issues, such as variability among
those to be authenticated or identified.
Many of these factors apply in any technology deployment, biometrics-
related or not.
While much can be learned from studying biometrics systems, it
OCR for page 76
LESSONS FROM OTHER LARGE-SCALE SYSTEMS
seems appropriate, given their scale and scope, to consider whether the
biometrics community can learn lessons from large-scale systems that
have been deployed in other domains. This chapter explores some of
the technical/engineering and societal lessons learned from large-scale
systems in manufacturing and medical screening and diagnosis. In each
case, the discussion points out useful analogies to biometric systems and
applications.
MANUFACTURING SYSTEMS
Manufacturing systems convert initial materials into finished prod -
ucts that must meet quality specifications. Each step in the conversion may
consist of a complex process sensitive to multiple characteristics of the
input materials and processing conditions. Each step also represents an
economic investment; modifications to the process that can achieve equal
or higher quality at lower cost are every company’s goal. Production-line
systems have been studied systematically since before World War II from
the perspectives of industrial engineering, statistics, experimental design,
operations research, and quality control. Insights gained from the study
of such systems have been generalized to better understand and improve
the performance of systems for product development and other industrial
processes and to facilitate improvements in corporate management.
A simple example, used in a 2005 briefing to the study committee by
Lynne Hare of Kraft, Incorporated, is the development of a new sensor
for a manufacturing production line. The process begins with identifying
the business need for the sensor and proceeds through its implementation
and then deployment in the production line. The stages include explicit
translation of the business need into the scientific requirements for the
sensor, fabrication of a prototype sensor, preliminary (static) testing, for-
mal static and dynamic testing, pilot installation and testing, and produc -
tion line implementation and validation. The process never ends, because
revalidation is scheduled at periodic intervals. At each stage of testing
and data collection, the information obtained may send the development
process back to an earlier stage to correct any observed deficiencies and
improve robustness of the sensor to varying conditions.
This example can be interpreted directly or as analogy. Directly, it
gives a model for developing and implementing devices required by any
biometric system to sense biometric traits, for example, fingerprint scan -
ners, iris scanners, and audio recorders. There is also an analogy between
development and validation of a sensor and the development and imple -
mentation of a biometric system. In this analogy, the multiple levels of
testing—preliminary static, formal static and dynamic, and production
line testing—are counterparts to technology, scenario, and operational
OCR for page 76
BIOMETRIC RECOGNITION
evaluations of a biometric modality. Motivations for these three levels of
testing in the sensor development environment can be informative for the
development and testing of an entire biometric system.
Additionally, a biometric system may be considered as a produc-
tion line, the inputs as individuals presenting for recognition, and the
output as a series of decisions that will achieve a high quality, reflected
in low values of the false match rate and false nonmatch rate, and in a
ratio appropriate for the system’s intended purpose(s). When a biometric
system is looked at in this way, it can be seen that the methods of indus -
trial engineering and statistical quality control can be applied to achieve
system quality.
At least three fundamental insights into managing industrial pro-
cesses are also relevant to biometrics. The discussion below paraphrases
selected core concepts from the work of Deming, Shewhart, Box, and their
many successors.1
One insight is that careful articulation of requirements, preferably
in measurable terms and derived from an end product or process, is
exceptionally important to the successful development and implementa -
tion of component parts. In the case of a production line, for example, a
requirement might be for the sensor to respond reliably and repeatedly
only to stimuli in the desired range and measure stimuli accurately, under
conditions in the production environment. The range of stimuli, sensor
sensitivity and resolution, and resistance to environmental disturbances
must be accurately specified during the design process in order for the
sensor to properly identify defective units, which is its ultimate purpose.
By analogy, biometric system design should be driven by clear objectives
for the recognition task in the context of the broader application rather
than merely by the existence of an attractive technology.
A second insight is that a scientific approach is invaluable to under-
standing systems, particularly the interrelatedness of system components.
The hallmark of the scientific approach is exploration through both theory
and data. The performance of complex systems can be often improved by
identifying and correcting bottlenecks or other localized problems whose
negative effects may not have been fully perceived and articulated. Such
problems and other aspects of interrelatedness and individual compo -
nent performance can be identified by planned data collection guided
by careful theorizing about the system. Such data may be collected by
observation, as exemplified by the use of statistical control charts in a
1 George E.P. Box and Owen L. Davies, The Design and Analysis of Industrial Experiments,
Edinburgh: Oliver and Boyd (1954); Walter A. Shewhart, Economic Control of Quality of Manu-
factured Product, Milwaukee, Wisc.: Quality Press (1980); W. Edwards Deming, Out of the
Crisis, Cambridge, Mass.: MIT Press (1986).
OCR for page 76
LESSONS FROM OTHER LARGE-SCALE SYSTEMS
production line, or by direct experimentation on the system itself. In such
experimentation, system inputs or conditions are systematically modified
to learn their effects on the functioning of the system and the quality of its
output. Such experiments can be carried out from time to time or can be
built into the system itself. Evolutionary operation (EVOP), an example of
the latter, refers to the regular alteration of baseline system parameters by
small amounts during production runs. The changes made are too small
to disrupt system operation, and the system is run with these changes in
place just long enough to assess the effects on product quality and other
aspects of performance. Changes that most improve performance may
then be retained, and the process continued from new baseline values of
parameters. Iterations of such experimentation gradually nudge the sys -
tem toward optimal parameter values by exploring nonlinear regions of
the “response surface” that relates performance to different combinations
of parameter values.
A third insight, stressed in statistical quality control and one of the
four pillars of Deming’s “system of profound knowledge,”2 is the impor-
tance of understanding background variation in system performance and
identifying separable contributors to it. The first and foremost meaning of
“understanding” in this context is recognition that systems exhibit natu -
ral variability due to random influences, and that inordinate reaction to
such short-run variability is often wasteful and of little benefit. Although
dramatic but relatively brief slumps and streaks are a major source of dis -
cussion by sports analysts and some stock traders, basing major decisions
on such brief events rarely leads to prosperity for a baseball team or an
investor. A deeper level of understanding develops from the awareness
that random variation in output typically comes from multiple sources
that persist even as its momentary influences fluctuate. In technical par-
lance, these sources and the measures of their strength are often referred
to as components of variation (or variance), and in industry parlance as
“the voice of the process.” In a manufacturing process they might include
variability in raw material batches, calibration drift of instruments guid -
ing system processes, problems with machinery maintenance, and human
error. Reducing the variation from such common sources can improve
product quality over the long run. Some variation arises, however, from
“special” sources that would not be expected to recur and against which
changes to the system can do little to protect. Thus, identification and
reduction of the largest components of variance from common sources is
generally accepted as critical to quality improvement in industrial pro -
duction systems. Some version of Deming-Shewhart plan-do-study-act
2 See W.E. Deming, The New Economics for Industry, Goernment, Education, Cambridge,
Mass.: MIT Press (2000).
OCR for page 76
0 BIOMETRIC RECOGNITION
(PDSA) cycles is generally used to bring a scientific approach to bear on
this task.
The insights sketched above apply to biometric recognition systems
no less than to any other systems. But since they provide an approach
rather than a prescription for learning about and improving systems, their
implications will vary greatly according to context. Moreover, for many
operational biometric systems the “ground truth”—that is, the “correct”
answer in terms of system objectives—is indeterminate for many trans -
actions. The approach described above is invaluable in developing such
systems. The emphasis on examination of process variables in an opera -
tional mode is potentially very helpful. Its potential benefits are even
greater if challenge experiments can be superimposed on the operational
system. There is substantial precedent for such challenge experiments
in other contexts, including evaluations of Internal Revenue Service tax
assistance and Transportation Security Administration airport passenger
and baggage screening.
So, independent of the particular biometric modality and its applica -
tion, the following lessons can be drawn from the experience and meth-
odologies that have evolved in industrial production:
• System objectives must be clarified at the outset if the system is to
be designed efficiently and if the ability to evaluate system performance
is to be preserved. In particular, the often-interrelated but distinct goals
of improving convenience, controlling access, detecting threats, lowering
costs, tracking and managing employees, and deterrence must be distin -
guished and prioritized in system planning.
• The operational environment, including the range within which
environmental characteristics and characteristics of the populations pre -
senting to the system will vary, should be anticipated as much as possible
in systems development. This includes consideration of operation under
routine conditions; under unusual conditions unconnected to any specific
threat; under realistic threat scenarios for attempts to defeat the system
at the individual level; and under realistic threat scenarios for penetrat -
ing, degrading performance, or shutting down operations at the system
level.
• To the extent that systems are mission-critical, large-scale, and
addressed to national security, controlled observation at the operational
level, including ongoing challenge experimentation, is essential. In rou -
tine operation, many errors are likely to occur in which an individual
making a true recognition claim is at first erroneously restricted but the
mistake is later discovered and corrected, at which point these errors
become visible and available for analysis. However, when an individual
gains unauthorized access because, for example, a false claim of identity
goes undetected, the error may remain undiscovered for a long time.
OCR for page 76
LESSONS FROM OTHER LARGE-SCALE SYSTEMS
Challenge experiments, which observe and compile system responses to
inputs representing (1) typical experience, (2) variations in conditions, (3)
difficult presentations requiring adjudication or systems adaptation, and
(4) attack modes, are the best way to identify the potential for such errors
and ways to prevent them. Erroneous rejections of true recognition claims
and erroneous acceptances of false claims should be documented and
subject to rigorous fault analysis, just as would take place in the case of
an investigation into a transportation crash. Such analysis should include
comparison with a sample of correct recognitions used as controls in order
to distinguish factors predisposing to errors from those predisposing to
correct decisions.
• Studies of system behavior, including those attempting to discover
and reduce the largest contributors to system error and the most variable
components of intermediate products that contribute to recognition deci -
sions, may be as revealing and helpful for biometric systems as they have
been for systems involving other repeatable processes.
MEDICAL SCREENING SYSTEMS
Medical screening systems collect diagnostic information, generally
in a staged sequence, in an attempt to locate individuals with an unde -
tected disease that can be more effectively treated early in its course. The
input to such a system is data from a population of individuals, some
with disease but most without. Results of the stages generally are clas-
sified as positive or negative, and only individuals who test positive at
each stage are labeled by the system as having the disease. Consider the
following simplified (and not necessarily medically realistic) view of a
prostate cancer screening system. Screening is initiated by a digital rectal
examination. The patient with a normal exam is not screened further.
Abnormal palpation results, however, are followed by a prostate-specific
antigen (PSA) test. Patients with a PSA level below a certain point are not
screened further. When the PSA level is at or above this point, the prostate
is biopsied. When the pathologist finds the biopsy to be negative for can -
cer, the patient is so informed. When the pathologist finds it to be positive
for cancer, the diagnosis is considered to be established, and the patient is
referred for consideration of treatment alternatives. The alternatives, and
indeed the importance of any treatment, may vary depending on age of
the patient, stage of the cancer, and rate of progression, which may often
be determined by a watch-and-wait period.3
Each component test of this progression will detect some prostate
3 Thepoint here is that the disposition of a medical screening result may vary as a function
of patient factors, and what happens after that is, nonetheless, appropriately viewed as a
system output. Similarly, the disposition of a biometric recognition (or lack of recognition)
OCR for page 76
BIOMETRIC RECOGNITION
cancers and miss others, its false negatives. Some men without prostate
cancer, perhaps with another disease such as benign prostatic hyperplasia
(BPH), will be classified positive at one or more steps. The proportion of
cancers detected is known technically as “sensitivity” and the proportion
of prostate cancer-free individuals classified as negative is known as the
“specificity.” The complementary proportions—that is, the proportion of
prostate cancers missed and the proportion of men without cancer who
are identified as positives—are called the false negative and false positive
rates. These are analogous to the false match and false nonmatch rates
in a biometric recognition application. Note that each component of the
screening system will have its own values of these numerical character-
istics describing the performance of that component and that another set
of values characterizes the performance of the screening system overall.
In practice, the true values are generally unknown, but hypothesized or
estimated values coupled with well-established mathematical relation -
ships can provide useful guidance for screening policies.
Screening systems have been extensively studied in a medical context.
Their general characteristics are well understood, but their specific per-
formance levels may be unclear. The following lessons are among those
that have been learned:
• Individual components in general usage are rarely as sensitive and
specific as the components when they were under development because
tests are usually developed and evaluated by researchers exceptionally
skilled in their use on subjects whose states of health or disease are well
known.
• The value of each component to the screening system is determined
not just by its individual properties but by the information it contributes
in addition to the contribution of the other components. For instance, con -
firming the result of a test by repetition is less valuable than confirming it
by a different test that screens for a different disease marker.
• Limitations of individual components can vitiate the effective-
ness of other components. For instance, in the system described above, a
pathologist who cannot detect true prostate cancer renders the accuracy
of earlier components in the sequence virtually irrelevant.
• Effectiveness of a system is highly population-specific, even when
the system’s overall sensitivity and specificity are exceptionally high. This
is easily seen by considering a screening system implemented in a popula-
tion from which the disease in question is absent. No matter how high the
might vary according to situational and subject factors; because the consequences of the
system’s results affect system output, they are important in evaluating a biometric system.
OCR for page 76
LESSONS FROM OTHER LARGE-SCALE SYSTEMS
sensitivities and specificities of the system components and of the system
as a whole, all positives will be false positives and the screening system
will provide no health benefit.
• In view of the preceding item, the performance of a system is best
represented by its population-specific predictive values—that is, the pro -
portion of screen-positive individuals who truly have the disease (positive
predictive value) and screen-negative individuals who truly do not have
the disease (negative predictive value). Alternatively, the ratios of screen-
positives with the disease to screen-positives without the disease and the
ratios of screen-negatives without the disease to screen-negatives with
it, may also be used to represent performance. These measures combine
information on the accuracy of the testing (sensitivity and specificity) with
information on the composition of the population, since both are critical
to determining whether screening is informative.
• The ability of a system to detect disease and the importance of
detection may vary by characteristics of the disease and the patients in
whom the disease occurs. For instance, screening is more likely to detect
slowly progressing (indolent) than rapidly progressing (aggressive) dis-
ease, because the symptom-free period is longer for the former. But sensi -
tivity is less important in detecting indolent disease, because subsequent
rounds of screening may detect it before it has progressed much further.
In the case of prostate cancer, elderly men with the indolent form may be
more likely to die from something else before the cancer kills them.
These observations are general, and the analogy to biometric systems
is imperfect. They do, however, have some implications for biometric
systems:
• Laboratory and scenario testing are apt to underestimate field error
rates of biometric applications.
• Combinations of independent or minimally dependent characteris -
tics and processes generally incorporate more information, and thus offer
higher potential for improved performance, than combinations of more
correlated components. Hence, in biometrics systems design, independent
features, components of multimodal biometrics, and components of deci-
sion-making scores are preferable to combinations of correlated alterna-
tives of comparable cost.
• A poor adjudication process, or an ineffective backup process for
dealing with failures-to-acquire (see Chapter 2) in a biometric system, may
negate the benefits of good error rates in the basic biometric technology.
• Biometric technologies must be calibrated to the environment and
population in which they will be implemented. For instance, one might
expect different operational characteristics for biometric border-control
OCR for page 76
BIOMETRIC RECOGNITION
systems using identical technology on the Mexican border with Texas
and the Canadian border with New York, in part because the frequency
of attempted illegal border crossings in these places is so different.
• System performance characteristics may vary by major popula-
tion subgroups and by the types of challenges presented to the system.
Extrapolation of technological or system performance characteristics
across settings or challenges—for example, from (1) laptop access control
to auto theft control to border control or (2) from illegal immigrants to
narcotics smugglers to terrorists—is unlikely to be reliable.