B
Watch-List Operational Performance and List Size: A First-Cut Analysis

Let p be the probability that someone presenting to a watch-list system has been previously enrolled, and F(·) be a prior distribution on this probability. F(·) may be discrete and even a point prior with all mass at one possible value π of p, a continuous distribution on the interval [0,1] such as a Beta distribution, or any other probability distribution function on a probability space on [0,1]. Two types of results may be distinguished here: matching an enrolled presenter to the correct prior enrollment sample or, less restrictively, recognizing that the presenter has previously enrolled, although perhaps by matching to the wrong enrollee. The latter is pertinent to watch-list performance because such a result would serve the intended function of denying privileges, even if for the wrong reason. We distinguish here between these two possibilities by referring to the first as identification and to the second as watch-list recognition.

Addressing the identification problem first, one is trying to match a person specifically with his or her enrollment record and is in error if the correct match is missed. The confidence we should place in a claimed match—that is, its “predictive value”—is the probability that a claimed match is correct:



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 150
B Watch-List Operational Performance and List Size: A First-Cut Analysis Let p be the probability that someone presenting to a watch-list sys- tem has been previously enrolled, and F(·) be a prior distribution on this probability. F(·) may be discrete and even a point prior with all mass at one possible value π of p, a continuous distribution on the interval [0,1] such as a Beta distribution, or any other probability distribution function on a probability space on [0,1]. Two types of results may be distinguished here: matching an enrolled presenter to the correct prior enrollment sam- ple or, less restrictively, recognizing that the presenter has previously enrolled, although perhaps by matching to the wrong enrollee. The latter is pertinent to watch-list performance because such a result would serve the intended function of denying privileges, even if for the wrong reason. We distinguish here between these two possibilities by referring to the first as identification and to the second as watch-list recognition. Addressing the identification problem first, one is trying to match a person specifically with his or her enrollment record and is in error if the correct match is missed. The confidence we should place in a claimed match—that is, its “predictive value”—is the probability that a claimed match is correct: PPV(p) = P(true match with enrollment sample|claimed match with enrollment sample) = P(true match with presenter’s enrollment sample) = P(claimed match with anyone’s enrollment sample) p × P(true match | enrolled) p × P(any match|enrolled) + (1 - p) × P(any match|unenrolled) 0

OCR for page 150
 APPENDIX B Consider the effect on this predictive value of enrolling one additional person in a watch list of length n, assuming the pattern of presentations to the list is fixed at proportion p of previous enrollees. In addition to com- parisons with the slightly shorter previous list, the presenter is now com - pared to the new enrollee. This cannot increase and may decrease P( true match|enrolled), because each comparison offers an additional opportu- nity for an enrolled presenter to be erroneously matched with the wrong enrollee by matching more closely with someone else’s stored data than with his or her own. Similarly, both denominator terms cannot decrease and may increase, because the new comparison offers any presenter an oppportunity of falsely matching with an extra enrollee. Hence the ratio, PPV(p), cannot increase and may decrease with watch- list length. Using the subscript to indicate watch-list length, PPVn+1(p) ≤ PPVn(p) for any specific p. Thus, the posterior means for the two list sizes over the distribution F(p) must hold the same relationship: 1 1 E( PPVn+1 ( p)) = ∫ PPVn+1 ( p)dF( p) ≤ ∫ PPVn ( p)dF( p) = E( PPVn ( p)). 0 0 These expectations are the marginal probabilities that a claimed match is correct for the different list sizes, so increasing list length by one enrollee cannot increase and may be expected to decrease the confidence warranted by a watch-list identification. Iterating this point shows that lengthening the list by any amount must have the same implications. However, this argument depends on decoupling the presentation distri- bution F(p) from enrollee characteristics. In a finite population setting, where increasing enrollment increases p, a much more complicated argu- ment might be required, with the outcome dependent on the specifics of functional relationships. A general argument that would work in such a setting is not obvious. Our confidence in a nonmatch is NPV(p) = P(unenrolled and claimed nonmatch|claimed nonmatch) = P(unenrolled and claimed nonmatch) = P(claimed nonmatch) (1 - p) × P(claimed nonmatch|unenrolled) . (1 - p) × P(claimed nonmatch|unenrolled) + p × P(claimed nonmatch|enrolled) As noted above, increasing watch-list size by one new enrollment without changing p offers an additional opportunity for each unenrolled presenter to falsely match. Thus, P(claimed nonmatch|unenrolled) can-

OCR for page 150
 BIOMETRIC RECOGNITION not increase and may decrease. The new enrollee can affect results only for those enrolled presenters failing to match their enrollment samples and gives such presenters an additional chance to match the watch list, although incorrectly, thus decreasing P(claimed nonmatch|enrolled). Assuming that list size does not affect the presentation distribution F(p), the net impact depends on the ratio of the two probabilities. In the simplest conceivable model, when comparisons between pairs of individuals are independent and true and false-match probabilities are uniformly m1 and m0, these are respectively (1 – m0)n and (1 – m1)(1 – m0)n–1 when n subjects are enrolled, and both are multiplied by (1 – m0) with each new enroll- ment, leaving their ratio and NPV(p) unchanged. But if m0 depends on enrollment status, as might occur when attempts are made to compromise the identification process, then NPV(p) can decrease or increase when m0 is higher for comparisons of unenrolled to enrolled presenters, or of one enrolled to other enrolled presenters, respectively. The expectation would change accordingly, in either direction. Considering the watch-list recognition problem from the same per- spective, one is now satisfied with a claim that the presenter matches someone on the list, without concern for whether the match is to the presenter’s own enrollment sample. The definition and above discussion of NPV remain unaltered because a false match of a presenting enrollee, which is the event adjudicated differently by identification and watch- list recognition, does not contribute to probabilities conditioned on the absence of a match. Moreover, PPV(p) = P(true match with any enrollment sample|claimed match with list ) = P(claimed match with list and true match with any enrollment sample ) = P(claimed match with list) p × P(claimed match with anyone|enrolled) p × P(claimed match with anyone|enrolled) + (1 - p) × P(false match|unenrolled) With a new enrollment to the list, an enrolled presenter who fails to match the correct enrollment sample has an added chance of matching the new enrollee and being correctly flagged as previously enrolled. This increases the numerator probability rather than decreasing it, as was the case for individual identification: numerator and denominator thus both increase. In the simple case described above, PPV can be shown to decline with list size, as was the case for identification. However, other scenarios and results are conceivable; if match probabilities differ for enrolled and

OCR for page 150
 APPENDIX B unenrolled presenters, the prior distribution F(û) depends on list size, and match comparisons may be dependent. For an example of how linkage of F(p) to list size can change these results, consider a closed set identification system scaled up by enrolling many more users, each of whom interacts with the system daily to obtain workplace access, perhaps in a rapidly expanding corporation. Unless the number of attempted intrusions increases greatly, F(p) is shifted to the right and p stochastically increases. In the resulting change, the increas- ing dominance of the PPV fraction by its numerator term outweighs the increasing chance of false recognition for any single impostor challenge, because impostor challenges occur with declining relative frequency. Con- fidence in a match would thereby increase rather than decrease.