equal variances per se, as measures of accuracy they share the same “symmetric” features of d’. A second class of standard measures of association, which do depend on marginal totals, are functions of the correlation coefficient; they include Cohen’s kappa and measures derived from the Chi-square coefficient, such as the Phi, or four-fold point, coefficient. Like the “percentage correct” index, these measures vary with the base rate of positive cases in the study sample and with the diagnostician’s decision threshold, in a way that is evident only when their ROCs are derived. Their ROCs are not widely known inasmuch as the measures were designed for single 2-by-2 or 2-by-3 tables, rather than for the 2-by-n table that represents the multiple possible thresholds used in estimating an ROC. However, these measures can be shown to predict an ROC of irregular form—one that is not concave downward or that intersects the ROC axes at places other than the (0.0, 0.0) and (1.0, 1.0) corners. Moreover, some of these latter measures were developed to determine statistical significance relative to hypotheses of no relationship, and they lack cogency for assessing degree of accuracy or effect size. Several of these alternative statistics have been analyzed and their theoretical ROCs compared with a broad sample of observed ROCs (Swets, 1986a, 1986b); the two classes of association statistics are discussed by Bishop, Fienberg, and Holland (1975).


The accuracy index (A) is equal to the proportion of correct signal identifications that would be made by a diagnostician confronted repeatedly by pairs of random test results, one of which was drawn from the signal category and one from the noise category. For example, a decision maker repeatedly faced with two examinees, one of whom is truthful, will make the correct choice 8 out of 10 times by using a test with A = 0.8. In other situations, A does not translate easily to percent correct. Under a great many assumptions about test situations that are realistic in certain applications, the percent correct is quite different from A, as is illustrated in Table 2-1. (The measure A is applied to diagnostic performance in several fields; see Swets, [1988, 1996:Chapter 4].)


A conventional way of representing decision thresholds quantitatively is as the slope of the tangent to the ROC curve drawn at the cutoff point that defines the threshold. It can be shown that this slope is equal to the ratio of the height of the signal distribution to the height of the noise distribution (the “likelihood ratio”) at that threshold (see representations in Figure 2-1). At point F in Figure 2-2, this slope is 2, at point B it is 1, and at point S it is 1/2 (Swets, 1992, 1996:Ch. 5).


Computer software exists to give maximum-likelihood fits to empirical ROC points (e.g., Metz, 1986, 1989, 2002; Swets, 1996). There are two common approaches: to draw straight line segments interpolating between estimated ROC points and the lower left and upper right corners of the plotting square; or to assume a curved form that follows from underlying distributions of the measure of evidence that are normal (Gaussian), often with arbitrary variances but sometimes with these assumed equal, and to use maximum likelihood estimation. In either case, A is determined as the area under the estimated ROC; standard errors and confidence bounds for A may also be computed. These methods have technical limitations when used on relatively small samples, but they are adequate to the level of accuracy needed here.


A different distinction between validity and utility is made in some writings on diagnostic testing (Cronbach and Gleser, 1965; Schmidt et al., 1979). That distinction concerns the practical value of a test with a given degree of accuracy in particular decision-making contexts, such as screening populations with low base rates of the target condition. We address these issues in this report (particularly in Chapter 7), but do not apply the term “utility” in that context. Our usage of “utility” in discussing the polygraph follows the usage of the term by polygraph practitioners.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement