to determine scientifically whether or how well the polygraph (or any other technique for the psychophysiological detection of deception) “works.” The appropriate criterion of validity can be slippery; truth is often hard to determine; and it is difficult to disentangle the roles of physiological responses, interrogators’ skill, and examinees’ beliefs in order to make clear attributions of practical results to the validity of the test. Given all these confounding factors in the case evidence, even the most compelling anecdotes from practitioners do not constitute significant scientific evidence.
Evidence of scientific validity is essential to give confidence that a test measures what it is supposed to measure. Such evidence comes in part from scientifically collected data on the diagnostic accuracy of a test with certain examiners and examinees. Evidence of accuracy is critical to test validation because it can demonstrate that the test works well under specific conditions in which it is likely to be applied. Evidence of accuracy is not sufficient, however, to give confidence that a test will work well across all examiners, examinees, and situations, including those in which it has not been applied. This limitation is important whenever a test is used in a situation or on a population of examinees for which accuracy data are not available and especially when scientific knowledge suggests that the test may not perform in the same way in the new situation or with the new population. This limitation of accuracy data is particularly serious for polygraph security screening because the main target populations, such as spies and terrorists, have not been and cannot easily be subjected to systematic testing. Confidence in polygraph testing, especially for security screening, therefore also requires evidence of its construct validity, which depends, as we have noted, on an explicit and empirically supported theory of the mechanisms that connect test results to the phenomenon they purport to be diagnosing. A test with good construct validity is one that uses methods that are defensible in light of the best theoretical and empirical understanding of those mechanisms, the external factors that may alter the mechanisms and affect test results, and the measurement issues affecting the ability to detect the signal of the phenomenon being measured and exclude extraneous influences. Only to the extent that a diagnostic test meets these construct validity criteria can one have confidence that it will work well in new situations and with different kinds of examinees.
A well supported theory of the test is also essential to provide confidence that the test will work well in the face of efforts examinees may make to produce a false negative result. Spies and terrorists may be strongly motivated to learn countermeasures to polygraph tests and may develop potential countermeasures that have not been studied. To have confidence that such measures will fail or will be detected requires basic