Decision Analysis of Polygraph Security Screening
In recent decades, decision scientists and policy advisers have worked to develop systematic methods for resolving hard decision problems that arise in business, medicine and public policy (Raiffa, 1968; Quade, 1989; Gold et al., 1996; Hammond, Keeney, and Raiffa, 1999). These methods are used explicitly in many scientific articles, and they are used implicitly in practical advice, where the goal is to get decision makers to think systematically before acting.
It is useful to consider what such an analysis of counterespionage personnel policy, or of polygraph testing in that context, would entail. Six steps of such an analysis are typically recommended (Hammond, Keeney, and Raiffa, 1999): (1) understanding the problem and context of decision; (2) defining the goals and objectives of policy; (3) generating the alternative choices; (4) predicting their consequences; (5) evaluating those consequences and trading off results in different domains; and (6) using the analysis to help make the decision.
The different uses of polygraph examinations—for periodic screening of employees, preemployment screening, and event-specific investigation—present different decision problems. Consequently, the problems must be specified in each situation, even though some objectives, such as minimizing costs, are relevant to all situations.
Consider the example of periodic screening for espionage (the logic is the same for sabotage or terrorism, though the analysis would need to consider each of these separately). The main goal of periodic screening is to limit the damage to national interests by employees who are spies by detecting them and by deterring others who might otherwise be induced
to become spies. A secondary goal is to reduce the damage from information leaks following security violations. Personnel programs might be evaluated against a variety of criteria, including the number of undetected spies working in the agency and the potential damage each could do, the financial costs of the program itself, and the costs to individuals and society of careers interrupted or changed because of false positive test results. We note that, currently, postemployment polygraph screening often involves periodic testing at known intervals, a policy that is likely to be less effective than aperiodic testing at unanticipated intervals.
Policy analysis must consider some set of alternatives for dealing with the problem. One might consider three alternative programs: periodic screening that includes a polygraph test like the Test for Espionage and Sabotage (TES); no security screening or a lower cost interrogation without the polygraph; and an intense screening with replacements or supplements for the polygraph, such as more pencil-and-paper testing or more extensive background investigation of finances and activities. Any final assessment would have to define the programs precisely, including major differences that distinguish different programs.
Formal policy analysis would then predict the consequences of each alternative policy, perhaps by mathematical modeling, using parameters that represent the key factors affecting results. Different parts of the analysis might use different kinds of models. Game theory might be useful for modeling deterrent effects and the use of countermeasures, while standard statistical models might be used for estimating the number of spies caught in the next year. The analysis would set a time horizon within which effects will be counted and specify how long the programs are assumed to be in place. The effects of detecting spies would be immediate, but deterrece might have longer range effects. We first discuss three key parameters and then explain how the modeling might be performed. For simplicity, we consider only the goal of limiting the damage from espionage. (The analysis for other security violations is quite similar.)
The first parameter is p(a), the probability of a spy operating under screening policy a. If a is a tough screening policy that makes spying less attractive, p(a) would be lower than the probability given no rescreening. A second parameter is C(a), the annual costs of screening program a, which would normally be modeled as the sum of fixed costs, F(a), and a per-screen variable cost, V(a): C(a) = F(a) + N(a)V(a), with N(a) representing the number of employees screened under policy a per year. (Other, more subjective, costs are considered later as part of the evaluation of consequences.)
With tests that perfectly discriminate spies from others, the mathematics of prediction is simple and implies that one should only use the
cheapest of the perfect tests, and use it if the annual costs of the test itself and of spying between tests were less than the annual costs of spies with no screening. Unfortunately, all currently known screening tests are imperfect. A third parameter, P(a), represents the performance (accuracy) of screening program a for detecting spies and avoiding false accusations. Because polygraph screening programs involve more than just the polygraph test (for example, the effect of the interrogation depends on examinees’ perceptions of polygraph accuracy), P(a) depends on more than just the polygraph test alone, and may be different from the accuracy index (A) of the polygraph test procedure. Bayes’ theorem can be used to calculate the number of false positives and true positives as a function of policy and to select the appropriate threshold for labeling an employee as deceptive (or, more specifically, as a security risk or a spy), given the calculations of net costs.
To estimate the parameters for the model, one would need to use judgment (preferably informed by statistical evidence) to calculate the base rate of espionage and a plausible range of values. For example, the estimate of the probability that an employee is a spy might be based on the 139 known spies from 1940-1994 (Taylor and Snow, 1997) added to an estimate of the spies that were caught but not reported for security reasons, and the estimated number of spies who were not caught in this period, divided by the number of people working in that period with access to critical information. This probability would vary from agency to agency and over time.
The variable costs of a screening program are primarily labor and could be estimated from the number of cases done each year, multiplied by the average salaries paid to examiners and examinees for the time they spend in the screening process. Fixed costs might be estimated by some standard overhead amount or by a detailed costing. Alternatively, the total monetary costs might be estimated by taking the annual polygraph program budget and estimating the portion used in screening activities.
Chapter 5 is primarily concerned with assessing the accuracy of polygraph testing in various situations. Accuracy may depend on the testing procedure, the situation, and characteristics of examiners and examinees, as well as the base rate of espionage and the decision threshold selected for each decision point in a screening program. Historical data on performance is needed for estimating the likely numbers of false positives and false negatives, as well as a subjective assessment of the relationship of the historical data to the current context.
To evaluate the predicted consequences for each policy, it is necessary to frame the analysis by choosing a common perspective for all programs, which in this case would be a societal viewpoint, rather than that of a particular agency. The simplest way to combine outcomes in differ-
TABLE J-1 Outcomes Under Alternative Screening Policies
Policies Under Consideration
Costs of Screen
Number of Spies Remaining
Number of False Accusations
Number of Security Violations
interrogation without polygraph
Some TES screening
Much TES screening
ent domains is by a cost-consequence table, as shown in Table J-1 (Gold et al., 1996). Usually, the entries are incremental relative to a single reference program, such as interrogation without polygraph. If no one policy is dominant (best on all dimensions), this table might be used in a subjective assessment of the tradeoffs to get to the best choice. People might disagree on those tradeoffs, but the table entries, if correct, give the information needed for a reasoned choice.
There are many difficulties in estimating the costs for the analysis. It is easier to compute the total costs of polygraph examinations than their incremental costs and their effects in comparison with interrogation without polygraphs. The total costs are the incremental costs if polygraph examinations are added to whatever else is done and any confessions are due solely to the polygraph, but this assumption probably overstates both the incremental costs and benefits.
In principle, an alternative table might replace or supplement the columns for the number of spies remaining and number of false accusations by estimates of their costs. All cost estimates should include costs to the examinee and spillover costs, in addition to the direct costs of running the screening program.
The costs per false positive are much lower for preemployment screening than for periodic employee screening. In preemployment screening, there is a cost to the government of hiring less qualified people and a cost to an applicant of not getting a desired job. Unless the skills sought are very specialized, the government costs will be small. The costs to an applicant include bad feelings from failing the polygraph and the need to search for a different job. Costs are much higher in employee screening because national security jobs by their nature rely on specific human capital that must be learned on the job. For an employee who has
not committed any serious security violations and who has settled into a social setting and learned many skills specific to his or her job, the costs to the government of putting that employee in some state of limbo involve training a replacement and perhaps damage to national security caused by the replacement of a valuable contributor with an inexperienced one. The costs to the employee include bad feelings, a waste of job-specific skills and knowledge, and perhaps a search for a new, probably inferior job. The costs to the government will be higher if there are negative side effects on morale or productivity of coworkers or on the ability to attract potentially productive employees.
The hardest estimate to make is the expected costs per undetected spy or terrorist. These will vary greatly by the potential of that person to do damage: from virtually none for ineffective spies to enormous amounts for successful ones who may compromise agents or give away invaluable technical information. A report on information collected on the 139 Americans who were officially charged with spying between 1940 and 1994 showed many to be low-level personnel who needed money and naively tried to sell some secrets (Taylor and Snow, 1997). Since 1978, 38 percent of spies caught were caught on their first attempt. In recent years, ideology has become much less important as a motive. Taylor and Snow (1997) credit the 1978 Foreign Intelligence Surveillance Act for both detecting and successfully prosecuting more spies than before. Despite the end of the cold war, foreign governments are still interested in U.S. secrets, with economic and nonmilitary technical information becoming relatively more important than they used to be.
The expected costs of an isolated security violation, such as taking classified information home, are the product of the value of that information to an adversary and the probability that the adversary gets it. Because many people with access to classified information slip up from time to time, it is fortunate that the probability of those mistakes leading to an important disclosure is quite small. This probability is hard to estimate, but the expected costs per violation might be approximated by dividing the costs of all leaks through inadvertent security violations (as opposed to espionage or hacking) by the number of such violations. An area with a very lax security system might attract attention from adversaries and increase the chance that any particular infraction there turns out badly.
For some purposes, it is useful to combine all the outcomes into one or two measures. In a cost-benefit analysis, all outcomes are replaced by an estimate of their dollar value, and if all outcomes but one are replaced by their dollar value, the one nonfinancial outcome is called the effect in a cost-effectiveness analysis. Typically in the health field, the effect is some measure of incremental health, such as years of life added. In employee screening, the effect would be undetected spies, so that the programs
could be rated on their cost in relation to the number of undetected spies (because of deterrence, this is slightly different from the cost per detected spy). To get to a cost-benefit analysis, one would need to put a dollar value on the cost of each undetected spy. Indirect effects of the program are also included in a thorough analysis. These would include the effects of detected spies on deterrence, the effects of false positives on morale and on the quality of scientific personnel that work in an agency, and the effects on other parts of the security system (for example, placing too much reliance on polygraph screening may result in loosening of ordinary security precautions, thus increasing the chances that a spy who is cleared by a polygraph examination will succeed in stealing secrets).
Most of the uncertainty in calculation and evaluation relates to modeling assumptions and subjective judgments rather than statistical noise. Also, policy makers typically are looking for choices that remain good even if conditions or goals change. For these reasons, analysts typically use sensitivity analysis to examine how choices and conclusions are affected by varying the subjective assumptions and parameter estimates over a reasonable range, rather than attempting to compute confidence intervals or make probabilistic statements about the best choice.
From this brief discussion it should be evident that there would be considerable difficulties involved in any quantitative policy analysis of the use of polygraph in periodic or aperiodic screening. An argument for conducting such an analysis despite the difficulties is that it may lead to better decision making than alternative strategies for making choices. For instance, leaving the choice to specialists may lead to inertia in maintaining policies that are no longer appropriate to changed conditions. Also, professionals have been noted to emphasize service to their clients rather than to society as a whole and may come to have undue faith in what they do (Fischhoff et al., 1981).
Fischhoff, B., S.Lichtenstein, P. Slovic, S. Derby, and R. Keeney 1981 Acceptable Risk. New York: Cambridge University Press.
Gold, M.R., J.E. Siegel, L.B. Russell, and M.C. Weinstein 1996 Cost Effectiveness in Health and Medicine. New York: Oxford University Press.
Hammond, J.S., R.L. Keeney, and H. Raiffa 1999 Smart Choices, A Practical Guide to Making Better Decisions. Boston: Harvard Business School Press.
Quade, E.S. (revised by G.M. Carter) 1989 Analysis for Public Decisions, 3rd ed. New York: North Holland.
Raiffa, H. 1968 Decision Analysis, Introductory Lectures on Choices Under Uncertainty. Reading, MA: Addison Wesley.
Taylor, S.A., and D. Snow 1997 Cold war spies: Why they spied and how they got caught. Intelligence and National Security 12(2):101-125.