K
Behavioral-Surveillance Techniques and Technologies
The primary question in behavioral science as applied to the use of behavioral technologies in the antiterrorism effort is, How can detection of particular behaviors and the attendant biological activity be used to indicate current and future acts of terrorism?
K.1
THE RATIONALE FOR BEHAVIORAL SURVEILLANCE
Some behavioral methods attempt to detect terrorist activity directly (for example, through surveillance at bridges, docks, and weapon sites). However, the focus in this appendix is on behavioral methods that are more indirect. Such methods are used to try to detect patterns of behavior that are thought to be precursors or correlates of wrongdoing (such as deception and expression of hostile emotions) or that are anomalous in particular situations (for example, identifying a person who fidgets much more and has much more facial reddening than others in a security line).
Many behavioral-detection methods monitor biological systems (such as cardiac activity, facial expressions, and voice tone) and use physiological information to draw inferences about internal psychological states (for example, “on the basis of this pattern of physiological activity, this person is likely to be engaged in deception”). In most situations, the easiest and most accurate way to determine past, current, and future behavior might be to ask the person what he or she has been doing, is doing, and plans
to do. However, the terrorist’s desire to avoid detection and the “cat and mouse” game that is played by terrorists and their pursuers make such a verbal mode of information-gathering highly unreliable.
Because verbal reports can be manipulated and controlled so easily, we might turn to biological systems that are less susceptible to voluntary control or that provide detectable signs when they are being manipulated. Once we move to the biological level, however, we have abandoned direct observation of terrorist behavior and moved into the realm of inference of likely behavior from more primitive and less specific sources. Biobehavioral methods can be powerful and useful, but they are intrinsically subject to three limitations:
-
Many-to-one. Any given pattern of physiological activity can result from or correlate with a number of quite different psychological or physical states.
-
Probabilistic. Any detected sign or pattern conveys some likelihood of the behavior, intent, or attitude of interest but not an absolute certainty.
-
Errors. In addition to the highly desirable true positives and true negatives that are produced, there will be the troublesome false positives (an innocent person is thought to be guilty) and false negatives (a guilty person is thought to be innocent). Depending on the robustness of the biobehavioral techniques involved, it may be possible in the face of countermeasures for a subject to induce false negatives by manipulating his or her behavior.
In addition, even if deception or the presence of an emotion can be accurately and reliably detected, information about the reason for deception, a given emotion, or a given behavior is not available from the measurements taken. A person exhibiting nervousness may be excited about meeting someone at the airport or about being late. A person lying about his or her travel plans may be concealing an extramarital affair. A person fidgeting may be experiencing back pain. None of those persons would be the targets of counterterrorist efforts, nor should they be—and the possibility that their true motivations and intents may be revealed has definite privacy implications.
K.2
MAJOR BEHAVIORAL-DETECTION METHODS
Most behavioral methods are based on monitoring the activity of neural systems that are thought to be difficult to control voluntarily or that reveal measurable signs when they are being controlled.
K.2.1
Facial Expression
Facial muscles are involved in the expression and communication of emotional states. They can be activated both voluntarily and involuntarily,1 so there is ample opportunity for a person to interfere with the expression of emotion in ways that serve personal goals. There is strong scientific evidence that different configurations of facial-muscle contractions are associated with what are often called basic emotions.2 Those emotions include anger, contempt, disgust, fear, happiness, surprise, and sadness. There is also evidence that other emotions can be identified on the basis of patterns of movement in facial and bodily muscles (for example, embarrassment3) and that distinctions can be made between genuine felt happiness and feigned unfelt happiness according to whether a smile (produced by the zygomatic major muscles) is accompanied by the contraction of the muscles (orbicularis oculi) that circle the eyes.4
Facial-muscle activity can be measured accurately by careful examination of the changes in appearance that are produced as the muscles cause facial skin to be moved.5 Trained coders working with video recordings can analyze facial expressions reliably, but it is extremely time-consuming (it can take hours to analyze a few minutes of video fully). Greatly simplified methods that focus only on the key muscle actions involved in a few emotions of interest and that are appropriate for real-time screening are being developed and tested. Some basic efforts to develop automated computer systems for analyzing facial expressions have also been undertaken,6 but the problems inherent in adapting them for real-world, naturalistic applications are enormous.7
1 |
W.E. Rinn, “The neuropsychology of facial expression: A review of the neurological and psychological mechanisms for producing facial expressions,” Psychological Bulletin 95(1):52-77, 1984. |
2 |
P. Ekman, “An argument for basic emotions,” Cognition and Emotion 6(3-4):169-200, 1992. |
3 |
D. Keltner, “Signs of appeasement: Evidence for the distinct displays of embarrassment, amusement, and shame,” Journal of Personality and Social Psychology 68(3):441-454, 1995. |
4 |
P. Ekman and W.V. Friesen, “Felt, false and miserable smiles,” Journal of Nonverbal Behavior 6(4):238-252, 1982. |
5 |
P. Ekman and W.V. Friesen, Facial Action Coding System, Consulting Psychologists Press, Palo Alto, Calif., 1978. |
6 |
J.F. Zlochower, A.J. J. Lien, and T. Kanade, “Automated face analysis by feature point tracking has high concurrent validity with manual FACS coding,” Psychophysiology 36(1):35-43, 1999. |
7 |
For example, according to a German field test of facial recognition conducted in 2007, an accuracy of 60 percent was possible under optimal conditions, 30 percent on average (depending on light and other factors). See Bundeskriminalamt (BKA), Face Recognition as a Tool for Finding Criminals: Picture-man-hunt, Final report, BKA, Wiesbaden, Germany, February 2007. Available in German at http://www.cytrap.eu/files/EU-IST/2007/pdf/2007-07-FaceRecognitionField-Test-BKA-Germany.pdf. |
There are several other ways to measure facial-muscle activity. The electrical activity of the facial muscles themselves can be measured (with electromyelography [EMG]). That requires the application of many electrodes to the face, each placed to maximize sensitivity to the action of particular muscles and minimize sensitivity to the action of other muscles. Because of the overlapping anatomy of facial muscles, their varied sizes, and their high density in some areas (such as around the mouth), the EMG method may be better suited to simple detection of emotional valence (positive or negative) and intensity than to the detection of specific emotions. Another indirect method of assessing facial-muscle activity is to measure the “heat signature” of the face associated with changes in blood flow to different facial regions.8 That information can be read remotely by using infrared cameras; however, the spatial and temporal resolutions are problematic.
Even if a method emerged that allowed facial-muscle activity to be measured reliably, comprehensively, economically, and unobtrusively, there would be the issue of its utility in a counterterrorism effort. Necessary (but not sufficient) conditions for utility would include:
-
The availability of tools that can determine the specific emotion that is being signaled if and when emotional facial expression is displayed.
-
The superiority of a facial-expression–based emotion-prediction system to a system based on any other biological or physiological markers.
-
The detectability of indicators that a person is attempting to conceal his or her true emotional state or to shut down facial expression entirely, such as
-
Small, fleeting microexpressions of the emotion being felt.
-
Tell-tale facial signs of attempted control (such as tightening of some mouth muscles).
-
Signs that particular emotions are being simulated.
-
Characteristic increases in cardiovascular activity mediated by the sympathetic branch of the autonomic nervous system.9
Because no specific facial sign is associated with committing or plan-
ning a terrorist act, using facial measurement in a counterterrorism effort will have to be based on some combination of the detection of facial expressions thought to indicate malevolent intent (such as signs of anger, contempt, or feigned happiness in some situations), the detection of facial expressions thought to indicate deception,10 and the detection of facial expressions that are anomalous compared with those of other people in the same situation.
Results of research on the connection between facial expression and emotional state suggest correlations between the two. However, the suggestive findings have generally not been subject to rigorous, controlled tests of accuracy in a variety of settings that might characterize real-world application contexts.
K.2.2
Vocalization
In addition to the linguistic information carried by the human voice, a wealth of paralinguistic information is carried in pitch, timbre, tempo, and the like and is thought to be related to a person’s emotional state.11 Those paralinguistic qualities of speech can be difficult to control voluntarily, so they are potentially useful for detecting underlying emotional states and deception. In the emotion realm, much of the promise of mapping paralinguistic qualities of vocalization onto specific emotions has yet to be realized, and the history of using paralinguistic markers in the deception realm is not very encouraging. At one time, a great deal of attention was given to the detection of deception by quantifying microtremors in the voice (“voice stress analyzers”), but this approach has failed to withstand scientific scrutiny.12
Why has more progress not been made in using paralinguistic qualities of speech to detect emotions and deception? There are several possible reasons. The relationships between paralinguistic qualities of speech and psychological states are much weaker than originally thought. The field has not yet identified the right characteristics to measure. And siz-
10 |
P. Ekman and M. O’Sullivan, “From flawed self-assessment to blatant whoppers: The utility of voluntary and involuntary behavior in detecting deception,” Behavioral Sciences and the Law. Special Issue: Malingering 24(5):673-686, 2006. |
11 |
K.R. Scherer, Vocal Measurement of Emotion, Academic Press, Inc., San Diego, Calif., 1989. |
12 |
See, for example, National Research Council, The Polygraph and Lie Detection, The National Academies Press, Washington, D.C., 2003; Mitchell S. Sommers, “Evaluating voice-based measures for detecting deception,” The Journal of Credibility Assessment and Witness Psychology 7(2):99-107, 2006, available at http://truth.boisestate.edu/jcaawp/2006_No_2/2006_99-107. pdf; J. Masip, E. Garrido, and C. Herrero, “The detection of deception using voice stress analyzers: A critical review,” Estudios de PsicologÃa 25(1):13-30, 2004. |
able individual differences in speech need to be accounted for before interindividual consistencies will emerge. In the interim, new approaches that do not rely on paralinguistic vocalizations in isolation but rather combine them with other indicators of deception and emotion (such as facial expressions and physiological indicators) may prove useful. Is is ironic that it is fairly simple to obtain high-quality, noninvasive samples of vocalizations in real-world contexts. Moreover, cost-effective, accurate instrumentation for analyzing the acoustic properties of speech is readily available. Thus, the tools are already in place; it is just the science that is lagging.
K.2.3
Other Muscle Activity
Technology is readily available for quantifying the extent of overall motor activity (sometimes called gross motor activity or general somatic activity). It can be done with accelerometers attached to a person (some are built into watch-like casings) or with pressure-sensitive devices (such as piezoelectric transducers) placed under standing and sitting areas. The latter can be used to track motion in multiple dimensions and thus enable characterization of patterns of pacing, fidgeting, and moving. Although clearly not specifically related to any particular emotional or psychological state, high degrees of motor activity may be noteworthy when they are anomalous in comparison with usual levels of agitation and tension.
K.2.4
Autonomic Nervous System
The autonomic nervous system (ANS) controls the activity of the major organs, including the heart, blood vessels, kidneys, pancreas, lungs, stomach, and sweat glands. Decades of methodological development in medicine and psychophysiology have produced ways to measure a wide array of autonomic functions reliably and noninvasively. Some of the measures are direct (such as using the electrical activity of the heart muscle to determine heart rate), and some are indirect (such as estimating vascular constriction by using the reflection of infrared light to determine the amount of blood pooling in peripheral sites or using impedance methods to measure the contractile force of the left ventricle as it pumps blood from the heart to the rest of the body). Additional work has been directed toward developing methods of ambulatory monitoring that enable tracking of ANS activity in freely moving people. Remote sensing of autonomic function is still in its infancy, but some progress has been made in using variation in surface temperature to indicate patterns of blood flow.
Measures of ANS activity are essentially measures of arousal and reflect the relative activation and deactivation of various organ systems to
provide the optimal milieu to support current body activity (such as sleep, digestion, aggression, and thinking). Debate has raged over the decades as to whether specific patterns of autonomic activity are associated with particular psychological states, including emotions. For emotion, the issue is whether the optimal bodily milieu for anger (ANS support for fighting) is different from that for disgust (ANS support for withdrawal and expulsion of harmful substances). Evidence in support of that kind of autonomic specificity for at least some of the basic emotions is drawn from experimentation, metaphors found in language (such as association of heat and pressure with anger or of coolness with fear), and observable signs of autonomic activity (such as crying during sadness but not during fear or gagging during disgust but not during angers). There are a number of reviews of these issues and the associated scientific evidence.13
Over the years, patterns of ANS activity have been mapped onto several nonemotionl states. Among the more durable of them have been the distinction between stimulus intake (elevated skin conductance plus heart rate deceleration) and stimulus rejection (elevated skin conductance plus heart rate acceleration),14 and the more recent distinction between the cardiovascular responses to threat (moderate increases in cardiac contractility, no change or decrease in cardiac output, and no change or increase in total peripheral resistance) and to challenge (increase in cardiac contractility, increase in cardiac output, and decrease in total peripheral resistance).15
Regardless of the putative pattern, using the existence of any particular pattern of ANS activity by itself to infer psychological or emotional states is fraught with danger. The ANS is the slave to many masters, and any ANS pattern may reflect any of a host of nonpsychological and psychological states.
The other way in which ANS monitoring has been used extensively is to detect deception. The use of autonomic measurement in lie-detection technology has a long history in law enforcement, security screening, and personnel selection. Despite its history (which continues), most of the major scientific investigations of the validity of the polygraph have raised serious reservations. For example, an independent review of the
use of the polygraph commissioned by the Office of Technology Assessment concluded that16
there is at present only limited scientific evidence for establishing the validity of polygraph testing. Even where the evidence seems to indicate that polygraph testing detects deceptive subjects better than chance (when using the control question technique in specific-incident criminal investigations), significant error rates are possible, and examiner and examinee differences and the use of countermeasures may further affect validity. (p. 96)
In 2003, a review by the National Research Council was similarly critical,17 concluding that the polygraph has a better than chance but far less than perfect performance in detecting specific incidents of deception but that it is not acceptable for use in general screening and is highly vulnerable to countermeasures. In considering the use of the polygraph in antiterrorism efforts, it is important to weigh its possible utility in “guilty knowledge” situations (for example, the person being interrogated is denying knowing something that he or she knows) against the likelihood that the person will be trained in using countermeasures. Empirically, “guilty knowledge” studies indicate that the polygraph confers at best a minimal advantage in identifying such situations and suggest that guilty parties may not need to take countermeasures at all to evade detection by a polygraph.
K.2.5
Central Nervous System
The brain is clearly the source of motivated behavior—both good and evil. Thus, measuring brain activity is appealing if the goal is to detect intentions, motives, planned behaviors, allegiances, and a host of other mental states related to terrorism and terrorist acts. The electrical activity of the brain can be measured directly with electroencephalography (EEG) and indirectly with such methods as magnetoencephalography (MEG, which detects changes in magnetic fields produced by the brain’s electrical activity), positron-emission tomography (PET, which uses radioactive markers to track blood flow into the brain areas that are most active), and functional magnetic resonance imaging (fMRI, which uses strong magnetic fields to detect changes in the magnetic properties of blood flowing
through the brain that occur when active brain areas use the oxygen carried by red blood cells).
The electrical activity of the brain can be monitored using these technologies while an individual is undertaking fairly complex behavioral activities and can sometimes be linked to particular discrete stimulus events. An overarching goal of the research using these methods has been to understand how and where in the brain such basic mental activities as error detection, conflict monitoring, emotion activation, and behavioral regulation occur. In most brain research, the focus has been more on specific cognitive processes than on specific emotions. Some patterns of brain activity can be used to predict when a person is experiencing emotion but not the particular emotion. In addition to emotional activation, some patterns indicate attempts at emotion regulation and control.18
Each of the existing measures of brain activity has advantages and disadvantages in temporal resolution, spatial resolution, invasiveness, susceptibility to movement artifact, methodological requirements, and expense. Much of the current excitement in the field is focused on fMRI. Viewed from the perspective of counterterrorism, fMRI presents numerous challenges: subjects must be supine in a tube for a long period (typically 15 minutes to 2 hours), temporal resolution is low, and the method is highly vulnerable to movement artifacts (movements greater than 3 mm can result in unusable images). Although the committee heard testimony about detection of deception with fMRI, the paucity of research supporting it and the considerable constraints associated with it make it difficult to imagine its having any immediate antiterrorism utility.
K.3
ASSESSING BEHAVIORAL-SURVEILLANCE TECHNIQUES
Proponents and advocates (especially vendors) often seek to demonstrate the validity of a particular approach to behavioral surveillance or deception detection by presenting evidence that it discriminates accurately between truthfulness and deception in a particular sample of examinees. Although such evidence would be necessary to accept claims of validity, it is far from sufficient.
The 2003 National Research Council report on the polygraph and lie detection19 provided a set of questions that guide the collection of credible evidence to support claims of validity of any proposed technique for deception detection and a set of characteristics of high-quality studies
that address issues of accuracy. Those questions and characteristics are presented in Box K.1.
K.4
BEHAVIORAL AND DATA MINING METHODS: SIMILARITIES AND DIFFERENCES
Behavioral and data mining methods have many similarities and some key differences. Perhaps most important, they face many of the same challenges and can both be evaluated in the overall framework presented in this report. These are some characteristics that are common to the two methods:
-
Probabilistic. Data mining and behavioral surveillance seek patterns that are likely to be associated with terrorist acts. Successful methods will need to have high rates of true positives and true negatives and low rates of false positives and false negatives. Because of the low base rate of terrorism in most contexts (for example, in airport security lines), both methods will detect many acts of malfeasance that are not directly related to terrorism (for example, acts by people who have committed or are planning other crimes). The value and cost of these “true positives of another sort” must be considered in evaluating any applications of the methods.
-
Remote and secret monitoring. Data mining and some kinds of behavioral surveillance allow information to be collected and analyzed without direct interaction with those being monitored.
-
Countermeasures. Data mining and behavioral surveillance are vulnerable to countermeasures and disinformation.
-
Gateways to human judgment. Data mining and behavioral surveillance may best be viewed as ways to identify situations that require follow-up investigation by skilled interviewers, analysts, and scientists.
-
Privacy. Data mining and behavioral surveillance raise serious concerns for the protection of individual liberties and privacy.
-
Need for prior empirical demonstration. Data mining and behavioral surveillance should deployed operationally on a wide scale only after their utility has been empirically demonstrated in the laboratory and on a limited scale in operational contexts.
-
Need for continuing evaluation. The use of data mining and of behavioral surveillance should be accompanied by a continuing process of evaluation to establish utility, accuracy and error rates, and violation of individual privacy.
The following are some of the important differences:
-
Collection versus analysis. Techniques for detecting deception require
BOX K.1 Questions for Assessing Validity and Characteristics of Accurate Studies Questions for Assessing Validity
Research Methods for Demonstrating Accuracy
|
SOURCE: National Research Council, The Polygraph and Lie Detection, The National Academies Press, Washington, D.C., 2003, pp. 222-224. |
-
the collection of physiological and biological data, whereas data-mining is a technique for analyzing already-collected data.
-
Degree of intrusiveness. Traditional jurisprudence and ethics generally regard a person’s body as worthy of a higher degree of protection than his or her information, residences, or possessions. Thus, techniques that require the collection of physiological and biological data (especially data relevant to one’s thoughts) are arguably more intrusive than collection schemes directed at different kinds of personal data.