Conclusions and Recommendations
The committee’s work was informed by a number of basic premises. These premises framed the committee’s perspective in developing this report, and they can be regarded as the assumptions underlying the committee’s analysis and conclusions. The committee recognizes that others may have their own analyses with different premises, and so for analytical rigor, it is helpful to lay out explicitly the assumptions of the committee.
Premise 1. The United States faces two real and serious threats from terrorists. The first is from terrorist acts themselves, which could cause mass casualties, severe economic loss, and social dislocation to U.S. society. The second is from the possibility of inappropriate or disproportionate responses to the terrorist threat that can do more damage to the fabric of society than terrorists would be likely to do.
The events of September 11, 2001, provided vivid proof of the damage that a determined terrorist group can inflict on U.S. society. All evidence to date suggests that the United States continues to be a prime target for such terrorist groups as Al Qaeda, and future terrorist attacks could cause
major casualties, severe economic loss, and social disruption.1 The danger of future terrorist attacks on the United States is both real and serious.
At the same time, inappropriate or disproportionate responses to the terrorist threat also pose serious dangers to society. History demonstrates that measures taken in the name of improving national security, especially in response to new threats or crises, have often proven to be both ineffective and offensive to the nation’s values and traditions of liberty and justice.2 So the danger of unsuitable responses to the terrorist threat is also real and serious.
Given the existence of a real and serious terrorist threat, it is a reasonable public policy goal to focus on preventing attacks before they occur—a goal that requires detecting the planning for such attacks prior to their execution. Given the possibility of inappropriate or disproportionate responses, it is also necessary that programs intended to prevent terrorist attacks be developed and operated without undue compromises of privacy.
Premise 2. The terrorist threat to the United States, serious and real though it is, does not justify government authorities conducting activities or operations that contravene existing law.
The longevity of the United States as a stable political entity is rooted in large measure in the respect that government authorities have had for the rule of law. Regardless of the merits or inadequacies of any legal regime, government authorities are bound by its requirements until the legal regime is changed, and, in the long term, public confidence and trust in government depend heavily on a belief that the government is indeed adhering to the laws of the land. The premises above would not change even if the United States were facing exigent circumstances. If existing legal authorities (including any emergency action provisions, of which there are many) are inadequate or unclear to deal with a given situation
or contingency, government authorities should seek to change the law rather than to circumvent or disobey it.
A willingness of U.S. government authorities to circumvent or disobey the law in times of emergency is not unprecedented. For example, recently declassified Central Intelligence Agency (CIA) documents indicate widespread violations of the agency’s charter and applicable law in the 1960s and 1970s, during which time the CIA conducted surveillance operations on U.S. citizens under both Democratic and Republican presidents that were undertaken outside the agency’s charter.3
The U.S. Congress has also changed laws that guaranteed confidentiality in order to gain access to individual information collected under guarantees. For example, Section 508 of the USA Patriot Act, passed in 2001, allows the U.S. Department of Justice (DOJ) to gain access to individual information originally collected by the National Center for Education Statistics under a pledge of confidentiality. In earlier times, the War Powers Act of 1942 retrospectively overrode the confidentiality provisions of the Census Bureau, and it is now known that bureau officials shared individually identifiable census information with other government agencies for the purposes of detaining foreign nationals.4
Today, many laws provide statutory protection for privacy. Conforming to such protections is not only obligatory, but it also builds necessary discipline into counterterrorism efforts that serves other laudable purposes. By making the government stop and justify its effort to a senior official, a congressional committee, or a federal judge, warrant requirements and other privacy protections often help bring focus and precision to law enforcement and national security efforts. In point of fact, courts rarely refuse requests for judicial authorization to conduct surveillance. As government officials often note, one reason for these high success rates is the quality of internal decision making that the requirement to obtain judicial authorization requires.
Premise 3. Challenges to public safety and national security do not warrant fundamental changes in the level of privacy protection to which nonterrorists are entitled.
The United States is a strong nation for many reasons, not the least of which is its commitment to the rule of law, civil liberties, and respect
M. Mazzetti and T. Weiner, “Files on illegal spying show C.I.A. skeletons from Cold War,” New York Times, June 27, 2007.
W. Seltzer and M. Anderson, “Census confidentiality under the second War Powers Act (1942-1947),” paper prepared for the Annual Meeting of the Population Association of America, March 30, 2007, Population Association of America, New York, available at http://www.uwm.edu/~margo/govstat/Seltzer-AndersonPAA2007paper3-12-2007.doc.
for diversity. Especially in times of challenge, it is important that this commitment remain strong and unwavering. New technological circumstances may necessitate an update of existing privacy laws and policy, but privacy and surveillance law already includes means of dealing with national security matters as well as criminal law investigations. As new technologies become more commonly used, these means will inevitably require extension and updating, but greater government access to private information does not trump the commitment to the bedrock civil liberties of the nation.
Note that the term “privacy” has multiple meanings depending on context and interpretation. Appendix L (“The Science and Technology of Privacy Protection”) explicates a technical definition of the term, and the term is often used in this report, as in everyday discourse, with a variety of informal meanings that are more or less consistent with the technical definition.
Premise 4. Exploitation of new science and technologies is an important dimension of national counterterrorism efforts.
Although the committee recognizes that other sciences and technologies are relevant as well, the terms of reference call for this report to focus on information technologies and behavioral surveillance techniques. The committee believes that when large amounts of information, personal and otherwise, are determined to be needed for the counterterrorist mission, the use of information technologies will be necessary and counterterrorist authorities will need to collect, manage, and analyze such information. Furthermore, it believes that behavioral surveillance techniques may have some potential for inferring intent from observed behavior if the underlying science proves sound—a capability that could be very useful in counterterrorist efforts “on the ground” if realized in the future.
Premise 5. To the extent reasonable and feasible, counterterrorist programs should be formulated to provide secondary benefits to the nation in other domains.
Counterterrorism programs are often expensive and controversial. In some cases, however, a small additional expenditure or programmatic adjustment may enable them to provide benefits that go beyond their role in preventing terrorism. Thus, they would be useful to the nation even if terror attacks do not occur. For example, hospital emergency reporting systems can improve medical care by prompt reporting of influenza, food poisoning, or other health problems, as well as alerting officials of bioterrorist and chemical attacks.
At the same time, policy makers must be aware of the phenomenon of “statutory mission creep”—in which the goals and missions of a program are expanded explicitly as the result of a specific policy action, such as congressional amendment of an existing law—and avoid its snares. In some instances, such as hospital emergency reporting systems, privacy interests may not be seriously compromised by their application to multiple missions. But in others, such as the use of systems designed for screening terrorists to identify ordinary criminals, privacy interests may be deeply implicated because of the vast and voluminous new data sets that must be brought to bear on the expanded mission. Mission creep may also go beyond the original understandings of policy makers regarding the scope and nature of a program that they initially approve, and thus effectively circumvent careful scrutiny. In some cases, a sufficient amount of mission creep may even result in a program whose operation is not strictly legal.
CONCLUSIONS REGARDING PRIVACY
The rich digital record that is made of people’s lives today provides many benefits to most people in the course of everyday life. Such data may also have utility for counterterrorist and law enforcement efforts. However, the use of such data for these purposes also raises concerns about the protection of privacy and civil liberties. Improperly used, programs that do not explicitly protect the rights of innocent individuals are likely to create second-class citizens whose freedoms to travel, engage in commercial transactions, communicate, and practice certain trades will be curtailed—and under some circumstances, they could even be improperly jailed.
Conclusion 1. In the counterterrorism effort, some degree of privacy protection can be obtained through the use of a mix of technical and procedural mechanisms.
The primary goal of the nation’s counterterrorism effort is to prevent terrorist acts. In such an effort, identification of terrorists before they act becomes an important task, one that requires the accurate collection and analysis of their personal information. However, an imperfect understanding of which characteristics to search for, not to mention imperfect and inaccurate data, will necessarily draw unwarranted attention to many innocent individuals.
Thus, records containing personal information of terrorists cannot be
examined without violating the privacy of others, and so absolute privacy protection—in the sense that the privacy of nonterrorists cannot be compromised—is not possible if terrorists are to be identified.
This technical reality does not preclude putting into place strong mechanisms that provide substantial privacy protection. In particular, restrictions on the use of personal information ensure that innocent individuals are strongly protected during the examination of their personal information, and strong and vigorous oversight and audit mechanisms can help to ensure that these restrictions are obeyed.
How much privacy protection is afforded by technical and procedural mechanisms depends on critical design features of both the technology and the organization that uses it. Two examples of relevant technical mechanisms are encryption of all data transports to protect against accidental loss or compromise and individually logged5 audit records that retain details of all queries, including those made by fully authorized individuals to protect against unauthorized use.6 But the mere presence of such mechanisms does not ensure that they will be used, and such mechanisms should be regarded as one enabler—one set of necessary but not sufficient tools—for the robust independent program oversight described in Recommendation 1c below.
Relevant procedural mechanisms include restrictions on data collection and restrictions on use. In general, such mechanisms govern important dimensions of information collection and use, including an explication of what data are collected, whether collection is done openly or covertly, how widely the data are disseminated, how long they are retained, the decisions for which they are used, whether the processing is
performed by computer or human, and who has the right to grant permissions for subsequent uses.
Historically, privacy from government intrusion has been protected by limiting what information the government can collect: voice conversations collected through wiretapping, e-mail collected through access to stored data (authorized by the Electronic Communications Privacy Act, passed in 1986 and codified as 18 U.S.C. 2510), among others. However, in many cases today, the data in question have already been collected and access to them, under the third-party business records doctrine, will be readily granted with few strings attached. As a result, there is great potential for privacy intrusion arising from analysis of data that are accessible to government investigators with little or no restriction or oversight. In other words, powerful investigative techniques with significant privacy impact proceed in full compliance with existing law—but with significant unanswered privacy questions and associated concerns about data quality.
Analytical techniques that may be justified for the purpose of national security or counterterrorism investigations, even given their potential power for privacy intrusion, must come with assurances that the inferences drawn against an individual will not then be used for normal domestic criminal law enforcement purposes. Hence, what is called for, in addition to procedural safeguards for data quality, are usage limitations that provide for full exploitation on new investigative tools when needed (and justified) for national security purposes, but that prevent those same inferences from being used in criminal law enforcement activity.
An example—for illustration only—of the latter is the use of personal data for airline passenger screening. Privacy advocates have often expressed concerns that the government use of large-scale databases to identify passengers who pose a potential risk to the safety of an airplane could turn into far-reaching enforcement mechanisms for all manner of offenses, such as overdue tax bills or child support payments. One way of dealing with this privacy concern would be to apply a usage-limiting privacy rule that allows the use of databases for the purpose of counterterrorism but prohibits the use of these same databases and analysis for domestic law enforcement. Those suspicious of government intentions are likely to find a rule limiting usage rather less comforting than a rule limiting collection, out of concern that government authorities will find it easier to violate a rule limiting usage than a rule limiting collection. Nevertheless, well-designed and diligently enforced auditing and oversight processes may help over time to provide reassurance that the rule is being followed as well as to provide some actual protection for individuals.
Finally, in some situations, improving citizen privacy can have the
result of improving their security and vice versa. For example, improvements in the quality of data (i.e., more complete, more accurate data) used in identifying potential terrorists are likely to increase security by enhancing the effectiveness of information-based programs to identify terrorists and to decrease the adverse consequences that may occur due to confidentiality violations for the vast majority of innocent individuals. In addition, strong audit controls that record the details of all accesses to sensitive personal information serve both to protect the privacy of individuals and to reduce barriers to information sharing between agencies or analysts. (Agencies or analysts are often reluctant to share information, even among themselves, because they feel a need to protect sources and methods, and audit controls that limit information access provide a greater degree of reassurance that sensitive information will not be improperly distributed.)
Conclusion 2. Data quality is a major issue in the protection of the privacy of nonterrorists.
As noted in Chapter 1, the issue of data quality arises internally as a result of measurement errors within databases and also as a consequence of efforts to link data or records across databases in the absence of clear, unique identifiers. Sharing personal information across agencies, even with “names” attached, offers no assurances that the linked data are sufficiently accurate for counterterrorism purposes; indeed, there are no metrics for accuracy that appear to be systematically used to assess such linking efforts.
Data of poor quality severely limit the value of data mining in a number of ways. First, the actual characteristics of individuals are often collected in error for a wide array of reasons, including definitional problems, identify theft, and misresponse on surveys.
These errors could obviously result in individuals being inaccurately represented by data mining algorithms as a threat when they are not (with the consequence that personal and private information about them might be inappropriately released for wider scrutiny). Second, poor data quality can be amplified during file matching, resulting in the erroneous merging of information for different individuals into a single file. Again, the results can be improper treatment of individuals as terrorist threats, but here the error is compounded, since entire clusters of information are now in error with respect to the individual who is linked to the information in the merged file.
Such problems are likely to be quite common and could greatly limit the utility of data mining methods used for counterterrorism. There are no obvious mechanisms for rectifying the current situation, other than col-
lecting similar information from multiple sources and using the duplicative nature of the information to correct inaccuracies. However, given that today the existence of alternate sources is relatively infrequent, correcting individual errors will be extraordinarily difficult.
Distinctions Between Capability and Intent
Conclusion 3. Inferences about intent and/or state of mind implicate privacy issues to a much greater degree than do assessments or determinations of capability.
Although it is true that capability and intent are both needed to pose a real threat, determining intent on the basis of external indicators is inherently a much more subjective enterprise than determining capability. Determining intent or state of mind is inherently an inferential process, usually based on indicators such as whom one talks to, what organizations one belongs to or supports, or what one reads or searches for online. Assessing capability is based on such indicators as purchase or other acquisition of suspect items, training, and so on. Recognizing that the distinction between capability and intent is sometimes unclear, it is nevertheless true that placing people under suspicion because of their associations and intellectual explorations is a step toward abhorrent government behavior, such as guilt by association and thought crime. This does not mean that government authorities should be categorically proscribed from examining indicators of intent under all circumstances—only that special precautions should be taken when such examination is deemed necessary.
CONCLUSIONS REGARDING THE ASSESSMENT OF COUNTERTERRORISM PROGRAMS
Conclusion 4. Program deployment and use must be based on criteria more demanding than “it’s better than doing nothing.”
In the aftermath of a disaster or terrorist incident, policy makers come under intense political pressure to respond with measures intended to prevent the event from occurring again. The policy impulse to do something (by which is usually meant something new) under these circumstances is understandable, but it is simply not true that doing something new is always better than doing nothing. Indeed, policy makers may deploy new information-based programs hastily, without a full consideration of (a) the actual usefulness of the program in distinguishing people or characteristic patterns of interest for follow-up from those not of inter-
est, (b) an assessment of the potential privacy impacts resulting from the use of the program, (c) the procedures and processes of the organization that will use the program, and (d) countermeasures that terrorists might use to foil the program.
The committee developed the framework presented in Chapter 2 to help decision makers determine the extent to which a program is effective in achieving its intended goals, compliant with the laws of the nation, and reflective of the values of society, especially with regard to the protection of data subjects’ privacy. This framework is intended to be applied by taking into account the organizational and human contexts into which any given program will be embedded as well as the countermeasures that terrorists might take to foil the program.
The framework is discussed in greater detail in Chapter 2.
CONCLUSIONS REGARDING DATA MINING7
Policy and Law Regarding Data Mining
Conclusion 5. The current policy regime does not adequately address violations of privacy that arise from information-based programs using advanced analytical techniques, such as state-of-the-art data mining and record linkage.
For example, an activity for counterterrorist purposes, possibly a data mining activity, is likely to require the linking of data found in multiple databases. The literature on record linkage suggests that, even assuming the data found in any given database to be of high quality, the data derived from linkages (the “mosaic” consisting of the collection of linked data) are likely to be error-prone. Certainly, the better the quality of the individual lists, the fewer the errors that will be made in record linkage, but even with high-quality lists, the percentage of false matches and false nonmatches may still be uncomfortably high. In addition, it is also the case that certain data mining algorithms are less sensitive to record linkage errors as inputs, since they use redundant information in a way that can, at times, identify such errors and downweight or delete them. Again, even in the best circumstances, such problems are currently extremely
Additional observations about data mining are contained in Appendix H.
difficult to overcome. Error-prone data are, of course, both a threat to privacy (as innocent individuals are mistakenly associated with terrorist activity) and a threat to effectiveness (as terrorists are overlooked because they have been hidden by errors in the data that would have suggested a terrorist connection).
The committee also notes that the use of analytical techniques such as data mining is not limited to government purposes; private parties, including corporations, criminals, divorce lawyers, and private investigators, also have access to such techniques. The large-scale availability of data and advanced analytical techniques to private parties carries clear potential for abuses of various kinds that might lead to adverse consequences for some individuals, but a deep substantive examination of this issue is outside the primary focus of this report on government policy.
The Promise and Limitations of Data Mining
Chapter 1 (in Section 1.6.1) notes that data mining covers a wide variety of analytical approaches for using large databases for counterterrorist purposes, and in particular it should be regarded as being much broader than the common notion of a technology underlying automated terrorist identification.
Conclusion 6. Because data mining has proven to be valuable in private-sector applications, such as fraud detection, there is reason to explore its potential uses in countering terrorism. However, the problem of detecting and preempting a terrorist attack is vastly more difficult than problems addressed by such commercial applications.
As illustrated in Appendix H (“Data Mining and Information Fusion”), data mining has proven valuable in a number of private-sector applications. But the data used by analysts to track sales, banks to assess loan applications, credit card companies to detect fraud, and telephone companies to detect fraud are fundamentally different from counterterrorism data. For example, private-sector applications generally have access to a substantial amount of relatively complete and structured data. In some cases, their data are more accurate than government data, and, in others, large volumes of relevant data sometimes enable statistical techniques to compensate8 to some extent for data of lower quality—thus, either way, reducing the data-cleaning effort required. In addition, a few false positives and false negatives are acceptable in private-sector
applications, because a few false positives can usually be cleared up by contact with clients without a significant draw on resources, and a few false negatives are tolerable. Ground truth—that is, knowledge of what is actually true that can be used to validate or verify a new measurement or technique—is available in many private-sector applications, a point that enables automated learning and refinement to take place. All of the relevant data are available—at once—in private-sector applications.
These attributes are very different in the counterterrorism domain. Ground truth is rarely available in tracking terrorists, in large part because terrorists and terrorist activity are rare. Data specifically associated with terrorists (targeted collection efforts) are sparse and mostly collected in unstructured form (free text, video, audio recordings). The availability of much of the relevant data depends on the specific nature of data collected earlier (e.g., information may be needed to obtain a search warrant that then leads to additional information). Data tracks of terrorists in commercial and government administrative databases (as contrasted with government intelligence databases) are co-mingled with enormously larger volumes of similar data associated with innocent individuals, and they are not in any way apparent or obvious from the fact of their collection—that is, it is generally unknown who is a terrorist in any such database. And links among records in databases of varying accuracy will tend to reflect accuracies characteristic of the most inaccurate of the databases involved.
Such differences are not described here to argue that data mining for counterterrorist applications is ipso facto unproductive or operationally useless. But the existence of these differences underscores the difficulty of productively applying data mining techniques in the counterterrorist domain.
Conclusion 7. The utility of pattern-based data mining is found primarily if not exclusively in its role in helping humans make better decisions about how to deploy scarce investigative resources, and action (such as arrest, search, denial of rights) should never be taken solely on the basis of a data mining result. Automated terrorist identification through data mining (or any other known methodology) is neither feasible as an objective nor desirable as a goal of technology development efforts.
As noted in Appendix H, subject-based data mining and pattern-based data mining have very different characteristics. The common example of pattern-based data mining is what might be called automated terrorist identification, by which is meant an automated process that examines large databases in search of any anomalous pattern that might indicate a terrorist plot in the making. Automated terrorist iden-
tification is not technically feasible because the notion of an anomalous pattern—in the absence of some well-defined ideas of what might constitute a threatening pattern—is likely to be associated with many more benign activities than terrorist activities. In this situation, the number of false leads is likely to exhaust any reasonable limit on investigative or analytical resources. For these reasons, the desirability of technology development efforts aimed at automated terrorist identification is highly questionable.
Other kinds of pattern-based data mining may be useful in helping analysts to search for known patterns of interest (i.e., when they have a basis for believing that such a pattern may signal terrorist intent). For example, analysts may determine that a pattern is suggestive of terrorist activity on the basis of historical experience. By searching for patterns known to be associated with (prior) terrorist incidents, it may well be possible to uncover tangible and useful evidence of similar terrorist plots in the making. The significance of uncovering such plots, even if they are similar to those that have occurred in the past, should not be underestimated. Terrorists learn from their past failures and successes, and to the extent that counterterrorist activities can force them to develop new—and unproven—approaches, they will be placed at a significant disadvantage.
Patterns of interest may also be identified by analysts thinking about sets of activities that are indicative of or associated with terrorist activity, even if there is no historical precedent for such associations. Under some circumstances, terrorists might well be limited in the options they might pursue in attacking a specific target. If so, it might be reasonable to search for patterns associated with the planning and execution of those options.
Still, patterns of interest identified using these techniques should be regarded as indicative rather than authoritative, and they should be used only to suggest when further investigation may be warranted rather than as definitive indications of terrorist activity. The committee believes that data mining routines should never be the sole arbiter prior to actions that have a substantial impact on people’s lives. Data mining should be used to help humans make decisions when the combination of human judgment and automated data mining results in better decisions than human judgment alone. But even when this is the case, it does not negate the fact that data mining routines, on their own, can make obvious mistakes in deciding the rankings and that the use of human judgment can dramatically reduce the rate of errors.
Conclusion 8. Although systems that support analysts in the identification of terrorists can be designed with features and functionality that enhance privacy
protection without significant loss to their primary mission, privacy-preserving examination of individually identifiable records is fundamentally a contradiction in terms.
Systems can often be designed in ways that enhance privacy without compromising their primary mission. For example, in searching for a weapon at a checkpoint, a scanner might generate anatomically correct images of a person’s body in graphic detail. Since what is of interest is not those images but rather the presence or absence of weapons, a system could be designed to detect the presence or absence of a weapon in a particular scan and that fact (presence or absence) reported rather than the image itself. Procedural protections could also be put into place: for example, an individual might be given the choice of going through an imaging scanner or undergoing a pat-down search. (Note also that a different and broader set of privacy implications arises if images are stored for further use, as they may well be for system assessments.)
Nevertheless, in the absence of a near-perfect profile of a terrorist, it is not possible, even in principle, to somehow examine the records of an individual (who might or might not be a terrorist) but to expose those records only if he or she actually is a terrorist. (A profile of a terrorist is intended to enable the sorting of individuals into those who match the profile and those who do not. If the profile is perfect, and the data contained in individual records are entirely accurate, all of those who match can be regarded with certainty as terrorists and all of those who do not match can be regarded with certainty as nonterrorists. In practice, profiles are never perfect and data are not entirely accurate, and so the notion of degrees of match is much more relevant than the notion of simply match or nonmatch.)
As a result, any realistic system examining databases containing information about terrorists will bring a mix of terrorists and nonterrorists to the attention of analysts, who will decide whether these individuals warrant further investigation. “Further investigation” in this nonroutine context necessarily results in an examination of the private personal information for these individuals, and it may result in tangible inconvenience and loss of various freedoms.
Conclusion 9. Research and development on data mining techniques using real population data are inherently invasive of privacy to some extent.
Much of data mining is focused on looking for patterns of behavior, characteristics, or transactions that are a priori plausible (i.e., plausible on the basis of expert judgment and experience) as possible indicators of terrorist activity. But these expert judgments about patterns of interest
must be empirically valid if they are to have significant operational utility, whereby validity is measured by a high true positive rate in identifying terrorist activity and a low false positive rate.
On one hand, a degree of empirical validity can be obtained through the use of synthetic and anonymized data or historical data. For example, large population databases can be seeded with data created to resemble data associated with real terrorist activity. Although such data are, by definition, based on assumptions about the nature and character of terrorist activities, the expert judgment of experienced counterterrorism analysts can provide such data with significant face validity.9 By testing various algorithms in this environment, the simulated terrorist signatures provide a measure of ground truth against which various data mining approaches can be tested.
On the other hand and by definition, the use of synthetic data to simulate terrorist signatures does not provide real-world empirical validation. Only real data can be the basis for real-world empirical validation. Thus, another approach is to use historical data on terrorists. For example, a great deal is known today about the actual behavioral and activity signatures of the September 11, 2001, terrorists. Seeding large population databases with such data and requiring various algorithms to identify known terrorists provide a complementary approach to validation.
The use of historical data on terrorists is limited in one fundamental respect: it does not account for unprecedented events. But it is entirely reasonable to suggest that the successful application of proposed tools and techniques to known past events is a minimum and necessary (though not sufficient) metric of success.
Using real population databases—large databases filled with actual behavioral and activity data on actual individuals—presents a serious privacy issue. Almost all of these individuals will have no connection to terrorists, and the use of such data in this context means that their private personal information will indeed be compromised.
It is a policy decision as to whether the risks to privacy inherent in conducting research and development (R&D) on data mining techniques for counterterrorism using real population data are outweighed by the potential operational value of using those techniques. The committee
recommends that such R&D should be conducted on synthetic data (see Section 3.7), but if the decision is made to use real population data, the committee urges that policy makers face, acknowledge, and report on this issue explicitly.
CONCLUSIONS REGARDING DECEPTION DETECTION AND BEHAVIORAL SURVEILLANCE
Conclusion 10. Behavioral and physiological monitoring techniques might be able to play an important role in counterterrorism efforts when used to detect (a) anomalous states (individuals whose behavior and physiological states deviate from norms for a particular situation) and (b) patterns of activity with well-established links to underlying psychological states.
Scientific support for linkages between behavioral and physiological markers and mental state is strongest for elementary states (simple emotions, attentional processes, states of arousal, and cognitive processes), weak for more complex states (deception), and nonexistent for highly complex states (terrorist intent and beliefs). The status of the scientific evidence, the risk of false positives, and vulnerability to countermeasures argue for behavioral observation and physiological monitoring to be used at most as a preliminary screening method for identifying individuals who merit additional follow-up investigation. Indeed, there is no consensus in the relevant scientific community nor on the committee regarding whether any behavioral surveillance or physiological monitoring techniques are ready for use at all in the counterterrorist context given the present state of the science.
Conclusion 11. Further research is warranted for the laboratory development and refinement of methods for automated, remote, and rapid assessment of behavioral and physiological states that are anomalous for particular situations and for those that have well-established links to psychological states relevant to terrorist intent.
A number of techniques have been proposed for the machine-assisted detection of certain behavioral and physiological states. For example, advances in magnetic resonance imaging (MRI), electroencephalography (EEG), and other modern techniques have enabled measures of changes in brain activity associated with thoughts, feelings, and behaviors.10 Research in image analysis has yielded improvements in machine recog-
nition of faces under a variety of circumstances (e.g., when a face is smiling or when it is frowning) and environments (e.g., in some nonlaboratory settings).
However, most of the work is still in the basic research stage, with much of the underlying science still to be validated or determined. If real-world utility of these techniques is to be realized, a number of issues—practical, technical, and fundamental—will have to be addressed, such as the limits to understanding, the largely unknown measurement validity of new technologies, the lack of standardization in the field, and the vulnerability to countermeasures. Public acceptability regarding the privacy implications of such techniques also remains to be demonstrated, especially if the resulting data are stored for unknown future uses or undefined lengths of time.
For example, the current state-of-the-art of functional MRI technology can identify changes in the hemodynamics in certain regions of the brain, thus signaling activity in those regions. But such results are not necessarily consistent across individuals (i.e., different areas in the brains of different individuals may be active under the same stimulus) or even in the same individual (i.e., a slightly different part of the brain may become active even in the same individual under the same stimulus). Certain regions of the brain may be active under a variety of different stimuli. In short, understanding of what these regions do is still primitive. Furthermore, even if simple associations can be made reliably in laboratory settings, this does not necessarily translate into usable technology in less controlled situations. Behavior of interest to detect, such as terrorist intent, occurs in an environment that is very different from the highly controlled behavioral science laboratory.
Conclusion 12. Technologies and techniques for behavioral observation have enormous potential for violating the reasonable expectations of privacy of individuals.
Because the inferential chain from behavioral observation to possible adverse judgment is both probabilistic and long, behavioral observation has enormous potential for violating the reasonable expectations of privacy of individuals. It would not be unreasonable to suppose that most individuals would be far less bothered and concerned by searches aimed at finding tangible objects that might be weapons or by queries aimed at authenticating their identity than by technologies and techniques whose use will inevitably force targeted individuals to explain and justify their mental and emotional states. Even if behavioral observation and physiological monitoring are used only as a preliminary screening methods for identifying individuals who merit additional follow-up investigation,
these individuals will be subject to suspicion that would not fall on others not so identified.
CONCLUSIONS REGARDING STATISTICAL AGENCIES
Conclusion 13. Census and survey data collected by the federal statistical agencies are not useful for terrorism prevention: such data have little or no content that would be useful for counterterrorism. The content and sampling fractions of household surveys as well as the lack of personal identifiers makes it highly unlikely that these data sets could be linked with any reasonable degree of precision to other databases of use in terrorism prevention.
The content of the data collected by the federal statistical agencies under the auspices of survey and census programs is generally inconsistent with the needs of counterterrorist activities, which require individually identifiable data. Even ignoring issues of access, the value of the data collected on national household or business surveys for terrorism prevention is minimal.
The reasons are several:
Censuses collect little information beyond name, address, and basic demographic data on age, sex, and race; such data are unlikely to be of much value for identifying terrorists or terrorist behavior.
Because a substantial proportion of individuals move frequently, the 10-year cycle of censuses means that the census information is unlikely to be timely, even in supplying current addresses.
The census long form, which has been collected on a sample basis (and its successor program, the American Community Survey, ACS) have more information but still very little that is directly relevant to predicting terrorist activity. Moreover, because these data are collected only for a sample, the probability that those of interest would be in the sample for a given year of the ACS is very slight, and, furthermore, the ability to match files without identifiers into other record systems would be limited. At best, these data might provide background information to provide a description of the socioeconomic make-up of a clustered group of blocks.
Other household surveys also collect little of direct relevance to terrorism prevention, and because they typically draw on much less than 1 percent of the population, the chances of identifying new information on an individual of interest are rather low.
Regarding establishment surveys, for terrorism detection one might be interested in businesses that have increased activity with people in
various parts of the world, but such information is not contained on federal statistical system business censuses and surveys.
A variety of surveys collect information relevant to crime prevention and public health. Data collections on criminal activity, such as the National Crime Victimization Survey and the Uniform Crime Reports, contain data on victims of crime, and they are most useful in identifying geographic areas in which such criminal activity seems to be prevalent. Health surveys, such as the National Health Information Survey, the National Health and Nutrition Examination Survey, and the National Ambulatory Medical Care Survey (largely collected by the National Center for Health Statistics) have value in broader public health programs, but they cannot provide timely information for purposes of biosurveillance or for addressing a bioterrorist attack.
In addition, statistical agencies often collect information under a promise of confidentiality, and the costs of altering or relaxing the rules for confidentiality protection are quite substantial. The quality of the data collected could be adversely affected as a consequence of respondents’ decreased willingness to cooperate. Statistical agencies typically collect information under a promise of confidentiality, and reneging on such officially provided assurances could substantially reduce the quality of the data collected, resulting in much poorer data on the state of the nation.11
Aside from census and survey data, statistical agencies also hold considerable administrative data (which they have collected from other agencies); such data may be merged with data collected for statistical purposes and thus create the potential for data sets and databases that could at some point conceivably be useful for purposes of counterterrorism. While these derived data sets are currently protected by pledges of
confidentiality if any of the component data sets are so protected, some additional consideration needs to be given to such constructs and how to respond to requests for them from other government agencies.
In light of the conclusions presented above, the committee has two central recommendations. The first recommendation has subparts a-d.
Systematic Evaluation of Every Information-Based Counterterrorism Program
Recommendation 1. U.S. government agencies should be required to follow a systematic process (such as the one described in the framework proposed in Chapter 2) to evaluate the effectiveness, lawfulness, and consistency with U.S. values of every information-based program, whether classified or unclassified, for detecting and countering terrorists before it can be deployed, and periodically thereafter.
Appendix J (“The Total/Terrorist Information Awareness Program”) recounts the story of the Total Information Awareness (TIA) program of the Defense Advanced Research Projects Agency (DARPA) and the intense controversy it engendered—which was a motivation for launching this study. The committee notes that in December 2003, the Department of Defense (DOD) inspector general’s (IG) audit of TIA concluded that the failure to consider privacy adequacy during the early development of TIA led DOD to “risk spending funds to develop systems that may not be either deployable or used to their fullest potential without costly revision.”12 The DOD-IG report noted that this was particularly true with regard to the potential deployment of TIA for law enforcement: “DARPA need[ed] to consider how TIA will be used in terms of law enforcement to ensure that privacy is built into the developmental process.”13 Greater consideration of how the technology might be used not only would have served privacy but also would probably have contributed to making TIA more useful.
The committee believes that a systematic approach to the development, procurement, and use of information-based counterterrorism programs is necessary if their full value is to be obtained. The framework
developed by the committee and provided in Chapter 2 is intended as a template for government decision makers to use in evaluating the effectiveness, appropriateness, and validity of every information-based counterterrorism program and system. The U.S. Department of Homeland Security (DHS)—and all agencies of the U.S. government with counterterrorism responsibilities—should adopt the framework described in Chapter 2, or one similar to it, as a central element in their decision making about new deployments and existing programs in use. Failure to adopt such a systematic approach is likely to result in reduced operational effectiveness, wasted resources, privacy violations, mission creep, and diminished political support, not only for those programs but also for similar and perhaps for not-so-similar programs across the board.
To facilitate accountability, such evaluations (and the data on which they are based) should be made available to the broadest audience possible. Broad availability implies that these evaluations should be unclassified to the maximum extent possible—but even if evaluations are classified, they should still be performed and should be made available to those with the requisite clearances.
Such evaluations should be independent and comprehensive, and in particular they should assess both program effectiveness and privacy together, involving independent experts with the necessary technical, legal, and policy expertise to understand each of these areas and how interactions among them might affect the evaluation. For example, the meaning of privacy is in part technical, and an assessment of privacy cannot be left exclusively to individuals lacking such technical understanding.
Chapter 2 noted that much of the committee’s framework is not new and also that government decision makers have failed to implement many of the guidelines embedded in the framework even when they have been cognizant of them. It is the committee’s hope that by presenting to policy makers a comprehensive framework independent of any particular program, the pressures and exigencies associated with specific crises can be removed from the consideration and adoption of such a framework for application to all programs.
The committee also calls attention to four subrecommendations that derive from Recommendation 1.
Recommendation 1a. Periodically after a program has been operationally deployed, and in particular before a program enters a new phase in its life cycle, policy makers should apply a framework such as the one proposed in Chapter 2 to the program before allowing it to continue operations or to proceed to the next phase.
A systematic approach such as the framework in Chapter 2 is not intended to be applied only once in the life cycle of any given program. As noted in Appendix D (“The Life Cycle of Technology, Systems, and Programs”), a program undergoes a number of different phases in its lifetime: identification of initial needs, research and technology development, systems development, and operational deployment and continual operational monitoring. Each of these phases provides a desirable opportunity for applying the framework to help decide whether and how the program should transition to the next phase. Each of the framework’s questions should still be asked. But the answers to those questions as well as the interpretation of the answers will vary depending on the phase. Such a review may result in a significant modification or even a cancellation of a given program.
The committee calls special attention to the importance of operational monitoring, whose purpose is to ensure that the initial deployed capability remains both effective at contributing to the mission for which it was designed and acceptable from a privacy standpoint. Often after initial deployment, the operational environment changes. Improved base technologies or entirely new technologies become available. Existing threat actors change their tactics, or entirely new threats emerge. Executive branch policies change, or new administrations take office. Analysts gain experience, or new analysts arrive. Interpretations of existing law change through court decisions, or new legislation is passed. Data-based models may change simply because more data have become available that change the parameters and estimates on which the models are based. Error rates may change for similar reasons. Because every program is necessarily embedded in this milieu, the net result is that successful programs are almost always dynamic, in that they evolve in response to such changing circumstances.
An evolved program is, by definition, not the same as the original program—and it is a fair question to ask whether the judgments made about any program in its original form would be valid for an evolved program. For these reasons, a policy regime is necessary that provides for periodic reassessment and reevaluation of a program after initial deployment, at the same time promoting and fostering necessary changes—whether technological, procedural, legal, ethical, or other.
Recommendation 1a is important to programs currently in existence—that is, programs in existence today, and especially programs that are operationally deployed today should be evaluated against the framework. To the best of the committee’s knowledge, no such evaluations have been performed for any data mining or deception detection programs in operation, although this is not to say that none have been done. If such evaluations have been performed, they should be made
available to policy makers (senior officials in the executive branch or the U.S. Congress), and if possible, the public as well. If not, they should be undertaken with all due speed. And if they cannot be performed without access to classified information, an independent group of experts with the requisite clearances should be chartered to perform such assessments.
Recommendation 1b. To protect the privacy of innocent people, the research and development of any information-based counterterrorism program should be conducted with synthetic population data. If and when a program meets the criteria for deployment in the committee’s illustrative framework described in Chapter 2, it should be deployed only in a carefully phased manner, e.g., being field tested and evaluated at a modest number of sites before being scaled up for general use. At all stages of a phased deployment, data about individuals should be rigorously subjected to the full safeguards of the framework.
Almost by definition, technology in the R&D stage is nascent and unproven. Nascent and unproven technologies are not sufficiently robust or reliable to warrant risking the privacy of individuals—that is, the very uncertain (perhaps nonexistent) benefit that would be derived from their use does not justify the very real cost to privacy that would inevitably accompany their widespread use in operational settings. Thus, the committee advocates R&D based on synthetic population data whose use poses very little risk of violating the privacy of innocent individuals. In addition, the successful use of synthetic data in many fields, such as epidemiology, medicine, and chemistry, for testing methods provides another reason to explore its potential uses in counterterrorism.
The committee believes that realistic synthetic population data could probably be created along the lines originally suggested in Rubin and in Little and further developed by Fienberg et al. and Reiter,14 for the specific purpose of providing the background against which terrorist signatures are sought. Furthermore, because it is difficult to create from entirely synthetic data large databases that are useful for testing and (partially) validating data mining techniques and algorithms, a partial substitute for entirely synthetic data is data derived from real population data in such a way that the individual identities of nonterrorists are masked
while preserving some of the important large-scale statistical properties of those data.
Using synthetic population data as the background, a measure of the utility of various data mining approaches can be obtained in R&D. Such results must be evaluated in the most rigorous and independent manner possible in order to determine if the program should move into deployment.
If the results are determined to be sufficiently promising (e.g., with sufficiently low false positive and false negative rates) that they offer significant operational capability, it is reasonable to apply the new capabilities to real data in an operational context.15 But the change from synthetic to real data must be accompanied by a full array of operational safeguards that protect individuals from harm to their privacy, as suggested by the committee’s proposed framework. Put differently, if real data are to be used, they—and the individuals with whom they are associated—deserve the full benefit of the privacy protections associated with the program in question.
Transitioning to an operational context from R&D must also be done carefully and is best undertaken in small phases. The traditional approach to acquisition generally involves the deployment of operational capabilities in large blocks of capability (i.e., large functional components deployed on a wide scale). Experience indicates that this approach is often slow and cumbersome, and it increases technical, programmatic, and budgetary risks. The operational environment often changes significantly in the time between initial requirements specification and first deployment—thus, the capability may even be obsolete when it is first deployed. And deploying systems on a large scale before they are deployed on a small scale is almost always problematic, because small-scale operational trials are needed to shake out the inevitable bugs when R&D technologies meet the real world.
By contrast, phased deployment is based on a philosophy of “build-a-little, test-a-little, deploy-a-little.” Phased deployment recognizes that kinks and problems in the deployment of any new capability are inevitable, positing that by making small changes, system developers will be able to more easily identify and correct these problems than when everything changes all at once. Small changes are easier to reverse, should that become necessary. It also becomes feasible to test new capabilities offered by small changes in parallel with the baseline version, so that ground
truth provided by the baseline version can be used to validate the new capabilities when their domain of operation is the same.
The committee recognizes that, under this approach, operational capabilities will not have been subject to real-world empirical validation before deployment, although they will have had as much validation as possible with synthetic population data. And the phased deployment of privacy-sensitive capabilities reduces the likelihood of inappropriate or improper compromises of privacy from what they would have been under a more traditional acquisition model.16
The approach recommended above (synthetic data before deployment, deployment only in measured phases) places a high premium on two actions. First, every effort must be made to create good synthetic data that are useful for testing the validity of machine-learning tools and are simultaneously very realistic. For synthetic terrorist data, both historical data and expert judgment play a role in developing signatures that might plausibly be associated with terrorist activity, and plausibility should be assessed through independent panels of judges without a vested interest in any given scenario. Such judges must also be trained in or experienced with evasion or obfuscation techniques. For synthetic population data, every use must be made of known techniques for confidentiality protection and statistical disclosure limitation17 to reduce the likelihood that the privacy of individuals is compromised, and further research on the creation of better synthetic data to represent large-scale populations is certainly warranted.
Second, evaluation of R&D results must truly be independent and rigorous, with high standards of performance needed for a decision to deploy. As noted in Conclusion 4, the rule that “X is better than doing nothing” often drives deployment decisions, and, given the high potential costs to individual privacy of deployment, the benefits afforded by deployment must be more than marginal to warrant the potential cost.
Recommendation 1c. Any information-based counterterrorism program of the U.S. government should be subjected to robust, independent oversight of the operations of that program, a part of which would
entail a practice of using the same data mining technologies to “mine the miners and track the trackers.”
In practice, operational monitoring is generally the responsibility of the program managers and operational personnel. But as discussed in Appendix G (“The Jurisprudence of Privacy Law and the Need for Independent Oversight”), oversight is necessary to ensure that actual operations have been conducted in accordance with stated policies.
The reason is that, in many cases, decision makers formulate policies in order to balance competing imperatives. For example, the public demands both a high degree of effectiveness in countering terrorism and a high degree of privacy. Program administrators themselves face multiple challenges: motivating high performance, adhering to legal requirements, staying within budget, and so on. But if operational personnel adhere to some elements of a policy and not to others, the balance that decision makers intended to achieve will not be realized in practice.
The committee emphasizes that independent oversight is necessary to ensure that commitments to minimizing privacy intrusions embedded in policy statements are realized in practice. The reason is that losses of privacy are easy to discount under the pressure of daily operations, and those elements of policy intended to protect privacy are more likely to be ignored or compromised. Without effective oversight mechanisms in place, public trust is less likely to be forthcoming. In addition, oversight can support continuous improvement and guide administrators in making organizational change.
For example, program oversight is essential to ensure that those responsible for the program do not bypass procedures or technologies intended to protect privacy. Noncompliance with existing privacy-protecting laws, regulations, and best practices diminishes public support and creates an environment in which counterterrorism programs may be curtailed or eliminated. Indeed, even if shortcuts and bypasses increase effectiveness in a given case, in the long run scandals and public outcry about perceived abuses will reduce the political support for the programs or systems involved—and may deprive the nation of important tools useful in the counterterrorist mission. Even if a program is effective in the laboratory and expected to be so in the field, its deployment must be accompanied by strong technical and procedural safeguards to ensure that the privacy of individuals is not placed at undue risk.
Oversight is also needed to protect against abuse and mission creep. Experience and history indicate that in many programs that collect or use personal information, some individuals may violate safeguards intended to protect individual privacy. Hospital clerks have been known to exam-
ine the medical records of celebrities without having a legitimate reason for doing so, simply because they are curious. Police officers have been known to examine the records of individuals in motor vehicle information systems to learn about the personal lives of individuals with whom they interact in the course of daily business. And, of course, compromised insiders have been known to use the information systems of law enforcement and intelligence agencies to further nefarious ends.
The phenomenon of mission creep is illustrated by the Computer-Assisted Passenger Prescreening System II (CAPPS II) program, initially described in congressional testimony as an aviation security tool and not a law enforcement tool but which morphed in a few months to a system that would analyze information on persons “with [any] outstanding state or federal arrest warrants for crimes of violence.”18
To guard against such practices, the committee advocates program oversight that mines the miners and tracks the trackers. That is, all operation and command histories and all accesses to data-based counterterrorism information systems should be logged on an individual basis, audited, and mined with the same technologies and the same zeal that are applied to combating terrorists. If, for example, such practices had been in place during Robert Hanssen’s tenure at the Federal Bureau of Investigation (FBI), his use of its computer systems for unauthorized purposes might have been discovered sooner.
Finally, the committee recognizes the phenomenon of statutory mission creep, as defined above in the discussion of Premise 5. It occurs, for example, because in responding to a crisis, policy makers will naturally focus on adapting existing programs and capabilities rather than creating new ones. On one hand, if successful, adaptation often promises to be less expensive and faster than creating a new program or capabilities from scratch. On the other hand, because an existing program is likely to be highly customized for specific purposes, adapting that program to serve other purposes effectively may prove difficult—perhaps even more difficult than creating a program from scratch. As importantly, adapting an existing program to new purposes may well be contrary to agreements and understandings established in order to initiate the original program in the first place.
The committee does not oppose expanding the goals and missions of a program under all circumstances. Nevertheless, it cautions that such expansion should not be undertaken hastily in response to crisis. In the committee’s view, following diligently the framework presented in Chapter 2 is an important step in exercising such caution.
Recommendation 1d. Counterterrorism programs should provide meaningful redress to any individuals inappropriately harmed by their operation.
Programs that are designed to balance competing interests (in the case of counterterrorism, collective security and individual privacy and civil liberties) will naturally be biased in one direction or another if their incentive/penalty structure is not designed to reflect this balance. The availability of redress to the individual harmed thus acts to promote the goal of compliance with stated policy—as does the operational oversight mentioned in Recommendation 1c—and to provide incentives for the government to improve the policies, technologies, and data underlying the operation of the program.
Although the committee makes no specific recommendation concerning the form of redress that is appropriate for any given privacy harm suffered by innocent individuals as the result of a counterterrorism program, it notes that many forms of redress are possible in principle, ranging from apology to monetary compensation. The most appropriate form of redress is likely to depend on the nature and purpose of the specific counterterrorism program involved. However, the committee believes that, at a minimum, an innocent individual should always be provided with at least an explicit acknowledgment of the harm suffered and an action that reduces the likelihood that such an incident will ever be repeated, such as correcting erroneous data that might have led to the harm. Note that responsibilities for correction should apply to the holder of erroneous data, regardless of whether the holder is the government or a third party.
The availability of redress might, in principle, enable terrorists to manipulate the system in order to increase their chances of remaining undetected. However, as noted in Item 7 of the committee’s framework on effectiveness, information-based programs should be robust and not easily circumvented by adversary countermeasures, and thus the possibility that terrorists might manipulate the system is not a sufficient argument against the idea of redress.
Periodic Review of U.S. Law, Policy, and Procedures for Protection of Privacy
Recommendation 2. The U.S. government should periodically review the nation’s laws, policies, and procedures that protect individuals’ private information for relevance and effectiveness in light of changing technologies and circumstances. In particular, Congress should reexamine existing law to consider how privacy should be protected in the context of information-based programs (e.g., data mining) for counterterrorism.
The technological environment in which policy is embedded is constantly changing. Although technological change is not new, the pace of technological change has dramatically increased in the digital age. As noted in Engaging Privacy and Information Technology in a Digital Age, advances in information technology make it easier and cheaper by orders of magnitude to gather, retain, and analyze information, and other trends have enabled access to new kinds of information that previously would have been next to impossible to gather about another individual.19 Furthermore, new information technologies have eroded the privacy protection once provided through obscurity or the passage of time. Today, it is less expensive to store information electronically than to decide to get rid of it, and new and more powerful data mining techniques and technologies make it much easier to extract and identify personally identifiable patterns that were previously protected by the vast amounts of data “noise” around them.
The security environment is also constantly changing. New adversaries emerge, and counterterrorist efforts must account for the fact that new practices and procedures for organizing, training, planning, and acquiring resources may emerge as well. Most importantly, new attacks appear. The number of potential terrorist targets in the United States is large,20 and
although the different types of attack on these targets may be limited, attacks might be executed in myriad ways.
As an example of a concern ripe for examination and possible action, the committee found common ground in the proposition that policy makers should seriously consider restrictions on how personal information is used in addition to restrictions on how records are collected and accessed. Usage restrictions could be an important and useful supplement to access and collection limitation rules in an era in which much of the personal information that can be the basis for privacy intrusion is already either publicly available or easily accessible on request without prior judicial oversight. Privacy protection in the form of information usage restrictions can provide a helpful tool that balances the need to use powerful investigative tools, such as data mining, for counterterrorism purposes and the imperative to regulate privacy intrusions of such techniques through accountable adherence to clearly stated privacy rules. (Appendix G elaborates on this aspect of the recommendation.)
Such restrictions can serve an important function in helping to ensure that programs created to address a specific area stay focused on the problem that the programs were designed to address and in guarding against unauthorized or unconsidered expansion of government surveillance power. They also help to discourage mission creep, which often expands the set of purposes served by the program without explicit legislative authorization and into areas that are poorly matched by the original program’s structure and operation. An example of undesirable mission creep would be the use of personal data collected from the population acquired for counterterrorist purposes to uncover tax evaders or parents who have failed to make child support payments. This is not to say that finding such individuals is not a worthy social goal, but rather that the mismatch between such a goal and the intrusiveness of data collection measures for counterterrorist purposes is substantial indeed. Without clear legal rules defining the boundaries for use between counterterrorism and inappropriate law enforcement uses, debates over mission creep are likely to continue without constructive resolution.
A second example of a concern that may be ripe for legislative action involves the current legal uncertainty supporting private-sector liability for cooperation with government data mining programs. Such uncertainty creates real risk in the private sector, as indicated by the present variety of private lawsuits against telecommunications service providers,21 and private-sector responsibilities and rights must be clarified along
with government powers and privacy protections. What exists today is a mix of law, regulation, and informal influence in which the legal rights and responsibilities of private-sector entities are highly uncertain and not well understood.
A coherent, comprehensive legal regime regulating information-intensive surveillance such as government data mining, would do much to reduce such uncertainty. As one example, such a regime might address the issue of liability limitation for private-sector data sources (database providers, etc.) that provide privacy-intrusive information to the government.
Without spelling out the precise scope and coverage of the comprehensive regime, the committee believes that to the extent that the government legally compels a private party to provide data or a private party otherwise complies with an apparently legal requirement to disclose information, it should not be subject to liability simply for the act of complying with the government compulsion or legal requirement. Any such legal protection should not extend to the content of the information it supplies, and the committee also believes that the regime should allow incentives for data providers to invest reasonable effort in ensuring the quality of the data they provide. Furthermore, they should provide effective legal remedies for those individuals who suffer harm as a result of provider negligence. Furthermore, the regime would necessarily preserve the ability of individuals to challenge the constitutionality of the underlying data access statute.
Listed below are other examples of how the adequacy of privacy-related law might be called into question by a changing environment (Appendix F elaborates on these examples).
Conducting general searches. On one hand, the Fourth Amendment forbids general searches—that is, searches that are not limited as to the location of the search or the type of evidence the government is seeking—by requiring that all searches and seizures must be reasonable and that all warrants must state with particularity the item to be seized and the place to be searched. On the other hand, machine-aided searching of enormous digital transaction records is in some ways analogous to a general search. Such a search can be a dragnet that sweeps through millions or billions of records, often containing highly sensitive information. Much like a general search in colonial times was not limited to a particular person or place, a machine-aided search through digital databases can be very broad. How, if at all, should database searches be regulated by the Fourth Amendment or by statute?
A related issue is that the historical difficulty of physical access to ostensibly public information has provided a degree of privacy protection
for that information—what might be known as privacy through obscurity. But a search-enabled digital world erodes some of these previously inherent protections against invasions of privacy, changing the technological milieu that surrounds privacy jurisprudence.
Increased access to data; searches and surveillance of U.S. persons outside the United States. The Supreme Court has not yet addressed whether the Fourth Amendment applies to searches and surveillance for national security and intelligence purposes that involve U.S. persons22 who are connected to a foreign power or that are conducted wholly outside the United States.23 Lower courts, however, have found that there is an exception to the Fourth Amendment’s warrant requirement for searches conducted for intelligence purposes within the United States that involve only non-U.S. persons or agents of foreign powers.24 The Supreme Court has yet to rule on this important issue, and Congress has not supplied any statutory language to fill the gap.
Third-party records. Two Supreme Court cases (United States v. Miller, 1976, and Smith v. Maryland, 1979)25 have established the precedent that there is no constitutionally based reasonable expectation of privacy for information held by a third party, and thus the government today has access unrestricted by the Fourth Amendment to private-sector records on every detail of how people live their lives. Today, these third-party transactional records are available to the government subject to a very low threshold—through subpoenas that can be written by almost any government agency without prior judicial oversight—and are one of the primary data feeds for a variety of counterterrorist data mining activities. Thus, the public policy response to privacy erosion as a result of data mining used with these records will have to address some combination of the scope of use for the data mining results, the legal standards for access to and use of transactional information, or both.26 (See also Appendix G for
discussion of how usage limitations can fill gaps in current regulation of the confidentiality of third-party records.)
Electronic surveillance law. Today’s law regarding electronic surveillance is complex. Some of the complexity is due to the fact that the situations and circumstances in which electronic surveillance may be involved are highly varied, and policy makers have decided that different situations call for different regulations. But it is an open question as to whether these differences, noted and established in one particular set of circumstances, can be effectively maintained over time. Although there is broad agreement that today’s legal regime is not optimally aligned with the technological and circumstantial realities of the present, there is profound disagreement about whether the basic principles underlying today’s regime continue to be sound as well as in what directions changes to today’s regime ought to occur.
In making Recommendation 2, the committee intends the government’s reexamination of privacy law to cover the issues described above but not be limited to them. In short, Congress and the president should work together to ensure that the law is clear, appropriate, up to date, and responsive to real needs.
Greater clarity and coherence in the legal regime governing information-based programs would have many benefits, both for privacy protection and for the counterterrorist mission. It is perhaps obvious that greater clarity helps to protect privacy by eliminating what might be seen as loopholes in the law—ambiguities that can be exploited by well-meaning national security authorities, thereby overturning or circumventing the intent of previously established policy that balanced competing interests. But the benefits of greater clarity from the standpoint of improving the ability of the U.S. government to prosecute its counterterrorism responsibilities are less obvious and thus deserve some elaboration.
First and most importantly from this perspective, greater legal clarity would help to reduce public controversy over potentially important tools that might be used for counterterrorist purposes. Although many policy makers might wish that they had a free hand in pursuing the counterterrorist mission and that public debate and controversy would just go away, the reality is that public controversy does result when the government is seen as exploiting ambiguities and loopholes.
As discussed in Appendix I (“Illustrative Government Data Mining Programs and Activity”), a variety of government programs have been shut down, scaled back, delayed, or otherwise restricted over privacy considerations: TIA, CAPPS II for screening airline passengers, MATRIX (Multistate Anti-Terrorism Information Exchange) for linking law enforcement records across states with other government and private-sector
databases, and a number of data-sharing experiments between the U.S. government and various airlines. Public controversy about these efforts may have prematurely compromised counterterrorism tools that might have been useful. In addition, they have also made the government more wary of national security programs that involve data matching and made the private sector more reluctant to share personal information with the government in the future.
In this regard, this first rationale for greater clarity is consistent with the conclusion of the Technology and Privacy Advisory Committee: “[privacy] protections are essential so that the government can engage in appropriate data mining when necessary to fight terrorism and defend our nation. And we believe that those protections are needed to provide clear guidance to DOD personnel engaged in anti-terrorism activities.”27
Second, greater legal clarity and coherence can enhance the effectiveness of certain information-based programs. For example, the Privacy Act of 1974 requires that personal data used by federal agencies be accurate, relevant, timely, and complete. On one hand, these requirements increase the likelihood that high-quality data are stored, thus enhancing the effectiveness of systems that use data subject to those requirements. On the other hand, both the FBI’s National Crime Information Center and the passenger screening database of the Transportation Security Administration have exemptions from some of these requirements;28 to the extent that these exemptions result in lower-quality data, these systems are likely to perform less well.
Third, the absence of a clear legal framework is likely to have a profound effect on the innovation and research that are necessary to improve the accuracy and effectiveness of information-based programs. Such clarity is necessary to support the investment of financial, institutional, and human resources in often risky research that may not pay dividends for
Technology and Privacy Advisory Committee, Safeguarding Privacy in the Fight Against Terrorism, U.S. Department of Defense, Washington, D.C., March 2004, p. 48, available at http://www.cdt.org/security/usapatriot/20040300tapac.pdf.
The Department of Justice and the Transportation Security Administration have published notices on these programs in the Federal Register, exempting them from certain provisions of the Privacy Act that are allowed under the act. In March 2003, the DOJ exempted the FBI’s National Crime Information Center from the Privacy Act’s requirements that data be “accurate, relevant, timely and complete,” Privacy Act of 1974; Implementation, 68 Federal Register 14140 (2003) (DOJ, final rule). In August 2003, the Department of Homeland Security exempted the TSA’s passenger screening database from the Privacy Act’s requirements that government records include only “relevant and necessary” personal information, Privacy Act of 1974: Implementation of Exemption, 68 Federal Register 49410 (2003) (DHS, final rule). Outside these exceptions, the Privacy Act otherwise applies to these programs. (Under the act, exemptions have to be published to be effective, and so the committee assumes that there are no “secret” exemptions.)
decades. But that type of research is essential to counterterrorism efforts and to finding better ways of protecting privacy.
Finally, a clear and coherent legal framework will almost certainly be necessary to realize the potential of new technologies to fight terrorism. Because such technologies will operate in the political context of an American public concerned about privacy, the public—and congressional decision makers—will have to take measures that protect privacy when new technologies are deployed. All technological solutions will require a legal framework within which to operate, and there will always be gaps left by technological protections, which law will be essential to fill. Consequently, a lack of clarity in that framework may not only slow their development and deployment, as described above, but also make technological solutions entirely unworkable.