To understand the uses of signals intelligence (SIGINT) data, several “use cases” are presented below. These are hypothetical scenarios that describe episodes in which analysts query SIGINT metadata as part of an investigation of a threat to national security, such as counterterrorism, or to stem the proliferation of weapons of mass destruction.1 (Some public sources of information on actual cases are listed in Box 3.1.) The committee asked National Security Agency (NSA) briefers for unclassified use cases illustrating the use of metadata, under any authority, whether collected in bulk or targeted, whether foreign or domestic. The committee focused on how metadata are used, not on the authorities or restrictions under which it is collected. Three categories of use cases are presented below, which, the committee was told, account for the majority of metadata use: contact chaining, finding alternate identifiers, and triage.2 This set is not, however, exhaustive.
The examples contain more detail than is strictly necessary to illustrate the use case categories. The detail is presented to show that an investigation may
1 For scenarios of four counterterrorism investigations studied by the Privacy and Civil Liberties Oversight Board, see Report on the Telephone Records Program Conducted under Section 215 of the USA PATRIOT Act and on the Operations of the Foreign Intelligence Surveillance Court, January 23, 2014, http://www.pclob.gov/library/215-Report_on_the_Telephone_Records_Program.pdf, p. 144 ff.
2 National Security Agency, presentation to the committee on August 28, 2014.
Some Specific Cases of Signals Intelligence in Use
Very little has been made public about actual cases where U.S. signals intelligence has contributed to counterterrorism. A principal reason is that the Intelligence Community (IC) carefully protects information about sources and methods from adversaries. Nevertheless, information on some cases can be found in public speeches and testimony to Congress by IC leaders and in two reports prepared by the Privacy and Civil Liberties Oversight Board.
The accounts of these cases are incomplete and possibly inconsistent. The selection of the cases that were made public, the details of the accounts, and their significance have all been controversial.
Pointers to some of this public information are provided below, not because the committee endorses the views of its authors, but simply to supplement the abstract use case categories presented in this chapter with some concrete examples:
• Testimony by Gen. Keith Alexander and others before the House Select Committee on Intelligence, June 18, 2013, http://icontherecord.tumblr.com/post/57812486681/hearing-of-the-house-permanent-select-committee-on.
• Four cases using Foreign Intelligence Surveillance Act (FISA) Section 215 authority:
—Basaaly Moalin, financial support of Al Shabab.
—Najibullah Zazi, plotted to bomb the New York Subway system.
—David Coleman Headley, helped plan the 2008 Mumbai attack.
—Khalid Ouazzani, suspected of plotting to bomb the New York Stock Exchange.
• Described in Privacy and Civil Liberties Oversight Board, Report on the Telephone Records Program Conducted under Section 215 of the USA Patriot Act and on the Operations of the Foreign Intelligence Surveillance Court, http://www.pclob.gov/library/215-Report_on_the_Telephone_Records_Program.pdf, p. 144 ff.
• Some uses of FISA Section 702 authority are described in Privacy and Civil Liberties Oversight Board, Report on the Surveillance Program Operated Pursuant to Section 702 of the Foreign Intelligence Surveillance Act, http://www.pclob.gov/library/702-Report.pdf, p. 104 ff.
• Depend on different kinds of data,
• Use different analysis techniques or use common techniques in different ways,
• Use both bulk and targeted SIGINT collection, and
• Expect to reveal U.S. persons, whose constitutional rights must be protected.
Note that the SIGINT data used in these examples are metadata collected from telephone and email communications. The only metadata
elements used are the “to” and “from” identifiers in the form of telephone numbers or email addresses, or the Internet Protocol (IP) address of a computer used for communication. Collection methods are not described, and it is assumed that the data are collected in such a way that they contain the entries that are required to satisfy the scenario. The Intelligence Community (IC) may collect additional kinds of SIGINT metadata.
Communications metadata, domestic and foreign, are used to develop contact chains by starting with a target and using metadata records to indicate who has communicated directly with the target (one hop), who has in turn communicated with those people (two hops), and so on. Studying contact chains can help identify members of a network of people who may be working together; if one is known or suspected to be a terrorist, it becomes important to inspect others with whom that individual is in contact who may be members of a terrorist network. Similarly, studying contact chains can help analysts to understand the structure of an organization under investigation.
In Use Case 1, the U.S. government has identified a Somali pirate network that includes target A. An analyst queries and displays all the call contacts to or from A’s telephone number in the last 18 days. Some contacts are identified as already known targets; others are undetermined. The analyst invokes a similar query and display for target B, who has communicated frequently with A, and notes that there are three people, not yet determined to be targets, who have been in contact with both A and B. The analyst can see this relationship immediately, because the contact sets of A and B are displayed as a network, with contacts as nodes, linked by lines to indicate calls. The analyst invokes the query-and-display function again on one of these three, C, and discovers this person is in contact not only with targets A and B but also with other known pirates. Perhaps C is a “missing link” between the networks in which A and B are operating.3
Many contacts uncovered this way are ruled out as having no intelligence value. Calls to a car mechanic, an IT help desk, or an automated
3 “Inside the NSA,”60 Minutes, CBS News, video segment, December 15, 2013, 3:40-4:45, http://www.cbsnews.com/videos/inside-the-nsa/. The transcript for the 60 Minutes segment is at CBS News, “NSA Speaks out on Snowden, Spying,” December 15, 2013, http://www.cbsnews.com/news/nsa-speaks-out-on-snowden-spying/. Note that the video that plays on the page with the transcript is not guaranteed to be the correct segment of 60 Minutes; the URL for the correct video segment is given above.
weather report are likely to be ruled out, although perhaps some may later be found to have intelligence value. Further, laws or regulations restrict what an analyst is allowed to do. For instance, there are special rules applied to subjects of interest who are or might be U.S. persons and various (and differing) sets of rules depending on which authority allowed the collection of the underlying information (see Section 1.4).
Either bulk or targeted collection can lead to the result in Figure 3.1. Since A and B are targets, targeted collection using a discriminant that specifies “collect all calls to or from A or B” would collect all the contacts and subjects shown in the figure. However, if all calls between A or B and C occurred before either A or B was identified as a target, later collection targeted on A or B will not find C by way of A or B, but might find C because of communication with some other target. Bulk collection provides useful “history,” because it does not limit collection to only the targets known at the time of collection.
FIGURE 3.1 A network of contacts among identifiers.
Targets may use several communication channels, each characterized by a specific identifier—in the example above, a telephone number, email address, or IP address. Targets may use different channels as a matter of convenience or as a form of operational security to try to evade detection by spreading their communications over several channels, by initiating new channels, or by stopping use of some channels. In some cases, identifiers may be assigned by the technology, such as an Internet service provider (ISP) that assigns a temporary IP address to a laptop.
An analyst can continue tracking a target by knowing the set of identifiers the target uses and tracking changes to the set over time, for example, when the target switches to a different email address. Activity detected using these identifiers is an important part of intelligence about the target. For example, a frequently used identifier that goes silent or that is found to have moved (e.g., by being detected at a different site) may indicate target activities of interest.
To succeed, alternate identifiers must be found quickly, with a speed and rate that meets or exceeds that with which the target acts. If targets are changing phone or email identifiers every day, the surveillance required to track the changes must be undertaken at a similar rate.
In Use Case 2, an international cyber-criminal is thwarted when U.S. government access to his email communication allows anticipation and mitigation of a cyber attack. In response, the criminal transfers his communications to an alternate identifier—using a smaller ISP within the United States that is known for outspoken resistance to government surveillance. The U.S. government, via a Foreign Intelligence Surveillance Court (FISC) order, obtains bulk email metadata from the ISP, also imposing a gag order on the ISP and preventing deliberate or inadvertent disclosure of surveillance to the cyber-criminal. The intent of this action is to uncover the criminal’s alternate identifier for the purpose of collecting additional intelligence.
The alternate identifier technique is applied to the email metadata obtained from the ISP, in order to find a new identifier that communicates with the same identifiers as the old email address, leading to the discovery of an alternate identifier used by the cyber-criminal.
In the following use case, alternate identifiers are used in a more complex scenario that combines communications surveillance with other intelligence methods.
In Use Case 3, country X is a U.S. adversary that produces chemical and biological warfare weapons. The U.S. policy community wants continued monitoring of the program and to know if the country is supplying terrorists with weapons of mass destruction (WMD). The IC knows the following about the program:
• It is run under the cover of a medical research institute at the major university in the country. The institute also conducts legitimate medical and pharmacological research. There are also a variety of known and suspected laboratories associated with the program spread throughout the country.
• The institute’s doctors, scientists, and researchers were trained in Europe, Russia, and the United States. The institute seeks medical research equipment from legitimate suppliers around the world.
• Plague, anthrax, and malaria are endemic to the country. The institute regularly works with the United Nations and international aid organizations to mitigate the threat posed by these and other diseases.
• Some working with the institute have provided “medical aid” to the Sons of the Western Sun, a U.S.-designated terrorist group attempting to overthrow the government in a neighboring country.
The IC goals are to
• Identify equipment and materials that the institute or its associated laboratories are attempting to purchase and who the suppliers are or could reasonably be.
• Locate and identify all the laboratories and facilities in the country associated with the institute.
• Determine research topics being pursued by members of the institute.
• Track communication between Sons of the Western Sun and members of the institute.
• Determine the view on WMD of the country’s leadership and the directions provided to them by the institute.
To obtain information relevant to these goals, the IC may collect against the institute, the country, and the terrorist organization. The collection options are constrained by the following:
• Only persons considered loyal are allowed to travel overseas. They are also very wary of communications intelligence activities.
• Few foreigners travel to the country. The U.S. Embassy is heavily watched, and the staff is small.
• The institute regularly buys material and equipment online and often will contact suppliers with unsolicited emails asking for information on a wide range of products and services.
• Almost all telephonic communications is by cell phone. Twitter is a national pastime.
• The Sons of the Western Sun are believed to obtain substantial financial, logistical, and personnel support from elements in Europe and the United States, many of whom are unknown.
Use Case 3 illustrates a complex scenario in which several different ways of gathering intelligence may be involved. Most likely, the entire institute would be the focus of communications data collection. According to the committee’s definition, this might be considered bulk collection, because it would collect a significant amount of data about communications of legitimate researchers who have no role in WMD efforts. However, focusing collection on the institute is less intrusive than collecting on the whole country. The alternate identifier method may be applied to the identifiers of everyone in the institute in order to track all ongoing communications. Correlations with known members of Sons of the Western Sun may help distinguish targets from innocents.
In Use Case 4, following the events of Use Case 3, country X eliminated its WMD programs. The United States aided with the destruction of the weapons, but despite public declarations, the IC remains convinced that a number of facilities were never identified by the country. The new government has been rumored in press articles to want to re-establish the WMD program. U.S. policy makers are concerned about a new arms race in the region and want to know the status and intention of the country toward its WMD program. The IC knows the following:
• Because of the thaw in U.S.-X relations, the scientists who worked on the old WMD program have been traveling widely in the United States and Asia.
• Large numbers of citizens of X currently travel freely between X and the United States, and a number of U.S. tourists travel yearly to X to bathe in the renowned hot springs.
• Several key proliferators associated with the old WMD program have winter villas in X. They were known to buy goods from both U.S. and Asian suppliers for the program.
• The government of X and most of its leading citizens have bank accounts in the United States, among other places.
Analysts have to determine the current status of the WMD program and leadership intentions toward the program. Unfortunately, after X agreed to dismantle its WMD program, most collection efforts on the program were ended or drastically reduced. Thus, the IC goals are to
• Determine if old proliferators started shopping for WMD-related materials and equipment.
• Identify the actual intention of the government of X or other senior policy makers toward the WMD program, despite public pronouncements.
• Identify current activities at all previously known WMD sites and possible new facilities, and identify the use of new agents to purchase WMD-related materials and equipment.
• If a WMD program is identified, determine the what, where, who, and how.
This example draws on many intelligence sources of which SIGINT is only one. Part of the approach will be to collect bulk communications data focused on areas where IC analysts expect the scientists to be communicating. They may use some of the same identifiers they used before the WMD program shutdown. Doubtless, new identifiers will come to light, some identifying U.S. persons and, therefore, requiring minimization procedures.
Intelligence reports from the period before the shutdown may contain references to identifiers or other evidence that will help target or focus collection or seek contemporaneous alternate identifiers. It is possible that during the shutdown a reduced collection effort was sustained in order to monitor the termination of WMD activity; parts of that data that have not exceeded the IC’s retention limits may be used in the alternate identifier search.
Finding alternate identifiers depends on collecting timely communications metadata in bulk. The collection must be in bulk in order for the metadata to include the new identifiers, which are not known at the time of collection to be associated with a target because they are two hops from the old identifier. Collection that is focused around the target (i.e., communications channels and modes that the target is known to use) is used, in part, to limit intrusion on innocents. Focus is also driven by
concerns of cost and computer processing time to run the correlation algorithms.4
Note that the two-hop restriction announced by the President in January 2014 on queries of domestic telephone metadata still allows alternative identifiers to be found for known domestic reasonable and articulable suspicion (RAS) targets.5 Starting from a target, a query will find all the target’s one-hop contacts, then find all two-hop contacts; among these, there may be an identifier that communicates with many of the target’s one-hop contacts. This identifier may be an alternate identifier for the target.
Reverse targeting is another approach to find a target’s alternate identifiers by working backwards from persons known to be in contact with the target. In this approach, each identifier that communicates with the target is used as a query against bulk metadata or as a selector in future targeted collection, which will reveal any new identifier that communicates with the target’s previous communication partners. Because this method explicitly collects against persons known not to be persons of interest, apart from the fact that they communicate with the target, the use of this method raises extra privacy concerns. Current policy and statutes forbid the use of this method under certain authorities.6
Investigations or events may uncover lists of identifiers that need triage—that is, categorizing identifiers according to the danger that their owners might pose to national security. Queries about each identifier are made to the IC’s databases to determine whether the identifier can be matched against a currently known target, is related to a target, or exhibits other properties of a dangerous person. Most often, the list presented for triage cites only identifiers and not names of persons.
Queries for triage are matched against historical metadata (both bulk and targeted) to find evidence of connections between the identifier and events or people that were or are of interest. The identifier may have once been a target, but the information about the target has since been discarded. Or the identifier may have been retained because it arose in the
4 Note that these algorithms will examine metadata associated with many innocents. The set of identifiers examined will certainly include all those associated with reverse targeting, explained below.
5 The White House, “Remarks by the President on Review of Signals Intelligence,” Office of the Press Secretary, January 17, 2014, http://www.whitehouse.gov/the-press-office/2014/01/17/remarks-president-review-signals-intelligence.
6 See U.S. Code Title 50, Chapter 36, Subchapter VI, § 1881a, Procedures for targeting certain persons outside the United States other than United States persons.
course of an investigation. In any case, the alternate identifier technique is used to find new identifiers related to the one offered for triage that might be used by the same person.
In Use Case 4, when the IC resumes investigation of country X, the many identifiers obtained during the original investigation must be triaged for use in the new context.
In Use Case 5, a bombing suspect is identified, along with an associated email address. The triage process, which includes finding alternate identifiers used by the suspect, is used to quickly find possible associates of the suspect and other information about these identifiers held in IC databases.
Triage benefits from all retained metadata, bulk or targeted, because it seeks information about an identifier that may not have previously been the subject of explicit IC attention. A timely response in Use Case 5 requires historical information; initiating targeted collection using the relevant email addresses obtained only after a suspect was identified would mean that only future information would be available, probably trickling in too slowly to be useful. Other communications channels used by the suspect in the past, such as a cell phone, might never be found using present and future SIGINT data alone.
If an identifier presented for triage has never been associated with a target, then past bulk data is probably more likely to find direct associates, and contact chaining or alternate identifier techniques may lead to additional associates. But the identifier may also be found in metadata targeted to a particular terrorist organization, even though the identifier was not previously known to be associated with the organization.
The use case categories described above—contact chaining, alternate identifiers, and triage—illustrate the growing importance in intelligence work of discovering and defining networks. Indeed, doing so may be as
7 National Security Agency, presentation to the committee on August 28, 2014.
important as examining content to know what members of the network are saying. It is a task where bulk collection is especially important.
The use case categories above do not exhaust the use cases for SIGINT data, or even for telephony metadata. Many uses are ad hoc and do not fit into neat categories. For example, to find whether any of a number of targets associated with a terrorist group have communicated with any of a number of explosives suppliers, a query listing the targets and the suppliers can be constructed and applied against stored metadata. This is an iterative use of a simple query to determine whether one group contacted another.
The committee had hoped that analyzing use cases might suggest alternatives to bulk collection. But this path is limited, for two reasons: (1) the three categories do not cover all uses of SIGINT data and (2) the use cases show that both bulk and targeted collection are used. If the use case is focused in some way, targeted collection may provide enough data, especially if the focus has been under targeted surveillance for some time. For investigations that have little or no prior targeting history, bulk collection may be the only source of useful information. Thus it appears that it is the context of the investigation, rather than the technique for using collected metadata, that most influences the value of bulk collection.