Broadly speaking, the intelligence function involves the collection, analysis, and dissemination of information to decision makers. Intelligence analysts use all available sources of such information to understand problems of interest to decision makers. These sources include human intelligence, imagery, and a variety of other kinds of intelligence in addition to signals intelligence (SIGINT). This report focuses on signals intelligence.
In general, the intelligence process starts with decisions by policy makers on the areas of national security interest for which intelligence will be useful. Some of these areas cover imminent or anticipated threats, while others pertain to strategic intelligence to develop an understanding of regions or organizations that might become threatening. Based on the priorities stated by decision makers, intelligence officials in the community identify specific collection methods and opportunities that are expected to yield useful information. These methods and opportunities interact with and support each other (much as the various elements in an ecosystem interact with each other), so that, for example, a piece of information from one method may cue collection with another method or may corroborate or support information derived from another.
The intelligence process seeks information about both tactical matters (i.e., specific dangerous persons, groups, or plots, such as known terrorist organizations or plans to bomb subways or investigations of recent bombings) and strategic matters (i.e., a broad picture of a threat, such as a country’s plans to build nuclear weapons). Increasingly, this is not a sharp
distinction, because context is often important to understanding a tactical threat, and tactical information is required to respond to strategic threats.
A characteristic of tactical investigations is often (although not always) a highly compressed timeline. For example, in investigating a bombing, investigators must work quickly to determine whether the bomb that just exploded is the first in a series.
Signals intelligence is defined by the National Security Agency (NSA) to be “intelligence derived from electronic signals and systems used by foreign targets, such as communications systems, radars, and weapons systems.”1 In the modern world, distinctions between paper records and electronic recordings that may once have been technically meaningful are increasingly obsolete as all forms of information storage become electronic.
In this section, the committee presents a simplified conceptual model of the parts and functions of the SIGINT process, which is used for further discussion. In Chapter 3, “use cases,” examples of the use of SIGINT data in plausible scenarios, are shown. The description below is primarily technical in nature. Constraints on SIGINT imposed by law, regulation, and policy are discussed in Section 1.3.
As with other forms of intelligence gathering, SIGINT is conducted in response to requirements for intelligence from policy makers. Priorities are established by different agencies in the policy community and are reviewed at least annually. Based on these priorities, agencies in the Intelligence Community (IC), including NSA, design and develop mechanisms for collecting information in different locations, information that will meet the wide variety of policy maker requirements. To the extent possible, collection mechanisms are consolidated for greater efficiency, both between the various intelligence agencies and within NSA, as the entity charged with SIGINT collection. Thus, a given collection mechanism may provide information that is useful for a variety of different topics. This process seeks to avoid the development and deployment of collection mechanisms individually for each and every target, an approach that would be inefficient and expensive.
The committee’s conceptual model of the SIGINT process is depicted in Figure 2.1. In this model, NSA extracts signals data from various sources, filters it for items of interest, stores the items, analyzes them, and disseminates selected information to policy makers and other units of the IC. (Not described here, and discussed later in the report, are the
FIGURE 2.1 A conceptual model of signals intelligence.
audits and other measures to establish compliance with rules and regulations concerning personal privacy.) There are many signal types; among the most important are the digital signals that carry the voice content of telephone calls. Information pertaining to telephony is also collected as SIGINT; this is information about the calling and called telephone numbers and time and duration of call—so-called “telephony metadata.” Internet communications, such as email or commands to search engines, may also be collected, and, once again, a distinction is drawn between content and metadata.
Signals are derived from many sources, but the specific steps taken to winnow large data streams to those that are manageable and potentially productive are the same regardless of the source. Figure 2.1 shows how one signal might be collected. The first three steps in the SIGINT model, taken together, are what the committee informally calls collection:2
2 The committee’s definition of collection differs from that used by NSA in certain ways. See, for example, NSA, NSA’s Civil Liberties and Privacy Protections for Targeted SIGINT Activities Under Executive Order 12333, NSA Director of Civil Liberties and Privacy Office Report, October 7, 2014, https://www.nsa.gov/civil_liberties/_files/nsa_clpo_report_targeted_EO12333.pdf. See also footnote 3 in this chapter.
• Extract. The first step is to obtain the signal from a source, convert it into a digital stream, and parse the stream to extract the kind of information being sought, such as an email message or the digital audio of a telephone call. Extraction interprets layers of communications and Internet protocols, such as Optical Transport Network (OTN), Synchronous Digital Hierarchy (SDH), Ethernet, Internet Protocol (IP), Transmission Control Protocol (TCP), Simple Mail Transport Protocol (SMTP), or Hypertext Transport Protocol (HTTP). In cases where business records are sought, this step extracts and reformats relevant SIGINT data from a business record format used by the business.
• Filter. This step selects, from all the items extracted, items of interest that should be retained. It is sometimes controlled by a “discriminant,” which the IC agency running the collection provides to describe in precise terms the properties of an item that should be retained. For example, a discriminant might specify “all telephone calls from 301-555-1212 to Somalia,” “all telephone calls from France to Yemen,” or “all search-engine queries containing the word ‘sarin.’” If there is no discriminant, then all extracted items are retained.
• Store. Retained items are stored in a database operated by the U.S. government. This is the point at which collection is deemed in this model to occur for the retained data.3 By contrast, the previous steps are fleeting, with data processed in near real-time (keeping data only for short periods of time—minutes to hours—for technical reasons) as fast as it is supplied, with all but the items to be retained discarded. Items collected from separate sources are usually combined into a modest number of large databases to facilitate searching and analysis.
In modern communication systems, traffic from many sources and destinations is aggregated into a single channel. For example, the radio signals to and from a base station serving all mobile phones in a cell are all on the same radio channels, and all of the IP packets between two routers may be carried on the same fiber. With rare exceptions, there is no single physical access point comparable to the central office connection of a landline telephone at which to observe only the items of interest and nothing more. Reflecting this reality, the committee’s definition of “collection” says that SIGINT data is collected only when it is stored, not when it is extracted. Put another way, every piece of data that passes by a potential monitoring point must be machine-filtered as part of the
3 Not everyone agrees on a definition of the word collection, which is widely used in policy, law, and regulation pertaining to SIGINT. This lack of collective agreement extends to entities within the IC itself. Moreover, subtle distinctions among the definitions lead to different views on certain SIGINT properties, especially its intrusion on privacy.
extraction process to determine whether it is potentially relevant or can be thrown away without further examination.
The committee notes that there are at least two differing conceptions of privacy with respect to when data are acquired. One view asserts that a violation of privacy occurs when the electronic signal is first captured, irrespective of what happens to the signal after that point. Another view asserts that processing the signal only to determine if it is irrelevant does not compromise privacy rights in any way, even if that signal is held for a non-zero period of time. In a technological environment in which different communications streams are mixed together on the same physical channel, picking out the sought-after communication stream requires the latter approach. Further, note that the committee has made a technical judgment about a useful definition of collection while remaining silent about what does or does not constitute an appropriate definition of privacy.
The committee also uses “collection” as a term to describe only government retention of data. If non-government actors acquire information from or about various parties in some legal manner but the government does not have access to that information, the government is not engaging in collection as a result of the actions of those parties. In contrast, if the government gains access to that information through technical or legal means and stores some or all of it for government use, it is reasonable to consider this collection.
Note that intelligence agencies narrow their focus throughout the various steps of collection as much as possible, both to comply with rules about what is allowed and to use their limited resources efficiently. Privacy protections of different sorts are applied at various points throughout the process. These include choices about where to extract signals and what discriminants to use, minimization procedures used to protect information about U.S. persons, and controls on how collected information can be used.
Notwithstanding the operation of the predecessor program to Foreign Intelligence Surveillance Act (FISA) Section 215, outside of the requirements of FISA, most agree now that the IC can target U.S. persons only when permitted explicitly with Foreign Intelligence Surveillance Court (FISC) involvement using procedures designed to ensure Fourth Amendment protections. The legal protections provided by the Fourth Amendment and various domestic legislation, such as FISA, distinguish between foreign and U.S. persons; in particular, the latter enjoy the protections of the Fourth Amendment. In cases where information about U.S. persons is collected as a part of authorized foreign intelligence collection activities, minimization rules approved by the U.S. Attorney General require special handling for privacy protection, consistent with foreign intelligence needs, which typically will require removing the names of U.S. persons or other
TABLE 2.1 Hypothetical Call Detail Records as They Might Appear in a Signals Intelligence Database
|Caller||Called||Call Start Time||Call Duration|
|+1-415-555-0103||+963 99 2210403||2014:10:3:16:01:43||73:43|
NOTE: In this hypothetical example of call detail records as they might appear in a signals intelligence database, the call shown in the first line might be relaying a message through an intermediary at +1-703-555-0198. The call on the third line is to an international number, which might belong to a foreign national or a U.S. person. The call in the fourth line was probably ordering a pizza, since a directory of telephone numbers reveals that the called number is a pizza shop.
identifying information prior to dissemination. Of course, the names of U.S. persons can be included when necessary to understand the foreign intelligence information.4
Stated policy calls for strict rules for the dissemination of identities of U.S. persons in intelligence reports.
Intelligence collection results in large databases holding records that are expected to have intelligence value. (Table 2.1 provides a hypothetical example of records in such a database.) In counterterrorism investigations, an analyst generally starts with a “seed,” an identifier of a communications endpoint that has been obtained in the course of intelligence gathering and is deemed relevant to a possible threat. The analyst uses the seed identifier to formulate one or more queries of the databases to seek more information, for example, identifiers for other parties communicating with the seed. The analyst may also query for communications content, if it exists or can be obtained. Thus, analysts can build a pattern of a seed’s connections to other parties and/or to other data that provide a richer and fuller picture of that party’s role within a larger enterprise, such as a terrorist organization. Other databases may be consulted as
4 The committee’s understanding, based on the briefings it received, is that most data incidentally collected about U.S. persons are never examined, because U.S. person data is not returned in response to analyst queries for foreign intelligence information.
well. In this way, analysts can build a network that depicts how parties of interest relate to one another and characterize the activities of each of the parties in a network or more formally structured enterprise.
Analysts use a variety of software tools as they work with SIGINT data. They may use tools to formulate queries or display the results (e.g., see Figure 3.1). They may set up “standing queries” (which need special approval) that run each day to report new events associated with their active targets. Using results of queries of the data, they build a record of data and evidence for investigations in a “working store,” a set of digital files separate from the SIGINT databases.
The last step in the SIGINT process is dissemination. SIGINT analysts will routinely disseminate the results of their work to others, both inside and outside the IC. For example, NSA analysts working on a specific terrorism investigation might disseminate their findings to other analysts and collectors who are working on related issues or directly to policy makers who may choose to take action based on the SIGINT.
Like the initial collection, SIGINT dissemination is governed by various laws and regulations designed to protect the sources and methods involved in the collection as well as the privacy and civil liberties of the subjects of the collection, especially if the intelligence involves U.S. persons.5 Specifically to the latter, and pursuant to U.S. Signals Intelligence Direcive (USSID) 18,6 such reports will normally cloak the identity of U.S. persons until a reader of the report specifically asks for the identity to be disclosed and provides a valid reason for the release, such as initiating a further investigation. This process is designed to ensure that both the requesting agency and NSA, as the disseminator of the information, can verify that disclosing this sensitive information is appropriate and necessary to understand the foreign intelligence value of the report.
Presidential Policy Directive 28 (PPD-28) asks whether it is feasible to create software that could replace “bulk collection” with “targeted
5 Section 4 of PPD-28 indicates that the IC should endeavor to give the same protections to foreign persons as to U.S. persons with regard to the retention and dissemination of identifying information.
6 National Security Agency, “United States Signals Intelligence Directive USSID SP0018, (U) Legal Compliance and U.S. Persons Minimization Procedures,” Issue Date January 25, 2011, approved for release on November 13, 2013, referred to as USSID 18, http://www.dni.gov/files/documents/1118/CLEANEDFinalUSSIDSP0018.pdf.
Bulk collection results in a database in which a significant portion of the data pertains to identifiers not relevant to current targets. Such items usually refer to parties that have not been, are not now, and will not become subjects of interest. Moreover, they are not closely linked to anyone of that sort: knowing to whom these parties talk will not help locate threats or develop more information about threats. Bulk collection occurs because it is usually impossible to determine at the time of filtering and collection that a party will have no intelligence value. Although the amount of information retained from bulk collection is often large, and often larger than the amount of information retained from targeted collection, it is not their size that makes them “bulk.” Rather, it is the (larger) proportion of extra data beyond currently known targets that defines them.
Targeted collection tries to reduce, insofar as possible, items about parties with no past, present, or future intelligence value. This is achieved by using discriminants that narrowly select relevant items to store. For example, if the email address email@example.com was obtained from a terrorist’s smartphone when he was arrested, using a discriminant to instruct the filter to save only “email to or from firstname.lastname@example.org” would result in a targeted collection. Some or many of the people communicating with this person might turn out to have no intelligence value, but the collection is far more selective than, say, collecting all email to or from anyone with an email address served by aol.com. A discriminant could be a top-level Internet domain, a country code (e.g., .cn for China, .fr for France), a date on which communication occurred, a device type, and so on. A discriminant could even refer to the content in a communication, such as “all email with the word ‘nuclear’ in it.” Note that if a discriminant is broadly crafted, the filter may retain such a large proportion of data on people of no intelligence value that the collection cannot be called “targeted.”
PPD-28 seeks ways to reduce or avoid bulk collection in order to increase privacy and civil liberty protections for those not relevant to the intelligence collection purposes. Note that there is no precise threshold in collecting data on such “harmless” persons that will distinguish between bulk and targeted; it’s a matter of degree. Also note that the bulk/targeted distinction applies broadly to different data types: telephony content, metadata, business records, Internet searches, and so on.
7 The White House, Presidential Policy Directive/PPD-28, “Signals Intelligence Activities,” Office of the Press Secretary, January 17, 2014, http://www.whitehouse.gov/sites/default/files/docs/2014sigint_mem_ppd_rel.pdf.
information collection that may yield extremely valuable information about threats unknown at the time of collection and less intrusive information collection that may miss information about dangerous threats.
Bulk and targeted collection can apply to many different kinds of communication modalities—telephone, email, instant message, and so on. Various web-based applications, such as electronic banking or online shopping that allow users to exchange information electronically, are among these modalities, even if they are not usually thought of as means for communication, per se. Obtaining phone metadata under Section 215 authority also counts as bulk collection.
The laws, regulations, court rulings, and other writings about SIGINT use a number of terms to describe intelligence gathering and analysis. These terms are not always used precisely or consistently. Intelligence and law enforcement cultures use different words for the same concept or the same word for slightly different concepts. It is easy, when describing and debating intelligence processes, to stumble over problems of definition rather than of substance. Indeed, for several years in some instances, NSA analysts were accessing the database of domestic telephone metadata without proper reasonable and articulable suspicion (RAS) authority; this was due to differing NSA and FISC definitions of the word “archive.”8 Several presenters to the committee acknowledged these problems and indicated that the IC is continuing to work on them. The term target is also used loosely and in different forms throughout the community.
The preceding section addresses the definitions of bulk and targeted collection; this section provides working definitions adopted by the committee for several other key terms. For the purposes of this report, the committee has formulated the following lexicon (see Figure 2.2):
• Identifier: A text or bit string that denotes a communication end point.
• Unknown: An identifier that may or may not have intelligence value.
• Ruled out: An identifier that has been determined to have no intelligence value at the present time.
8 John DeLong testimony to committee; see also Memorandum of the United States in Response to the Court’s Order Dated January [sic] 28, 2009 at 11, In re Production of Tangible Things From [REDACTED], No. BR 08-13 (FISA Ct. February 17, 2009), http://www.dni.gov/files/documents/section/pub_Dec%2012%202008%20Supplemental%20Opinions%20from%20the%20FISC.pdf.
FIGURE 2.2 Classification of identifiers used in signals intelligence analysis.
• Subject of interest: An identifier that may have intelligence value and is likely to be part of an intelligence investigation.
• Target: A subject of interest that may be a security threat.
• Seed: A subject of interest that is used as the starting point for an intelligence investigation.
• RAS target: A target for which there is a reasonable and articulable suspicion that the person is associated with a foreign terrorist organization.9
For the purposes of this report only, and realizing that they may have different and possibly broader meanings in the IC, the committee uses the working definitions presented in Box 2.1, drawn from statutes and its understanding of IC practices in the context of SIGINT and technology. (For definitions used by the U.S. SIGINT System, see USSID SP0018, Section 9.10)
9 RAS is a term of art used in the context of Section 215 collection. See David Kris, On the bulk collection of tangible things, Journal of National Security Law and Policy 7:209, 2014.
10 National Security Agency, USSID 18, approved for release in 2013.
Working Definitions in Signals Intelligence and Technology
A text or bit string that denotes a communication endpoint, such as a telephone number, mobile phone subscriber number, Internet Protocol (IP) address, or email address.
|subject of interest||
An identifier of a party (person, group) that may have intelligence value and is likely to be part of an intelligence investigation.
|target (n, adj)||
A subject of interest in an intelligence investigation. This term is used liberally by the Intelligence Community (IC) to denote an identifier or person that is the subject of interest or surveillance.
A target need not be the principal subject of interest. For example, an associate of a known threat might be a target.
Note that a target can be a computer identified by its IP address.
Target identifiers may be used in selectors or discriminants to obtain, from a large collection of data, data pertaining only to the target.
An initial target used to start an intelligence investigation.
A target for which there is a reasonable, articulable suspicion (RAS) that it is associated with a foreign terrorist organization. Foreign Intelligence Surveillance Act Section 215 requires a RAS target designation to permit certain queries.
In 2012, fewer than 300 identifiers met the RAS standards and were used as seeds in the Section 215 collection.1
Detailed instructions for searching a database of collected data.
Note: this is consistent with computer technology usage, and is akin to an SQL query.
A query may have several “terms” or “selectors”:
Example: “calls made from identifier x000325 after July 2, 2013”
Example: “Internet search requests using the term ‘sarin’ or emails containing ‘poison gas’
Same meaning as query, but used in conjunction with filtering applied as part of collection. Discriminants must be simple enough to be applied in real time as signals intelligence (SIGINT) data is extracted and filtered.
This word appears explicitly in Presidential Policy Directive 28 (PPD-28) as part of the definition of “targeted collection.”
Example: “all the email addresses used in communications to or from Yemen”
(usually) A query term that cites a specific identifier. (sometimes) Any query term.
Example: “calls made from identifier x000325”
Example: “calls made from identifier x000235 or identifier y4576”
|collection (of SIGINT data)||
Storing SIGINT data on government-controlled information technology (IT) systems so as to enable authorized access by IC analysts and the software tools they use.
Storing on a government-controlled IT system the results of a query of an external database constitutes collection in the sense of the committee’s definition.
Under the proposal to have communications carriers retain call detail records and allow authorized access to those records by IC analysts, the records at the carrier are not “collected” in the sense of the committee’s definition, but if the records transmitted to the government in answer to a query are stored, they are considered collected.
Collection in which a significant portion of the retained data pertains to identifiers that are not targets at the time of collection.
Note: Although the term “bulk” suggests that the set of collected data is large, and bulk data can indeed be large, size alone is not the controlling factor in defining bulk collection.
Collection that stores only the SIGINT data that remains after a filter discriminant removes most non-target data.
Procedures, approved by the Foreign Intelligence Surveillance Court (FISC), that must be “reasonably designed in light of the purpose and technique of the particular surveillance, to minimize the acquisition and retention, and prohibit the dissemination, of nonpublicly available information concerning unconsenting U.S. persons consistent with the need of the United States to obtain, produce, and disseminate foreign intelligence information.”
See, e.g., 50 U.S.C. §§ 1801(h)(1) and 1821(4)(A); USSID-18.2
Loosely, “data about data,” distinct from the data itself (“contents”).
Sometimes called “non-content data.”3
There is no standard definition that enumerates metadata elements associated with a telephone call or an email transmission; instead, statutes and court orders that authorize collection of metadata list explicitly the elements that can be collected. However, metadata does not include “content of any telephone call, or the names, addresses, or financial information of any party to a call.”4
In telephony, generally includes calling and called numbers, duration of call, time of the call, and perhaps more information.
In email, generally includes from and to email addresses, time of sending, IP addresses of email services, and the like. The “subject” and “re” fields of email headers are considered content, not metadata.5
|call detail record (CDR)||
Business records kept by telephone service providers (usually for billing purposes) detailing for each call information such as the calling and called numbers, the time and duration of the call, and possibly additional information.
Sometimes called “telephony metadata.”
“Tangible things, including books, records, papers, documents and other items.”
Further, the FISC has ruled that the electronic form of these records counts as a tangible thing whose production can be compelled under Section 215.
|foreign intelligence information||
“Foreign intelligence means information related to the capabilities, intentions, or activities of foreign governments or elements thereof, foreign organizations, foreign persons, or international terrorists.” (PPD-28 and Executive Order 12333).
For details see Foreign Intelligence Surveillance Act of 1978.8 Foreign intelligence collection priorities are set annually at the policy level.
“A citizen of the United States, an alien lawfully admitted for permanent residence, an … association [of citizens and permanents residents] or a corporation which is incorporated in the United States …”
For details see Foreign Intelligence Surveillance Act of 1978.9
1 “Administration White Paper: Bulk Collection of Telephony Metadata Under Section 215 of the USA Patriot Act,” August 9, 2013, p. 4. Found various places online, including http://big.assets.huffingtonpost.com/Section215.pdf.
2 National Security Agency, “United States Signals Intelligence Directive USSID SP0018, (U) Legal Compliance and U.S. Persons Minimization Procedures,” Issue Date January 25, 2011, approved for release on November 13, 2013, referred to as USSID 18, http://www.dni.gov/files/documents/1118/CLEANEDFinalUSSIDSP0018.pdf.
3 See U.S. Department of Justice, Justice News, “Acting Assistant Attorney General Elana Tyrangiel Testifies Before the U.S. House Judiciary Subcommittee on Crime, Terrorism, Homeland Security, and Investigations,” March 19, 2013, http://www.justice.gov/iso/opa/doj/speeches/2013/olp-speech-1303191.html.
4 “Administration White Paper,” 2014.
5 FISC ruling, U.S. Foreign Intelligence Surveillance Court, “In Re Motion of Propublica, Inc. for the Release of Court Records, Docket No.: Misc. 13-09, The United States’ Opposition to the Motion of Propublica, Inc. for the Release of Court Records,” http://www.dni.gov/files/documents/1118/CLEANEDPRTT%201.pdf, p. 11.
6 USA Patriot Act 2001, http://www.gpo.gov/fdsys/pkg/PLAW-107publ56/pdf/PLAW-107publ56.pdf, p. 17.
7 “Administration White Paper,” 2014.
8 Ibid., p. 2, item (e).
9 Ibid., p. 4, item (i).