Part I
Thinking About Privacy

Chapter 1 (“Thinking About Privacy”) introduces many of the concepts needed for an informed discussion about privacy. The chapter underscores that privacy is an elusive concept, even though many people have strong intuitions about what it is. Indeed, privacy is seen to be a concept that acquires specific meaning only in the context of specific circumstances and settings. Notions of privacy are influenced by many factors, including technological change, societal and organizational change, and changes in immediate circumstances. Relevant technical issues include concepts of false positives and false negatives, the nature of personal information, the distinction between privacy and anonymity, fair information practices, and reasonable expectations of privacy.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age Part I Thinking About Privacy Chapter 1 (“Thinking About Privacy”) introduces many of the concepts needed for an informed discussion about privacy. The chapter underscores that privacy is an elusive concept, even though many people have strong intuitions about what it is. Indeed, privacy is seen to be a concept that acquires specific meaning only in the context of specific circumstances and settings. Notions of privacy are influenced by many factors, including technological change, societal and organizational change, and changes in immediate circumstances. Relevant technical issues include concepts of false positives and false negatives, the nature of personal information, the distinction between privacy and anonymity, fair information practices, and reasonable expectations of privacy.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age This page intentionally left blank.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age 1 Thinking About Privacy Just as recent centuries saw transitions from the agricultural to the industrial to the information age and associated societal and technological changes, the early 21st century will continue to pose dynamic challenges in many aspects of society. Most importantly from the standpoint of this report, advances in information technology are proceeding apace. In this rapidly changing technological context, individuals, institutions, and governments will be forced to reexamine core values, beliefs, laws, and social structures if their understandings of autonomy, privacy, justice, community, and democracy are to continue to have meaning. A central concept throughout U.S. history has been the notion of privacy and the creation of appropriate borders between the individual and the state. In the latter 19th century, as industrial urban society saw the rise of large bureaucratic organizations, notions of privacy were extended to the borders between private organizations and the individual. This report focuses on privacy and its intersections with information technology and associated social and technology trends. 1.1 INTRODUCTION One of the most discussed and worried-about aspects of today’s information age is the subject of privacy. Based on a number of other efforts directed toward analyzing trends and impacts of information technology (including the evolution of the Internet, a variety of information security issues, and public-private tensions regarding uses of information and

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age information technology), the National Research Council saw a need for a comprehensive assessment of privacy challenges and opportunities and thus established the Committee on Privacy in the Information Age. The committee’s charge had four basic elements: To survey and analyze potential areas of concern—privacy risks to personal information associated with new technologies and their interaction with non-technology-based risks, the incidence of actual problems relative to the potential, trends in technology and practice that will influence impacts on privacy, and so on; To evaluate the technical and sociological context for those areas as well as new collection devices and methodologies—why personal information is at risk given its storage, communication, combination with other information, and various uses; trends in the voluntary and involuntary (and knowing and unknowing) sharing of that information; To assess what is and is not new about threats to the privacy of personal information today, taking into account the history of the use of information technology over several decades and developments in government and private sector practices; and To examine the tradeoffs (e.g., between more personalized marketing and more monitoring of personal buying patterns) involved in the collection and use of personal information, including the incidence of benefits and costs,1 and to examine alternative approaches to collection and use of personal information. Further, in an attempt to paint a big picture that would at least sketch the contours of the full set of interactions and tradeoffs, the charge called for these analyses to take into account changes in technology; business, government, and other organizational demand for and supply of personal information; and the increasing capabilities for individuals to collect and use, as well as disseminate, personal information. Within this big picture, and motivated by changes in the national security environment since the September 11, 2001, attacks on the World Trade Center and the Pentagon, the committee addressed issues related to law enforcement and national security somewhat more comprehensively than it did other areas in which privacy matters arise. To what end does the committee offer this consideration of privacy in the 21st century? Most broadly, to raise awareness of the spider web of connectedness among the actions we take, the policies we pass, the 1 Throughout this report, the term “benefits and costs” should be construed broadly, and in particular should not be limited simply to economic benefits and costs.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age expectations we change, the “flip side” of impacts policies have on privacy. There should not be unintended consequences to privacy created by policies we write or change to address the continuing shifts in our society. We may decide to tolerate erosion on one side of a continuum—privacy and security sometimes pose a conflict, for example. We may decide it makes sense to allow security personnel to open our bags, to carry a “trusted traveler” card, to accept “profiling” of people for additional examination. But we should not be surprised by the erosion of our own and other people’s privacy by this shift in the continuum. Policies may create a new and desirable equilibrium, but they should not create unforeseen consequences. The goals here are not to evaluate “good” and “bad,” whether in changes in the continuums privacy moves on, policies, technologies, and laws. Rather, the committee hopes that this report will contribute to a recalibration of the many issues that play a part in privacy and will contribute to the analysis of issues involving privacy. The degree of privacy traded for security or public health, for example, should be a result of thoughtful decisions following public discussion in which all parties can participate. Only then will the policies that emerge from the pressures at work during the early years of the 21st century be understood in their impacts on privacy. To be clear, the committee does not claim that this report presents comprehensive solutions to the many privacy challenges confronting society today. Nor does it provide a thorough and settled definition of privacy. Debate will continue on this complicated and value-laden topic for the foreseeable future. This report does provide ways to think about privacy, its relationship to other values, and related tradeoffs. It emphasizes the need to understand context when evaluating the privacy impact of a given situation, piece of legislation, or technology. And it provides an in-depth look at ongoing information technology trends as related to privacy concerns. 1.2 WHAT IS PRIVACY? The committee began by trying to understand what privacy is, and it quickly found that privacy is an ill-defined but apparently well-understood concept. It is ill-defined in the sense that people use the term to mean many different things. Any review of the literature on privacy will reveal that privacy is a complicated concept that is difficult to define at a theoretical level under any single, logically consistent “umbrella” theory, even if there are tenuous threads connecting the diverse meanings. Specifying the concept in a way that meets with universal consensus is a difficult if not impossible task, as the committee found in doing its work.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age At the same time, the term “privacy” is apparently well understood in the sense that most people using the term believe that others share their particular definition. Nonetheless, privacy resists a clear, concise definition because it is experienced in a variety of social contexts. For example, a question may be an offensive privacy violation in one context and a welcome intimacy in another. The committee believes that in everyday usage, the term “privacy” generally includes reference to the types of information available about an individual, whether they are primary or derived from analysis. These types of information include behavioral, financial, medical, biometric, consumer, and biographical. Privacy interests also attach to the gathering, control, protection, and use of information about individuals. Informational dimensions of privacy thus constitute a definitional center of gravity for the term that is used in this report, even while recognizing that the term may in any given instance entail other dimensions as well—other dimensions that are recognized explicitly in the discussion.2 The multidimensional nature of privacy is explicated further in Chapter 2, and a theme that becomes apparent is the situational and contextual nature of privacy—that is, it depends on a number of specific factors that often do not cleanly and clearly overlap, rather than being identified by a sweeping universal calculus or definition. Moreover, privacy in any given situation may be in tension with other values or desires of the individual, subgroups, and society at large. Privacy, like most other values in modern democratic societies, is not an absolute but rather must be interpreted and weighed alongside other socially important values and goals. How this balancing (which need not mean equivalent weighing) is to be achieved is often the center of the controversy around privacy, because different people and groups balance in different ways these values that are in tension. A further complication is that participants in the balancing debate often confuse the needs of privacy with other values that might be tied to privacy but that are, in fact, distinct from it. For example, concerns over whether an individual’s HIV status should be private may in fact reflect, in part, a concern about his or her ability to obtain health insurance. In short, as with most interesting and contentious social topics, where privacy is concerned there are both costs and benefits, and these vary by the group, context, and time period in question, as well as by the means used to measure them. Sometimes, tradeoffs are inevitable (Box 1.1 pro- 2 The term “private” can have both descriptive and normative meanings. To describe information as “private information” might mean “information that is not accessible to others,” or it could mean “information that should not be accessible to others.” Generally the context will specify the meaning, but these two different meanings are noteworthy.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age BOX 1.1 Some Illustrative Tradeoffs in Privacy Government or privately controlled cameras monitoring the movement of ordinary citizens in public places for the stated purpose of increasing public safety. Government collection of data on peoples’ political activities for the stated purpose of increasing public safety or homeland security. Collection by a retailer of personal information about purchases for the stated purpose of future marketing of products to specific individuals. Collection by a bank of personal financial information about an individual for the stated purpose of evaluating his or her creditworthiness for a loan. Aggregation by insurers of medical data obtained through third parties for the stated purpose of deciding on rates or availability of health insurance for an individual. Provision of information to law enforcement agencies about library patrons (including who they are and what they read or saw in the library) for the stated purpose of increasing public safety or homeland security, and a prohibition of discussing or acknowledging that this has been done. Availability of public government records (including criminal records, family court proceedings, real estate transactions, and so on, and formerly only available in paper format) on the World Wide Web for the stated purpose of increasing the openness of government. Geographic tracking of cell-phone locations at all times for the stated purpose of enabling emergency location. Note also that privacy concerns are often grounded in information that may be used for purposes other than a stated purpose. Indeed, in each of the examples given above, another possible—and less benign—purpose might easily be envisioned and thus might change entirely one’s framing of a privacy issue. vides some illustrative examples). Advocates for various positions who argue vigorously for a given policy thus run the risk of casting their arguments in unduly broad terms. Though rhetorical excesses are often a staple of advocacy, in truth the factors driving the information age rarely create simple problems with simple solutions. Perhaps the best known of the general tradeoffs in the privacy debate is that which contrasts privacy with considerations of law enforcement and national security. At this writing, there is considerable debate over the Bush administration’s use of warrantless wiretapping in its counterterrorism efforts against al-Qaeda. Furthermore, the USA PATRIOT Act, passed in the immediate wake of the September 11, 2001, attacks on the World Trade Center and the Pentagon and extended and amended in early 2006, changed a number of privacy-related laws in order to facilitate certain law

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age enforcement and national security goals. (Chapter 9 contains an extensive discussion of these issues.) But the law enforcement/national security versus privacy debate is hardly the only example of such tradeoffs that are being made. Box 1.1 provides some illustrations. Privacy concerns interact with the delivery of health care and the information needed to contribute to public health as well as the information needed to discover and understand risk factors that any individual may have. Privacy concerns interact with the ability to do long- and short-term sociological studies. Techniques that are believed to increase productivity and profitability may come at a cost to the privacy interests of many consumers and workers. Privacy concerns also are reflected in the debates about new forms of intellectual property. Privacy concerns also interact with sociological and policy research. In order to conduct these kinds of research, substantial amounts of personal information are often necessary. However, in general, these data never have to be associated with specific individuals. This situation contrasts sharply with the societal needs described above: law enforcement authorities are interested in apprehending a specific individual guilty of criminal wrongdoing, national security authorities are interested in identifying a particular terrorist, or a business wants to identify a specific customer who will buy a product. For these reasons, protected data collections such as those found in social science data archives and census public-use files serve the interests of groups and communities with less controversy; when controversy does exist, it usually relates to whether the data contained in these files and archives are sufficiently anonymized, or to specific nonstatistical uses of these data. Tradeoffs are also not limited to the value of information to an organization versus the value of privacy to an individual—they also arise in the same situation of an individual alone. For example, an individual might regard his or her personal information as a commodity that can be traded freely in exchange for some other good or service of value—and thus he or she might well be willing to provide personal information on shopping habits at a chain drugstore or supermarket in exchange for a 2 percent discount on all purchases. Furthermore, even if the tradeoffs do appear to pit value to an organization against value to an individual, some would argue that there is benefit to the individual as well (albeit not specific benefit to him or her) if the organization can be construed as “all or most of society.” This point is discussed in greater detail in Section 6.4. Not only are these tradeoffs complex, difficult, and sometimes seemingly intractable, but they are also often not made explicit in the discussions that take place around the policies that, when they are enacted, quietly embody the value tradeoffs. Clarifications on these points would not necessarily relieve the underlying tensions, but they would likely help

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age illuminate the contours of the debate. A major purpose of this report is to contribute to that illumination. 1.3 AN ILLUSTRATIVE CASE In early 2005, a firm known as ChoicePoint announced that “a crime committed against ChoicePoint … MAY have resulted in [consumer] name[s], address[es], and Social Security number[s] being viewed by businesses that are not allowed access to such information.”3 Specifically, ChoicePoint reported that “several individuals, posing as legitimate business customers, recently committed fraud by claiming to have a lawful purpose for accessing information about individuals, when in fact, they did not.” ChoicePoint explained its business as verifying for its business customers information supplied by individuals as part of a business transaction, often as part of an application for insurance, a job, or a home loan. ChoicePoint notified approximately 143,000 individuals that their personal information might have been compromised. In early 2006, the U.S. Federal Trade Commission (FTC) announced that ChoicePoint would pay $15 million in fines and other penalties for lax security standards in verifying the credentials of its business customers. Furthermore, the FTC noted that “this breach occurred because ChoicePoint failed to implement reasonable and appropriate procedures for approving new customers and for monitoring existing ones.”4 It also said that more than 800 cases of identity theft arose from this breach in security. For purposes of this study, the truth or falsity of the FTC’s allegations about ChoicePoint’s security practices per se is not relevant. But what is relevant is that the personal information of more than 143,000 individuals was released to parties that did not have a lawful purpose in receiving that information, and that a number of cases of identity theft arose from this release. Several questions immediately come to mind: How is ChoicePoint able to aggregate such voluminous information? The data that ChoicePoint collects on individuals includes criminal histories, Social Security numbers, and employment histories. 3 “Choicepoint’s Letter to Consumers Whose Information Was Compromised,” CSO Magazine, available at http://www.csoonline.com/read/050105/choicepoint_letter.html. 4 Federal Trade Commission, “ChoicePoint Settles Data Security Breach Charges; to Pay $10 Million in Civil Penalties, $5 Million for Consumer Redress,” available at http://www.ftc.gov/opa/2006/01/choicepoint.htm.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age Why do ChoicePoint and other similar firms collect such voluminous data on individuals? What was the harm suffered by the individuals whose identities were not stolen? Eight hundred individual cases of identity theft were attributed to the breach, a number corresponding to about ½ of 1 percent of the 143,000 individuals involved. To what extent were individuals notified by ChoicePoint surprised by the existence of such aggregations of personal data? Question 1 points to the availability of great quantities of personal information on a large scale to organizations that have no direct involvement in the creation of the data. ChoicePoint is not the primary collector of such information; it is an aggregator of it. It also points to the fact that information collected for one purpose (e.g., a job application with a certain employer) can be “repurposed” and used for an entirely different purpose (e.g., verification of job history in connection with a background investigation). Question 2 points to a demand on the part of private businesses and government agencies for personal information about its employees and customers. Indeed, such information is so important to these businesses and government agencies that they are willing to pay to check and verify the accuracy of information provided by employees and customers. (Note also that by insisting that employees and customers provide personal information, these businesses and agencies often add to the personal information that is available to data aggregators.) Question 3 focuses attention on the value of privacy and the nature of the harm that can accrue to individuals when their privacy is breached even if they have not been the victims of identity theft. In this case, the answer is that these individuals suffer the same harm that Damocles experienced when he was partying and feasting under the sword. No physical harm came to Damocles, yet the cost to his sense of well-being was high indeed. A person whose privacy has been breached is likely to be concerned about the negative consequences that might flow from the breach, and those kinds of psychological concerns constitute a type of actual though intangible harm entirely apart from the other kinds of tangible harm that the law typically recognizes. A second kind of intangible harm experienced by Damocles might have been his reluctance to engage in dancing and making loud noises that might have caused the thread holding the sword to break—a so-called chilling effect on his activities and behaviors. In short—harm need not be tangible to be real or actual.5 5 Nor is “harm” a concept that is relevant only to individuals. As Section 2.3 addresses in greater detail, certain kinds of harm may relate to groups or to society as a whole. Group or

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age Question 4 alerts us to issues involving the commodification of personal information and its being treated as a kind of marketable property to be used as those who come to possess it choose. Question 4 also calls attention to several collateral issues surrounding privacy. In this case, the issue is the role of notification in privacy, and whether notification that personal information is being collected about an individual in any sense ameliorates any breach of privacy that might be associated with that collection. Given legal requirements to notify individuals after privacy violations have been documented, are such violations thus less likely? Questions and issues such as these recur frequently in this report, although in no sense do these examples exhaust the kinds of questions and issues that arise. Privacy provides a useful filter through which to think about individual and societal benefits and costs. 1.4 THE DYNAMICS OF PRIVACY Privacy is part of a social context that is subject to a range of factors. While a relationship between privacy and society has always existed, the factors (or pressures) affecting privacy in the information age are varied and deeply interconnected. These factors, individually and collectively, are changing and expanding in scale with unprecedented speed in terms of our ability to understand and contend with their implications to our world, in general, and our privacy, in particular. Some of these factors include the volume, magnitude, complexity, and persistence of information; the expanding number of ways to collect information; the number of people affected by information; and the geographic spread and reach of information technology. 1.4.1 The Information Age What is meant by the term “information age,” and what are the factors so profoundly affecting the dynamics of privacy? With respect to the information age, a great deal has been written about the fact that almost no part of our lives is untouched by computing and information technology. These technologies underlie new ways of collecting and handling information that in turn have ramifications throughout society, as they mediate much private and public communication, interaction, and transactions. They are central components of contemporary infrastructures involving (but certainly not restricted to) commerce, banking and finance, utilities, communications, national defense, education, and entertainment. societal harms may be related to individually suffered harm, but are conceptually separate notions.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age those in the latter category up the chain of command for further investigation. A false positive is someone in the latter category who, upon further investigation, has no terrorist connection at all. A false negative is someone in the former category who should have received further investigation but did not. Two important points arise in this discussion. For a given database and given analytical approach, false positives and false negatives are in some sense complementary. More precisely, for a given database, one can drive the rate of false positives to zero, or the rate of false negatives to zero, but not simultaneously. For example, it is easy to identify all individuals who are bad credit risks—just deny everyone credit. This approach catches all of the bad credit risks—but also results in a huge number of false negatives. Decreases in the false positive rate are inevitably accompanied by increases in the false negative rate, and vice versa, though not necessarily in the same proportion. However, if the quality of the data is improved, or if the classification technique is improved, it is possible to reduce both the false positive rate and the false negative rate. Identifying false negatives in any given instance may be problematic. In the case of credit card issuers, the bank will probably not issue cards to the bad credit risks. Thus, it may never learn that these individuals are in fact creditworthy—and these individuals may forevermore be saddled with another declination of credit on their records without being given the chance to prove their creditworthiness. In the case of the terrorist investigation, it is essentially impossible to know if a person is a false negative until he or she commits the terrorist act. False positives and false negatives are important in a discussion of privacy because they are the language in which the tradeoffs described in Section 1.2 are often cast. Banks obtain personal information on individuals for the purpose of evaluating their creditworthiness. All of these individuals surrender some financial privacy, but some do not receive the benefit of obtaining credit, and some of those not receiving credit are deserving of credit. A law enforcement official may obtain personal information on individuals searching evidence of criminal activity. All of these individuals surrender some privacy, and those who have not been involved in criminal activity have had their privacy violated despite the lack of such involvement. Data quality is the property of data that allows them to be used effectively, economically, and rapidly to inform and evaluate decisions.19 19 Alan F. Karr, Ashish P. Sanil, and David L. Banks, “Data Quality: A Statistical Perspec-

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age Typically, data should be correct, current, complete, and relevant. Data quality is intimately related to false positives and false negatives, in that it is intuitively obvious that using data of poor quality is likely to result in larger numbers of false positives and false negatives than would be the case if the data were of high quality. Data quality is a multidimensional concept. Measurement error and survey uncertainty contribute (negatively) to data quality, as do issues related to measurement bias. But in the context of using large-scale data sets assembled by multiple independent parties using different definitions and processes, many other issues come to the fore as well. It is helpful to distinguish between issues related to data quality in a single database and data quality associated with a collection of databases. Data quality issues for a single database include (but are not limited to) missing data fields; inconsistent data fields in a given record, such as recording a pregnancy for a 9-year-old boy; data incorrectly entered into the database, such as that which might result from a typographical error; measurement error; sampling error and uncertainty; timeliness (or lack thereof); coverage or comprehensiveness (or lack thereof); improperly duplicated records; data conversion errors, as might occur when a database of vendor X is converted to a comparable database using technology from vendor Y; use of inconsistent definitions over time; and definitions that become irrelevant over time. Data quality issues for multiple databases include all of those issues for a single database, and also syntactic inconsistencies (one database records phone numbers in the form 202-555-1212 and another in the form 2025551212); semantic inconsistencies (weight measured in pounds vs. weight measured in kilograms); different provenance for different databases; inconsistent data fields for records contained in different databases on a given data subject; and lack of universal identifiers to specify data subjects. 1.5.3 Privacy and Anonymity Privacy is an umbrella concept within which anonymity is located. A vandal may break a window, but his or her identity may not be directly tive,” Statistical Methodology 3:137-173, 2006; Thomas C. Redman, “Data: An Unfolding Quality Disaster,” DM Review Magazine, August 2004, available at http://www.dmreview.com/article_sub.cfm?articleId=1007211; Wayne W. Eckerson, “Data Warehousing Special Report: Data Quality and the Bottom Line,” May 1, 2002, available at http://www.adtmag.com/article.aspx?id=6321&page=; Y. Wand and R. Wang, “Anchoring Data Quality Dimensions in Ontological Foundations,” Communications of the ACM 39(11):86-95, November 1996; and R. Wang, H. Kon, and S. Madnick, “Data Quality Requirements Analysis and Modelling,” Ninth International Conference of Data Engineering, Vienna, Austria, 1993.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age known. Someone may send an unsigned or pseudonymous e-mail, or make a charitable contribution. Anonymity may involve a protected right, as in the delivery of political messages. Or it may simply be an empirical condition generated by stealth or circumstance. Unsigned graffiti illustrates the former and “faceless” individuals in a crowd the latter. The distinction between privacy and anonymity is clearly seen in an information technology context. Privacy corresponds to being able to send an encrypted e-mail to another recipient. Anonymity corresponds to being able to send the contents of the e-mail in plain, easily readable form but without any information that enables a reader of the message to identify the person who wrote it. Privacy is important when the contents of a message are at issue, whereas anonymity is important when the identity of the author of a message is at issue. Depending on the context, privacy expectations (and actualities apart from the rules) may extend to content or the identity of the sender or to both. The relationship between privacy and anonymity can be made more formal. If personal information about an individual is denoted by the set P, the individual has privacy to the extent that he or she can keep the value of any element in the set private. Consider then another set Q, a subset of P, which consists of all elements that could be used—individually or in combination—to identify the individual. The anonymity of the individual thus depends on keeping Q private. For example, one might define a number of different sets: the set of all people with black hair, the set of all people who work for the National Academies, the set of all people who type above a certain rate, and so on. Knowledge that an individual is in any one of these sets does not identify that individual uniquely—he or she is thus “anonymous” in the usual meaning of the term. But knowledge that an individual is in all of these sets—that is, considering the intersection of all of these sets—might well result in the ability to identify the individual uniquely (and hence in the loss of anonymity).20 Note also that anonymity is often tied to the identification of an individual rather than the specification of that individual. A person may be specified by his or her complete genomic sequence, but in the absence of databases that tie that sequence to a specific identity the person is still anonymous. A fingerprint may be found on a gun used in a murder, but the fingerprint does not directly identify the shooter unless the fingerprint 20 More precisely, Q is the set of all subsets of P that could be used to identify the individual. Imagine that elements P2, P4, P17 of P could be used together to identify the individual, as could elements P2, P3, P14 taken together, and elements P3, P7, P14. Then anonymity would require that these three sets be kept private, that is {P2, P4, P17}, {P2, P3, P14}, and {P3, P7, P14}. In practice, this might well imply keeping private the union of all these sets {P2, P3, P4, P7, P14, P17}.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age is on file in some law enforcement databank. In short, the specification of a unique individual is not necessarily the same thing as identifying that individual.21 An additional consideration is that “identification” usually means unique identification—using any of these sets would result in a bin size of one. In other words, in the usual discussion of anonymity, an anonymous person is someone whose identity cannot be definitively ascertained. However, for some purposes, a bin size of three would be insufficient to protect his or her identity—if a stool pigeon for an organized crime syndicate were kept “anonymous” within a bin size of three, it is easy to imagine that the syndicate would be perfectly willing and able to execute three murders rather than one. Here again is a situational factor that contributes to the relative nature of such concepts. The anonymity dimension of privacy is central to the problem of protecting data collected for statistical purposes. For example, many agencies of the federal government collect information about the state of the nation—from the national economy to household use of Medicare—in order to evaluate existing programs and to develop new ones. That information is often derived from data collected by statistical agencies or others under a pledge of confidentiality. A most critical data source is micro-data, which includes personal information about individuals, households, and businesses, and a central concern of the federal statistical agencies is that the responses provided by information providers will be less candid if their confidentiality cannot be guaranteed.22 (This issue is addressed at greater length in Section 6.8.) This issue also arises explicitly, although in a somewhat different form, in contemplating the significance of an organization’s privacy—that is, information about an organization with whom a number of individuals may be associated. Information about an organization can reveal information about individuals, although it may not be uniquely associated with an individual. For example, if a survey of employers shows that a company pays a large amount in employee health care benefits to medi- 21 It is worth noting that despite the common “intuitively obvious” usage of the term “identity,” identity is fundamentally a social construct and thus has meaning only in context. I may know a person who sends me e-mail only by his or her e-mail address, but the identity “JohnL7534@yahoo.com” may be entirely sufficient for our relationship—and it may not matter if his first name is really John, whether his last name begins with L, or even whether this person is male or female. In this sense, specification might be regarded as a decontextualized identification. 22 See, for example, National Research Council, Expanding Access to Research Data: Reconciling Risks and Opportunities, The National Academies Press, Washington, D.C., 2005; National Research Council, Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics, National Academy Press, Washington, D.C., 1993.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age cal care providers that specialize in treating AIDS, then it can be inferred that some employees of that company have AIDS. This fact may have significance for all of the employees—those with AIDS may face a greater likelihood of having their status revealed, and those without AIDS may face higher health care premiums in the future if their past employment history becomes known. 1.5.4 Fair Information Practices Fair information practices are standards of practice required to ensure that entities that collect and use personal information provide adequate privacy protection for that information. These practices include notice to and awareness of individuals with personal information that such information is being collected, providing individuals with choices about how their personal information may be used, enabling individuals to review the data collected about them in a timely and inexpensive way and to contest that data’s accuracy and completeness, taking steps to ensure that the personal information of individuals is accurate and secure, and providing individuals with mechanisms for redress if these principles are violated. Fair information practices were first articulated in a comprehensive manner in the U.S. Department of Health, Education, and Welfare’s 1973 report Records, Computers and the Rights of Citizens.23 This report was the first to introduce the Code of Fair Information Practices (Box 1.3), which has proven influential in subsequent years in shaping the information practices of numerous private and governmental institutions and is still well accepted as the gold standard for privacy protection.24 From their origin in 1973, fair information practices “became the dominant U.S. approach to information privacy protection for the next three decades.”25 The five principles not only became the common thread running through various bits of sectoral regulation developed in the United States, but they also were reproduced, with significant extension, in the guidelines developed by the Organisation for Economic Co-operation and 23 U.S. Department of Health, Education, and Welfare, Records, Computers and the Rights of Citizens, Report of the Secretary’s Advisory Committee on Automated Personal Data Systems, MIT Press, Cambridge, Mass., 1973. 24 Fair information principles are a staple of the privacy literature. See, for example, the extended discussion of these principles in D. Solove, M. Rotenberg, and P. Schwartz, Information Privacy Law, Aspen Publishers, 2006; Alan Westin, “Social and Political Dimensions of Privacy,” Journal of Social Issues 59(2):431-453, 2003; Helen Nissenbaum, “Privacy as Contextual Integrity,” Washington Law Review 79:101-139, 2004; and an extended discussion and critique in Roger Clarke, “Beyond the OECD Guidelines: Privacy Protection for the 21st Century,” available at http://www.anu.edu.au/people/Roger.Clarke/DV/PP21C.html. 25 Westin, “Social and Political Dimensions of Privacy,” 2003, p. 436.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age BOX 1.3 Codes of Fair Information Practice Fair information practices are standards of practice required to ensure that entities that collect and use personal information provide adequate privacy protection for that information. As enunciated by the U.S. Federal Trade Commission (other formulations of fair information practices exist),1 the five principles of fair information practice include: Notice and awareness. Secret record systems should not exist. Individuals whose personal information is collected should be given notice of a collector’s information practices before any personal information is collected and should be told that personal information is being collected about them. Without notice, an individual cannot make an informed decision as to whether and to what extent to disclose personal information. Notice should be given about the identity of the party collecting the data, how the data will be used and the potential recipients of the data, the nature of the data collected and the means by which it is collected, whether the individual may decline to provide the requested data and the consequences of a refusal to provide the requested information, and the steps taken by the collector to ensure the confidentiality, integrity, and quality of the data. Choice and consent. Individuals should be able to choose how personal information collected from them may be used, and in particular how it can be used in ways that go beyond those necessary to complete a transaction at hand. Such secondary uses can be internal to the collector’s organization, or can result in the transfer of the information to third parties. Note that genuinely informed consent is a sine qua non for observation of this principle. Individuals who provide personal information under duress or threat of penalty have not provided informed consent—and individuals who provide personal information as a requirement for receiving necessary or desirable services from monopoly providers of services have not, either. Access and participation. Individuals should be able to review in a timely and inexpensive way the data collected about them, and to similarly contest that data’s accuracy and completeness. Thus, means should be available to correct errors, or at the very least, to append notes of explanation or challenges that would accompany subsequent distributions of this information. Integrity and security. The personal information of individuals must be accurate and secure. To assure data integrity, collectors must take reasonable steps, such as using only reputable sources of data and cross-referencing data against multiple sources, providing consumer access to data, and destroying untimely data or converting it to anonymous form. To provide security, collectors must take both procedural and technical measures to protect against loss and the unauthorized access, destruction, use, or disclosure of the data. Enforcement and redress. Enforcement mechanisms must exist to ensure that the fair information principles are observed in practice, and individuals must have redress mechanisms available to them if these principles are violated.    1See http://www.ftc.gov/reports/privacy3/fairinfo.htm.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age Development (OECD). These principles are extended in the context of OECD guidelines that govern “the protection of privacy and transborder flows of personal data” and include eight principles that have come to be understood as “minimum standards … for the protection of privacy and individual liberties.”26 They also include a statement about the degree to which data controllers should be accountable for their actions. This generally means that there are costs associated with the failure of a data manager to enable the realization of these principles. 1.5.5 Reasonable Expectations of Privacy A common phrase in discussions of privacy is “reasonable expectation of privacy.” The phrase has a long history in case law, first introduced in Katz v. United States, 389 U.S. 347 (1967), that reflects the fact that expectations are shaped by tradition, common social practices, technology, law, regulations, the formal and informal policies of organizations that are able to establish their own rules for the spaces that they control, and the physical and social context of any given situation. Expectations of privacy vary depending on many factors, but place and social relationships are among the most important. Historically, the home has been the locale in which the expectation of privacy has been the most extensive and comprehensive. Yet there are different zones of privacy even within the home, and within the sets of interpersonal relationships that are common to one’s home. While customs vary across cultures and individual families, there is a well-distributed sense of the nature of these spatial boundaries within the home. Kitchens and living rooms are common or relatively public spaces within the home, and they are places into which outsiders may be invited on special occasions. Bedrooms and bathrooms tend to be marked off from the more public or accessible spaces within the home because of the more intimate and personal activities that are likely to take place within them. In U.S. workplaces, individuals have only very limited expectations of privacy. The loss of privacy begins for many with the application, and reaches quite personal levels for those jobs that require drug tests and personality assessments. On the other hand, privacy does not evaporate entirely on the job. Closets may be provided for the storage of personal effects, and depending on the relative permanence of assigned spaces, desk drawers may be treated as personal space. The presence or absence 26 Marc Rotenberg, The Privacy Law Sourcebook 2001, Electronic Privacy Information Center, 2001, pp. 270-272.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age of doors within workspaces affects the ability of workers to control direct observation by others. Technology also affects reasonable expectations of privacy. Technology can be used to enhance human senses and cognitive capabilities, and these enhancements can affect the ability to collect information at a distance. The result is that space is not the marker it once was for indicating boundaries between private and public interactions. In the case of information technology, the “objects” about which one is private (digital objects such as electronic files or streams of bits as communications) are quite distinct from objects that were originally the focus of privacy concerns (physical, tangible objects made of atoms). Thus, Kerr argues, for example, that the well-established history of Fourth Amendment law governing permissible searches (and also reasonable expectations of privacy) must be rethought in light of the manifest differences between physical and digital objects.27 Critical events such as the terrorist attacks of 2001 have dramatically increased the level of personal and records surveillance that travelers encounter. Heightened concern about threats of violence means that searches of personal effects are becoming more common at sporting events, popular tourist sites, and even schools. Formal and informal policies that define the boundaries between the public and the private also help to shape our expectations of privacy that develop over time. Privacy policies are not only established by legislatures, administrative agencies, and the courts. Individual firms, trade unions, professional associations, and a host of other institutional actors have also developed policies to govern the collection and use of personal information. Individuals also have policies, or norms, that govern the ways in which they will interact with organizations and with other individuals. Indeed, individuals’ reciprocal behavior with respect to asking for, and offering, information is conditioned by custom and manners that are no less significant for not being less formal than the written policies. Cross-cultural differences with respect to expectations of privacy can be noted. For example, compared to Western cultures, a number of Eastern cultures place a far lower value on certain kinds of privacy in the home, and an Asian child often grows up with very different expectations of privacy than might an American child. Finally, the concept of “reasonable expectations of privacy” has a normative meaning as well as a descriptive meaning. For example, in 27 Orin Kerr, “Searches and Seizures in a Digital World,” Harvard Law Review 119:531, 2005. Kerr’s normative reformulation of Fourth Amendment law calls for maintaining “the specific goals of specific doctrinal rules in light of changing facts,” although he clearly recognizes that other normative reformulations are possible.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age a world where electronic surveillance technologies make surveillance easy to conduct on a wide scale, one could argue that no one today has a “reasonable expectation” that his or her phone calls will not be tapped. But both statutory law (e.g., Title III in the U.S. Code) and case law (e.g., Katz v. United States, 389 U.S. 347 (1967)) stipulate that under most circumstances, an individual does have a reasonable expectation that his or her phone calls will not be tapped. 1.6 LESSONS FROM HISTORY In the history of the United States, a number of societal shifts have taken place that relate to contemporary visions of privacy (Appendix A). For example, a move from primarily rural to a more urban (or suburban) society resulted in changes to the scale of one’s community and increased one’s proximity to strangers. In addition, the impact of information technologies is often to compress time and distance in the social sphere, and one result has been an increasingly diminished utility of time and space as markers of the boundaries between private and public space. Associated changes in how trust is developed and sustained have all shaped our understanding and appreciation of the value of privacy and the limits on it in a more impersonal society. Furthermore, there is an increased appetite on the part of many sectors of society for information collection and analysis and verification. The kinds of interactions individuals have with institutions and with each other have changed as a result. Increased societal needs, increased interdependence, new kinds of risks, ever greater complexity, and an increase in the number of rules one needs to be aware of to move safely and smoothly through society have radically altered the kinds of interactions individuals have with institutions and with each other. Both private organizations and government agencies are increasingly concerned with the ability to document compliance and discover violations. This is a major motivation for collection of information about individuals and about organizations. As the discussion in Appendix A (on the history of surveillance and privacy in the United States) suggests, a number of lessons can be gleaned from history. The first is that surveillance has been intensifying as society has grown more complex.28 The second lesson is that each technological advance in the spheres 28 Living in small towns or tightly knit communities is often associated with lesser degrees of privacy (where “everyone knows everyone else’s business”). But lesser privacy in these communities is not generally the result of explicit acts of surveillance or information gathering—rather, it is a by-product of routine day-to-day living.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age of sensing, communication, and information processing invites greater surveillance, and often those invitations are accepted. The invention of the telegraph led almost immediately to the invention of wiretapping. The invention of automated fingerprint matching led to the FBI’s integrated automated fingerprint identification system. The development of the computer resulted in unprecedented record-keeping power, and the emergence of networking technology has further increased that power. This is not to suggest that technologies make things happen on their own, but they do facilitate the activities and ambitions of those who might use them and who can afford the costs of those new technologies. The third lesson is that times of crisis or war are often marked by contractions in the scope of civil liberties. Often, when U.S. government leaders have come to believe that the security or the core interests of the nation were being threatened from without, the government has increased its surveillance of groups within its borders. In case after case whether British Loyalists, or Japanese-Americans, or Arab-Americans, the unequal weight of government surveillance on these groups has been justified on the basis of alleged links between the groups in question and threats to the national interest. Moreover, as the putative threat from these groups has faded with history, actions taken against these groups have generally been regarded with a degree of retrospective shame. The fourth lesson is that although U.S. conceptions of privacy can be traced historically, the meaning of the concept has been highly varied and vague, and there has never been an agreed-upon meaning. One result is that the legal and regulatory framework surrounding privacy has been a patchwork without a unifying theme or driving principles. This state of affairs in the United States contrasts sharply with those of certain other nations (notably the member states of the European Union) that often take a more comprehensive approach to privacy-related issues. This point is discussed further in Chapter 4. 1.7 SCOPE AND MAP OF THIS REPORT This report examines privacy from several perspectives and offers analysis and ways of thinking through privacy questions at the same time that it provides a snapshot of the current state of affairs. Part I is this chapter (Chapter 1). Part II includes Chapters 2 through 5, which are primarily expository. Chapters 2 and 3 seek to lay the groundwork for what privacy is and how it affects and is affected by societal and technological complexities. Chapters 4 and 5 address the legal landscape of privacy in the United States and the political forces shaping that landscape throughout recent history.

OCR for page 17
Engaging Privacy and Information Technology in a Digital Age Part III (Chapters 6 through 9) considers privacy in context, examining privacy issues in different sectors of society. Chapter 6 looks at institutional practice in privacy broadly in several different sectors. Chapter 7 provides a more in-depth look at health and medical privacy. Chapter 8 explores privacy and the U.S. library community and also mentions the issue of intellectual property and privacy (where technology, policy, and privacy intersect strongly). Chapter 9 looks at law enforcement and national security. Part II can be skipped without loss of continuity if the reader wishes to consider the various case studies first in Part III. However, Parts I and II supply important background information that provides a context for Part III. Part IV consists of a single and final chapter (Chapter 10) and provides the bulk of the report’s look to the future. It examines mechanisms and options for privacy protection and presents the report’s findings and recommendations. Appendix A presents a short history of surveillance and privacy in the United States. Appendix B provides a look at international considerations.