Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Apppendix A Privacy for Research Data Robert Gellman INTRODUCTION Scope and Purpose The purpose of this paper is to describe privacy rules in the three most important areas relevant to research uses of information involving remotely sensed and self-identifying data. The three issues are (1) When is informa- tion sufficiently identifiable so that privacy rules apply or privacy concerns attach? (2) When does the collection of personal information fall under regulation? and (3) What rules govern the disclosure of personal informa- tion? In addition, a short discussion of liability for improper use or disclo- sure is included. The goal is to provide sufficient information to illustrate where linesâalbeit vague, inconsistent, and incompleteâhave been drawn. Spatial information can have a variety of relationships with personal data. A home address is spatial information that is likely to be personally identifiable and will typically be included within the scope of statutory privacy protections along with name, number, and other personal data. Even in the absence of a statute, spatial data that are identifiable raise overt privacy issues. In other contexts, spatial information linked with otherwise nonidentifiable personal data (e.g., from an anonymous survey) may pro- duce data that are personally identifiable or that may be potentially person- ally identifiable. Spatial information is not unique in being either identifi- able or linkable. However, the manner in which spatial information can become linked with identifiable data or may create identifiable data differs in practice from that for other types of data in both overt and subtle ways. 81
82 APPENDIX A In general, data about individuals are growing more identifiable as more information is collected, maintained, and available for public and private uses. Technological developments also contribute to the increasing identifiability of data that do not have overt identifiers. Spatial information has both of these characteristics, more data and better technology. Linking spatial information to research data can affect promises of confidentiality that were made at the time of data collection and in ways that were not foreseeable at that time. These are some of the challenges presented by the use of spatial information. Two preliminary observations about the complexity of privacy regula- tion are in order. First, privacy regulation can be highly variable and unpre- dictable in application. In the United States, privacy standards established by statute may differ depending on the extent to which the information is identifiable, the type of information, the identity of the record keeper, the identity of the user, the purpose for which the information was collected or is being used, the type of technology employed, and other elements. For some information activities, such as surveillance, additional factors may be relevant, including the manner in which information is stored or transmit- ted, the location being surveilled, the place from which the surveillance is done, and the nationality of the target. This list of factors is not exhaustive. Second, American privacy regulation is often nonexistent. Privacy stat- utes are often responsive to widely reported horror stories, and there are huge gaps in statutory protections for privacy. For many types of personal information, many categories of record keepers, and many types of infor- mation collection and disclosure activities, no privacy rules apply. Further- more, where regulation exists, information can sometimes be transferred from a regulated to a nonregulated environment. A person in possession of information regulated for privacy may be able to disclose the information to a third party who is beyond the regulatory scheme. Common law stan- dards may apply at times, but they rarely provide clear guidance. The paper begins by discussing terminology, particularly distinctions between privacy and confidentiality, and considers privacy as it is addressed in legislation, administrative process, professional standards, and litigation in the United States. Major legal and policy issues considered are identifi- ability of personal data, data collection limitations, disclosure rules, and liability for misuse of data. A Note on Terminology Privacy and confidentiality are troublesome terms because neither has a universally recognized definition. While broad definitions can be found, none is enlightening because definitions are at too high a level of abstrac- tion and never offer operational guidance applicable in all contexts. Never-
84 APPENDIX A defined scope and process for designation of information that requires protection in the interests of national defense and foreign policy. The other terms are secret and top secret. However, many other terms used by federal agencies (e.g., âfor official use onlyâ or âsensitive but unclassifiedâ) to categorize information as having some degree of confidentiality have no defined standards. The term confidential is much harder to encircle with a definition, whether in whole or in part. It retains a useful meaning as broadly descrip- tive of information of any type that may not be appropriate for unrestricted public disclosure. Unadorned, however, a confidential designation cannot be taken as a useful descriptor of rights and responsibilities. It offers a sentiment and not a standard. The terms privacy and confidentiality will not, by themselves, inform anyone of the proper way to process information or balance the interests of the parties to information collection, maintenance, use, or disclosure. In any context, the propriety and legality of any type of information process- ing must be judged by legal standards when applicable or by other stan- dards, be they ethical, social, or local. Local standards may arise from promises made by those who collect and use personal data. Standards may be found, for example, in website privacy policies or in promises made by researchers as part of the informed consent process. In nearly all cases, broad promises of confidentiality may create expectations that record keepers may not be able to fulfill. The laws that may allow or require disclosure of records to third partiesâand par- ticularly the federal governmentâcreate a reality that cannot be hidden behind a general promise of confidentiality. Other aspects of privacy (i.e., FIPs) may also require careful delineation. The vagueness of commonly used terminology increases the need for clarity and specificity. IDENTIFIABILITY AND PRIVACY Information privacy laws protect personal privacy interests by regulat- ing the collection, maintenance, use, and disclosure of personal informa- tion. The protection of identifiable individuals is a principal goal of these laws.4 Usually, it is apparent when information relates to an identifiable individual because it includes a name, address, identification number, or other overt identifier associated with a specific individual. Personal infor- mation that cannot be linked to a specific individual typically falls outside the scope of privacy regulation. However, the line between the regulated and the unregulated is not always clear. Removing overt identifiers does not ensure that the remaining informa- tion is no longer identifiable. Data not expressly associated with a specific individual may nevertheless be linked to that individual under some condi-
85 PRIVACY FOR RESEARCH DATA tions. It may not always be easy to predict in advance when deidentified5 data can be linked. Factors that affect the identifiability of information about individuals include unique or unusual data elements; the number of available nonunique data elements about the data subject; specific knowl- edge about the data subject already in the possession of an observer; the size of the population that includes the data subject; the amount of time and effort that an observer is willing to devote to the identification effort; and the volume of identifiable information about the population that includes the subject of the data. In recent decades, the volume of generally available information about individuals has expanded greatly. Partly because of an absence of general privacy laws, the United States is the world leader in the commercial collec- tion, compilation, and exploitation of personal data. American marketers and data brokers routinely combine identifiable public records (e.g., voter registers, occupational licenses, property ownership and tax records, court records), identifiable commercial data (e.g., transaction information), and nonidentifiable data (e.g., census data). They use the data to create for nearly every individual and household a profile that includes name, ad- dress, telephone number, educational level, homeownership, mail buying propensity, credit card usage, income level, marital status, age, children, and lifestyle indicators that show whether an individual is a gardener, reader, golfer, etc.6 Records used for credit purposes are regulated by the Fair Credit Reporting Act,7 but other consumer data compilations are mostly unregulated for privacy. As the amount of available personal data increases, it becomes less likely that nonidentifiable data will remain nonidentifiable. Latanya Sweeney, a noted expert on identifiability, has said: âI can never guarantee that any release of data is anonymous, even though for a particular user it may very well be anonymous.â8 For the statistician or researcher, identifiability of personal data is rarely a black and white concept. Whether a set of data is identifiable can depend on the characteristics of the set itself, on factors wholly external to the set, or on the identity of the observer. Data that cannot be identified by one person may be identifiable by another, perhaps because of different skills or because of access to different information sources. Furthermore, identifiability is not a static characteristic. Data not identifiable today may be identifiable tomorrow because of developments remote from the original source of the data or the current holder of the data. As the availability of geospatial and other information increases, the ability to link wholly nonidentifiable data or deidentified data with specific individuals will also increase. From a legislative perspective, however, identifiability is more likely to be a black and white concept. Privacy legislation tends to provide express regulation for identifiable data and nonregulation for nonidentifiable data,
86 APPENDIX A without any recognition of a middle ground. However, statutes do not yet generally reflect a sophisticated understanding of the issues. Until recently, policy makers outside the statistical community paid relatively little atten- tion to the possibility of reidentification. Nevertheless, a selective review of laws and rules illustrates the range of policy choices to date. U.S. Legislative Standards The Privacy Act of 1974,9 a U.S. law applicable mostly to federal agencies, defines record to mean a grouping of information about an indi- vidual that contains âhis name, or the identifying number, symbol, or other identifying particular assigned to the individual, such as a finger or voice print or a photograph.â10 An identifier is an essential part of a record. The ability to infer identity or to reidentify a record is not sufficient or relevant. A location may or may not be an identifier under the Privacy Act. A home address associated with a name is unquestionably an identifier. A home address without any other data element could be an identifier if only one individual lives at the address, but it might not be if more than one individual lives there. As data elements are added to the address, the con- text may affect whether the information is an identifier and whether the act applies. If the information associated with the address is about the property (â2,000 square feetâ), then the information is probably not identifying information about an individual. If the information is about the resident (âleaves for work every day at 8:00 a.m.â), it is more likely to be found to be identifying information. Part of the uncertainty here is that there is a split in the courts about how to interpret the actâs concept of what is personal information. The difference does not relate specifically to location information, and the details are not enlightening. However, the question of when a location qualifies as an identifier is an issue that could arise outside the narrow and somewhat loosely drafted Privacy Act of 1974.11 If a location is unassociated with an individual, then it is less likely to raise a privacy issue. However, it may be possible to associate location information with an individual, so that the addition of location data to other nonidentifiable data elements may make it easier to identify a specific individual. Other federal laws are generally unenlightening on identifiability ques- tions. Neither the Driverâs Privacy Protection Act12 nor the Video Privacy Protection Act13 addresses identifiability in any useful way. The Cable Communications Policy Act excludes from its definition of personally iden- tifiable information âany record of aggregate data which does not identify particular persons.â14 This exclusion, which probably addressed a political issue rather than a statistical one, raises as many questions as it answers.
87 PRIVACY FOR RESEARCH DATA Congress took a more sophisticated approach to identifiability in the Con- fidential Information Protection and Statistical Efficiency Act of 2002 (CIPSEA).15 The law defines identifiable form to mean âany representation of information that permits the identity of the respondent to whom the informa- tion applies to be reasonably inferred by either direct or indirect means.â This language is probably the result of the involvement of the statistical community in the development of the legislation. The standard is a reasonableness stan- dard, and some international examples of reasonableness standards will be described shortly. CIPSEAâs definition recognizes the possibility of using indi- rect inferences to permit identification, but it does not indicate the scope of effort that is necessary to render deidentified data identifiable. That may be subsumed within the overall concept of reasonableness. No Standard National privacy laws elsewhere do not always include guidance about identifiability. Canadaâs Personal Information Protection and Electronic Documents Act (PIPEDA) defines personal information as âinformation about an identifiable individual.â16 The act includes no standard for deter- mining identifiability or anonymity, and it does not address the issue of reidentification. A treatise on the act suggests that âcaution should be exercised in determining what is truly âanonymousâ information since the availability of external information in automated format may facilitate the reidentification of information that has been made anonymous.â17 Strict Standard The 1978 French data protection law defines information as ânomina- tiveâ if in any way it directly or indirectly permits the identification of a natural person.18 According to an independent analysis, âthe French law makes no distinction between information that can easily be linked to an individual and information that can only be linked with extraordinary means or with the cooperation of third parties.â19 The French approach does not appear to recognize any intermediate possibility between identifiable and anonymous. Unless personal data in France are wholly nonidentifiable, they appear to remain fully subject to privacy rules. This approach may provide greater clarity, but the results could be harsh in practice if data only theoreti- cally identifiable fall under the regulatory scheme for personal data. How- ever, the French data protection law includes several provisions that appear to ameliorate the potentially harsh results.20
88 APPENDIX A Reasonableness Standards The definition of personal data in the European Union (EU) Data Pro- tection Directive refers to an identifiable natural person as âan individual person . . . who can be identified, directly or indirectly.â21 On the surface, the EU definition appears to be similar to the strict standard in French law. However, the directiveâs introductory Recital 26 suggests a softer intent when it states that privacy rules will not apply to âdata rendered anony- mous in such a way that the data subject is no longer identifiable.â It also provides that âto determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the control- ler or by any other person to identify the said person.â22 Thus, the directive offers a reasonableness standard for determining whether data have been adequately deidentified. Variations on a reasonableness standard can be found elsewhere. The Council of Europeâs recommendations on medical data privacy provide that an individual is not identifiable âif identification requires an unreason- able amount of time and manpower.â23 An accompanying explanatory memorandum says that costs are no longer a reliable criterion for determin- ing identifiability because of developments in computer technology.24 How- ever, it is unclear why âtime and manpowerâ are not just a proxy for costs. The Australian Privacy Act defines personal information to mean âin- formation . . . about an individual whose identity is apparent, or can reasonably be ascertained, from the information.â25 It appears on the sur- face that a decision about identifiability is limited to determinations from the information itself and not from other sources. This language highlights the general question of just what activities and persons are included within the scope of a reasonableness determination inquiry. Under the EU direc- tive, it is clear that identification action taken by any person is relevant. The Council of Europe uses a time and manpower measure, but without defin- ing who might make the identification effort. The Australian law appears to limit the question to inferences from the information itself. The extent to which these differences are significantly different in application or intent is not clear. The British Data Protection Actâs definition of personal data covers data about an individual who can be identified thereby or through âother information which is in the possession of, or is likely to come into the possession of, the data controller.â26 The British standard does not ex- pressly rely on reasonableness or on the effort required to reidentify data. It bases an identifiability determination more narrowly by focusing on infor- mation that a data controller has or is likely to acquire. This appears to be only a step removed from an express reasonableness test. The Canadian Institutes of Health Research (CIHR) proposed a clarifi-
89 PRIVACY FOR RESEARCH DATA cation of the definition of personal information from PIPEDA that may offer the most specific example of a reasonableness standard.27 The CIHR language refers to âa reasonably foreseeable methodâ of identification or linking of data with a specific individual. It also refers to anonymized information âpermanently strippedâ of all identifiers such that the informa- tion has âno reasonable potential for any organization to make an identifi- cation.â In addition, the CIHR proposal provides that reasonably foresee- ability shall âbe assessed with regard to the circumstances prevailing at the time of the proposed collection, use or disclosure.â Administrative Process The Alberta Health Information Act takes a different approach. It defines individually identifying to mean when a data subject âcan be readily ascertained from the information,â28 and it defines nonidentifying to mean that the identity of the data subject âcannot be readily ascertained from the information.â29 This appears to limit the identifiability inquiry to the infor- mation itself. Albertaâs innovation comes in its regulation of data matching,30 which is the creation of individually identifying health information by combining individually identifying or nonidentifying health information or other in- formation from two or more electronic databases without the consent of the data subjects. The data matching requirements, which attach to anyone attempting to reidentify nonidentifying health information, include submis- sion of a privacy impact assessment to the commissioner for review and comment.31 The Alberta law is different because it expressly addresses reidentification activities by anyone (at least, anyone using any electronic databases). In place of a fixed standard for determining whether identifi- able information is at stake, the act substitutes an administrative process.32 The law regulates conduct more than information, thereby evading the definitional problem for information that is neither clearly identifiable nor wholly nonidentifiable. Data Elements and Professional Judgment Standards In the United States, general federal health privacy standards derive from a rule33 issued by the Department of Health and Human Services under the authority of the Health Insurance Portability and Accountability Act34 (HIPAA). The rule defines individually identifiable health informa- tion to include health information for which there is a reasonable basis to believe that the information can be used to identify an individual.35 This is an example of a reasonableness standard that by itself provides little inter-
90 APPENDIX A pretative guidance. HIPAAâs approach to identifiability does not end with this definition, however. HIPAA offers what may be the most sophisticated approach to identifiability found in any privacy law. The rule offers two independent methods to turn identifiable (regu- lated) data into deidentified (unregulated) data. The first method requires removal of 18 specific categories of data elements.36 With these elements removed, any risk of reidentification is deemed too small to be a concern. The HIPAA rule no longer applies to the stripped data, which can then be used and disclosed free of HIPAA obligations. The only condition is that the covered entity does not have actual knowledge that the information could be used, either on its own or in combination with other data, to identify an individual.37 The advantage of this so-called safe harbor method is that mechanical application of the rule produces data that can nearly always be treated as wholly nonidentifiable. Some critics claim that the resulting data are useless for many purposes. The second way to create deidentified (unregulated) health data re- quires a determination by âa person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable.â38 The required determination must be that âthe risk is very small that the infor- mation could be used, alone or in combination with other reasonably avail- able information, by an anticipated recipient to identify an individual who is a subject of the information.â39 The person making the determination must document the methods used and the results of the analysis on which the determination is based.40 HIPAA includes another procedure for disclosure of a limited dataset that does not include overt identifiers but that has more data elements than the safe harbor method. In order to receive a limited dataset, the recipient must agree to a data use agreement that establishes how the data may be used and disclosed, requires appropriate safeguards, and sets other terms for processing.41 Disclosures under the limited dataset procedure can be made only for activities related to research, public health, and health care operations. A recipient under this procedure is not by virtue of the receipt subject to HIPAA or accountable to the secretary of health and human services, but the agreement might be enforced by the covered entity that disclosed the data or, perhaps, by a data subject. Litigation Identifiability issues have arisen in a few court cases. â¢ One U.S. case involved a commercial dispute between two large health data processing companies. WebMD purchased a company (Envoy)
91 PRIVACY FOR RESEARCH DATA from Quintiles in 2000. As part of the acquisition, WebMD agreed to supply Quintiles with nonidentifiable patient claims data processed by En- voy. Quintiles processes large volumes of data to assess the usage of pre- scription drugs. Quintiles sells the resulting information in nonidentifiable form primarily to pharmaceutical manufacturers. The litigation arose be- cause of concerns by WebMD that the combination of its data with identi- fiable data otherwise in the possession of Quintiles would allow reidentification.42 The resolution of this dispute did not involve a ruling on the identifiability issues raised, but it may be a precursor to other similar battles. â¢ A United Kingdom case43 involving identifiability began with a policy document issued by the British Department of Health. The document expressly stated that stripping of identifiers from patient information be- fore disclosure to private data companies seeking information on the habits of physicians is not sufficient to avoid a breach of the physicianâs duty of confidentiality. Even the disclosure of aggregated data would be a violation of confidentiality. A company that obtains prescription data identifiable to physicians and not patients sued to overturn the policy. The lower court found that disclosure of patient information was a breach of confidence notwithstanding the anonymization. However, an appellate court found the reverse and overturned the department policy. Both courts proceeded on the theory that either personal data were identifiable, or they were not. Neither opinion recognized or discussed any middle ground. â¢ An Illinois case arose under the state Freedom of Information Act when a newspaper requested information from the Illinois Cancer Registry by type of cancer, zip code, and date of diagnosis.44 The registry denied the request because another statute prohibits the public disclosure of any group of facts that tends to lead to the identity of any person in the registry. The court reversed and ordered the data disclosed. Although an expert witness was able to identify most of the records involved, the court was not con- vinced. The court held that the âevidence does not concretely and conclu- sively demonstrate that a threat exists that other individuals, even those with skills approaching those of Dr. Sweeney, likewise would be able to identify the subjects or what the magnitude of such a threat would be, if it existed.â The Illinois Supreme Court upheld the decision in 2006.45 â¢ Litigation over the constitutionality of a federal law prohibiting so- called partial birth abortions produced a noteworthy decision on identifi- ability.46 The specific dispute was over disclosure during discovery of pa- tient records maintained by physicians testifying as expert witnesses. The records were to be deidentified before disclosure so that a patientâs identity could not reasonably be ascertained. The case was decided in part on grounds that there is still a privacy interest even if there were no possibility that the patientâs identity could be determined.47 Arguments that wholly
92 APPENDIX A nonidentifiable records retain a privacy interest are unusual, and the con- clusion is all the more remarkable because the judge (Richard Posner) is a well-known critic of privacy. Conclusion Existing statutes and rules that address deidentification matters can be categorized roughly into three groups. One category establishes standards for determining whether data are sufficiently or potentially identifiable to warrant regulation. The standards can (a) be inward-looking (considering only the data themselves); (b) be outward-looking (considering other data actually or potentially available elsewhere as well as the capabilities for reidentification generally available to individuals or experts); (c) require professional statistical judgment; or (d) consider the time, effort, or cost required for reidentification. This is not an exhaustive list, and multiple standards may apply at the same time. The second category involves an administrative process. The Alberta law requires an administrative review for privacy of some planned reidentification activities. An administrative process could also review deidentification efforts. Other forms of notice, review, and even approval are possible as well, but the Alberta law is the only known example to date. The third category is a mechanical rule requiring the removal of speci- fied data elements. While the first two categories are not exclusiveâit is possible to have a standard and a process together, for exampleâa me- chanical rule could be a complete alternative for a standard or a process, as HIPAA illustrates. Statutes, both domestic and international, are all over the lot. The signifi- cance of the differences among the various legislative provisions on identifi- ability is uncertain. It is not clear how much attention legislators paid to identifiability standards, and the statutes may simply offer alternate word formulas produced without much consideration. Better legislative standards on identifiability do not appear to be on anyoneâs agenda at present. The few court decisions in the area are no better than the statutes. The abortion records case and the Illinois cancer registry decision reach conclu- sions that are hard to reconcile. One case found a privacy interest in wholly nonidentifiable data, and the other found no privacy interest in supposedly deidentified records that an expert proved were identifiable. It may be some time before the courts understand the basic issues or produce any meaning- ful standards on identifiability. Finally, none of the statutes or court cases expressly addresses location information. Location information is just another data element that may contribute to the identifiability of personal data.
93 PRIVACY FOR RESEARCH DATA COLLECTION A second major privacy concern arises with the collection of personal information. In the United States, what personal information may be col- lected depends on who is doing the collection and what methods are being used. However, much actual and potential personal data collection is un- regulated, especially for private parties. For example, many merchants col- lect transaction and other information from data subjects and from a large industry of data brokers, mailing list purveyors, and other commercial firms. Even the collection of information from web users through spyware was not clearly or expressly illegal anywhere a few years ago, although some spyware may violate unfair and deceptive trade practices laws. In many other countries, however, general standards for collection exist as part of broadly applicable data protection laws, and the collection stan- dards apply generally to all public and private record keepers. Video Surveillance Video (and visual) surveillance is of particular interest because it has the capability of recording location in addition to other data elements. Except for surveillance by the government for law enforcement purposes, however, there is little law on video surveillance or the data produced by video surveillance. The lengthy description here is intended to describe standards for personal information collection for arguably public data ele- ments that might apply when statutes are rare or nonexistent. U.S. laws and policies for all types of surveillance lack clarity, coher- ence, consistency, compactness, and currency.48 The rules governing sur- veillance vary depending on numerous factors. General surveillance juris- prudence in the United States is extensive for criminal matters, and the Fourth Amendment provides important standards for government actions. Surveillance by private parties (other than wiretapping49 ) is only occasion- ally statutorily regulated, but it maybe actionable through a privacy tort. For all types of visual surveillance, the most important factors are whether it takes place in a public or private place and whether there is a reasonable expectation of privacy. A general rule of thumb (with some exceptions) is that visual surveillance in public space is not restricted. Supreme Court Decisions In Katz v. United States,50 the main issue was whether to allow evi- dence of a telephone conversation overheard by government agents who attached an electronic device to a public telephone booth made of glass. The Supreme Court decided that the surveillance was subject to Fourth
94 APPENDIX A Amendment protection, meaning that the surveillance needed a court order. Importantly, the Court held that the Fourth Amendment protects people and not places. Still, the Court said that â[w]hat a person knowingly ex- poses to the public, even in his own home or office, is not a subject of Fourth Amendment protection.â51 This statement suggests almost directly that the Fourth Amendment does not protect surveillance in public places. However, the Court did not decide that issue expressly. In a concurring opinion, Justice John M. Harlan offered a test now widely used to assess when privacy should fall under the protections of the Fourth Amendment. Under the test, a reasonable expectation of privacy exists if (1) a person has exhibited an actual (subjective) expectation of privacy and (2) that expectation is one that society is prepared to recognize as reasonable.52 When this test is satisfied, a government search or surveil- lance activity that violates the reasonable expectation of privacy falls under the Fourth Amendment. A well-recognized problem with the reasonable expectation of privacy test is the âsilent ability of technology to erode our expectations of privacy.â53 In United States v. Knotts,54 the government surreptitiously attached an electronic beeper to an item purchased by a suspect and transported in his car. The Court held that âa person traveling in an automobile on public thoroughfares has no reasonable expectation of privacy in his movements from one place to another.â55 Knotts implies that virtually any type of visual surveillance in a public place is free of Fourth Amendment con- straints. Aware that its decision might be read to allow unrestricted public place surveillance, the Court said that âdragnet-type law enforcement prac- ticesâ will be considered when they arise.56 In California v. Ciraolo,57 police officers in a private airplane flew over a house at an altitude of 1,000 feet and saw marijuana growing in the yard. The issue for the Supreme Court was whether the warrantless aerial obser- vation of a fenced yard adjacent to a home violated the Fourth Amendment. Privacy in a home receives the highest degree of Fourth Amendment protec- tion. However, the Court concluded that observation of the yard from publicly navigable airspace was not unreasonable and that there was no Fourth Amendment protection. Dow Chemical Company v. United States58 involved government aerial observation of a large chemical complex with security that barred ground- level public views and limited scrutiny from the air. The Supreme Court held that the complex fell under the doctrine of open fields, so aerial photo- graphs from navigable airspace are not a Fourth Amendment search. The Court suggested (but did not decide) that use of âhighly sophisticated sur- veillance equipment not generally available to the public, such as satellite technology, might be constitutionally proscribed absent a warrant.â59 This decision came in 1986, long before satellite photos were available to every
95 PRIVACY FOR RESEARCH DATA Internet user. Both this case and the preceding case (Ciraolo) were decided by 5 to 4 majorities.60 Video Surveillance Statutes Generally Statutes on video surveillance by private parties are rare but increasing. Recent years have seen a wave of legislation prohibiting video voyeurism. Washington State provides an example. Prior to a 2003 amendment, a statute defined the crime of voyeurism as viewing, photographing, or film- ing another person without that personâs knowledge or consent, while the person is in a place where he or she would have a reasonable expectation of privacy.61 The law defined a place where an individual would have a reasonable expectation of privacy as being (1) a place where a reasonable person could disrobe in privacy without being concerned about being pho- tographed or (2) a place where a person may reasonably expect to be safe from casual or hostile intrusion or surveillance.62 The law had to be changed when the State Supreme Court overturned the conviction of defendants who filmed in public places using a ground- level camera to take photographs up the skirts of women. The so-called upskirt photography took place in public, where there was no expectation of privacy. The state legislature quickly amended the statute, making it a crime to view, photograph, or film the intimate areas of another person without that personâs knowledge and consent under circumstances in which the person has a reasonable expectation of privacy, whether in a public or private place.63 A roughly comparable Arizona law, however, has an ex- ception for use of a child monitoring device,64 sometimes called a nanny cam. Other state laws regulate videotaping in particular circumstances. A Connecticut law prohibits employers from operating electronic surveillance devices in employee restrooms, locker rooms, or lounges.65 Texas passed a so-called granny cam law in 2001 that allows a nursing home resident âto place in the residentâs room an electronic monitoring device that is owned and operated by the resident or provided by the residentâs guardian.â66 Some laws regulate cameras to catch red light running and cameras for racial profiling oversight. Privacy Torts Video surveillance can constitute an invasion of privacy that is action- able through a private lawsuit under state laws, but state laws can vary considerably. Many states have adopted some policies from the Restate- ment of Torts (Second). The Restatement defines four types of privacy invasions, of which unreasonable intrusion upon the seclusion of another is
96 APPENDIX A the most important for surveillance purposes.67 This tort does not depend on any publicity given to the person whose interest is invaded.68 For the other privacy torts, actionable activities derive from the use to which a name, image, or information is put. Under the intrusion tort, mere surveil- lance can, under the right circumstances, give rise to a cause of action. The Restatement is clear that the intrusion must occur in a private place or must otherwise invade a private seclusion that an individual has estab- lished for his or her person or affairs. The Restatement expressly excludes the possibility of liability for taking a photograph while an individual is walking on a public highway. Even in public, however, some matters about an individual ânot exhibited to the public gazeâ can be actionable. For example, photographing someoneâs underwear or lack of it could be inva- sive and actionable as a tort, regardless of a criminal statute.69 The public/private distinction so important to Fourth Amendment ju- risprudence is equally important to the tort of intrusion upon seclusion. Surveillance of a public place, house, yard, car parked in a public place, at an airport counter, and at similar places would not give rise to liability. Surveillance in a private area, such as a dressing room or bathroom, could create liability. Tort law recognizes some limits, however. Several precedents find li- ability for invasion of privacy even though the surveillance took place entirely in public space. Thus, unreasonable or intrusive surveillance of personal injury defendants will give rise to a claim for invasion of privacy. Consumer advocate Ralph Nader successfully sued General Motors for surveilling him and invading his privacy while in public.70 Jacqueline Kennedy Onassis sued a paparazzo who aggressively followed and photo- graphed her and her children.71 Finding that the photographer insinuated himself into the very fabric of Mrs. Onassisâs life, the court issued a detailed injunction limiting the photographer from approaching her. Extrapolating from the Nader and Onassis cases is difficult, however. Even regular surveillance of a particular individual may not always support an actionable invasion of privacy. In personal injury cases, for example, it has become common for an insurance company to hire a private investigator to determine the extent of a victimâs injuries through surveil- lance. This type of surveillance is not always invasive, and the courts recog- nize it as a consequence of filing injury claims. The use of tort law in response to unreasonable surveillance activities, even in public space, has a firm basis. However, the border between reason- able and unreasonable activities remains uncertain, depending on the facts of each case and the intent of the person conducting the surveillance.
97 PRIVACY FOR RESEARCH DATA Conclusion The first issue in assessing the legality of surveillance is whether the surveillance is being done by the government or by a private actor. Rules regulating government surveillance are exquisitely complex, and rules gov- erning private surveillance are mostly nonexistent. For both types of sur- veillance, however, the two most important factors in distinguishing per- missible from impermissible visual surveillance are whether the area being surveilled is public or private and whether there is a reasonable expectation of privacy. However, many questions about the legitimate use of visual surveillance remain unanswered because courts and legislatures often trail technological developments. For the most part, however, there is almost no law that regulates visual surveillance in general or in public places. The implication in Knotts that virtually any type of visual surveillance in a public place is free of Fourth Amendment constraints is not an assurance that anything goes for the government, but that may well be the result, at least when an exotic technology is not employed. For private activity, a lawsuit over visual surveillance in public places is always possible, but it might be difficult for a plaintiff to win in the absence of a lewd intent or other showing of bad faith. The extent to which physical or camera surveillance of an individual is different from the association of location information with an individual is not clear. There is a qualitative difference between being followed or filmed, on one hand, and being tracked electronically with locations recorded (whether continuously or otherwise), on the other. Whether the association of geocoding with other types of personal data would create any legally recognized violations of privacy is impossible to say. None of the existing precedents is directly on point, and much would depend on facts, intent, methods, locations (public or private), expecta- tions, and uses. Consider the possibility that compiled information would create evidence of a crime, produce a record that would break up a mar- riage, something that would embarrass a public figure, disclose sensitive medical information (e.g., entering a drug abuse clinic), or constitute grounds for losing a job. A collection of information that violated an agreement or understand- ing reached with a research subject might be actionable under several differ- ent legal theories, including contract law and tort law. The tort for intru- sion upon seclusion is most relevant because it is not dependent on publicity (i.e., use of the information) given to the person whose interest is invaded. The mere collection of information could be enough to sustain a lawsuit. However, proving damages in privacy cases is often challenging, and it could present a significant barrier to recovery in an intrusion. Recovering damages from a researcher would be difficult in many foreseeable factual
98 APPENDIX A circumstances. However, ultimate success in litigation might provide lim- ited comfort to a researcher obliged to pay for and live through a lawsuit. Technology constantly changes the nature of surveillance and blurs the distinctions between traditional categories. Cell phones may (or may not) provide an example of a form of surveillance that is similar to but not quite the same as visual surveillance. Physically following an individual in public space is visual surveillance. Tracking an automobile with a beeper inside on public highways is also visual surveillance. Tracking an individual in public space by means of a cell phone may be different, especially if the phone is not in plain sight. This distinction between visually following an individual and using a cell phone as a tracking device may be important in a criminal context, and the courts are beginning to pay attention.72 However, crimi- nal jurisprudence is not likely to be of great relevance to researchers. Commercial tracking of cell phone locations73 may produce location information, but the availability of tracking information for secondary purposes is unknown and likely to be controlled by service contracts. There do not appear to be any statutes expressly regulating the use of cell phone location information for private purposes. It is common for private reposi- tories of personal information to exist without any statutory regulation. Marketers have voracious appetites for personal data, and they may be a market for using or acquiring location information. EU Data Protection Directive Most national privacy laws implement internationally recognized Fair Information Practice principles. The principle for collection limitation states âthat there should be limits to the collection of personal data, that data should be collected by lawful and fair means, and that data should be collected, where appropriate, with the knowledge or consent of the sub- ject.â74 The EU Data Protection Directive75 implements this policy through several provisions.76 Article 6(1)(b) requires member states to provide that personal data must be collected for specified, explicit and legitimate purposes and not further processed in a way incompatible with those purposes. Further processing of data for historical, statistical or scientific purposes shall not be consid- ered as incompatible provided that Member States provide appropriate safeguards. This policy is far removed from the anything-goes approach to personal information collection usually found in the United States in the absence of a statute that provides otherwise. In Europe, the purposes for collection and
99 PRIVACY FOR RESEARCH DATA processing must be specific, explicit, and legitimate. That means, among other things, that a data controller must define purposes in advance. Sec- ondary uses may not be incompatible with the stated purposes, and that is a weaker test than an affirmative requirement that secondary uses be com- patible. This provision of the directive provides that processing for historical, statistical, or scientific purposes does not violate the compatibility standard with additional safeguards. That allows disclosures of personal information to researchers and others, but it does not exempt the recipients from com- plying with data protection standards for the data they are processing. The directive also requires as a condition of personal data processing (including collection and disclosure) that the data subject has given consent unambiguously. Exceptions to consent include if processing is necessary for the performance of a contract, to comply with a legal obligation, to protect the vital interests of the data subject, to carry out a task in the public interest, or for the purposes of the legitimate interests pursued by the con- troller.77 There are more terms and conditions to these exceptions. European organizations cannot make unrestricted decisions about what to collect. In particular, the last justification for processingâfor the pur- poses of the legitimate interests pursued by the controllerâis worthy of additional discussion. It applies except when the data controllerâs interests are overridden by the interests or fundamental rights and freedoms of the data subject. The specific balance between the use of information for legiti- mate ordinary business activities (including but not limited to marketing) is something left to member states to decide. The policy allows considerable flexibility in implementation. For example, Great Britain implements the principle by giving individuals a limited right to prevent processing likely to cause damage or distress and an absolute right to prevent processing for purposes of direct marketing.78 In the United States, by contrast, there is no general right to opt-out of collection, marketing, or other types of process- ing. Some specific statutes grant limited rights to prevent some uses. Some companies have adopted privacy policies that grant greater rights to data subjects. One distinction that is important when comparing statutory standards across jurisdictions is the breadth of privacy laws. In countries with omni- bus privacy laws, all data controllers are likely to be subject to privacy regulation for identifiable data. Thus, a person who takes deidentified data and reidentifies them is likely to fall under the privacy regulatory scheme generally applicable to all record keepers immediately upon completion of the reidentification. The effect is that a European researcher who may have escaped data protection regulation because of the absence of identifiable data may become subject to regulation by linking that data with additional geographical or other data.
100 APPENDIX A In the United States, however, unless a law directly regulates an entityâs information processing activities, it is unlikely that any privacy restrictions will apply. The U.S. health privacy rule known as HIPAA offers an illustra- tion.79 The rule regulates the use of individually identifiable health infor- mation only by covered entities, which are most health care providers and all health plans (insurers) and health care clearinghouses. Others who ob- tain and use health data and who are not operating as covered entities (or as their business associates) are not affected by the rule in their processing activities. Thus, a researcher, public health department, or court may ob- tain regulated health data (under specified standards/procedures) without becoming subject to the HIPAA rule. Selected U.S. Statutes Limiting Collection of Personal Information Not all existing U.S. privacy statutes limit the collection of personal information. A few examples of collection restrictions illustrate the diver- sity that exists among the laws. Privacy Act of 1974 The Privacy Act of 1974,80 a law that applies only to federal govern- ment agencies and to a few government contractors (but no grantees), regulates collection in several ways. First, it allows agencies to maintain only information about an individual as is relevant and necessary to accom- plish an agency purpose. Second, it requires agencies to collect information to the greatest extent practicable directly from the data subject if an adverse determination may result. Third, it prohibits the maintenance of informa- tion describing how an individual exercises any right guaranteed by the First Amendment, unless authorized by statute or pertinent to an autho- rized law enforcement activity.81 For a researcher working for a federal agency who collects and links geographic data, the first two restrictions are not likely to be meaningful, and the third would be relevant only in narrow instances (such as tracking individuals at a political demonstration). Health Insurance Portability and Accountability Act The federal health privacy rule, issued by the U.S. Department of Health and Human Services under the authority of the Health Insurance Portabil- ity and Accountability Act (HIPAA), reflects generally recognized fair infor- mation practices,82 except that information collection is barely mentioned. The apparent policy is to avoid dictating to health care providers what information they can and cannot collect when treating patients. The only limited exception comes with the application of the HIPAA privacy ruleâs
101 PRIVACY FOR RESEARCH DATA minimum necessary standard.83 In general, the rule seeks to limit uses of, disclosures of, and requests for personal health information to the mini- mum necessary to accomplish the purpose of the use, disclosure, or request. The minimum necessary rule has several exceptions, including a broad one for treatment activities. The rule directs a covered entity requesting per- sonal health information from another covered entity to make reasonable efforts to limit the information requested to the minimum necessary to accomplish the intended purpose of the request. Data collection from a data subject or from any source other than another covered entity is not re- stricted by the minimum necessary rule. Childrenâs Online Privacy Protection Act The Childrenâs Online Privacy Protection Act (COPPA)84 makes it unlawful for a website operator to collect personal information from a child under the age of 13 without obtaining verifiable parental consent. Personal information includes a physical address. The law appears to apply to website operators located anywhere in the world. The law does not restrict collection of information by phone, fax, or other means or from older children. Cable Communications Policy Act Cable television operators may not use their cable system to collect personally identifiable information concerning a subscriber without con- sent.85 Exceptions cover service and theft detection activities. The law does not otherwise restrict collection, but it does restrict disclosure. Conclusion No general statute regulates personal information collection in the United States. A few U.S. laws restrict the collection of personal informa- tion in narrow contexts. The collection of personal informationâincluding information from public sources, from private companies, by direct obser- vation, or by linking of data from disparate sourcesâis only rarely the subject of overt regulation. Legal challenges to the mere collection of infor- mation are likely to be hard to mount in the absence of legislation, but challenges are not impossible. When collected information is used or dis- closed, however, different standards are likely to apply than apply to the mere collection of data. Use and disclosure regulations, while still rare, are found more frequently. No known federal law expressly addresses the col- lection of location information.
102 APPENDIX A DISCLOSURE A third major privacy concern is the disclosure of personal informa- tion. In the United States, disclosure is only sometimes subject to regula- tion. For many record keepers, the only limits on disclosure come from contracts with data subjects, the possibility of tort lawsuits, or market pressure. Many commercial and other institutions collect and disclose per- sonal information without the knowledge or consent of the data subjects. Some record keepers are subject to privacy or other laws with disclo- sure restrictions. Researchers are typically subject to human subject protec- tion rules and to oversight by institutional review boards. In some in- stances, laws protect narrow classes of research records or statistical data from disclosure. Laws that mandate disclosureâopen government or pub- lic record lawsâmay apply to government record keepers and to some others who receive grants from or do business with governments. Examples of all of these laws are discussed below. Any record may become the subject of a search warrant, court order, subpoena, or other type of compulsory process. Some laws protect records from some types of compulsory process, and these are reviewed here. Gen- eral laws, rules, and policies about compulsory process will not be exam- ined here, with one exception. A general statute providing for court- ordered disclosures that has received considerable attention is the USA Patriot Act. 86 Section 215 of the act allows the director of the Federal Bureau of Investigation (FBI) to seek a court order requiring the production of âany tangible things (including books, records, papers, documents, and other items) for an investigation to protect against international terrorism or clandestine intelligence activities.â87 The technical procedure is not of immediate interest, but the law requires a judge to issue an order if the request meets the statuteâs standards. The law also prevents the recipient of an order from disclosing that it provided materials to the FBI. The authority of this section makes virtually every record in the United States accessible to the FBI. It is unclear whether the USA Patriot Act was intended to override laws that protect research records against legal process. There may be different answers under different research protection laws. The standards for disclosure under the EU Data Protection Directive (a reasonable proxy for most international data protection laws) are mostly the same as the standards described above for collection. The directive generally regulates processing of personal information, and processing in- cludes collection and disclosure. As with collection, a data controller in the EU needs to have authority to make a disclosure (consent, legitimate inter- est, and others). International standards are not considered further in this section.
103 PRIVACY FOR RESEARCH DATA Laws Restricting Disclosure by Record Keepers Most privacy laws restrict the disclosure of personal information by defined record keepers. A brief description of the restrictions from a sample of these laws follows. Privacy Act of 1974 The Privacy Act of 1974,88 a law that applies only to federal govern- ment agencies and to a few government contractors (but no grantees), regulates disclosure of personal information maintained in a system of records in several ways. Generally, an agency can disclose a record only when the act allows the disclosure or with the consent of the subject of the record. The act describes 12 conditions of disclosure, which generally cover routine disclosures that might be appropriate for any government record (within the agency, for statistical uses, to the U.S. Government Account- ability Office, to Congress, for law enforcement, pursuant to court order, etc.). One of the conditions of disclosure is for a routine use, or a disclosure that an agency can essentially establish by regulation.89 Each system of records can have its own routine uses determined by the agency to be appropriate for the system. As a practical matter, the Privacy Act imposes a clear procedural barrier (publication in the Federal Register) to disclosure, but the substantive barriers are low. Fair Credit Reporting Act Enacted in 1970, the Fair Credit Reporting Act (FCRA) was the first modern information privacy law. The act tells consumer reporting agencies (credit bureaus) that they can disclose credit reports on individuals only for a permissible purpose. The main allowable purposes are for credit transac- tions or assessments, employment, insurance, eligibility for a government license, or for a legitimate business need in connection with a transaction initiated by a consumer. Some other governmental, law enforcement, and national security purposes also qualify.90 Health Insurance Portability and Accountability Act The privacy rule issued under the authority of the Health Insurance Portability and Accountability Act controls all disclosures of protected health information by covered entities (health care providers, health plans, and clearinghouses).91 However, the rule allows numerous disclosures with- out consent of the data subject. Disclosures for research purposes are per- mitted if an institutional review board or a privacy board approved waiver
104 APPENDIX A of individual authorization.92 Once disclosed to a researcher, protected health information is no longer subject to regulation under HIPAA (unless the researcher is otherwise a covered entity). However, the researcher will still be subject to the institutional review board that approved the project, which may seek to oversee or enforce the conditions of the disclosure, including restrictions on redisclosure. Whether institutional review boards have adequate oversight or enforcement capabilities is an open question. Confidential Information Protection and Statistical Efficiency Act The Confidential Information Protection and Statistical Efficiency Act of 2002 provides generally that data acquired by a federal agency under a pledge of confidentiality and for exclusively statistical purposes must be used by officers, employees, or agents of the agency exclusively for statisti- cal purposes.93 Information acquired under a pledge of confidentiality for exclusively statistical purposes cannot be disclosed in identifiable form for any use other than an exclusively statistical purpose, except with consent. The law essentially seeks to provide for functional separation of records, which is ensuring that data collected for a research or statistical purpose cannot be used for an administrative purpose.94 Some other statistical confidentiality laws (see below) offer express protections against subpoe- nas, but CIPSEA does not directly address legal process. The lawâs defini- tion of nonstatistical purpose can be read to exclude disclosures for legal process, but any exclusion is not express, and the law has not been tested.95 Driverâs Privacy Protection Act In 1994, Congress passed a law that prevents the states from disclosing motor vehicle and driversâ license records. As later amended, the Driverâs Privacy Protection Act requires affirmative consent before those records can be disclosed.96 The law allows disclosures for permissible purposes, and one of purposes is for use in research activities and in producing statistical reports.97 Any personal information so used cannot be pub- lished, redisclosed, or used to contact individuals. Highway Toll Records At least one state has a strict law protecting the confidentiality of electronic toll collection system (E-Z Pass) records that excludes all second- ary uses, apparently including law enforcement and research. New Hamp- shire law provides that
105 PRIVACY FOR RESEARCH DATA all information received by the department that could serve to identify vehicles, vehicle owners, vehicle occupants, or account holders in any electronic toll collection system in use in this state shall be for the exclu- sive use of the department for the sole purpose of administering the elec- tronic toll collection system, and shall not be open to any other organiza- tion or person, nor be used in any court in any action or proceeding, unless the action or proceeding relates to the imposition of or indemnifi- cation for liability pursuant to this subdivision. The department may make such information available to another organization or person in the course of its administrative duties, only on the condition that the organization or person receiving such information is subject to the limitations set forth in this section. For the purposes of this section, administration or adminis- trative duties shall not include marketing, soliciting existing account hold- ers to participate in additional services, taking polls, or engaging in other similar activities for any purpose.98 No search was undertaken to locate comparable state laws. Laws Protecting Research or Statistical Records Several laws provide stronger protection for research or statistical records, sometimes shielding the records from legal process. These laws vary, sometimes significantly, from agency to agency. It is not clear whether the differences are intentional or are the result of legislative happenstance. Census Bureau For records of the Census Bureau, the law prohibits the use, publica- tion, or disclosure of identifiable data (with limited statistical/administra- tive exceptions). It even provides that a copy of a census submission re- tained by the data subject is immune from legal process and is not admissible into evidence in court. This may be the most comprehensive statutory protection against judicial use in any law. Health Agencies A law applicable to activities undertaken or supported by the Agency for Healthcare Research and Quality protects identifiable information from being used for another purpose without consent and prohibits publication or release without consent.99 A similar law applies to the National Center for Health Statistics.100 Neither law expressly addresses protection against legal process, but the U.S. Department of Health and Human Services reportedly believes that both laws can be used to defeat subpoenas.
106 APPENDIX A Justice Agencies A law protects identifiable research and statistical records of recipients of assistance from the Office of Justice Programs, the National Institute of Justice, and the Bureau of Justice Assistance.101 The law prohibits second- ary uses and makes records immune from legal process or admission into evidence without the consent of the data subject. While the protection appears to be broad, the law yields to uses and disclosures âprovided byâ federal law. Thus, it appears that any statute or regulation calling for a use or disclosure (including the USA Patriot Act) would be effective. Controlled Substances Act Through the Controlled Substances Act, the attorney general can give a grant of confidentiality that authorizes a researcher to withhold identifiers of research subjects.102 Disclosure of identifiers of research subjects may not be compelled in any federal, state, or local civil, criminal, administra- tive, legislative, or other proceeding. The scope of protection against com- pelled disclosure is impressive and more detailed than some other laws. Institute of Education Sciences A law applicable to the recently established Institute of Education Sci- ences at the U.S. Department of Education severely restricts the disclosure of individually identifiable information and includes immunity from legal process.103 However, this strong protection has a significant limitation added by the USA Patriot Act. That act makes the records available for the investigation and prosecution of terrorism.104 A court order is required, but the court is obliged to issue the order if the government certifies that there are specific and articulable facts giving reason to believe that the information is relevant to a terrorism investigation or prosecution. The change in confidentiality protection previously afforded to educa- tion records is significant and potentially chilling. First, it illustrates how Congress can easily amend statutory protections afforded to statistical or research records. Second, the change appears to be retroactive, meaning that all records previously obtained under the older, more complete confi- dentiality regime are no longer protected against terrorism uses. Third, the availability of the records for terrorism eliminates the functional separation previously provided by law. Public Health Service Act The Public Health Service Act105 authorizes the secretary of health and human services to provide a certificate of confidentiality to persons engaged
107 PRIVACY FOR RESEARCH DATA in biomedical, behavioral, clinical, or other research. The certificate pro- tects a researcher from being compelled in any federal, state, or local civil, criminal, administrative, legislative, or other proceedings to identify data subjects.106 Certificates are not limited to federally supported research. A confidentiality certificate does not protect against voluntary or consensual disclosure by the researcher or the data subject. It is not certain that a certificate protects data if the data subjectâs participation in the research is otherwise known. Laws That May Require Disclosure Open Records Laws Virtually all government agencies are subject to either federal or state open records laws. The federal Freedom of Information Act107 permits any person to request any record from a federal agency. The lawâs personal privacy exemption covers most identifiable information about individuals. The exemption would be likely to protect any personal data contained in research records maintained by government researchers. While many state open records laws are similar to the federal law, some are significantly different. For example, some state open records laws do not provide a privacy exemption at all. In those states, research records might be pro- tected under other exemptions, other state laws, by constitutional limita- tions, or, conceivably, not at all. In a 1999 appropriations law, Congress directed the U.S. Office of Management and Budget (OMB) to require federal awarding agencies to ensure that all data produced under a grant be made available to the public through the procedures established under the Freedom of Information Act. The purpose was to provide for the public access to government-funded research data. The extension of the FOIA to government grantees was unprecedented. OMB Circular A-110 contains the implementing rules.108 The circular defines research data to exclude personal information that would be exempt from disclosure under the FOIAâs personal privacy ex- emption âsuch as information that could be used to identify a particular person in a research study.â109 The possibility for disclosure of identifiable research data is remote, but the OMB standard hereââcould be used to identify a particular personââis not derived expressly from the FOIA itself. It is not clear how the phrase should be interpreted. See the discussion of identifiability standards above. Public Records Public records is a term that loosely refers to government records that contain personal information about individuals and that are available for
108 APPENDIX A public inspection or copying either in whole or in part.110 State and local governments, rather than the federal government, maintain most public records. Examples include property ownership records, property tax records, occupational licenses, voting registration records, court records, ethics filings, and many more. Many states disclosed publicly driversâ li- cense data before the federal Driverâs Privacy Protection Act restricted such disclosures. 111 Some public records are available only to some users or for some purposes. Public records are relevant for several reasons. First, they are often source material for commercial or other data activities. Many details of an individualâs life, activities, and personal characteristics can be found in public files of government agencies. Regular review of public records may not only reveal current information about individuals but will also permit the compilation of a history of former addresses, roommates, jobs, and other activities. Commercial data companies heavily exploit public records to build personal and household profiles. Second, the records typically contain address information. Third, the continued public availability of public records has become controversial in some states because of privacy and identity theft concerns. Legislatures are reviewing decisions about dis- closure of the records. Conclusion Some privacy laws include provisions regulating the disclosure of per- sonal information. Other laws regulate the disclosure of narrowly defined categories of records used for statistical or research purposes. Still other laws define the terms under which public records (largely maintained by state and local governments) are publicly available. Open records laws make all government records subject to disclosure procedures, but records containing personal information are often exempt from mandated disclo- sure. Many records in private hands are unregulated at all for disclosure. There is no overarching theme or policy to be found in the law for disclo- sure of personal information, and it may require diligent research to deter- mine when or if personal information in public or private hands is subject to disclosure obligations or restrictions. LIABILITY Liability for misuse of personal data is a complex issue, and it can be addressed here only briefly. A full treatment would result in a legal treatise of significant length that would not provide significant enlightenment.112 Some privacy laws expressly include criminal or civil penalties that may apply to record keepers or to record users. Other laws or policies may apply
109 PRIVACY FOR RESEARCH DATA to specific record keepers. Physicians, for example, have an ethical obliga- tion to protect the confidentiality of patient information, and they could be sued or sanctioned under a variety of laws and theories for breaching that obligation. Credit bureaus are subject to the rules of the Fair Credit Report- ing Act, and the law provides for administrative enforcement, civil liability, and criminal penalties.113 Credit bureaus also have qualified immunity that provides limited protection against lawsuits from data subjects.114 Some penalties apply to those who misuse credit reports. Most merchants are likely to have neither a statutory nor an ethical obligation to protect client data, but some may have customer agreements or formal privacy policies. Violations of those agreements or policies could give rise to liability under tort or contract law and perhaps under other theories as well. Officers and employees of statistical agencies are subject to criminal penalties for wrongful disclosure of records.115 The Confidential Informa- tion Protection and Statistical Efficiency Act of 2002 expanded the class of individuals who may be subject to criminal penalties for wrongful disclo- sure.116 CIPSEA penalties cover officers and employees of a statistical agency, along with agents. An agent is a broadly defined category that appears to include anyone who a statistical agency allows to perform a statistical activity that involves access to restricted statistical information.117 An agent must agree in writing to comply with agency rules. CIPSEA does not include any provision that would expressly authorize a data subject to sue over a wrongful disclosure. However, other laws, including the Privacy Act of 1974, might provide a basis for a lawsuit for an individual against a federal agency that wrongfully used or disclosed per- sonal data. It is unlikely that the courts would conclude that CIPSEA cre- ates a private right of action for an aggrieved data subject against an agency employee or agent who improperly used or disclosed statistical informa- tion, but state law might provide a tort or other remedy. The creativity of plaintiffâs lawyers in finding a basis for a cause of action for cases with attractive facts should not be discounted. Winning a lawsuit and receiving damages, however, are harder to accomplish. Because of the patchwork quilt of privacy statutes and legal principles, the administrative, civil, and criminal liability of each record keeper and each record user must be analyzed separately. In the absence of statutes or regulations, the analysis would begin by identifying any duty that a record keeper may have to a data subject. In many instances, there will be no clear duty. In at least some circumstances, however, it may be possible for a data subject to have a legal remedy against a wholly unrelated third party, regardless of the source of the data used by the third party. The tort for intrusion upon seclusion and the tort for publicity given to private life permit a lawsuit to be filed against another person who has no relationship
110 APPENDIX A with the data subject and no defined contractual or statutory duty of confi- dentiality.118 The availability of these torts varies from state to state. An unexplored area of potential liability involves recipients of deidentified data who then reidentify the data subjects. In some instances, exploration of liability begins with a regulation. For example, under the federal health privacy rules issued under the authority of HIPAA, disclosure of a limited data set is permitted for some activities (including research) subject to conditions that include a prohibition against identifying the in- formation. If the recipient is not a covered entity under the rule, then there is no administrative enforcement against the recipient.119 Other enforce- ment possibilities may be available regardless of an underlying law. When a recipient obtains information from an entity that has a confi- dentiality duty to data subjects, liability over reidentification could arise in several ways. The reidentification activity might violate the agreement un- der which the data were transferred. The data supplier might be able to sue the recipient for breach of contract. Assuming that any privacy obligation falls directly on the data supplier only and not on the recipient, it is possible that the supplier could be sanctioned administratively for failing to prop- erly control further use of the information. If a recipient reidentifies data contrary to a contract or a law, it is possible that an aggrieved data subject could sue either the data supplier or the recipient. For the supplier, the principal question would be whether a breach of a duty of confidentiality resulted from an imprudent transfer of deidentified data. For a lawsuit against the recipient by an aggrieved data subject, the legal preliminaries are more complex. A tort or contract lawsuit may be possible, but a data subject may be unable to sue the recipient relying on the contract because the data subject is not a party to the contract between the data supplier and the recipient. The data subject lacks privityâan adequate legal relationshipâto the contract to be able to use the contract to enforce an interest. In general, the requirement for privity can be a major obstacle to enforcement of privacy rights for data subjects.120 However, the lack of privity can be trumped in some jurisdictions by the doctrine of third-party beneficiaries. Under current contract law prin- ciples, a contract with privacy clauses benefiting a data subject who is not a party to the contract may still be enforceable by the data subject. The conditions are that the parties to the contract (i.e., the supplier and the recipient) intended the data subject to benefit and that enforcement by the data subject is appropriate to achieve the intent of the parties.121 In other words, a data subject may be able to sue to enforce data protection provi- sions of a contract despite the lack of privity.122 The law on third-party beneficiaries varies from jurisdiction to jurisdiction, so different results are possible in different states.
111 PRIVACY FOR RESEARCH DATA Now consider the class of data recipients that reidentifies data after obtaining the data from public or other sources. These recipients may have no duty to the data subject and no direct relationship or obligation to the data suppliers. For example, Latanya Sweeney demonstrated that U.S. hos- pital discharge dataâpublicly available with all overt identifiers removedâ can nevertheless be linked to individual patients.123 In one example, she identified the hospital record of the governor of Massachusetts from records that had been deidentified before public release.124 A public disclosure of this type of information could support a lawsuit against a researcher, al- though a public figure might have a more difficult case, especially if a newspaper published the reidentified data. Whether the agency that re- leased the discharge data could also be sued is uncertain. A federal government agency might conceivably be sued for disclosing potentially identifiable personal information in violation of the Privacy Act of 1974.125 However, the act also allows agencies to justify disclosures that are compatible with the purpose for which the information was collected. An agency that took steps to allow a disclosure of deidentified data might have a complete defense.126 In any event, the act may not cover deidentified data at all, and the agency might not be responsible for its subsequent reidentification by another party. In all of these possible lawsuits, much would depend on the facts. If a reidentified record were used for a research purpose, a data subject might have difficulty convincing a jury that harm resulted. However, if the data were used to deny an insurance policy or to charge a higher price for a mortgage, proof of harm would be enhanced, as would the jury appeal of the case. Because there are so many institutions, interests, and potential legal standards, no broad conclusion about legal liability for data disclosures can be offered. Some statutes include clear sanctions, but much is uncertain otherwise. A data subject might have a remedy with respect to the disclo- sure or use of deidentified data that are later reidentified. The type of remedy and the likelihood of success would vary depending on the source of the data, the institutions involved, their relationship with the data sub- ject, and other facts. No known case or statute clearly addresses the possi- bility of a lawsuit by a data subject over reidentification of personal data. It is noteworthy, however, that remedies for the misuse and disclosure of identifiable personal information are often weak or absent. It seems unlikely that protections for deidentified data would be easier to achieve through the courts in the absence of clear statutes or other standards. However, the creativity of the plaintiffâs bar and the courts should not be discounted should a shocking misuse of data occur.
112 APPENDIX A CONCLUDING OBSERVATIONS The law surrounding the collection, maintenance, use, and disclosure of personal information by researchers and others is typically vague, incom- plete, or entirely absent. The possibility of civil liability to a data subject for collection, use, or disclosure of personal information exists, but lawsuits are not frequent, successes are few, and cases are highly dependent on facts. However, the research community faces other risks. For example, if an aggressive researcher or tabloid newspaper acquires deidentified research data and reidentifies information about politicians, celebrities, or sports heroes, the story is likely to be front-page news everywhere. The resulting public outcry could result in a major change in data availability or the imposition of direct restrictions on researchers. Many privacy laws origi- nated with horror stories that attracted press attention. When a reporter obtained the video rental records of a U.S. Supreme Court nominee, ner- vous members of Congress quickly passed a privacy law restricting the use and disclosure of video rental records.127 The Driverâs Privacy Protection Act also had its origins with a horror story. The demise of Human Resources Development Canadaâs Longitudinal Labour Force File in the summer of 2000 offers an example of how privacy fears and publicity can affect a research activity. The file was the largest repository of personal information on Canadian citizens, with identifiable information from federal departments and private sources. The database operated with familiar controls for statistical records, including exclusive use for research, evaluation, and policy and program analysis. The public did not know about the database until the federal privacy commissioner raised questions about the âinvisible citizen profile.â128 The database was staunchly defended, but the public objections were too strong, and Canada dismantled the database. The case for the database was not helped by its media designation as the âBig Brother Database.â129 Methods for collecting and using data while protecting privacy inter- ests exist, but how effective they are, how much they compromise research results, and how much they are actually used is unclear. It appears that there is room for improvement using existing policies, methodologies, and practices. However, there may be some natural limits to what can be ac- complished. The availability of personal data and the technological capa- bilities for reidentification seem to increase routinely over time as the result of factors largely beyond control. Basic transparency rules (for both privacy and human subjects protec- tion) require that respondents be told of the risks and consequences of supplying data. For data collected voluntarily from respondents, it is pos- sible that cooperation will vary inversely with the length of a privacy notice. Even when data activities (research or otherwise) include real privacy pro-
113 PRIVACY FOR RESEARCH DATA tections, people may still see threats regardless of the legal, contractual, or technical measures promised. Reports of security and other privacy breaches are commonplace. Complex privacy problems will not be solved easily because of the many players and interests involved. Those who need data for legitimate purposes have incentives for reducing the risks that data collection and disclosure entail, but data users are often more focused on obtaining and using data and less on remote possibilities of bad publicity, lawsuits, and legislation. The risk to a data subject is a loss of privacy. The risks to data suppliers and users include legal liability for the misuse of data and the possibility of additional regulation. The risk to researchers, statisticians, and their clients is the loss of data sources. The risk to society is the loss of research that serves important social purposes. These risks should encour- age all to work toward better rules governing the use and disclosure of sensitive personal information. Risks can be minimized, but most cannot be eliminated altogether. Self-restraint and professional discipline may limit actions that threaten the user community, but controls may not be effective against all members of the community and they will not be effective against outsiders. Industry standards may be one useful way to minimize risks, maximize data useful- ness, and prevent harsher responses from elsewhere. If standards do not come from elsewhere, however, then the courts and the legislatures may eventually take action. Judicial and legislative actions always follow tech- nological and other developments, and any changes imposed could be harsh and wide-reaching, especially if the issue is raised as a result of a crisis. Privacy legislation often begins with a well-reported horror story. NOTES 1. Collection Limitation Principle: There should be limits to the collection of personal data and any such data should be obtained by lawful and fair means and, where appropriate, with the knowledge or consent of the data subject. Data Quality Principle: Personal data should be relevant to the purposes for which they are to be used and, to the extent necessary for those purposes, should be accu- rate, complete, and kept up-to-date. Purpose Specification Principle: The purposes for which personal data are collected should be specified not later than at the time of data collection, and the subsequent use limited to the fulfillment of those purposes or such others as are not incompatible with those purposes, and as are specified on each occasion of change of purpose. Use Limitation Principle: Personal data should not be disclosed, made available or otherwise used for purposes other than those specified in accordance with the Pur- pose Specification Principle except (a) with the consent of the data subject, or (b) by the authority of law. Security Safeguards Principle: Personal data should be protected by reasonable secu- rity safeguards against such risks as loss or unauthorized access, destruction, use, modification or disclosure of data.
114 APPENDIX A Openness Principle: There should be a general policy of openness about developments, practices and policies with respect to personal data. Means should be readily available of establishing the existence and nature of personal data, and the main purposes of their use, as well as the identity and usual residence of the data controller. Individual Participation Principle: An individual should have the right (a) to obtain from a data controller, or otherwise, confirmation of whether or not the data con- troller has data relating to him; (b) to have communicated to him data relating to him within a reasonable time; at a charge, if any, that is not excessive; in a reason- able manner; and in a form that is readily intelligible to him; (c) to be given reasons if a request made under subparagraphs (a) and (b) is denied, and to be able to challenge such denial; and (d) to challenge data relating to him and, if the challenge is success- ful to have the data erased, rectified, completed, or amended. Accountability Principle: A data controller should be accountable for complying with measures, which give effect to the principles stated above. Organisation for Economic Co-Operation and Development (1980). 2. 5 U.S.C. Â§ 552(b)(4). 3. Executive Order 12958. 4. Laws in other countries sometimes extend privacy protections to legal persons. Cor- porate confidentiality interests (whether arising under privacy laws, through statisti- cal surveys that promise protection against identification, or otherwise) can raise similar issues of identification and reidentification as with individuals. Corporate confidentiality interests are beyond the scope of this paper. Another set of related issues is group privacy. Groups can be defined in many ways, but race, ethnicity, and geography are familiar examples. If the disclosure of microdata can be accomplished in a way that protects individual privacy interests, the data may still support conclusions about identifiable racial, ethic, or neighbor- hood groups that may be troubling to group members. Group privacy has received more attention in health care than in other policy arenas. See Alpert (2000). 5. The term deidentified is used here to refer to data without overt identifiers but that may still, even if only theoretically, be reidentified. Data that cannot be reidentified are referred to as wholly nonidentifiable data. 6. See generally Gellman (2001). For more on the growth in information collection and availability, see Sweeney (2001). 7. 15 U.S.C. Â§ 1681 et seq. 8. National Committee on Vital and Health Statistics, Subcommittee on Privacy and Confidentiality (1998a). 9. 5 U.S.C. Â§ 552a. 10. 5 U.S.C. Â§ 552a(a)(4). The value of a fingerprint as an identifier is uncertain. With- out access to a database of fingerprints and the ability to match fingerprints, a single fingerprint can rarely be associated with an individual. The same is true for a photo- graph. For example, a photograph of a four-year-old taken sometime in the last 50 years is not likely to be identifiable to anyone other than a family member. 11. Just to make matters even more complex, the federal Freedom of Information Act (5 U.S.C. Â§ 552) has a standard for privacy that is not the same as the Privacy Act. In Forest Guardians v. U.S. FEMA (10th Cir. 2005) available: http://www.kscourts.org/ ca10/cases/2005/06/04-2056.htm, the court denied a request for âelectronic GIS files . . . for the 27 communities that have a flood hazard designated by FEMA . . . showing all of the geocoded flood insurance policy data (with names and addresses removed) including the location of structures relative to the floodplain and whether the structure insured was constructed before or after the community participated in the NFIP.â The court found that disclosure would constitute an unwarranted inva-
115 PRIVACY FOR RESEARCH DATA sion of privacy, the privacy standard under the FOIA. The court reached this conclu- sion even though virtually identical information had been released in a paper file. The case turned mostly on the courtâs conclusion that there was a lack of public interest in disclosure, a relevant standard for FOIA privacy determinations. In strik- ing a balance, the court found that any privacy interest, no matter how small, out- weighed no public disclosure interest. 12. Personal information means information that identifies âan individual, including an individualâs photograph, social security number, driver identification number, name, address (but not the 5-digit zip code), telephone number, and medical or disability information, but does not include information on vehicular accidents, driving viola- tions, and driverâs status.â 18 U.S.C. Â§ 2725(3). 13. Personally identifiable information âincludes information which identifies a person as having requested or obtained specific video materials or services from a video tape service provider.â 18 U.S.C. Â§ 2710 (a)(3). 14. 47 U.S.C. Â§ 551(a)(2)(A). 15. E-Government Act of 2002, Pub. L. 107-347, Dec. 17, 2002, 116 Stat. 2899, 44 U.S.C. Â§ 3501 note Â§502(4). 16. S.C. 2000, c. 5, Â§ 2(1), available: http://www.privcom.gc.ca/legislation/02_06_01_01_ e.asp. 17. Perrin, Black, Flaherty, and Rankin (2001). 18. Loi No. 78-17 du 6 janvier 1978 at Article 4, available: http://www.bild.net/ dataprFr.htm. A 2004 amendment added these words: âIn order to determine whether a person is identifiable, all the means that the data controller or any other person uses or may have access to should be taken into consideration.â Act of 6 August 2004 at Article 2, available: http://www.cnil.fr/fileadmin/documents/uk/78-17VA.pdf. The amendment does not appear to have changed the strict concept of identifiability or to have added any reasonableness standard. 19. Joel R. Reidenberg and Paul M. Schwartz, Data Protection Law and Online Services: Regulatory Responses (1998) (European Commission), Available: http://ec.europa.eu/ justice_home/fsj/privacy/docs/studies/regul_en.pdf. 20. See Loi No. 78-17 du 6 janvier 1978 (as amended) at Article 32 (IV) (allowing the French data protection authority to approve anonymization schemes), Article 54 (allowing the French data protection authority to approve methodologies for health research that do not allow the direct identification of data subjects), and Article 55 (allowing exceptions to a requirement for coding personal in some medical research activities), available: http://www.cnil.fr/fileadmin/documents/uk/78-17VA.pdf. 21. Directive on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data, Council Directive 95/46/EC, 1995 O.J. (L 281) 31, at Article 2(a), available: http://europa.eu.int/comm/internal_market/ en/dataprot/law/index.htm. 22. Id. at Recital 26. 23. Council of Europe, Recommendation No. R (97) 5 of the Committee of Ministers to Member States on the Protection of Medical Data Â§1 (1997), available: http://www. cm.coe.int/ta/rec/1997/word/97r5.doc. 24. Council of Europe, Explanatory Memorandum to Recommendation No. R (97) 5 of the Committee of Ministers to Member States on the Protection of Medical Data Â§ 36 (1997), available: http://www.cm.coe.int/ta/rec/1997/ExpRec(97)5.htm. 25. Privacy Act 1988 Â§ 6 (2001), available: http://www.privacy.gov.au/publications/ privacy88.pdf. 26. UK Data Protection Act 1998 Â§ 1(1) (1998), available: http://www.legislation.hmso. gov.uk/acts/acts1998/19980029.htm.
116 APPENDIX A 27. Canadian Institutes of Health Research, Recommendations for the Interpretation and Application of the Personal Information Protection and Electronic Documents Act (S.C.2000, c.5) in the Health Research Context 6 (Nov. 30, 2001), available: http://www.cihr.ca/about_cihr/ethics/recommendations_e.pdf. 1(a) For greater certainty, âinformation about an identifiable individualâ, within the meaning of personal information as defined by the Act, shall include only that infor- mation that can: (i) identify, either directly or indirectly, a specific individual; or, (ii) be manipulated by a reasonably foreseeable method to identify a specific indi- vidual; or (iii) be linked with other accessible information by a reasonably foreseeable met- hod to identify a specific individual. 1(b) Notwithstanding subsection 1(a), âinformation about an identifiable individualâ shall not include: (i) anonymized information which has been permanently stripped of all identifi- ers or aggregate information which has been grouped and averaged, such that the information has no reasonable potential for any organization to identify a specific individual; or (ii) unlinked information that, to the actual knowledge of the disclosing organiza- tion, the receiving organization cannot link with other accessible information by any reasonably foreseeable method, to identify a specific individual. (c) Whether or not a method is reasonably foreseeable under subsections 1(a) and 1(b) shall be assessed with regard to the circumstances prevailing at the time of the proposed collection, use or disclosure. 28. Alberta Health Information Act Â§ 1(p) (1999), available: http://www.qp.gov.ab.ca/ Documents/acts/H05.CFM. 29. Id. at Â§ 1(r). 30. Id. at Â§ 1(g). 31. Id. at Â§ 68-72. 32. Nonstatutory administrative reviews of data disclosure may be commonplace. For example, the National Center for Health Statistics in the Department of Health and Human Services uses an administrative review process with a Disclosure Review Board to assess the risk of disclosure for the release of microdata files for statistical research. National Center for Health Statistics, Staff Manual on Confidentiality (2004), http://www.cdc.gov/nchs/data/misc/staffmanual2004.pdf. 33. U.S. Department of Health and Human Services, âStandards for Privacy of Individu- ally Identifiable Health Information,â 65 Federal Register 82462-82829 (Dec. 28, 2000) (codified at 45 C.F.R. Parts 160 & 164). 34. Public Law No. 104-191, 110 Stat. 1936 (1996). 35. 45 C.F.R. Â§ 160.103. 36. Id. at Â§ 164.514(b)(2). The complete list of data elements includes â(A) Names; (B) All geographic subdivisions smaller than a State, including street address, city, county, precinct, zip code, and their equivalent geocodes, except for the initial three digits of a zip code if, according to the current publicly available data from the Bureau of the Census: (1) The geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people; and (2) The initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000; (C) All elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older; (D) Telephone numbers; (E) Fax numbers; (F) Electronic mail addresses;
117 PRIVACY FOR RESEARCH DATA (G) Social security numbers; (H) Medical record numbers; (I) Health plan beneficiary numbers; (J) Account numbers; (K) Certificate/license numbers; (L) Vehicle identifi- ers and serial numbers, including license plate numbers; (M) Device identifiers and serial numbers; (N) Web Universal Resource Locators (URLs); (O) Internet Protocol (IP) address numbers; (P) Biometric identifiers, including finger and voice prints; (Q) Full face photographic images and any comparable images; and (R) Any other unique identifying number, characteristic, or code.â 37. Id. at. Â§ 164.514(b)(2)(ii). 38. 45 C.F.R. Â§ 164.512(b)(1). 39. Id. at Â§ 164.512(b)(1)(i). The commentary accompanying the rule includes references to published materials offering guidance on assessing risk, and it recognizes that there will be a need to update the guidance over time. Those materials are Federal Committee on Statistical Methodology, Statistical Policy Working Paper 22, Report on Statistical Disclosure Limitation Methodology (1994), available: http://www.fcsm. gov/working-papers/wp22.html; âChecklist on Disclosure Potential of Proposed Data Releases,â 65 Federal Register 82709 (Dec. 28, 2000), available: http://www.fcsm. gov/docs/checklist_799.doc. 40. 45 C.F.R. Â§ 164.512(b)(1)(ii). 41. 45 C.F.R. Â§ 164.514(e). 42. Quintiles Transnational Corp. v. WebMD Corp., No. 5:01-CV-180-BO(3), (E.D. N.C. Mar. 21, 2002). 43. R. v. Dept of Health ex parte Source Informatics Ltd., 1 All E.R. 786, 796-97 (C.A. 2000), reversing 4 All E.R. 185 (Q.B. 1999). 44. The Southern Illinoisan v. Illinois Department of Public Health, 812 N.E.2d 27 (Ill.App. Ct. 2004), available: http://www.state.il.us/court/Opinions/AppellateCourt/ 2004/5thDistrict/June/html/5020836.htm. 45. The Courtâs opinion focused in significant part on the expert abilities of Sweeney and found a lack of evidence demonstrating whether other individuals could identify individuals in the same fashion. Available: http://www.state.il.us/court/opinions/ SupremeCourt/2006/February/Opinions/Html/98712.htm. The opinion suggests that a different result might be obtained with a better factual showing that identifiability capabilities were more widespread among the population. Just how difficult it would be for others to reidentify the records is not entirely clear. However, both courts ignored the possibility that a recipient of data could hire someone with Sweeneyâs skills and learn the names of patients. The courtâs basis for decision does not seem to be sustainable in the long run. 46. Northwestern Memorial Hospital v. Ashcroft, 362 F.3d 923 (7th Cir. 2004), avail- able: http://www.ca7.uscourts.gov/tmp/I110H5XZ.pdf. 47. Two quotes from the decision are worth reproducing: Some of these women will be afraid that when their redacted records are made a part of the trial record in New York, persons of their acquaintance, or skillful âGooglers,â sifting the information contained in the medical records concerning each patientâs medical and sex history, will put two and two together, âoutâ the 45 women, and thereby expose them to threats, humiliation, and obloquy. *** Even if there were no possibility that a patientâs identity might be learned from a redacted medical record, there would be an invasion of privacy. Imagine if nude pictures of a woman, uploaded to the Internet without her consent though without identifying her by name, were downloaded in a foreign country by people who will never meet her. She would still feel that
118 APPENDIX A her privacy had been invaded. The revelation of the intimate details con- tained in the record of a late-term abortion may inflict a similar wound. 48. See generally, Gellman (2005). 49. Extensive rules and laws govern surveillance by wire, whether by government actors or private parties. 50. 389 U.S. 347 (1967). 51. 389 U.S. at 351. 52. 389 U.S. at 361. 53. See Schwartz (1995). 54. 460 U.S. 276 (1983). 55. 460 U.S. at 281. 56. Id. at 284. 57. 476 U.S. 207 (1986). 58. 476 U.S. 227 (1986). 59. Id. 60. In Kyllo v. United States, 533 U.S. 27 (2001), the Supreme Court found that police use of heat imaging technology to search the interior of a private home from the outside was a Fourth Amendment search that required a warrant. The case turned in part on the use by the government of âa device that is not in general public use, to explore the details of the home that would previously have been unknowable with- out physical intrusion.â Id. at 40. The broader implications of the Courtâs standard for technology not in general public use are not entirely clear. 61. Wash. Rev. Code Â§ 9A-44-115. 62. Wash. Rev. Code Â§ 9A-44-115(1)(c). 63. 2003 Wash. Laws Â§ 213 (amending Wash. Rev. Code Â§ 9A-44-115). 64. Ariz. Rev. Stat. Â§ 13-3019(C)(4). 65. Conn. Gen. Stat. Â§ 31-48b(b). 66. Tex. Health & Safety Code Â§ 242.501(a)(5). 67. The other torts are for appropriation of a name or likeness, publicity given to private life, and publicity placing a person in a false light. 3 Restatement (Second) of Torts Â§ 652A et seq. (1977) 68. Id. at Â§ 652B. 69. Id. at comment c. 70. Nader v. General Motors Corp., 255 N.E.2d 765 (NY 1970), 1970 N.Y. LEXIS 1618. 71. Galella v. Onassis, 487 F.2d 986 (2d Cir. 1973). 72. See, e.g., In the Matter of an Application of the United States For an Order (1) Authorizing the Use of a Pen Register and a Trap and Trace Device and (2) Autho- rizing Release of Subscriber Information and/or Cell Site Information, Magistrateâs Docket No. 05-1093 (JO), available: www.eff.org/legal/cases/USA_v_PenRegister/ celltracking_denial.pdf_; Brief for amicus Electronic Frontier Foundation at 7, avail- able: http://www.eff.org/legal/cases/USA_v_PenRegister/celltracking_EFFbrief.pdf (âThe prospective collection of cell site data will therefore reveal the cell phoneâs location even when that information could not have been derived from visual surveil- lance, but only from a physical searchâ [footnote omitted]). 73. Note, Harvard Journal of Law and Technology (fall, 2004). Given current database and storage capacities, the door is open for an Orwellian scenario whereby law enforcement agents could monitor not just criminals, but anyone with a cell phone. If it sounds improbable, consider that commercial tracking services already provide real-time loca- tion information for families and businesses. (p. 316)
119 PRIVACY FOR RESEARCH DATA 74. Organisation for Economic Co-Operation and Development, Council Recommenda- tions Concerning Guidelines Governing the Protection of Privacy and Transborder Flows of Personal Data, 20 I.L.M. 422 (1981), O.E.C.D. Doc. C (80) 58 (Final) (Oct. 1, 1980), available: http://www.oecd.org/document/18/0,2340,en_2649 _34255_1815186_1_1_1_1,00.html . 75. Council Directive 95/46, art. 28, on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data, 1995 O.J. (L281/47), available: http://europa.eu.int/comm/justice_home/fsj/privacy/law/index_ en.htm. 76. Additional rules govern the processing of special categories of data (racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, and data concerning health or sex life). Generally, explicit consent is necessary for collection of these special categories, with some exceptions. 77. Article 7. 78. UK Data Protection Act 1998 Â§Â§ 10, 11 (1998), available: http://www.legislation. hmso.gov.uk/acts/acts1998/19980029.htm. 79. U.S. Department of Health and Human Services, âStandards for Privacy of Individu- ally Identifiable Health Information,â 65 Federal Register 82462-82829 (Dec. 28, 2000) (codified at 45 C.F.R. Parts 160 & 164). 80. 5 U.S.C. Â§ 552a. 81. Id. at Â§Â§ 552a(e)(1), (2), & (7). 82. U.S. Department of Health and Human Services, âStandards for Privacy of Individu- ally Identifiable Health Information,â 65 Federal Register 82462- 82464 (Dec. 28, 2000). 83. 45 C.F.R. Â§164.502(b). 84. 15 U.S.C. Â§ 6502. 85. 47 U.S.C. Â§ 551(b). 86. Uniting and Strengthening America by Providing Appropriate Tools Required to Intercept and Obstruct Terrorism (USA Patriot Act) Act of 2001, Public Law No. 107-056, 115 Stat. 272, available: http://frwebgate.access.gpo.gov/cgi-bin/getdoc. cgi?dbname=107_cong_public_laws&docid=f:publ056.107. 87. 50 U.S.C. Â§ 1861. 88. 5 U.S.C. Â§ 552a. 89. The conditions of disclosure are at 5 U.S.C. Â§ 552a(b), with the routine use authority at (b)(2). The definition of routine use is at 5 U.S.C. Â§ 552a(a)(7). 90. 15 U.S.C. Â§ 1681b. 91. 45 C.F.R. Â§ 164.512. 92. Id. at Â§ 164.512(i). 93. 44 USC Â§ 3501 note, Â§ 512(a). An exception allows disclosure to a law enforcement agency for the prosecution of submissions of false statistical information under stat- utes imposing civil or criminal penalties. Id. at Â§ 504(g). 94. See Privacy Protection Study Commission, Personal Privacy in an Information Soci- ety 573 (1977), available: http://www.epic.org/privacy/ppsc1977report/. See also National Research Council and the Social Science Research Council (1993:34-35). 95. 44 USC Â§ 3501 note, Â§ 502(5). 96. 18 U.S.C. Â§ 2721. 97Id. at Â§ 2721(b)(5). 98. N.H. Rev. Stat. Online Â§ 237:16-e (2004), available: http://www.gencourt.state.nh.us/ rsa/html/XX/237/237-16-e.htm. 99. 42 U.S.C. Â§ 934 (formerly 42 U.S.C. Â§ 299c-3(c)). 100. 42 U.S.C. Â§ 242m(d).
120 APPENDIX A 101. 42 U.S.C. Â§ 3789g(a). 102. 21 U.S.C. Â§ 872(c). 103. 20 U.S.C. Â§ 9573. The law formerly applied only to the National Center for Educa- tion Statistics. 104. USA Patriot Act of 2001 at Â§ 508 (amending 20 U.S.C. Â§ 9007), Public Law No. 107-056, 115 Stat. 272, available: http://frwebgate.access.gpo.gov/cgi-bin/ getdoc.cgi?dbname=107_cong_public_laws&docid=f:publ056.107. 105. 42 U.S.C. Â§ 241(d). 106. The National Institutes of Health encourages investigators working on sensitive bio- medical, behavioral, clinical, or other types of research to obtain certificates. 107. 5 U.S.C. Â§ 552. 108. U.S. Office of Management and Budget, Circular A-110 (Uniform Administrative Requirements for Grants and Agreements with Institutions of Higher Education, Hospitals, and Other Non-Profit Organizations) (9/30/99), available: http:// www.whitehouse.gov/omb/circulars/a110/a110.html. 109. Id. at .36(d)(2)(i)(A). 110. See generally, Gellman (1995). 111. 18 U.S.C. Â§ 2721. 112. More on this general subject can be found in Perritt (2003). 113. 15 U.S.C. Â§ 1681 et seq. 114. Id. at Â§ 1681s-2. 115. See, e.g., 13 U.S.C. Â§ 214 (Census Bureau employees). 116. 44 U.S.C. Â§ 3501 note Â§ 513. Interestingly, while CIPSEA regulates both use and disclosure of statistical information, id. at Â§ 512, only wrong disclosure is subject to criminal penalties. 117. 44 U.S.C. Â§ 3501 note Â§ 502 (âThe term ââagentââ means an individualâ (A)(i) who is an employee of a private organization or a researcher affiliated with an institution of higher learning (including a person granted special sworn status by the Bureau of the Census under section 23(c) of title 13, United States Code), and with whom a contract or other agreement is executed, on a temporary basis, by an executive agency to perform exclusively statistical activities under the control and supervision of an officer or employee of that agency; (ii) who is working under the authority of a government entity with which a contract or other agreement is executed by an executive agency to perform exclusively statistical activities under the control of an officer or employee of that agency; (iii) who is a self-employed researcher, a consultant, a contractor, or an employee of a contractor, and with whom a contract or other agreement is executed by an executive agency to perform a statistical activity under the control of an officer or employee of that agency; or (iv) who is a contractor or an employee of a contractor, and who is engaged by the agency to design or maintain the systems for handling or storage of data received under this title; and (B) who agrees in writing to comply with all provisions of law that affect informa- tion acquired by that agency.â) 118. 3 Restatement (Second) of Torts Â§Â§ 652B, 652D (1977). 119. The HIPAA criminal penalties may not apply, either. See U.S. Department of Justice, Office of Legal Counsel, Scope of Criminal Enforcement Under 42 U.S.C. Â§ 1320d-6 (June 1, 2005), available: http://www.usdoj.gov/olc/hipaa_final.htm. 120. See, e.g., Reidenberg (1992). 121. Restatement (Second) of Contracts Â§Â§ 302, 303 (1981).
121 PRIVACY FOR RESEARCH DATA 122. The original draft HIPAA privacy rule required business partner agreements to state that the agreements intended to create third-party beneficiary rights. In the final rule, the third-party beneficiary language was removed. The commentary stated that the ruleâs intent was to leave the law in this area where it was. The discussion in the final rule shows that there were strongly divergent views on the issue. See 65 Federal Register 82641 (Dec. 28, 2000). 123. Considerable amounts of patient-level information are available. For example, the Healthcare Cost and Utilization Project distributes four databases for health services research, with data dating back to 1988. This joint federal-state partnership is spon- sored by the Agency for Healthcare Research and Quality, a part of the federal Department of Health and Human Services. The databases contain patient-level in- formation for either inpatient or ambulatory surgery stays in a uniform format âwhile protecting patient privacy.â Healthcare Cost and Utilization Project, Description of Healthcare Cost and Utilization Project (undated), available: http://www.ahcpr.gov/ downloads/pub/hcup/appkitv15b.pdf. Whether the privacy protections are adequate to protect against reidentification under all conditions is uncertain. Numerous other medical data sets are available from other sources. 124. See National Committee on Vital and Health Statistics, Subcommittee on Privacy and Confidentiality (1998b). 125. 5 U.S.C. Â§ 552a. 126. 5 U.S.C. Â§ 552a(b)(3) allows agencies to define a routine use to justify a disclosure. 127. Video Privacy Protection Act (âBork Lawâ), 18 U.S.C. Â§ 2710. 128. Privacy Commissioner (Canada), Annual Report 1999-2000 available: http://www. privcom.gc.ca/information/ar/02_04_09_e.asp. 129. McCarthy (2000). REFERENCES Alpert, S. 2000 Privacy and the analysis of stored tissues. Pp. A-1âA-36 in Research Involving Human Biological Materials: Ethical Issues and Policy Guidance (Volume II Commissioned Papers). Rockville, MD: National Bioethics Advisory Commis- sion. Available: http://bioethics.georgetown.edu/nbac/hbmII.pdf. [accessed De- cember 2006]. Gellman, R. 1995 Public records: Access, privacy, and public policy. Government Information Quarterly 12:391-426. 2001 Public Record Usage in the United States. Paper presented at the 23rd Interna- tional Conference of Data Protection Commissioners, September 25, Paris, France. Available: http://www.personaldataconference.com/eng/contribution/ gellman_contrib.html [accessed December 2006]. 2005 A general survey of video surveillance law in the United States. In S. Nouwt, B.R. de Vries, and C. Prins, eds., Reasonable Expectations of Privacy? Eleven Country Reports on Camera Surveillance and Workplace Privacy. Hague, Neth- erlands: T.M.C. Asser Press. Harvard Journal of Law and Technology 2004 Who knows where youâve been? Privacy concerns regarding the use of cellular phones as personal locators. Harvard Journal of Law and Technology 18(1):307, 316 (fall).
122 APPENDIX A McCarthy, S. 2000 Ottawa pulls plug on big brother database, Canadians promised safeguards on data. Globe and Mail, May 30. National Committee on Vital and Health Statistics, Subcommittee on Privacy and Confidentiality 1998a Proceedings of Roundtable Discussion: Identifiability of Data. Hubert Humphrey Building, January 28, Washington, DC. Transcript available: http://ncvhs.hhs.gov/980128tr.htm [accessed December 2006]. 1998b Roundtable Discussion: Identifiability of Data. Available: http://ncvhs.hhs.gov/ 980128tr.htm [accessed January 2007]. National Research Council and the Social Science Research Council 1993 Private Lives and Public Policies. G.T. Duncan, T.B. Jabine, and V.A.. de Wolf, eds. Panel on Confidentiality and Data Access. Committee on National Statis- tics, Commission on Behavioral and Social Sciences and Education.Washington, DC: National Academy Press. Organisation for Economic Co-Operation and Development 1980 Council Recommendations Concerning Guidelines Governing the Protection of Privacy and Transborder Flows of Personal Data. O.E.C.D. Doc. C (80) 58 (Final). Available: http://www.oecd.org/document/18/0,2340,en_2649_34255_ 1815186_1_1_1_1,00.html [accessed December 2006]. Perrin, S., H.H. Black, D.H. Flaherty, and T.M. Rankin 2001 The Personal Information Protection and Electronic Documents Act: An Anno- tated Guide. Toronto, Canada: Irwin Law. Perritt, H.H., Jr. 2003 Protecting Confidentiality of Research Data through Law. Paper prepared for Committee on National Statistics, National Research Council Data Confidenti- ality and Access Workshop, Washington, DC. Available: http://www7.national academies.org/cnstat/Perritt_Paper.pdf [accessed January 2007]. Reidenberg, J.R. 1992 The privacy obstacle course: Hurdling barriers to transnational financial ser- vices. Fordham Law Review 60:S137, S175. Reidenberg, J.R., and P.M. Schwartz 1998 Data Protection Law and Online Services: Regulatory Responses Commissioned from ARETE by Directorate General XV of the Commission of the European Communities. Available: http://ec.europa.eu/justice_home/fsj/pri- vacy/docs/studies/regul_en.pdf [accessed December 2006]. Schwartz, P. 1995 Privacy and participation: Personal information and public sector regulation in the United States. Iowa Law Review 80:553, 573. Sweeney, L. 2001 Information explosion. Chapter 3 in P. Doyle, J. Lane, J. Theeuwes, and L. Zayatz, eds., Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies. New York: North-Holland Elsevier.