Click for next page ( 9


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 8
THE MEANS OF PROTECTING INDIVIDUAL PRIVACY IN EVALUATION RESEARCH PHYSICAL PROTECTION A pledge to hold Information confidential is a means of striking a balance between an individual's right to privacy and the public's need to evaluate the effects of government programs or experiments. An evaluator asks people sensi- tive questions about their behavior, but promises to use the information ex- clusively for the purpose of evaluating the program or experiment and not to release it in a form that would permit identification of the individuals who furnished the information. Researchers and evaluators clearly have two strong incentives to honor the confidentiality pledge. Their personal integrity—or the reputation of their research organization—is at stake, and their ability to obtain additional data from people in the current or future evaluation research projects will be jeop- ardized if the pledge of confidentiality is broken. There are two major risks to the researchers' ability to keep the confidentiality promise: (1) that some- one will get hold of the records containing identified individual information and use the information to injure the person who gave it or for the private benefit of someone else; (2) that even after a person's name and address have been removed from the record, someone will be able to identify that person from the pieces of information given and use the information to that person's detri- ment. The first danger—that identified records will be stolen or misused—clear- ly increases with the number of people who have access to these records and with the extent to which these people lack a continuing commitment to evalua- tion research. If evaluations could be carried out by a single qualified re- searcher who conducted his or her own interviews, made a promise of confiden- tiality, and personally wrote up the results, the risk would be small. But modern evaluation research often involves a large quantity of data and a large number of people processing that data—interviewers, coders, key punch opera- tors, computer programmers, and computer operators—in addition to statisticians and other researchers. Hence, the risk of misuse is significant, and rules for handling information need to be carefully formulated and strictly enforced. The Committee feels strongly that every agency sponsoring evaluation research should have clear guidelines designed to minimize the risk of mis- handling sensitive personal information. These guidelines should be strictly followed by its own evaluators and by grantees and contractors involved in evaluation research, and severe penalties should be imposed on violators. 8

OCR for page 8
The following rules should govern such guidelines: (1) Sensitive information should not be collected unless it is clearly necessary to the evaluation and is to be used. Social scientists have a tendency to load additional questions onto survey in- struments because "it would be interesting to know that." Such loading should be severely discouraged, especially if the information is sensitive and might damage the respondent if revealed. (2) Where it is feasible and does not undermine the validity of the evaluation, the anonymity of the respondent should be preserved from the beginning by not collecting identifying information at all. Unfortunately, the collection of anonymous information usually limits the use- fulness and validity of the evaluation. If identifying information is not collected, the data provided by the respondent cannot be checked. In partic- ular, there is no way to follow up those who fail to give complete information or to estimate the likely error introduced by such incomplete responses. When the evaluation requires information from the same people at successive times (longitudinal data), identifiers must be collected so that the researchers can return for the later information. In many evaluations, the real interest centers on change brought about by a government program, and it is necessary to interview individuals more than once in order to estimate this change. Never- theless, where highly sensitive information is needed—for example, when the behavior of interest is clearly criminal—the only way to collect such informa- tion without risk to the respondent may be to preserve absolute anonymity from the beginning. (3) Identifying information, such as name and address or Social Security number, should be removed from the indi- vidual records at the earliest possible stage of analysis and replaced by a code number. The key linking this code number to the identifying information should be stored in a safe place and access to it severely limited. This key should be destroyed as soon as it is no longer needed. The objectives of these procedures should be to reduce to an absolute minimum the number of people who have access to identified records and to make sure that these people are fully committed to honoring the pledge of confidentiality and subject to severe penalties if they do not. There have been relatively few problems of protecting identified informa- tion when evaluations have been carried out by federal employees and the data processed within a federal agency. Usually, however, evaluation research is carried out by private researchers or research organizations under contract with the government, and data are processed by non-government computer instal- lations. In such situations, the contract should specify strict procedures for safeguarding identified confidential data. Interviewers and other employees handling identified data should have to sign a pledge not to reveal the infor- mation and should be subject to specified penalties if they break the pledge. Security arrangements should be specified, and the government should make sure they are observed and impose penalties if they are not. Special attention must be paid to safeguarding identified records that are stored in computers, especially computers that are accessible to many users on a time-sharing basis. The safeguards needed become more complicated when additional researchers (other than the person or organization that originally collected and analyzed

OCR for page 8
10 the data) want access for reanalysis. Such reanalysis is useful not only to get an independent check on the original evaluation, but also to use data collected for evaluation purposes to test additional hypotheses about human behavior. The information collected from families in the course of the New Jersey income maintenance experiment, for example, is not only useful in evaluating the particular income maintenance plans tried out in that experiment, but also provides a rich source of data on low income families that can be used to explore a variety of hypotheses about labor force participation, spending patterns, and other family economic behavior. It is essential, however, that reanalysis be carried out in ways that will not jeopardize the privacy of the respondents. Two basic procedures are possi- ble. One is to release data files to outside researchers only after all iden- tifiers have been removed and after ensuring that individuals cannot be identi- fied from the information given about them.^ The other is not to release any raw data at all, but to provide funding so that the original researcher can respond to the requests of outside researchers and provide reanalysis to their specifications. Under some circumstances it is useful to merge data on people from two or more confidential sources in order to improve the validity of evaluation re- search. For example, in comparing the effectiveness of two educational curric- ula for children from different socio-economic levels, children's test scores and other school performance data would be available from one source while in- formation on their family background would be obtained from another. Merging two sets of data might be indispensible to an evaluation, but care should be taken that the merging process does not jeopardize respondents' privacy. The key necessary to match two sets of data should be accessible to as few people as possible and should be returned to safekeeping or destroyed as soon as possible. Care should be taken to ensure that the new merged data set is not released in raw form if there is any significant chance that individuals can be identified from the data even after name, address, or other obvious identi- fiers have been removed. Methods of protecting privacy when more than one data file is utilized are discussed in detail in Appendix A. Even after name, address, Social Security number, and other positive identifiers have been removed from individual records, there remains a risk that a malevolent and persistent person could take the information given in an unidentified record and track down the subject person or family. The potential for this kind of detective work rises as the number of pieces of information given about each individual or family increases; hence, merging two or more data files increases the risk of identification. It is especially high when a person can be identified as a member of a small group—for example, residents of a particular census tract or pediatricians in Nevada. Risk of identifica- tion is also increased when a piece of information in the confidential file is also a matter of public record, such as birth, marriage, or death records or property taxes and deeds. Hence, if confidential data are to be released for reanalysis by additional researchers, extreme care must be taken to frustrate those who might try to match the records to real individuals. Variables that identify the person as ^This point is discussed further below.

OCR for page 8
11 a resident of a small area or member of a small group should be removed from the record, along with variables that might be matched with public records. Another technique for impeding the identification of individuals is error innoculation, which involves actually altering the information on an individual record in a way that will not invalidate statistical analysis, but will leave anyone with access to the record unsure whether it is a real or a doctored record.5 LEGAL PROTECTION The last several years have seen the emergence of a new threat to the confidentiality of data collected from people in the course of evaluation research: that a researcher will be legally compelled to reveal such infor- mation to a court or to Congress. Several incidents have brought the threat forcibly to the attention of the research community. In the New Jersey negative income tax experiment mentioned above, the researchers felt strongly that protecting the privacy of participants in the experiment was not only right, but essential to the success of the experiment. To evaluate the effects of a negative income tax, they needed accurate infor- mation on individuals' earnings, hours worked, expenditures, living arrange- ments, etc. People would be reluctant to participate or to give honest answers if they thought the information might be made public or used against them. More- over, public identification of the participants would subject them to publicity and possible pressure that might change their behavior and invalidate the re- sults of the experiment. Hence, both OEO and Mathematica (the organization carrying out the experiment) made assiduous efforts to assure respondents that the information they gave would be held strictly confidential and used only to evaluate the experiment. Interviewers and other employees were required to sign a "confidentiality agreement" and strict rules were laid down within the project for preventing the release of identified information. To the surprise and dismay of the experimenters, however, threats to the privacy of respondents arose from two directions. First, the Mercer County, New Jersey prosecutor, seeing an opportunity to use experimental information to find out if any families in the experiment were collecting illegal welfare payments, subpoenaed the records. The experimenters negotiated with the prosecutor, offering to reimburse the government for any illegal overpayments rather than subject the respondents to prosecution and publicity, but in the end believed there were no valid legal grounds for resisting the subpoenas. At the same time, the Senate Finance Committee, which was debating welfare reform, became interested in the experiment and asked for identified individual family records. The experimenters eventually convinced the Committee it did not need identified case histories, but realized that if a showdown had come there would not have been any legal grounds for resisting a Congressional subpoena. This and other incidents demonstrate that researchers can no longer in good conscience promise that information collected for evaluation purposes will 5This technique is discussed further in Appendix A.

OCR for page 8
12 be kept confidential unless they have a specific legal basis for the promise. If they are being honest with their respondents, they must alert them to the danger that the information they give might be made available to the courts or to the Congress. Such warnings are likely to decrease both the proportion of people willing to give information needed for evaluation research and the honesty of the responses. Resolution of this dilemma involves a balancing of values. There is a strong public interest in enforcing the laws and making evidence available to courts, but these interests may conflict with other values, such as protecting the privacy of the individual from government snooping or arbitrary search and seizure of personal property. Deference to these other values has led to Constitutional and other limitations on the admissibility of certain kinds of evidence, such as that obtained by illegal wiretapping. In these instances society is willing to take the risk that some criminal acts will be unpunished in order to guard against greater threats to the welfare of all citizens. Similar reasoning has led to the exclusion of certain "privileged" commu- nications, such as those between lawyer and client, from consideration as evidence in a court proceeding. The general reasoning is that society has an interest in fostering certain exchanges of information that would be jeopardized if their confidential nature could not be guaranteed, and that this interest outweighs the interest of producing additional evidence. The interest in en- couraging people with legal problems to seek the advice of lawyers and to speak freely to them is deemed considerable and depends on the assurance that the lawyers will not be required to testify against their clients in court. Sim- ilarly, confidences between husband and wife, priest and penitent, and doctor and patient have been absolutely or partly protected by common law or statutory privileges. The Committee believes that society must now consider whether the public interest would be served by making the information furnished in the course of evaluation research or social experimentation a class of privileged communication. Arguments against an extension of some form of privilege to communications between researcher and respondent range from the practical problems of defining bona fide research (or researchers) and possible impediments to prosecution of illegal activities to broader considerations such as the current trend in society toward openness and tendencies of many legal experts to resist expanding the class of privileged communication. The Committee has no desire to protect illegal activities or weaken the effectiveness of the courts or of the Con- gressional power of inquiry, but it believes that the benefits of fostering a free and honest flow of information from individuals to evaluators of public programs is worth the decrease in otherwise available data. It should be noted that this privilege would not deprive the courts or Congress of any substantial evidentiary resource currently available. Evalua- tion research and social experimentation are relatively new activities that neither the courts nor the Congress have previously relied on for any signifi- cant amount of evidence. Nor, in the long run, would the privilege deprive the courts or the Congress of evidence that would have been available lacking the privilege; if researchers cannot guarantee the confidentiality of information from respondents, much experimentation and evaluation will not be undertaken and the data will be available to no one. On balance, therefore, the Committee concluded that, in the public interest,

OCR for page 8
13 serious consideration should be given to protecting the confidentiality of information given by respondents for the purpose of bona fide evaluation re- search. The Committee then turned to the difficult questions of how the privilege should be created and how broad or narrow it should be. Is a statute needed? If so, what kind of communications should be protected, under what circum- stances, and with what exceptions? It has been argued that no statute is needed because the Constitutional guarantee of freedom of the press, which protects journalists from having to disclose their sources in court, by extension protects researchers from being compelled to disclose information gathered for research purposes, since one product of research is publication of the results. Without this extension, the basis for statutory protection of researchers would be very different from that for journalists, since protection of news sources derives from an explicit Constitutional guarantee and protection of research sources from only an im- plicit right of privacy. Even if extension of the Constitutional guarantee were to be accepted, however, it might afford little protection since recent court decisions have gone against journalists who have refused to disclose their sources in court. For this reason, consideration is currently being given to the enactment of a federal "shield statute," although there is no unanimity in the news or legal profession as to the advantages or scope of such a statute. The Committee believes that very serious consideration should be given to a researcher's shield statute and that the most valuable contribution the Committee could make to an informed debate would be to ask legal experts to draft such a statute and to publish it to stimulate discussion. The Nejelski and Peyser paper, Appendix B, is the result of that request, and the Committee hopes that it will be widely discussed and debated by lawyers, researchers, legislators, and the interested public. In brief, the draft statute would create a new class of privileged commu- nication—communication between the subject of research and the researcher— analogous to privileged communications between client and lawyer or patient and doctor. If the statute were in force, a researcher could not be compelled by a subpoena issued by a court or a legislative committee to testify about research subjects or to produce information gathered in the course of research. The researcher's shield statute proposed by Nejelski and Peyser reflects a position that might be described as "maximum protection" for research and researchers very broadly defined. It covers not only evaluation research, but all research in the public interest; it protects not only information commu- nicated by individuals to the researcher but the researcher's observations and intermediate work product, such as unfinished manuscripts; and it is an abso- lute privilege, applying under all conditions unless waived by both the subject and the researcher. The Committee believes that this "maximum protection" posi- tion deserves serious consideration; however, some members of the Committee would favor a statute that is narrower in one or more respects than that pro- posed by Nejelski and Peyser. Three kinds of limitations to "maximum protection" were considered by the Committee: (1) Limiting the privilege to federal agency evaluation research.

OCR for page 8
14 The original concern of the Committee was with "federal agency evaluation re- search," meaning research carried out by or under the sponsorship of a federal government agency and designed to estimate the effectiveness of a federal pro- gram, project, demonstration, or experiment. The Committee feels that the public importance of such research is clearly demonstrable and that the use- fulness of the research results is clearly jeopardized when confidentiality cannot be guaranteed. It would be possible to draft a federal statute creating a research privilege that applied only to federal agency evaluation research and some members of the Committee would favor this course of action. Proponents of this position believe that it is far easier to demonstrate to legislators the public usefulness of federal evaluation research than of research in general. They also fear that too broad a definition might invite abuse of the privilege by persons with tenuous connections to genuine research. Nejelski and Peyser, however, rejected this narrow approach because they saw no valid reason why evaluation sponsored by federal agencies should be singled out for protection and no useful way to distinguish evaluation research from other research. It is society's broader interest in the knowledge created by research that Nejelski and Peyser believe should be protected by the creation of a research privilege. They would specify only that the research must be in the "public interest," and not, for example, market research for the benefit of a private company. (2) Limiting the communications and materials protected. All members of the Committee see a clear interest in protecting identifiable information communicated by an individual to a researcher. Such information would normally be in the form of questionnaires, notes, or other records of interviews. Nejelski and Peyser would go further and protect all of a research- er's records and "work product," whether containing identifiable information or not. Some members of the Committee believe that extending the privilege to all of a researcher's work product is unnecessary and might invite suppression of evaluation results that are unfavorable to the sponsoring agency. (3) Narrowing the scope of the privilege. In general, privileges can be waived by the party whose privacy is being pro- tected (the lawyer's client or the doctor's patient) and do not apply under certain circumstances (e.g., a client's communication to his lawyer concerning future crimes). Nejelski and Peyser propose the creation of a more comprehen- sive privilege, which could be waived only by both researcher and subject and which would cover all communications, including discussion of future crime. Some members of the Committee feel this position gives too heavy a weight to the value of protecting research. These and related issues can and should be debated, but the Committee feels strongly that some kind of legal protection of research must be considered, to guarantee that respondents who give information about themselves to researchers— especially researchers evaluating the effectiveness of federal programs—need not fear that the information will be revealed to their detriment in a court or to an investigative body. Without such protection, it will become more and more difficult to obtain the information needed for valid evaluation of the effects of government programs.