3
Data Subjects

The principle of informed consent is, in essence, an expression of belief in the need for truthful and respectful exchanges between statisticians and human subjects.

International Statistical Institute, 1986

''Telling the truth," therefore, is not solely a matter of moral character; it is also a matter of correct appreciation of real situations and of serious reflection upon them.

Dietrich Bonhoeffer, 1965

The proposition that confidentiality can be protected by entirely prohibiting interagency transfers of identifiable data unless explicit consent is obtained would eliminate many valuable studies.

Office of Federal Statistical Policy and Standards, 1978

INTRODUCTION

Government statistical programs cannot function without the cooperation of data subjects and data providers. Experience in several Western European countries during the past two decades has shown clearly that even a mandatory census of population is dependent on the willingness of most people to respond and, to the best of their ability, provide accurate information (Butz, 1985a). Even when statistics are produced from administrative records, such as tax returns, an aroused public, through its elected representatives and advocacy groups, may block those statistical uses of the records that they consider objectionable.

Statistical agencies have legal and ethical responsibilities toward the data subjects and data providers who are the sources of data used in their statistical programs. It is sometimes difficult for statistical agencies to decide just what constitutes ethical treatment



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics 3 Data Subjects The principle of informed consent is, in essence, an expression of belief in the need for truthful and respectful exchanges between statisticians and human subjects. International Statistical Institute, 1986 ''Telling the truth," therefore, is not solely a matter of moral character; it is also a matter of correct appreciation of real situations and of serious reflection upon them. Dietrich Bonhoeffer, 1965 The proposition that confidentiality can be protected by entirely prohibiting interagency transfers of identifiable data unless explicit consent is obtained would eliminate many valuable studies. Office of Federal Statistical Policy and Standards, 1978 INTRODUCTION Government statistical programs cannot function without the cooperation of data subjects and data providers. Experience in several Western European countries during the past two decades has shown clearly that even a mandatory census of population is dependent on the willingness of most people to respond and, to the best of their ability, provide accurate information (Butz, 1985a). Even when statistics are produced from administrative records, such as tax returns, an aroused public, through its elected representatives and advocacy groups, may block those statistical uses of the records that they consider objectionable. Statistical agencies have legal and ethical responsibilities toward the data subjects and data providers who are the sources of data used in their statistical programs. It is sometimes difficult for statistical agencies to decide just what constitutes ethical treatment

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics of data subjects and data providers, however. Legislation, especially the Privacy Act of 1974, sets some minimum requirements. Beyond that, much depends on the circumstances under which data are obtained from providers. When data are collected directly for statistical purposes, appropriate rules or guidelines may differ between mandatory censuses and surveys and those for which response is voluntary. And the method of data collection—telephone, face-to-face interview, or mail questionnaire—influences how the conditions of participation in a census or survey are communicated to respondents. Secondary uses of administrative records for statistical purposes raise additional questions. How much should data providers, for example filers of tax returns, be told about statistical uses of their data and how much control should they have over such uses? Should they be notified of plans to use administrative records for research not directly related to the program for which the records are maintained and perhaps even given a chance to deny the use of their information for such purposes? There are some important distinctions between individuals and organizations as data subjects, and their concerns are likely to be quite different, as is the legislation that governs the collection and use of information about them. Within the broad category of organizations, for example, there is great diversity. Reynolds (1993) distinguishes units of government, nonprofit organizations, businesses, and voluntary organizations, such as political parties and religious bodies. This chapter focuses mainly on individuals as data subjects and providers. However, many of the considerations that apply to relationships between statistical agencies and data providers apply equally to individuals and organizations. This chapter explores the relationships of federal statistical agencies with data subjects and providers. How do the former communicate with the latter? How effective is the communication process? How can it be improved? In the section that follows, we examine direct communication between agencies and individual data subjects or providers through the use of informed consent and notification procedures. We also examine issues related to statistical uses of data sets based on information that individuals are obliged to provide to the government in order to obtain benefits or comply with legal requirements. Next, we review research relevant to the processes of communication between statistical agencies and their data providers. Some experimental research studies have explored the effects of variation

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics in informed consent procedures. Recently, cognitive research techniques have been used in laboratory settings to study respondent understanding of informed consent statements. In addition, a few public opinion surveys have explored people's attitudes about various uses of the data that are collected from them by the government for statistical and other purposes. Finally, we examine the public information and educational activities of statistical agencies that focus on privacy and confidentiality issues. Such activities are directed at the general public and at organizations that attempt to represent the interests of data subjects and data providers. INFORMED CONSENT AND NOTIFICATION Ethics and law demand that data providers be told about the conditions under which they are asked to supply information that will be used for statistical and research purposes. If participation is voluntary, data collectors must let data providers know this and give them enough information to make an informed decision about whether to provide the information requested. Throughout this report we make a distinction between informed consent and notification. The former term is appropriate only when data providers have a clear choice and will not be subject to penalties for failure to participate. The term notification is more appropriate for the decennial census of population, for which participation is mandatory, and for statistical and research uses of administrative records, such as tax returns or applications for welfare benefits, where failure to provide information needed for administrative purposes may expose individuals to penalties or lead to denial of benefits to which they would otherwise be entitled. HISTORICAL DEVELOPMENT OF INFORMED CONSENT AND NOTIFICATION PROCEDURES Clinical experiments with human subjects provided the setting for much of the early development of informed consent procedures. Reynolds (1993) cites four criteria that were found to be necessary for active informed consent: (1) a rational adult is making the decision, (2) full information is provided, (3) the decision is obtained without coercion, and (4) the subject is aware of the consequences. Some of these criteria might be regarded as less critical in statistical surveys. Yet, as Mugge (1993:346) points out

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics in a paper prepared for the panel, "Even in the most innocuous such survey a subject may suffer inconvenience, time loss, embarrassment, or psychological distress in giving an interview, and one may also suffer harm through the disclosure, the misuse, or even the planned use of the data to be provided." The Privacy Act of 1974 Prior to the passage of the Privacy Act of 1974 (P.L. 93–579), there were no general standards for informed consent and notification procedures in federally sponsored data collection activities. The information given to data providers varied widely from agency to agency. The Privacy Act, under the general heading "Agency Requirements" (Title 5 U.S.C. § 552 (a)(e)(3)), required that each person asked to supply information be informed of the following: the authority under which the information is requested, whether provision of the information is mandatory or voluntary, the principal purposes for which the information is intended to be used, the "routine uses" that may be made of the information (routine uses generally involve disclosure of individually identifiable information to other agencies or organizations and must be described in a record systems notice published in the Federal Register), and the effects on the person, if any, of not providing all or any part of the requested information. These requirements led to a much greater degree of uniformity in the informed consent and notification procedures used by federal statistical agencies. Legally, the Privacy Act requirements apply only to collection of data from individuals, but statistical agencies have also applied them, for the most part, to the collection of data from organizations. From 1974 on, agencies began to pay closer attention to the content of their statements to data providers and, in many instances, gave them more information than they did prior to the passage of the Privacy Act. The process of improvement was evolutionary rather than immediate, however. It would not be difficult to find some examples of informed consent and notification statements used since 1974 that were incomplete or possibly misleading or that failed to inform data providers of important potential statistical uses of the data, such as the release of public-use microdata sets (see,

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics e.g., Boruch and Kehr, 1983). However, Mugge's (1993) recent review of selected consent and notification procedures, conducted for the panel, found that federal statistical agencies were doing a good job of complying with the requirements of the Privacy Act (see below). Professional Association Guidelines The Privacy Act requirements were written in broad terms and did not answer all possible questions about what information should be included in informed consent and notification statements and how the information should be communicated to data providers in different kinds of data collections. Since the passage of the act, the American Statistical Association (ASA) and the International Statistical Institute (ISI) have examined some of the issues and have developed recommendations and guidelines. The ASA's Ad Hoc Committee on Privacy and Confidentiality issued a report with numerous recommendations in 1977. The committee did not make a sharp distinction between informed consent and notification, but it did explore the question of what people should be told about planned and potential statistical and research uses of information supplied initially for administrative purposes. A particularly controversial question for the committee was whether data providers in voluntary surveys should be given specific information about planned linkages of their data with data from other sources, such as income tax returns. According to the committee's report (American Statistical Association, 1977:73), a majority of the members believed that was not necessary: In informing respondents of the uses of the data, it is sufficient to state that the data will be used for statistical purposes only, if such is the case. It is neither feasible nor necessary to spell out the possibly manifold ways, some of which may not be known in advance, in which the statistics may be employed. However, three members of the committee, including the chair, disagreed with this finding. They believed that data subjects should be given specific information about all linkages, whether planned at the time of data collection or subsequently. For surveys from which public-use microdata sets were to be released, the committee suggested a statement of purpose such as the following: "The data will be used only for statistical purposes, in which individual reports will not be identifiable." As

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics Jabine (1986) subsequently pointed out, however, such absolute statements might not be justified because there are no statistical disclosure limitation techniques that could guarantee zero risk of disclosure. In 1983, the ASA's Ad Hoc Committee on Professional Ethics published on a trial basis its Ethical Guidelines for Statistical Practice, which were formally adopted by the ASA Board in December 1988 (see American Statistical Association, 1983, 1989; see also Ellenberg, 1983). Two of the guidelines are directly relevant to informed consent in "collecting data for a statistical inquiry." The committee said that the data collectors should (American Statistical Association, 1989:24) 2.B. inform each potential respondent about the general nature and sponsorship of the inquiry, and the intended uses of the data; 2.C. establish their intentions, where pertinent, to protect the confidentiality of information collected from respondents; strive to ensure that these intentions realistically reflect their ability to do so; and clearly state pledges of confidentiality and their limitations to the respondents. The committee's ethical guidelines were aimed at statistical inquiries. The committee did not explore the question of notification about planned or possible statistical and research uses of information collected initially for administrative purposes. At the international level, a Declaration on Professional Ethics was adopted by the International Statistical Institute in August 1985 and published in 1986. Section 4 of the declaration, "Obligations to subjects," is considerably more detailed than the guidelines of the two ASA committees with respect to the content of informed consent statements. Noting that "no universal rules can be framed" (p. 235), the declaration suggests that data providers not be overwhelmed with unwanted and incomprehensible details. On the other hand, it states unequivocally that "information that would be likely to affect a subject's willingness to participate should not be deliberately withheld'' (p. 235). The declaration lists 12 topics that might be included in an informed consent statement and says that "in selecting from this list, the statistician should consider not only those items that he or she regards as material, but those which the potential subject is likely to regard as such" (p. 236). Other aspects of the treatment of informed consent and notification in the ISI declaration are of special interest. First, it recognizes

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics the difference between mandatory and voluntary data collections, saying that "statisticians should attempt to ensure that subjects appreciate the purpose of a statistical inquiry, even when the subject's participation is required by law" (p. 236). Second, the declaration approaches the question of statistical uses of administrative data from an unusual perspective, that of minimizing intrusions on data providers: One way of avoiding inconvenience to potential subjects is to make more use of available data instead of embarking on a new inquiry. For instance, by making greater statistical use of administrative records, or by linking records, information about society may be produced that would otherwise have to be collected afresh. Although some subjects may have objections to the data's being used for a different purpose from that intended, they would not be adversely affected by such uses provided that their identities are protected and that the purpose is statistical, not administrative (p. 235). The declaration also states that the guidelines are not meant to be limited to persons. A footnote to Section 4 (p. 234) says, "This section of the declaration refers to human subjects, including individuals, households and corporate entities." Matching Survey and Administrative Data The practice of matching survey and administrative records for the same individuals has become increasingly common over the past three decades. Such linkages are performed for statistical purposes, most commonly to enhance a survey data base with additional information for the same persons (for an example, see the description of the 1973 Exact Match Study in Chapter 6). Advances in computer power, combined with widespread use of Social Security numbers as personal identifiers by units of government at all levels, have increased the economic and technical feasibility of matching individual records from different sources. During this same period, the trend has been toward the inclusion of more explicit information about planned linkages in informed consent or notification statements. If Social Security numbers are requested in survey interviews, it becomes a virtual necessity, regardless of any legal or ethical imperatives, to tell data providers how they will be used. Some survey respondents will surely want to know why the numbers are needed.

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics Waivers Sometimes a statistical agency may want to use data gathered for statistical purposes in ways that are not ordinarily permitted under their statutes or policies. Such uses may or may not be contemplated when the data are obtained from data providers. If the uses are known at the time, the permission of data providers can be requested as part of the informed consent procedures. If they are not known when the data are being collected and the informed consent statement does not provide for unanticipated research use of the data, some kind of recontact may be necessary later. We use the term waiver for this process because it involves asking data providers to waive confidentiality or data access provisions that would normally apply to their data. In some instances, confidentiality statutes have been interpreted to deny data providers the right to waive the relevant provisions. We provide one example here that relates to data for persons. This example was provided in one of the case study workshops (see Appendix A) that were organized by the Committee on National Statistics and the Social Science Research Council to provide background information for the panel. The Longitudinal Retirement History Survey (LRHS) was conducted for the Social Security Administration by the Census Bureau, under the authority of Title 13 of the U.S. Code, using a sampling frame based partly on data from the 1960 decennial census. From 1969 to 1979, interviews were conducted at two-year intervals with a sample of persons who had been close to retirement age at the time the survey started. Social Security earnings data from administrative sources were added to the data base containing the survey information for persons in the sample. In addition to published reports and analyses, a public-use microdata file containing matched survey and administrative record data was released. In the late 1980s, the National Institute on Aging (NIA) wanted to fund a follow-up study in which surviving members of the LRHS sample would be interviewed one or more times. Additional administrative record data for them, from various sources, would be added to the data set; for those who had died, the year and cause of death would be determined by means of a match to the National Death Index. The NIA, however, was willing to fund the study only if the resulting linked data files could be released to researchers working under NIA grants. Census Bureau policies for release of

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics microdata, which had been revised subsequent to the initial survey, precluded releases of such linked data sets on the basis of the belief that there was a significant risk of reidentification of individuals by the agency holding the administrative data. A possible alternative was for the Census Bureau to approach the persons in the sample to see if they would waive certain of the protections provided to their data under Title 13. Such waivers would make it possible to conduct the follow-up survey under a different authority (e.g., the authority given the Commerce Department by Title 15 of the U.S. Code) that would allow the resulting linked data set to be released to researchers. The Census Bureau conducted a test of waiver procedures with a cohort of the National Longitudinal Survey sample (this group of persons had also been surveyed under the authority of Title 13). The test revealed that about two-thirds of the persons approached were willing to sign a waiver. Subsequent to the test, however, Office of Management and Budget (OMB) lawyers ruled that Census Bureau employees could not release such data, even though waivers had been obtained. Consequently, the NIA abandoned its effort to fund follow-up interviews and to add information from administrative record sources (for additional details, see Jabine, 1993a). The agency subsequently funded a new longitudinal study, the Health and Retirement Study, which is being conducted under arrangements that will permit the release of microdata files containing linked survey and administrative record data for persons who have provided written consent. However, an opportunity to enhance the utility of an existing and uniquely valuable data set had to be forgone. CURRENT POLICIES AND PRACTICES Mugge (1993) reviewed the informed consent or notification statements for 15 data collection programs of federal statistical agencies, plus the statement included with individual income tax forms. He also reviewed three examples of informed consent procedures used in connection with the collection of Social Security numbers in statistical surveys. He checked each statement for conformance with Privacy Act requirements and the inclusion of a confidentiality pledge. He also reviewed the formats used and other features of interest. Mugge concluded that the statements for the 15 statistical surveys were scrupulous in their compliance with Privacy Act requirements, when applicable. Most of the requirements were

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics also met for economic surveys in the group, even though not called for by the act. However, he questioned the adequacy of the statement in the tax return package, which tells filers that their information will be disclosed to other federal agencies and to state and local governments "as provided by law." The statement contains little information about the kinds of disclosures that will be made, and it makes no specific mention of statistical uses or linkages with other records. When a government agency asks individuals to disclose their Social Security number, the Privacy Act requires that the agency "inform that individual whether that disclosure is mandatory or voluntary, by what statutory or other authority such number is solicited, and what uses will be made of it" (P.L. 93–579, Sec. 7(b)). Mugge found that these requirements were followed, in different ways, in each of the three examples he reviewed. However, two of the three questionnaires did not have provisions for distinguishing whether the absence of a Social Security number meant that the respondent refused to provide it, the respondent did not know it, or the data subject did not have one. Lacking this information, an agency that wanted to follow a policy of not linking data when the number was refused (recommended in Jabine, 1986) would be able to attempt linkages only for those cases for which Social Security numbers were reported by respondents. Finally, Mugge found that notification statements were delivered to survey respondents in several different ways, including a transmittal letter, a separate Privacy Act notice, a question-and-answer sheet, a brochure describing various aspects of the survey, separate instructions for completing the questionnaire, and on the face sheet of the questionnaire itself. In several instances a multitiered approach was used; that is, the information was supplied at more that one stage, and a telephone number, frequently toll-free, was provided for use by respondents wanting more information. Special techniques have been developed for telephone surveys, because it is not always possible to meet the Privacy Act requirement that the notification statement be on the questionnaire form or on a paper to be retained by the data provider. Lawyers at the Department of Health and Human Services have approved the procedure developed by the National Center for Health Statistics of having telephone interviewers sign a statement that they have read the full notification statement to the respondent or, for computer-assisted surveys, enter information to that effect in the computer. Evaluation studies, especially those in which census or survey

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics responses are matched with administrative records and vice versa, pose special problems for the use of informed consent procedures. For example, what information about linkages should be given to participants in ethnographic studies designed to investigate factors associated with undercoverage in the census of population or to respondents to a postcensal household survey of census response behavior? What about a survey of voting behavior (voting tends to be overreported in household interviews) in which the survey responses are to be matched with voting records or the sample of respondents has been selected from voting records? The dilemma in such instances is that full disclosure of the purposes and procedures of the study may prejudice the accuracy and utility of the results. Social scientists have wrestled with such questions (see, e.g., Beauchamp et al., 1982:Pt. 3), but, insofar as we have been able to determine, federal statistical agencies that undertake or sponsor such studies do not have any generally accepted set of guidelines to refer to in developing their informed consent procedures. MANDATORY DATA SETS: CONTROLLING INFORMATION ABOUT ONESELF We turn our attention now to one of the more difficult questions we faced, as well as some of our predecessors: When individuals are required by law to provide personal information to the government, to what degree should they be allowed to control uses of that information for purposes other than the immediate ones for which they were required to furnish it? For voluntary surveys, maintaining control over one's own information is fairly straightforward if the agency collecting the data provides accurate, complete, and clear information about the voluntary nature of the survey and the uses that will be made of the data. If appropriate informed consent procedures are used and the agreed conditions of use are followed, data providers have full control over the information they supply. Prospective data providers who do not like the conditions associated with the survey may decline to participate. They may sometimes be asked for their reasons, but they cannot be required to give any. When provision of information is mandatory, however, data providers no longer have full control over the uses that will be made of their information. For most persons, the consequences of failing to file a tax return or refusing to provide information needed to establish eligibility for Social Security benefits are too unpleasant

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics condition attached to the majority proposal would be eliminated. A ''passive waiver" procedure would be acceptable, that is, the data could be shared if the data provider did not explicitly forbid it after being notified of the proposed uses. This position was taken based on the member's subjective weighing of the value of individual autonomy and control over information about oneself compared with the value of more nearly complete and economically obtained data. RESEARCH RELATED TO CONFIDENTIALITY AND DATA ACCESS CONTROLLED SURVEY EXPERIMENTS Three major U.S. studies have conducted included experiments on informed consent procedures for a statistical survey of the general population. In the late 1970s, the Committee on National Statistics (National Research Council, 1979) conducted a face-to-face survey in which the confidentiality assurances given to respondents were systematically varied. Five separate versions were studied: (1) assurance of confidentiality in perpetuity, (2) assurance of confidentiality for 75 years, (3) assurance of confidentiality for 25 years, (4) no mention of confidentiality, and (5) a statement that replies could be given to other agencies and the public. Participation rates decreased monotonically with decreasing assurances of confidentiality, but the observed spread among the five versions was less than three percentage points. Further, two-thirds of the refusals came from persons who declined to participate before the interviewer had a chance to read the introduction to the survey. For those who did participate, nonresponse and underreporting to the income question, which was the most sensitive one included in the survey, were greater when weaker confidentiality assurances were given. In an experiment that was also part of a face-to-face survey, Singer (1978) investigated the effects of more versus less information about sensitive subject matter in survey introductions, varying assurances of confidentiality, and requiring or not requiring a signature to document consent. Varying the information given to respondents about the content of the survey had no observable effect on participation rates or the quality of survey response, nor were participation rates affected by varying assurances of confidentiality. However, response rates to sensitive questions were higher for those who had been given absolute assurances of confidentiality.

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics Asking respondents to sign a consent form would have caused a significant drop in participation rates for the survey if it had been required in order to conduct the interview. However, signing the consent form was not required and most of the respondents who refused to sign were still willing to be interviewed. In a telephone survey, Singer (1979) investigated the effects of variations in information about survey content and the purpose of the study on participation rates and response rates to individual questions. She found no significant differences. Singer (1993) has prepared a comprehensive review of the above and other related studies, including some conducted in Germany. The review, originally prepared for the panel, includes some studies of passive consent procedures whereby data subjects or providers are notified that some step will be taken (e.g., their children will be enrolled in a school-related research study) unless they notify the researchers of their disapproval. Some of the studies, in addition to varying treatments experimentally, asked participants directly about their attitudes and perceptions on matters related to informed consent, privacy, and confidentiality. The results of the various studies do not always agree but, taken in all, they provide the basis for some tentative conclusions: Behavior does not necessarily mirror expressed concerns. Most persons, when asked directly, thought that assurances of confidentiality would make people more willing to answer questions, but experiments embedded in surveys of the general public showed only small differences in participation rates between those given strong assurances and those given weak assurances or none at all. Moreover, some recent experiments have shown that elaborate assurances of confidentiality can reduce willingness to participate (Singer et al., 1990). When potential survey respondents are contacted without prior notice, such as an advance mailing, most refusals to participate occur before the interviewer has had a chance to explain the confidentiality protections that will be given to survey responses. Requiring a signature to document consent significantly reduces survey participation rates. To a point, verbal assurances of confidentiality appear to result in less nonresponse and more accurate responses to questions on sensitive topics, like income. In some instances, however, especially when the assurances immediately preceded the sensitive question, there were more refusals to answer.

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics Studies involving technical means of ensuring confidentiality, such as randomized responses, have sometimes yielded higher estimates of sensitive behavior, but the lack of significant changes in some studies or for some variables indicates that the conditions required for this to occur are not clear. There is evidence that many people do not hear, understand, or remember precisely what is said in the introduction to an interview (Singer, 1979). COGNITIVE RESEARCH The past decade has seen increasing recognition of the contributions that the cognitive sciences can make to survey research (see, e.g., Tanur, 1992). Three federal statistical agencies, the Bureau of Labor Statistics, the Census Bureau, and the National Center for Health Statistics, have established small units to undertake laboratory and field studies of the cognitive processes of respondents who are asked to comprehend survey questions, retrieve relevant information from memory, and make judgments about how to answer the questions. Most of the research has been aimed at improving questions on specific topics. However, the Behavioral Sciences Research Group at the Bureau of Labor Statistics, with support from the Statistics of Income Division, Internal Revenue Service (IRS), has recently begun a program of research on how providers of personal data understand and interpret words and phrases contained in informed consent and notification statements (van Melis-Wright et al., 1992). The research at the Bureau of Labor Statistics has two components. An initial investigation will identify and evaluate people's perceptions of the language, terms, and concepts that might be used in assurances of confidentiality. The second phase will investigate the effect of various terms and concepts on participation decisions in order to determine which of them may be effectively and ethically used to promote participation and truthful answers in surveys. The analyses will include terms and concepts related to data providers' willingness to have the information they provide shared in identifiable form with another organization. PUBLIC OPINION RESEARCH Several U.S. public opinion surveys have addressed privacy questions, and a few have been devoted entirely to such issues, notably a series of surveys conducted by Louis Harris and Associates

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics in recent years. However, these surveys have focused primarily on privacy and confidentiality issues related to uses of information about individuals by the private sector, and they provide only limited information on public attitudes about government statistical activities. A 1990 survey in the series was sponsored by Equifax Inc. (1990), a provider of consumer and business information services. Most of the questions were about information practices of commercial organizations and government agencies in general. One item asked respondents about how much they trusted several different organizations, including the Census Bureau, to collect information about them and treat it responsibly. The proportion who had high or moderate trust in the Census Bureau (81 percent, with 2 percent not sure) was as large as that for any other organization and considerably larger than most (e.g., 67 percent for the Internal Revenue Service, with 1 percent not sure). The Internal Revenue Service (1984, 1987) has sponsored several national surveys of taxpayers to study their opinions and attitudes about IRS personnel, programs, and activities. Some of the results are relevant to the questions discussed earlier in this chapter concerning the use of information from mandatory data sets (such as those based on income tax returns) for statistical purposes not directly related to the purposes for which the information was obtained. Most of the taxpayer opinion surveys, for example, included questions about statistical and other nontax uses of information provided on tax returns. The most recent survey, the 1990 Taxpayer Opinion Survey, included questions on knowledge of tax laws and policies governing sharing of tax data, views on what the provisions of those laws and policies should be, views on sharing tax data for specified purposes, and the possible use of IRS and Social Security Administration records in future population censuses. The 1990 survey data (see Table 3.1) indicated that most taxpayers had limited knowledge of the relevant laws and policies and had not read or thought much about the sharing of tax data with other parts of the government. Nevertheless, the majority said they had substantial interest in the subject. When asked in general terms for their views about policies for sharing tax data, the majority of respondents took a fairly restrictive view. For example, 64 percent agreed strongly or agreed that the IRS should never release information about people's income to other government agencies under any circumstances. However, when respondents were subsequently asked about several kinds of releases of

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics TABLE 3.1 Taxpayer Attitudes on Specific Releases of Tax Data Agency and Purpose Favor Oppose Not Sure The Department of Justice—for major criminal investigations (such as drugs and organized crime) 77 21 1 The Veterans Administration—for determining whether veterans are eligible to receive benefits because they are unable to work 70 27 3 The Census Bureau—for identifying where people have moved in order to study population trends (statistical use) 65 33 3 The Department of Agriculture—to maintain an up-to-date list of farms for crop and livestock surveys (statistical use) 59 37 4 State governments—for improving state collection of taxes from businesses 56 42 2 State governments—for improving state collection of taxes from individuals 44 52 4 Members of Congress—for any use they consider appropriate 13 84 3   SOURCE: Tabulations from the 1990 Taxpayer Opinion Survey provided by the Research Division, Internal Revenue Service. tax data to particular agencies for specific purposes, their reactions were quite different, as shown in Table 3.1. For example, when taxpayers were asked what they thought about the use of certain kinds of administrative records in the decennial census in order to reduce the cost of the census and the burden on respondents, 70 percent favored the use of Social Security information on date of birth and sex, and 61 percent favored the use of IRS information on place of residence and income. Data from the 1990 and earlier taxpayer opinion surveys should be used with caution, for several reasons. First, the wording and format of the questions about sharing tax information differed significantly from one survey to the next. Second, the data reflect only the opinions of the member of each tax-filing family (as

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics defined for individual income tax purposes) considered to be most knowledgeable about that family's tax-filing practices. Third, most respondents to the 1990 survey had little knowledge of laws and policies governing nontax uses of their return information and had not given much thought to such matters. Subject to these caveats, the survey results suggest that the majority of taxpayers support the idea of sharing their tax information with other agencies for statistical purposes. However, minorities of taxpayers, generally in the range of 15 to 20 percent, are strongly opposed to such data sharing. At the time this report was being prepared, the full IRS report and the public-use data file from the 1990 Taxpayer Opinion Survey had not been released. When these outputs become available, it will be possible to analyze how the demographic and social characteristics of taxpayers are associated with their attitudes toward statistical and other nontax uses of their tax return information. FINDINGS AND RECOMMENDATIONS Statistical agencies need to "know their respondents." How do data providers interpret concepts like privacy, confidentiality, disclosure, data sharing, and statistical purposes? How well do they understand informed consent and notification statements, and how are their decisions on survey participation influenced by different formats and modes of presentation? What kinds of information about themselves do they consider to be most sensitive? What do they think about the linkage of their information from two or more sources? How do their reactions vary by race-ethnic group, gender, and socio-economic status? The research described in this section has begun to provide useful answers to some of these questions. Such research should continue because it is not complete and because public opinion about statistical data collection activities may vary gradually over time as a result of changes in the technological, social, and political environments in which public surveys are undertaken. Moreover, sudden changes in the environment for conducting public surveys are possible in response to highly publicized incidents involving improper disclosure and uses of confidential personal data in connection with nonstatistical activities, either in the public or private sector.

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics Recommendation 3.4 Statistical agencies should undertake and support continuing research, using the tools of cognitive and survey research, to monitor the views of data providers and the general public on informed consent, response burden, sensitivity of survey questions, data sharing for statistical purposes, and related issues. PUBLIC INFORMATION ACTIVITIES OF STATISTICAL AGENCIES AND ORGANIZATIONS CURRENT PRACTICES Statistical agencies explain their positions on confidentiality questions and fair information practices to individual data providers through the use of informed consent procedures and notification statements. They also attempt to communicate their positions on such matters to the general public or to particular groups and organizations through various kinds of public information activities. The motivation for some of these activities is to enlist the support of the public and various organizations for specific programs, such as the decennial census. In other instances, public information activities develop as responses to specific allegations or suggestions that data collected for statistical purposes are not being kept confidential. Following are some examples of constructive public information activities that have come to the attention of the panel: The Bureau of the Census (1985) has developed a general-purpose brochure, How the Census Bureau Keeps Your Information Strictly Confidential , which succinctly describes the legal and physical security protections that are provided for information collected by the Census Bureau. The brochure cites favorable comments from several newspapers and other sources about the confidentiality provided to Census Bureau data. The Census Bureau discussed the application of statistical disclosure limitation techniques (see discussion of these techniques in Chapter 6) to small-area tabulations from the 1990 census with representatives of several organizations, including the American Civil Liberties Union (ACLU), to determine their views, as data users and privacy advocates. In the discussions, the ACLU representatives had to balance their privacy concerns against their interest in ensuring that minority groups receive proper voting representation. In this instance, the latter concern won out. It was

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics agreed that the exact counts of persons aged 18 and over by race or ethnicity should be published and that the planned use of techniques to reduce disclosure risks would be applied only to other variables. The American Statistical Association's (1991) Committee on Privacy and Confidentiality has developed a brochure on Surveys and Privacy to answer questions about privacy and confidentiality, clarify the responsibilities of survey takers, describe a respondent's reasonable expectations, and encourage survey response. In addition to survey respondents and survey takers, the intended audience for the brochure includes "the public, the media, and Congress." The committee obtained funding from four statistical agencies to print and distribute a large number of the brochures. An example of unfavorable publicity requiring a response or clarification of the Census Bureau's confidentiality policies is provided by the announcement, early in 1990, by the Lotus Development Corporation and Equifax of plans to market a CD-ROM product called Lotus Marketplace: Households. Prospective purchasers were offered a data base containing information on 80 million U.S. households, including names, address, gender, marital status, income, buying preferences, and even "psychographic categories," such as "cautious young couple" or "inner-city single.'' The sales brochure for Lotus Marketplace: Households mentioned the Census Bureau as one of the product's important sources of data. A quick reading of the brochure might have given the impression that household incomes were being obtained directly from individual census returns (some press reports conveyed this impression; see, e.g., Lewis, 1991). They were not. As was made reasonably clear in the technical portion of the brochure, the household income values were model-based estimates obtained by using small-area data from the decennial census plus household size and other variables from other sources. Nevertheless, the Census Bureau received some indignant inquiries as to why individual census information was being made available for a commercial data base. Because of strong objections about the privacy and confidentiality aspects of this data base by many organizations and individuals, Lotus Marketplace: Households was withdrawn from the market early in 1991. A particular objection to the product was that the CD-ROM format would preclude the possibility of corrections to individual data items. However, similar data bases in

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics other formats are being developed and sold. Small-area data from the decennial census and geocoding systems developed and released by the Census Bureau are likely to be used in many of these systems. Statistical agencies can expect more questions to be raised about their relation to such commercial enterprises, and they must face the possibility that objections to them might have a spill-over effect on public willingness to participate in government surveys. RECOMMENDATIONS The panel believes that the risks of major or deliberate violations of privacy or confidentiality are extremely low in the federal statistical system. The risks are somewhat higher for federal administrative records, as illustrated by recent revelations of sales of Social Security records to private investigators (see Washington Post, December 28, 1991:A1; Baltimore Sun, February 29, 1992:1A), and probably highest of all for private sector record systems, as illustrated by the Lotus Marketplace example. The public, however, does not always distinguish among these different types of records, and there is a danger that violations not involving statistical data bases can create moral outrage and have damaging spill-over effects on federal statistical programs. Recommendation 3.5 Federal statistical agencies should continue to develop systematic informational activities designed to inform the public of their ability to maintain the confidentiality of individually identifiable information, including use of legal barriers to disclosure and physical security procedures, and their intentions to minimize intrusions on privacy and the time and effort required to respond to statistical inquiries. Recommendation 3.6 Agencies should be prepared to deal quickly and candidly with instances of "moral outrage" that may be directed at statistical programs from time to time as a result of actual or perceived violations of pledges of confidentiality given to data providers by data collectors. The agencies should be prepared to explain the purpose of specific data collection activities and the procedures used to protect confidentiality. They should accept full responsibility if a violation occurs and should announce measures to prevent future violations.

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics Recommendation 3.7 As part of the communication process, statistical agencies should work more closely with appropriate advocacy groups, such as those concerned with civil liberties and those that represent the rights of disadvantaged segments of the population, and with specialists on ethical issues and human rights. Some agencies may want to include members of such groups on their advisory committees.

OCR for page 61
Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics This page in the original is blank.