Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
5 Administrative Records In a chapter of the panel's interim report called "Administrative Records: Intriguing Prospects, Formidable Obstacles" and in an earlier letter report, we outlined some basic requirements for more effective use of administrative records and recommended several actions to explore and develop new uses of administra- tive records in the 2000 census and beyond. In the letter report, we recommended that the Census Bureau "undertake a planning study . . . that would develop one or more detailed design options for a 2010 administrative records census." The Panel on Census Requirements in the Year 2000 and Beyond, in its interim report, gave strong support to the recommendations in our letter report and also urged greater attention to enhanced uses of administrative records in the Census Bureau's current estimates programs. In this report, we focus on many of the same issues, illuminated by the light of some significant developments since the interim report was issued. The pros- pects for a national health care information system and legislation to govern the use of health records, a July 1993 Interagency Conference on Statistical Uses of Administrative Records sponsored by the Census Bureau, increased interest in continuous measurement, and other developments have served to heighten the prospects for expanded uses of administrative records. November 1993 brought the publication of the final report of the Panel on Confidentiality and Data Access of the Committee on National Statistics, with several recommendations relevant to statistical uses of administrative records. In this chapter we examine uses of administrative records not only in census operations, but also in other demographic programs. Although some of the obstacles associated with administrative record uses are indeed formidable, we 136
ADMINISTRATIVE RECORDS 137 believe that the prospects are more intriguing and promising than ever before. The recommendations in this chapter are designed to take full advantage of these prospects. Some of the topics covered here are of such central importance in improving the census (e.g., the development of a Master Address File and improved record linkage techniques) that they have been discussed in a broader context in Chapter 2. Here we address the particular relevance of these issues for uses of administra- tive records. The panel is encouraged by the progress that has been made in response to its earlier recommendations about the use of administrative records. In particular, the Census Bureau should be commended for taking a major step in initiating government-wide discussion of the use of administrative records by convening an Interagency Conference on Statistical Uses of Administrative Records in July 1993 and subsequently issuing a report on the conference. We also commend the Census Bureau for its initiation of a cooperative arrangement with the Statistics of Income Division of the Internal Revenue Service (IRS) to support and extend the latter's research on the population coverage of IRS and Social Security Ad- ministration records. We believe that the Census Bureau's announced plans for using administrative records in the 1995 census test are consistent with the rel- evant recommendations in our interim report, subject to certain reservations dis- cussed later in this chapter. Other recommendations made at the time of the interim report pertain to longer-term goals, such as the call for a long-range research and development program relating to the use of administrative records for demographic data, and the need for ultimate coordination and oversight for statistical uses of administra- tive records through the Office of Management and Budget (OMB). In this chapter, we reemphasize these recommendations and stress the importance of a proactive policy aimed at overcoming the formidable obstacles and taking full advantage of the intriguing prospects. Early in its deliberations, the panel concurred with the Census Bureau's judgment that a 2000 census based entirely or primarily on administrative records would not be feasible. We recommended, therefore, that exploration of census uses of administrative records follow a two-track approach. The 2000 census track would identify and test possible uses of administrative records in the 2000 census as an adjunct to the more traditional modes of data collection; the long- range track would develop and test procedures for a possible administrative records census in 2010 or beyond, as well as uses of administrative records in other demographic data programs. We remain committed to this two-track ap- proach. We believe that an administrative records census in 2010 is a live option that should be thoroughly examined and evaluated during the current decennial census cycle, in order to avoid putting off the question of its feasibility for yet another decade. The next section of this chapter addresses some of the fundamental questions
38 COUNTING PEOPLE IN THE INFORMATION AGE associated with the use of administrative records in statistical programs. Among them are those associated with access to administrative records for statistical and research purposes, the importance of public acceptance, and how best to address some of the specific technical requirements associated with the use of administra- tive records. Key recommendations in this section are that health care legislation should not preclude uses of new health care records in the decennial census and that the Census Bureau should be invited to participate in the design of new health care record systems to help ensure their suitability for statistical uses. We then proceed to a description of the main features of an administrative records census and a discussion of the many issues that arise. We identify privacy, coverage, geography, content, and cost as considerations that need to be properly addressed. This section also emphasizes the crucial need for a Master Address File without it, an administrative records census would be virtually impossible to conduct. In the next section we examine the uses of administrative records in the 1995 census test and the 2000 census. Although a census based primarily on adminis- trative records is not feasible for 2000, there are still many ways in which admin- istrative records can be used to improve the quality and reduce the costs associ- ated with the 2000 census. In addition, both the 1995 census test and the 2000 census provide rare opportunities to learn more about the potential benefits and the potential problems arising from an administrative records census. Administrative records have current and potential uses not only in decennial census operations, but also in other demographic programs. They play a major role, for example, in the Census Bureau's current population estimates program, and they also have a potentially significant role to play in a continuous measure- ment system. In the next to last section, we stress the importance of this broader view of the role of administrative records in demographic programs. In the final section of the chapter we summarize the different ways in which administrative records have been or might be used to help satisfy needs for small- area demographic data. We identify the major components of a proactive policy to develop enhanced statistical uses of administrative records. We urge the Census Bureau to follow such a policy and we call on other executive branch agencies and the Congress to lend their support to it. BASIC REQUIREMENTS FOR MORE EFFECTIVE USE OF ADMINISTRATIVE RECORDS The panel believes that there are significant benefits to be obtained from greater use of administrative records both in the decennial census program and in the programs that provide current demographic data. For example, tabulations of birth and death records and extracts from tax and Social Security records are important inputs to the Census Bureau's current population estimates program and in the demographic analyses that have played a significant role in the evalu
ADMINISTRATIVE RECORDS 139 ation of census coverage for several decades. However, the potential benefits of using administrative records, especially to produce more frequent and timely data for small geographic areas, are far from being fully realized. In this section we discuss major policy and technical issues that must be addressed in order to be successful in efforts to develop effective new uses of administrative records. As indicated above, these issues were addressed in our interim report, and some of the relevant background presented in that discussion is repeated here. We also take note of some important developments over the past few months that are relevant to this discussion. Access Effective use of administrative records by the Census Bureau requires a legal right to access, the establishment of close and mutually beneficial ongoing rela- tionships between the Census Bureau and the custodians of administrative records, and reasonable assurance of continued access to data that are suitable for the intended statistical uses. The value of administrative record systems for statisti- cal uses will be enhanced if custodians are willing to consider making modest additions to or changes in content, when this can be done at a reasonable cost and without detriment to the program uses of the data. By a legal right to access we mean at least the absence of legal prohibitions on access for statistical uses, and, preferably, positive statutory recognition of statistical uses as a permissible secondary use of the records. The Census Bureau probably has greater legal access to administrative records than any other U.S. statistical agency, but its access is by no means universal, especially to systems maintained at the state level, for which access is controlled in part by state laws. Currently, the question of Census Bureau access to health records is of special interest. Under the Clinton administration's proposal for health care reform, virtually everyone would be covered by one of the available health care plans. Administration of the new system will require the creation and continuous updating of enrollment records for all participants with information about the* identities, current addresses, characteristics, and plan affiliations. Potentially, these health care enrollment records could provide more complete coverage of the U.S. population, with current information about each person's location and demographic characteristics, than any other national system of administrative records. The records are likely to include information on each person's race and ethnicity, data elements that are lacking or incomplete in other administrative record systems with broad national coverage. The potential usefulness of such a record system for the decennial census and other demographic data programs is obvious. The extent of its actual usefulness will be determined by decisions about the specific content of health care records and about legislation governing use of and access to them. The schedule for reaching these decisions is difficult to predict, but relevant legislation on the
140 COUNTING PEOPLE IN THE INFORMATION AGE privacy and confidentiality of health records is being actively considered and may be enacted in the current session of the U.S. Congress. Also being consid- ered is more general legislation that would modify and extend the scope of the Privacy Act of 1974 and establish a privacy protection commission. The Committee on National Statistics, over the past several months, has paid close attention to the implications of health care reform for federal statistical activities. In March 1994, it approved and transmitted a letter report (Bradburn, 1994) to several members of the Congress who have played major roles in the development of legislation on the confidentiality of health care information. The position taken by the committee in this report can be summarized by the follow- ~ng excerpt: The Committee has two concerns. The first and foremost is that privacy and confidentiality of health care information be adequately protected. The second is that the U.S. health care system, individual health care subscribers, and the public as a whole benefit from access to that information for research and other statistical purposes in ways that protect confidentiality. It is not necessary to sacrifice either confidentiality or the benefits of information: both are possible if legislation provides for responsible access and demonstrated, effective means to protect confidentiality. In the report, the committee identifies ways in which access to health care enroll- ment information would permit the Census Bureau to reduce the cost of decennial census operations and improve the quality of current population estimates. A recommendation of the Panel on Confidentiality and Data Access of the Committee on National Statistics (Duncan et al., 1993) called for expanded ac- cess to administrative records for statistical uses, subject to appropriate con- straints: "Greater access should be permitted to key statistical and administrative data sets for the development of sampling frames and other statistical uses. Ad- ditional data sharing should only be undertaken in those instances in which the procedures for collecting the data comply with the panel's recommendations for informed consent or notification" (Duncan et al., 1993:99, Recommendation 4.1~. Legal access is necessary but not sufficient for effective use. Access in specific instances is often arranged only with great difficulty. These difficulties will continue unless statistical agencies and the custodians of record systems develop new ways of thinking about statistical uses of administrative records. The custodians need to regard satisfying the statistical requirements of the nation as one of the responsibilities on which they will be judged, not as an inconve- nience or an intrusion on their territory. Some flexibility is needed on the part of both statistical and program agencies in adapting their operating procedures and schedules to meet basic statistical needs. There must also be willingness by the statistical agencies to assist administrative agencies in every possible way, for example, by providing technical support to the latter for the production of small- area tabulations of administrative records, to the extent that this can be done
ADMINISTRATIVE RECORDS 141 while ensuring a one-way flow of administrative records for statistical uses. Statistics Canada's access to administrative records, which makes possible the demographic data system that is described later in this chapter, may provide a useful model. These conditions might be more readily created if the Office of Management and Budget were to adopt a strong leadership position, as part of the current administration's campaign to reinvent government, in efforts to maximize the utility of information collected by federal government agencies. A 1976 policy statement of the Social Security Administration (U.S. Department of Health and Human Services, 1976:17) suggests a principle that might guide such efforts: The operation of the social security system produces a vast and unique body of statistical information about employment, payrolls, life-time earnings histories, retirement, disability, mortality, and benefit claims and payments. The Social Security Administration has an obligation to develop these data according to the best scientific standards and with maximum economy and minimum delay, and to publish them in a form useful both to the program administrator and to social scientists generally. It also has an obligation to encourage the linkage of these data with other bodies of statistical information and to make the data available for research uses by other organizations, subject always to careful safeguarding of the confidentiality of information relating to individuals. A similar policy would be appropriate for all major systems of administrative records that have potential statistical uses. The Census Bureau's initiative in organizing the July 1993 Interagency Con- ference on Statistical Uses of Administrative Records and its intention to hold a similar meeting with custodians of state record systems are important steps. As indicated in the proceedings of the first conference (Bureau of the Census, 1994h), many of the custodians of administrative record systems recognize that sharing their records for statistical uses would have benefits. Several officials said they would feel more comfortable sharing the records if specifically authorized to do so by their legislative authority. They were all conscious of the need to inform data subjects about how their data would be used and to inform the public about the benefits and risks associated with data sharing. They felt the need for some mechanism, such as an individual ombudsman, an expanded role for OMB or a data protection board, "to develop and oversee policies on data sharing, and to provide a balance between the interests of data providers and data users." The panel also applauds the Census Bureau's initiative to establish and main- tain its Administrative Records Information System, which now provides de- tailed information about major federal and state administrative record systems that may have potential statistical and research uses beyond those that are directly related to the programs for which they were established. The coverage, content, and structure of administrative record systems changes frequently, and it is im
42 COUNTING PEOPLE IN THE INFORMATION AGE portent that the information system be periodically updated. The system, which is publicly accessible in electronic form, can be a valuable resource for all agen- cies and organizations wishing to explore possible statistical and research appli- cations of administrative records. Effective use of IRS records in the decennial census and other Census Bu- reau programs requires a close working relationship between the two agencies. In the past, the two agencies have occasionally had difficulty in agreeing on their respective roles and in developing and maintaining smooth working a~range- ments. Nonetheless, statistical uses of IRS records have led to major improve- ments and efficiencies in Census Bureau programs, especially in the quinquen- nial economic censuses, in which IRS and Social Security Administration records are a major input to the census mailing lists and eliminate the need to collect separate data for most of the small establishments. The negotiations between the two agencies for access to IRS records for use in the 1995 census test and the 2000 census will be an important test of this ongoing relationship. The panel has been greatly encouraged by recent arrangements whereby the Census Bureau has funded work by a contractor to develop comprehensive meth- odological documentation of the IRS's research to compare estimated 1990 popu- lation counts based on a sample of linked individual income tax and informa- tional return records with 1990 census counts, with and without adjustment (Czajka and Schirrn, 1994~. In a second phase of this project, the contractor will analyze the potential for using the same IRS data sets in the context of a census or a program of current population estimates. Census Bureau, IRS, and contractor personnel meet periodically to review the work and develop specific objectives. The Census Bureau cannot commit itself to substantially greater reliance on administrative records unless it can have reasonable assurance of continued ac- cess. Proposed legislation that calls for the establishment of major new adminis- trative records systems should be carefully monitored to ensure that possibilities for important statistical uses of the records are recognized and are not unneces- sar~ly foreclosed. We believe it is especially important for the Census Bureau, other statistical agencies, and the Statistical Policy Office of OMB to play a role in the develop- ment of new record systems in connection with health care reform. These agen- cies should have the opportunity to participate in decisions about content of the records, so that standard concepts and definitions that are suitable for both adrnin- is~ative and statistical purposes can be adopted. In our interim report, we recom- mended that the Statistical Policy Office recognize statistical uses of administra- tive records as one of its major areas of responsibility and assume an active role in facilitating effective working relationships between statistical and program agencies and in tracking relevant legislation. Recommendation 5.1: Legislation that requires or authorizes the cre- ation of individual record systems for administrative purposes should
ADMINISTRATIVE RECORDS 143 not create unnecessary barriers to legitimate statistical uses of the records, including important uses not directly related to the programs that the records were developed to serve. Preferably, such legislation should explicitly allow for such uses, subject to strong protection of the confidentiality of individual information. The panel urges Congress, in considering legislation relevant to health care reform, not to foreclose possible uses of health care enrollment records for the decennial cen- suses and other basic demographic statistical programs. Recommendation 5.2: To facilitate statistical uses of new health record systems, the responsible executive branch agencies should invite the Census Bureau and other federal statistical agencies to participate ac- tively in the development of content and access provisions for these record systems. Recommendation 5.3: The Office of Management and Budget should review identifiers, especially addresses, and demographic data items currently included in major administrative record systems with a view to promoting standardization and facilitation of statistical uses of infor- mation about individuals both in these record systems and in new ones that may be developed. Public Acceptance Expanded statistical uses of administrative records in the census will require at least the tacit acceptance of those who provide information about themselves to the program agencies. Greater use of administrative records could reduce the number of requests for individuals to provide information about themselves to the government and also has the potential for improving coverage and substantially reducing the cost of censuses. The key issues, however, are consent and confi- dentiality. Do people accept the use of information about themselves for statisti- cal purposes that are not directly related to the purposes for which they supplied their information? Should they be able, as individuals, to prevent such uses? Will the confidentiality of their data be adequately protected? Effective use of administrative records for census evaluation, coverage improvement, and supple- mentation of content may require that one or more identifiers, such as full name, date of birth, and Social Security number, be collected in the census and entered into census electronic files, along with addresses. Will this be acceptable? These questions are not easy to answer. As already noted, the final report of the Panel on Confidentiality and Data Access called for increased sharing of data among federal agencies for statistical and research purposes, with greater access to "key statistical and administrative data sets for the development of sampling frames and other statistical uses"
44 COUNTING PEOPLE IN THE INFORMATION AGE (Duncan et al., 1993, Recommendation 4.1~. That panel specified that such uses should be conditional on strong protection of the confidentiality of the data and the use of suitable informed consent or notification procedures. It also recom- mended that statistical agencies undertake and support continuing research to monitor the views of data providers and the general public on data sharing for statistical purposes and on several other aspects of federal statistical activities (Recommendation 3.4~. We believe that it is necessary to proceed with public debate about the ethical, legal, and policy issues associated with statistical uses of administrative records. The Census Bureau has had contacts with privacy advocates in connec- tion with previous censuses and has informed us that it plans future discussions that will focus specifically on statistical uses of administrative records in the 1995 census test. However, there has been some reluctance on the part of the agencies involved to enter a public debate for fear that calling attention to these questions might lead to discontinuance of important existing activities, such as the use of income tax return and Social Security data in the Census Bureau's intercensal population estimates programs. We believe it is better to face these questions directly. Discussions are likely to be more productive if they focus on specific uses of administrative records, such as improvement of coverage in the 2000 census and the implications of an administrative records census in 2010, rather than on broad philosophical questions. Carefully designed surveys and focus group interviews can help provide background for public discussion of these issues. The views expressed by data providers and the public in surveys do not always coincide with those of privacy advocates as reflected in congressional testimony, panel reports, and other public venues. For example, some privacy advocates have expressed strong objections to the possible use of the Social Security number as an identifier for participants in health care plans. But in the 1993 Harris-Equifax Health Information Privacy Survey, about two-thirds of the respondents favored the use of the Social Security number for this purpose in preference to a new identifier (Louis Harris and Associates, 1993~. Relevant information on taxpayers' opinions about possible uses of their tax information in the census of population and for other statistical purposes has been collected in a series of surveys conducted for the IRS, most recently in the 1990 Taxpayer Opinion Survey. In that survey, the results suggested that the majority of taxpayers support the idea of their tax information being released to other agencies for statistical purposes, but that a minority, generally about 1 in 5, are strongly opposed. When they were specifically asked about the use of adminis- trative records in a decennial census, 70 percent of the survey respondents fa- vored the use of Social Security information on their date of birth and sex and 15 percent were strongly opposed. A smaller proportion, 61 percent, favored the use of income tax return information on their place of residence and income in the census and 23 percent were strongly opposed (Internal Revenue Service, 1993~.
ADMINISTRATIVE RECORDS 145 Another kind of evidence about people's feelings on this subject is provided by the responses of survey respondents to requests for their Social Security numbers or other identifiers, so that information about them residing in adminis- trative record systems can be obtained and linked to their survey responses. When Social Security numbers have been requested in surveys sponsored by the Census Bureau, relatively small proportions of respondents, usually well under 10 percent, have refused to give them to the survey interviewers (Beresford, 1992~. Response to these surveys and to individual survey items is voluntary. Survey respondents' willingness to give permission for such linkages can be influenced considerably by the manner in which permission is sought. Common practice is to request the needed identifiers at some point in a survey, following a brief explanation of the purposes for which they will be used. Recent experi- ments conducted by Statistics Canada (1994) tested two different procedures for obtaining health insurance numbers from respondents to a national health survey, so that data from provincial health ministry administrative files could be linked to their survey responses. The common practice of asking for the numbers near the end of the survey, with an explanation of their intended use, resulted in a low refusal rate. However, a procedure that required a signed consent form was successful for only about one-half of the respondents. The results of opinion surveys and experiments like the ones cited are sensi- tive to variations in many features: sponsorship and purpose of the survey; the population covered; the sample design and selection methods used; the wording, format, and order of questions; the context in which questions are asked; the mode of response; the level of nonresponse; and others. To obtain more informa- tion for developing its findings and recommendations on this topic, the panel commissioned a review of relevant research (Blair, 1994~. Some of the prelimi- nary findings of this review, which includes extensive analysis of the data from the most recent (1990) national Taxpayer Opinion Survey sponsored by the IRS, are reflected in our conclusions on this topic. Data from different surveys and other sources are difficult to compare. Only the IRS Taxpayer Opinion Survey series specifically addressed statistical uses of administrative records, and that was done in the context of surveys dealing prima- rily with the experience and views of knowledgeable taxpayers (not the general adult population) relating to tax compliance issues. The series of U.S. surveys about privacy issues sponsored by Equifax, including the 1993 Health Informa- tion Privacy Survey, has focused almost entirely on people's concerns about administrative (nonstatistical) uses of and access to information about them in a much more general context. Nevertheless, analysis of the data from these two sets of surveys suggests a certain amount of consistency in identification of the population subgroups that are most concerned about privacy issues. Women, blacks, people with less education, and, somewhat less clearly, middle-aged people exhibit the highest proportions with general and specific privacy concerns.
46 COUNTING PEOPLE IN THE INFORMATION AGE It is our view that surveys and other research undertaken to date have not yet provided a sufficiently clear picture of the public's understanding of and views on statistical uses of administrative records. The issues are conceptually com- plex, requiring care in both explanation and attitude measurement. Most people have given little thought to this subject, so responses to questions about it may be based mostly on generalized feelings about privacy issues and may be sensitive to minor variations in question wording and context. Experience in the series of taxpayer opinion surveys has shown that asking people how they feel about specific kinds of uses leads to results that are substantially at variance with responses to broad philosophical questions. More information is needed about the specific concerns of those who express strong resistance to secondary uses of their administrative record information. Opinions may change if administrative record uses in the decennial census should enter the arena of broad public debate. For these reasons, the panel believes that further research on public views about the use of administrative records is needed. Recommendation 5.4: The Census Bureau, in cooperation with other agencies and organizations, should support a program of research on public views about statistical uses of administrative records in govern- ment. The research should focus on public reaction to very specific administrative record use scenarios, rather than on general questions of privacy. Possible biasing effects of the sponsorship of such research need to be guarded against. Survey designs, instruments, protocols, and procedures should by re- viewed by qualified independent experts not associated with the sponsoring orga- nizations. If legislation is passed that establishes a privacy protection commis- sion or board, we believe it would be appropriate for such a body to sponsor research of this kind or to provide external advice on research plans to other organizations sponsoring such research. Technical Requirements Several of the central themes that are discussed in Chapters 1 and 2 includ- ing address list development, record linkage research, the need for long-term planning, and the need for greater interagency cooperation and coordination-are critical to making better use of administrative records in the decennial census and in other programs that produce small-area population and housing data. The first two of these are technical requirements, and in this subsection we discuss briefly their particular relevance to statistical uses of administrative records. As stated in Chapter 2, the panel believes that a geographic database that is fully integrated with a Master Address File is a basic requirement, whether for a traditional census or for one making greater use of administrative records. To be cost-effective and available for noncensus uses, the system should be continu
ADMINISTRATIVE RECORDS 147 ously updated. Its coverage should be extended to rural areas, and conversion of rural-type addresses to street addresses should be promoted by all means pos- sible. We were pleased to note that the Census Bureau and U.S. Postal Service are collaborating on efforts to do this (Bureau of the Census, 1993a: November). Appendix B to the interim report of the Panel on Census Requirements explained clearly how such a system could serve as the keystone for using administrative records data to provide small-area data more frequently and inexpensively. In our interim report, we recommended (Recommendation 1.1) that the tran- sition to a continuous, integrated system begin immediately, at least for the 1995 census test sites, so that it could receive its first major tryout in the 1995 census test. We note that the Census Bureau recently expressed its intentions in the following terms: "The Census Bureau is committed to having a continuously updated, permanent address list linked to the TIGER database. ... We plan to use the MAP tMaster Address File] in the 1995 census test" (Bureau of the Census, 1994a:25~. Record linkage that is, the identification of records belonging to the same unit, either within a single data set or in two different data sets is critical to enhanced uses of administrative records, whether in the context of a full adminis- trative records census or a more traditional design. If different administrative data sets are to be used to improve the coverage of the Master Address File or of persons at known addresses, duplicates must be identified and eliminated. Uses of administrative records to supply missing data or to evaluate the coverage and content of the census enumeration all require some type of record linkage for persons or addresses. Record linkage, like any element of census data collection and processing, is subject to error. The Census Bureau and other organizations have developed effective techniques for linking large data sets, but many aspects need further research and development as the techniques are applied in specific circumstances. How can the inputs be standardized to facilitate linkages? Over how wide an area should initial computerized matches be undertaken? What are the best keys (e.g., name, address, date of birth, Social Security number), alone or in combination, and what are the costs and other considerations for capturing these items in the computerized files for a census or evaluation survey? Additional research on these questions is a prerequisite to success in making more effective use of administrative records. As discussed in more detail in the next two sections of this chapter, testing should be carried out in conjunction with the 2000 census and the tests leading up to it and also in separate initiatives to explore the possibility of a 2010 census based primarily on administrative records. Another technical requirement for enhanced uses of administrative records is knowledge about the quality of the data that they can provide. How well do administrative record systems, individually or in combination, cover the target population for a census? Recent work in the United States (Sailer et al., 1993; Czajka and Schirrn, 1994) and in Canada (Standish et al., 1993) indicates that
48 COUNTING PEOPLE IN THE INFORMATION AGE well over 90 percent of the respective countries' enumerated populations can be identified in their tax systems, not supplemented by any other source. How much could this coverage be improved by adding records from other systems, and how would the coverage of subgroups defined by geography or other characteristics (differential coverage) be affected, both in absolute and relative terms? To what extent do the record systems proposed for use include information that makes it feasible to identify addresses and persons that are not members of the target population for the decennial census? Besides coverage, the accuracy and relevance of data available from admin- istrative records need to be considered. For example, suppose there are adminis- trative data for persons whose census responses are incomplete. Is enough known about the quality of these data to make an informed choice among alternatives: further nonresponse follow-up, imputation based on similar persons, or substitu- tion of the administrative data? To what extent could tax data be used either to evaluate or substitute for income data collected in a census? What are the impli- cations of conceptual differences? Opportunities and resources should be sought to pursue questions like these. Evaluation of the quality of information from administrative records should take into account the purposes for which the data may be used and the quality of comparable information that can be obtained in censuses and surveys. If data of equivalent or perhaps somewhat lower quality can be obtained from administra- tive records at a substantially lower cost, their use may be acceptable. Failure of administrative data to duplicate exactly the concepts used in the decennial census does not necessarily rule out their use for any statistical purpose. As discussed later in this chapter, IRS data on taxable income may provide useful indicators of change in total money income as defined in the census. AN ADMINISTRATIVE RECORDS CENSUS: KEY FEATURES AND ISSUES As noted above, we agreed, at an early stage of our work, with the Census Bureau's conclusion that a census based primarily on administrative records was not a feasible option for 2000 (Bureau of the Census, 1992c). However, we also recommended that the Census Bureau initiate a separate program of research on uses of administrative records focusing on the 2010 census and on the current estimates program. This recommendation stemmed from our belief that research work must start now if an administrative records census is to be a possibility for 2010. Without an early start, the Census Bureau will miss the important opportu- nity provided by the 2000 census to try out and evaluate many aspects of this new approach. In this section we examine what a census based primarily on adminis- trative records might mean, and consider the main political, managerial, and technical issues raised by this possibility. In essence, an administrative records census represents a reversal of the roles
ADMINISTRATIVE RECORDS 149 of enumeration and administrative record use in the census. In the traditional census, and even with the innovative changes proposed for 2000, the census depends first and foremost on enumeration of the population, with administrative records available for use as an aid and supplement to the enumeration process. In an administrative records census, the administrative records become the primary source of information, supplemented when necessary by enumeration or other methods of data collection. In our view there are five primary issues to be addressed in assessing the feasibility and benefits of an administrative records census. 1. Privacy: Is the American public prepared to accept the use and linkage of administrative records for the purposes of a census? 2. Coverage: Can administrative records deliver the level and distribution of overall coverage needed from a census? 3. Geography: Can administrative records allocate individuals accurately by place of residence as required by a census? 4. Content: How much of the traditional content can we expect an adminis- trative record census to deliver, and with what quality? 5. Cost: How would the cost of an administrative record census compare with other alternatives? These are the issues on which a research program must focus. We examine each of them in more detail below, but first we will sketch out some possible scenarios, and potential record sources, for an administrative records census. Definition of an Administrative Records Census There are many approaches one could envisage for an administrative records census. We will sketch out here a generic approach that seems to fit the current statistical and administrative record infrastructure in the United States. This scenario, which is illustrated in Figure 5.1, would start with a geocoded Master Address File purporting to contain all addresses in the country. Administrative sources would probably be used to some degree in the construction of this initial Master Address File. Selected administrative files of individuals (or families or households) would be matched individually by address to the Master Address File. This would result in an enhanced file with a set (maybe an empty set) of individuals associated with each address. There would also be a set of addresses from administrative records that do not appear to have a match on the Master Address File. The next step would be an edit that would apply specified checks to each address to identify cases in which the set of individuals linked to the address fails internal consistency tests, or no individuals are linked to the address. These would be the addresses for which some form of follow-up and resolution is required.
150 Master Address File Nonmatches (on address) \ \ \ - / Match . (on address) fat Enhanced Master Address File Win) COUNTING PEOPLE IN THE INFORMATION AGE Administrative Files /~q - Edit (within address) ¢,PJ Administrative Record Census Database FIGURE 5.1 Schematic diagram of an administrative record census. The next step would be resolution, in which cases referred from the edit, as well as addresses from administrative records that do not match to the Master Address File, are investigated and resolved. The precise form of this investiga- tion can depend on the nature of the inconsistency found, but might include automated correction or imputation, manual review and decision, supplementary
ADMINISTRATIVE RECORDS 151 matching to additional sources, and telephone and field follow-up. The applica- tion of these resolution decisions to the enhanced Master Address File produces the census database. This generic scenario leaves much room for variations. One important pos- sibility would be to investigate and resolve only a sample of unmatched cases. This would be analogous to the use of sampling for nonresponse follow-up in a conventional census. Even if a combination of administrative files is used to produce the census data base, we would not want to assume that coverage is sufficiently complete and uniform to require no further correction. Like a con- ventional census, a census based primarily on administrative records would re- quire the fourth step that was identified at the beginning of Chapter 1: a coverage measurement process that estimates the size of the population not covered through the initial and follow-up processes. The coverage measurement, which would be undertaken for a sample of areas, could take various forms, such as field enu- meration, more intensive investigation of unmatched cases, verification of house- hold composition for matched addresses, or some combination of these. Estima- tion procedures would be applied to the results to produce a one-number administrative records census. Other crucial issues are the choice of administrative records to match to the Master Address File, specification of the checks or edits to be applied to the individuals who are linked to a particular address, and the nature of resolution measures for those cases that fail the checks. The choices in these areas will depend on the results of research into the quality and coverage of administrative record sources, and the costs and effectiveness of different resolution measures. The closer one can move to the situation in which resolution through field follow- up is unnecessary, the greater the benefits from the administrative records approach. Another crucial issue is timing. As long as the concept of a specific Census Day remains, it is important that any follow-up take place soon after that date. This implies that the administrative records to be used must be available before or soon after Census Day. Most administrative records have some time lag, so versions that predate Census Day will probably have to be used. Variations of an administrative records census that do not require a Master Address File do not seem promising. Whereas the administrative records them- selves could be used to develop an address list, there would still need to be some bonding of these addresses to a precise geographic location. In fact, if we look at what other countries have done, those that have carried out an administrative records census have had not only the equivalent of a Master Address File, but also a permanent population register, i.e., a requirement that individuals register all changes of address with a local or central authority. For example, Denmark, which is the leader in administrative records censuses, hav- ing conducted them in 1981 and 1991 (Redfern, 1994), has a network of perma- nent registers of population, dwellings, and enterprises. Other Scandinavian
52 COUNTING PEOPLE IN THE INFORMATION AGE countries that have a population register, such as Sweden and Finland, have nonetheless conducted censuses that combine administrative records use with traditional enumerations in order to improve the quality and extend the content of the administrative record sources. Outside the Scandinavian countries, to our knowledge, only the Netherlands has conducted a census based primarily on administrative records. Therefore, in moving toward an administrative records census, the United States would be joining a select group of countries and would be breaking new ground by taking a census based on administrative records in the absence of a population register. Record Sources for an Administrative Records Census Three major requirements emerge when one considers what types of admin- istrative records would be most suitable for use in a census based primarily on administrative records. First, the census should have as its core a nationally consistent set of administrative records. These could be held federally or could be decentralized but should be subject to precise national standards of content, quality, and timeliness. Although state or local administrative records can be very useful for local follow-up or supplementation of a traditional census, an approach that uses administrative records as the primary data source must start from a set of records that have broad and consistent coverage over most of the country. The prospects of trying to bring 50 or more independent administrative systems into conformity for a census, if they are not already following common standards, are not attractive. This preference is not meant to preclude the use of state and local record systems to supplement the basic administrative record setups) used for the census. However, care should be taken to avoid the use of supplemental record systems in ways that would clearly lead to inequitable treat- ment of different localities or population groups. The second requirement is, insofar as possible, to use administrative record systems that are updated on a continuous basis, rather than records that are subject only to periodic updating. Fulfillment of this requirement allows a snap- shot to be taken at any point in time, in particular on Census Day, rather than being restricted to specific points in time when a file update has been completed. It is also almost certain that the average lag between real changes and their appearance in the file will be much less with a continuously updated system. For example, a record system for administering the provision and financing of health care may be updated every time there is a change in status affecting coverage, and every time there is a transaction with the health care system. In contrast, a record system for collecting income taxes may be updated for most people only once a year, following filing of their annual tax returns. It is unlikely that the administrative record setups) chosen as the basic source for the main collection phase of the census will provide equivalent coverage of all population subgroups defined in terms of geography or demographic and socio
ADMINISTRATIVE RECORDS 153 economic variables. The third requirement, therefore, is to supplement the core administrative record setups) with other types of records that are expected to be promising sources for improving coverage of addresses or persons most likely to be missed by the basic administrative record sources. Some of the supplementa- tion efforts could be made part of the main collection phase of the census; addi- tional efforts could be part of special procedures analogous to the integrated coverage measurement survey that is proposed as a means of producing a one- number conventional census. To illustrate some of the considerations that must go into the choice of administrative record systems, but without meaning to restrict consideration of other potentially useful sources, we next review two major contenders for the role of core administrative record system in an administrative records census: the individual taxation system and the health care system. Income Tax and Social Security Records About 65 percent of the population files a tax return either as primary or secondary taxpayers (Sailer et al., 1993~. Children and other dependent individu- als are claimed as exemptions by tax filers. In addition, many nonfiling individu- als are known to the tax system through informational documents covering par- ticular sources of income. By carefully combining and unduplicating these various sets of individuals, estimates of population can be developed from the individual tax system. Research along these lines is under way at the IRS and has produced esti- mates of population that are 97.5 percent of the 1990 census population at the national level (Sailer et al., 1993~. Coverage of males (99 percent) is higher than of females (96 percent). At the state level these ratios vary between 91 and 104 percent. Similar results have been obtained from comparable work in Canada. These results are encouraging but not yet good enough to stand alone especially when we remember that these percentages are in relation to a census count that is itself probably 1-2 percent below the true population count. The variation in coverage between states may be due partly to the problem of counting tax filers at the appropriate location, a problem that is likely to become more apparent as one tries to make estimates for smaller geographic areas. Moreover, the observed net coverage rates may mask levels of offsetting under- and overcoverage higher than those estimated to have occurred in the 1990 census. It is also the case that these high coverage rates have been achieved without any explicit attempt to adapt the tax system to become a source of population estimates. Further analysis of coverage rates should help to pinpoint sectors of the population that are not well covered and to identify measures that might be taken within the tax system to improve both coverage and geographic precision. Changes to tax rules themselves, particularly changes that encourage low-income
54 COUNTING PEOPLE IN THE INFORMATION AGE individuals to file to receive tax credits, may also help to fill some of the coverage gaps. Information in IRS files can be and for some purposes has been linked with information in files of the Social Security Administration, using Social Security number as the primary match key. Such linkages have made it possible for Sailer and colleagues to develop national estimates of the 1990 population by gender and age group. The key Social Security file, the NUMIDENT file, also contains information on race and ethnicity, determined at the time each person's Social Security number was issued, but the race/ethnic data are incomplete and of ques- tionable quality. Continuing this line of research appears worthwhile. If its potential materi- alizes, it should eventually result in a policy decision that the production of population estimates become a recognized objective of the tax system, rather than just an incidental by-product. This would serve to justify any changes to the administrative system needed primarily for the purposes of population estima- tion. Even if this approach does not develop into a replacement for the census, it will surely lead to valuable intercensal estimates of population change. Recommendation 5.5: Research on the production of population esti- mates from Internal Revenue Service and Social Security Administra- tion records should continue as a joint initiative of these agencies with the Census Bureau and should focus on identifying measures that could serve to reduce coverage differentials and improve geographic preci- sion. Health Care Records In contrast to the well-established individual tax system, comprehensive uniform health care records do not yet exist. But, as discussed in the preceding section, they may soon, and that presents a statistical opportunity that should not be missed. Given the coverage objectives for health care, one can expect that the records needed to support the system will provide very high coverage of the population. Furthermore, and again in contrast to the tax system, the health care system should have the advantage of being a system that almost all people will wish to be in rather than one they wish to avoid. The basic enrollment records are likely to include race and ethnicity information, based on the latest OMB require- ments, for nearly all persons, something that is now lacking in the Social Security Administration and Internal Revenue Service record systems. This is potentially an extremely valuable source of population data. There is a window of opportunity as a health care record system is put in place, but two important principles must be accepted. First, there must be recog- nition of the legitimacy of use of these records for census-type purposes. Second, there has to be statistical input to their design to enhance their value for this
ADMINISTRATIVE RECORDS 155 purpose by, for example, inclusion of appropriate information on geographic location, individual characteristics (especially race and ethnicity), and family relationships. Given the legitimate privacy concerns that surround health care records, it is important to distinguish the demographic and socioeconomic enrollment infor- mation from the medical information that reflects general health status and spe- cific encounters with the health system. It is only the former that is relevant to consideration of census alternatives. The latter is certainly of interest for medical research and health policy, but its access should probably be subject to different rules than those for the basic demographic data. Other Major Record Systems An administrative record census would not necessarily have to be based entirely on a single core system. Supplementation of the core system with records from other systems that have national coverage and are consistent content across states might prove cost-effective, especially if these additional record systems tend to cover populations that are more likely to be missed in an enumerative census. In the 1995 census test, the Census Bureau is planning to experiment with the use of several such files, including those maintained for the Food Stamp, Aid to Families With Dependent Children, and Medicaid programs. It is impor- tant to recognize, however, that ongoing initiatives for welfare reform are gather- ing strength, with the result that these programs and the associated record systems may soon undergo substantial change. The panel's Recommendations 5.1 and 5.2, which have singled out expected changes in health care records, could apply equally to new or modified record systems for welfare programs; that is, appro- priate Census Bureau uses of records from these systems should not be precluded and the Census Bureau should have an opportunity to participate in the design of new record systems. Summary of Key Factors Affecting Feasibility We now return to the five primary issues identified earlier on which we feel a research program needs to focus if an administrative records census is to be . come a reality. 1. Public acceptance. The issue of public acceptance of the use of adminis- trative records has been addressed in general in the preceding section. Because the census is the best known of all government statistical activities, the issue becomes paramount in the case of an administrative records census. The public benefits of using administrative records for a census, primarily in saving money and reducing respondent burden, have to be seen to outweigh any perceived invasion of privacy or violation of confidentiality from their use. A significant
156 COUNTING PEOPLE IN THE INFORMATION AGE public information effort would be needed to make clearer the distinction be- tween statistical uses and administrative uses and the public benefits and benig- nancy of the former. We believe that it is important to pursue actively the public debate about an administrative records census. (Some aspects were covered in a March 16, 1994, hearing of the House Subcommittee on Census, Statistics and Postal Personnel on uses of health information for research and statistical purposes.) If the idea is to be rejected on political grounds, it might as well be rejected before significant investment is made. And if the idea is to be judged on cost-benefit consider- ations, with some cost ascribed to the rights of individuals to control uses of information about themselves, then a debate on these costs and benefits, with an element of public education in it, should begin in parallel to technical developments. 2. Coverage. Because one of the prime objectives of census redesign is to reduce differential undercoverage, while at least maintaining overall coverage levels, the level of coverage that an administrative record census can achieve is a primary issue. We have reported above on work by the Internal Revenue Service to begin to assess what overall coverage rates might be achievable from tax records without any adaptation of tax reporting for purely statistical ends. We have recommended that this work continue. Assessing differentials in coverage by demographic characteristics requires these characteristics to be available in the administrative files. For age and sex, and maybe marital status, this is not a problem, but for race, one of the primary correlates of undercoverage, it is. Options for associating a race variable with tax records would have to be developed. A complete refiling of Social Security numbers to eliminate duplicates, add personal identification numbers, and serve related purposes would offer one option. Another would be to make greater use of the birth registration system, possibly including parents' Social Security num- bers on the certificates. In the case of health care records, the opportunity still exists to ensure that the required variables are included in the basic health care enrollment records. A recent news article (Vobejda, 1994) indicates that the main advocacy organiza- tions for racial and ethnic minorities are in favor of this. 3. Geography. The importance of an up-to-date Master Address File to a successful census has already been stressed. For an administrative record census it is crucial. Three other areas of research are important to the geographic dimen- sion of an administrative record census. If the census requires persons to be counted at their usual residence, administrative records must contain residential addresses. The extent to which this is the case and the accuracy of de jure address recording need to be assessed. Addresses themselves must reflect geographic location. In many rural areas this is not yet the case. The creation of urban-style addresses for rural areas is a prerequisite for an administrative record census, and the Census Bureau is work
ADMINISTRATIVE RECORDS 157 ing with the Post Office to bring this about (Bureau of the Census, 1993a: No- vember). Postal codes could be very helpful in matching addresses, especially if they are detailed and reported accurately in the administrative records. Liaison with the Post Office in the future development of the postal code system should aim at making the system as consistent with census needs as possible. Finally, as discussed in the preceding section, efficient address and person matching software will be crucial to the large-scale matching operation that will be required by an administrative record census. 4. Content. Administrative records will not be able to match the richness of content of a long-form census. In terms of replicating short-form content, the gaps are likely to be race and family relationships. If administrative records are to become a basis for statistical uses, including the census, it would be helpful to specify a core set of demographic variables that should be maintained on all administrative files. Particular administrative files will be rich in certain variables (e.g., income tax files are rich in income variables); other variables will not be available through any administrative record system and would have to be the subject of a follow-on or postcensal survey if they are to be linked directly to the census. Alternatively, estimates for these variables could be obtained at a fairly low level of geography from a continuous measurement system, as discussed in Chapter 6. The fact that a variable is contained in an administrative record system does not ensure its relevance and quality. As noted earlier in Recommendation 5.3, we believe the Office of Management and Budget should take the lead in promoting standardization of identifiers and basic demographic variables that are commonly included in administrative records. Such standardization would benefit both statistical and program agencies. The latter would find it easier to relate aggre- gate program data to Census Bureau population data to analyze coverage and other features of their programs. The quality assurance procedures that are used within each administrative record system will have to be assessed in determining usability in a census. 5. Cost. A major reason for considering an administrative record census is its potential to be cheaper than a revised traditional methodology. It is difficult to cost an administrative record census without a clearer idea of the record sources and the methods to be used. Nevertheless, a costing model framework should be developed so that initial crude cost estimates can be derived and subsequently revised as the features of a practical administrative record census methodology are tied down. Recommendation 5.6: The Census Bureau should continue its develop- ment of a cost model for an administrative record census and should use the model to maintain current cost estimates for several versions of this option as they are developed.
58 COUNTING PEOPLE IN THE INFORMATION AGE Modelled costs for several versions of an administrative record census should be compared with the costs of the alternative proposed census methodologies for 2000, not with 1990 costs. If the supposed significant savings do not materialize as research proceeds, the rationale for further pursuit of the administrative records option should be reexamined. Testing an Administrative Records Census Approach The previous section has identified a series of research issues to be ad- dressed. Adequate exploration of these issues will require a series of tests that progressively expand the administrative record census process through the stages illustrated in Figure 5.1. For example, the following type of research program could be envisioned: 1. Pilot testis) to construct a census purely from administrative records in a defined and limited geographic area-primarily to learn of the problems; 2. Testis) as in (1), but in conjunction with a conventional census test to allow evaluation of coverage; 3. Testis) that incorporate follow-up activities to investigate missing ad- dresses and households for which the administrative data fail to meet standards for completeness and consistency primarily procedural tests; 4. Testis) as in (3), but in conjunction with a conventional census test (or a coverage measurement survey associated with a census test) to allow evaluation of coverage. This will require careful design to ensure that the administrative records test and the conventional census do not affect each other; and 5. Full stand-alone dress rehearsals) of an administrative records census in chosen areas, using the procedures found to be most effective in steps (1) to (4~. Since the 2000 census represents the last chance before 2010 to compare an administrative records census approach to the traditional approach under full census conditions, it is essential that research should by then have advanced to the stage at which a meaningful test (probably of type (4) above) can be under- taken in conjunction with the 2000 census. To meet this timetable will require an immediate start to the research program whose elements have been identified above. Recommendation 5.7: During the 2000 census the Census Bureau should test one or more designs for an administrative records census in selected areas. Planning for this testing should begin immediately. As stated at the start of this chapter, the panel believes that the Census Bureau should treat the possibility of a 2010 administrative record census as a live option, to be carefully explored and evaluated during the current decennial census cycle. The option may be rejected later in the cycle on the basis of new developments in legislation that governs coverage and content of and access to
ADMINISTRATIVE RECORDS 159 administrative records or public opinion about secondary statistical uses of the records. Research may show that the cost reduction and other advantages are smaller than expected, that the quality of administrative records data is inad- equate, or that there are other unforeseen problems. However, we don't know enough now to reach a firm conclusion. Even if a 2010 administrative records census should prove to be out of the question, the research necessary to evaluate its feasibility may facilitate and accelerate other beneficial uses of administrative records to produce small-area demographic data. USE OF ADMINISTRATIVE RECORDS IN THE 2000 CENSUS We have recommended that the Census Bureau follow a two-track approach to expanded uses of administrative records. The 2000 census track, which is the subject of this section, would identify and test new uses of administrative records considered feasible for use in the 2000 census. Possible uses include coverage improvement, content improvement, evaluation of the census, and measures to improve operational efficiency. The long-range track would develop and test procedures for a possible administrative records census in 2010 or beyond and uses of administrative records in other demographic data programs, such as cur- rent population estimates and current sample surveys, including any new surveys that might be started as part of a continuous measurement program (see Chapter 6~. The preceding section of this chapter discussed the administrative records census part of the long-range track. The section that follows this one will discuss the current estimates and surveys parts of the long-range track. Tests of adminis- trative record uses prior to the 2000 census and uses of administrative records in the 2000 census, if designed with the long-range track in mind, can contribute in important ways to progress on that track. Such progress requires that the Census Bureau take advantage of favorable opportunities to acquire knowledge about key administrative records systems and gain experience with their use. The 1995 Census Test A Census Bureau memorandum describing plans for testing various uses of administrative records in the 1995 census test was made available to the panel in January 1994 (Knott, 1994~. A more widely distributed document, the "1995 Census Test Design Plan," which was issued in February, describes some of the uses of administrative records that the Census Bureau plans to evaluate in the test. The Census Bureau's plans call for the development of an administrative records database for each of the test sites, using records from several federal, state, and local record systems. Negotiation for acquisition of these records is already under way. Several different uses of the administrative records database will be tested
160 COUNTING PEOPLE IN THE INFORMATION AGE directly as part of the 1995 census test or simulated and evaluated by comparison with the census test results. The primary activities for which uses of administra- tive records will be tested or simulated are coverage improvement for the count portion of the test and integrated coverage measurement. Administrative records, along with data from the 1990 census and other sources, will also be used to develop a targeting database for use in the identification of areas with specific barriers to enumeration and efficient stratification of samples for nonresponse follow-up and integrated coverage measurement operations. It is anticipated that new software for matching and unduplication of persons will be developed to support all of these uses. These plans are consistent with the panel's recommen- dations in its interim report (Chapter 4, Recommendations 4.3 to 4.5) and we hope that they can be successfully implemented. Privacy and confidentiality aspects of the research will require close atten- tion. Census Bureau staff have been working on the development of respondent notification statements for the census test questionnaires and on the language that will be included to describe planned uses of administrative records (response to the census test will be mandatory, so it is not appropriate to refer to informed consent procedures in this instance). These uses will also have to be described in the Privacy Act record systems notice or notices for the census test that are required to be published in the Federal Register. Staff also plan to meet with organizations and individuals who are interested in privacy issues to explain and discuss their plans for administrative records uses in the 1995 census test. Maintaining the confidentiality of the identifiable record extracts that will be obtained from several different administrative sources is also an important con- sideration. To maintain the continued trust and confidence of the public and of the federal and other agencies that supply these records, the Census Bureau will need a detailed plan for physical security and for control of access to the records at all stages who will use them and for what purposes. An audit trail to record all instances of access should be part of the plan. Because of the timing of this report, the panel was not in a position to comment in depth on technical details of plans for testing administrative record uses in the 1995 census test. What we have seen so far is encouraging. At this time, we have one general recommendation and comments on two specific fea- tures of the test plan. Recommendation 5.8: The Census Bureau should plan its uses of ad- ministrative records in the 1995 census test and other tests leading up to the 2000 census and in the census itself in a manner that will also provide knowledge and experience of value for a possible administrative records census in 2010 or beyond and for uses of administrative records in de- mographic programs other than the census. The primary reason for testing administrative records uses in the 1995 census test is unquestionably to determine what kinds of uses are most likely to be effective
ADMINISTRATIVE RECORDS 161 in attaining the main goals for the 2000 census, namely, to reduce differential undercoverage and unit costs of enumeration. However, with careful planning, much can also be learned about other potential uses of the records, whether for current programs or for a future administrative records census. One implication of this strategy is that attempts to obtain specific administrative record files should not necessarily be abandoned as soon as it becomes evident that they cannot be obtained in time for operational use in the 1995 census test. If obtained later, these files can still be used to simulate their use in the test or for other purposes, such as improving current population estimates. We believe that these considerations are especially important for major federal and state record systems. In keeping with this general theme, we have two specific suggestions. The first is that bird records for a time period surrounding the 1995 census test enumeration date should be obtained for at least some of the test sites. Birth records could play a role in integrated coverage measurement procedures or, failing that, in the evaluation of those procedures. Second, as mentioned in our interim report (p. 77), we suggest that Social Security numbers be obtained, on a voluntary basis, for a sample of people in one or more of the test sites. Doing this would contribute to general research on record linkage techniques by providing a basis for evaluation of the relative effectiveness of the Social Security number as a match key compared with other identifiers like name, address, and exact date of birth. A second reason would be to facilitate a full two-way match of people counted in the test sites by conven- tional census methods with those identified by one or more administrative records sources. In particular, it would be useful in an attempt to determine whether people enumerated but not identified in administrative records for the test sites were covered by administrative records for areas outside the test sites. The 2000 Census Earlier in this section we identified four kinds of administrative record uses that are worth considering for the 2000 census. We now consider each one in somewhat more detail. 1. Measures to improve coverage. Coverage improvement measures are of two kinds: those aimed at improving the Master Address File and those aimed at improving the coverage of individuals. The former can be used throughout the decade, or at least prior to the census, to ensure a good starting address list. The latter needs to use records current at the time of the census and will tend to focus on administrative sources that are rich in data on hard-to-enumerate subpopula- tions. Both kinds of uses of administrative records could be made either across the board or for a sample of blocks as part of a built-in adjustment process designed to produce a one-number census. Administrative records would be a
62 COUNTING PEOPLE IN THE INFORMATION AGE logical element of an integrated coverage measurement procedure (see Chapter 4) in which close to 100 percent coverage is sought for a sample of areas. 2. Measures to improve content. Administrative records have already been used to some extent to evaluate census responses, for example, in record checks with tax data to assess income reporting. A next step could be to use them as a source of data to replace data that are missing due to nonresponse or data that failed edit tests. The final level would be to use administrative records as the initial source of data for some variables, with some form of follow-up for missing data cases as necessary. 3. Measures to improve operational efficiency. The use of administrative records as a source of telephone numbers for use in nonresponse follow-up is one example of use for operational efficiency. Administrative records data could be used prior to the census to identify hard-to-enumerate areas for which special enumeration methods might be appropriate or as a basis for stratification of samples selected for nonresponse follow-up or integrated coverage measurement. 4. Measures to evaluate the census. Uses of administrative records for evaluation will depend very much on how the 2000 census methodology devel- ops with regard to integrated coverage measurement processes that are designed as part of a one-number census. In this context, special attention would be given to the use of administrative records sources believed to offer the best coverage of hard-to-enumerate subpopulations. As in the past, administrative records are also likely to be one of the inputs to the demographic analyses that have traditionally played a role, along with postenumeration surveys, in evaluation of decennial census coverage. To the extent that administrative records are not used to im- prove content, they could be used to evaluate content. Uses of administrative records for coverage and content improvement and evaluation can occur in three phases of the census: in the conduct of the main census operations (those activities that are carried out everywhere, i.e., without any use of sampling, and in sample nonresponse follow-up); in the acquisition and use of data for an integrated coverage measurement sample; and in evaluation activities. Development of the final design for the 2000 census will require decisions about what uses of administrative records (and other special techniques) are appropriate for each of these phases. Such decisions will depend partly on the relative costs of specific procedures and partly on complex technical considerations. Espe- cially important among the latter are the nature of the sample designs and estima- tion procedures to be used for integrated coverage measurement and the potential effects of lack of independence on the estimation procedures. Successful implementation of the Census Bureau's plan for construction of an administrative records database for the 1995 census test sites is a necessary prerequisite for obtaining the information needed to make well-informed design decisions about uses of administrative records in the 2000 census. In addition, significant resources will be needed to undertake simulation studies and other
ADMINISTRATIVE RECORDS 163 kinds of analyses of the role administrative records may play in the 2000 census program and elsewhere. We hope that the Census Bureau will have the will and the resources to take full advantage of this unique opportunity. As noted earlier in this chapter, new health record systems that are expected to emerge over the next several years will be automated and are likely to have close to universal coverage. It is unlikely that such systems will exist for all states in time for significant general use in the 2000 census. However, it is expected that some states will have their reformed health care systems in place fairly soon, presenting opportunities to explore the characteristics and potential uses of health care enrollment records for demographic data programs. One possibility would be to simulate an administrative records census in one or more of these states in 2000. Recommendation 5.9: In maintaining and updating its Administrative Records Information System, the Census Bureau should give high prior- ity to the acquisition of detailed information about record systems that are being developed to support health care reform at the state level. The Census Bureau should seek early opportunities to obtain and use health enrollment records in one or more states and should plan for experimen- tal uses of these records as part of the 2000 census. USE OF ADMINISTRATIVE RECORDS IN OTHER DEMOGRAPHIC PROGRAMS Administrative records already play a major role in various demographic programs of the federal statistical system. In fact, the system is highly dependent on the outputs of different record systems, whether regulatory or administrative in nature. A glance at any edition of the Statistical Abstract of the United States shows the extent and wealth of information now available. Statistical informa- tion based on aggregated data has always been forthcoming as a natural by- product of administrative record reporting systems. These aggregated data are an important ingredient in federal, state, and local information systems. Even at the aggregate level one can distinguish between the mere summari- zation of data on a particular subject or program and the extension to secondary analysis or indirect measures of change in other social, economic, and demo- graphic variables. For example, small-area population estimates for postcensal periods are heavily dependent on aggregated administrative record data that are deemed to be symptomatic of population change. Examples of uses abound at all levels of government at which programs are in place for producing current small- area population estimates carried forward from the last decennial census. Some examples: the Census Bureau uses school enrollment, housing units (building permits), vital records (births and deaths), immigration statistics, federal income tax returns and exemptions, and Medicare data in various ways in their popula
164 COUNTING PEOPLE IN THE INFORMATION AGE lion estimates programs; states and localities, use data on utility hookups, em- ployment (from the federal-state unemployment insurance program), and drivers' licenses. At the local level the most widely used indicator or variable is building (and demolition) permits and similar types of information (e.g., certificates of occupancy and utility connections) to measure population change (Bureau of the Census, 1993g). Extensive programs exist in other areas, but we need not belabor the point. The main purpose of the above litany is to clearly separate and distinguish such uses of aggregated administrative records from our current focus on the more intensive use of administrative records centered on individual records with the potential for matching, merging, linking, and geographic coding. It is at the micro level that the uses of administrative records discussed in this chapter are most effective for census-taking purposes and where they can have the greatest impact on current and future programs of population and other demographic estimates. Uses in Current Population Estimates The Census Bureau's Population Division has an extensive program of popu- lation estimates producing figures for the nation, states, counties, and places (cities, towns, and townships). These estimates represent updates from the last decennial census. Estimated population counts are provided for all levels of geography and counts for a few demographic subgroups at the higher levels only. The estimates serve many important uses, a major one being the allocation of federal and state funds. Examples of other uses are as denominators for vital rates and as benchmarks for survey estimates. Furthermore, many states also use the postcensal subnational estimates (produced by the states and cooperatively with the Census Bureau) to allocate state funds to counties, townships, and incor- porated places within states (Long, 1993~. The history of the development, preparation and publication of population estimates for all places in the country and the role of administrative records in that process demonstrates how administrative records can be used to address major data needs and methodological challenges created by the demands of pub- lic policy and legislation. In the 1970s, general revenue-sharing legislation cre- ated a requirement for population and income estimates for all general-purpose units of government at a time when existing methodologies and available data sets were inadequate for such an undertaking. Fortunately, research in progress at about that time into possible statistical uses of individual income tax returns suggested the viability, validity, and credibility of their use as a primary input to estimates of population and income change. Considering the large amounts of money being allocated and distributed on the basis of the estimates, it was ex- tremely important for the methodology to have a reasonably sound statistical basis and face validity-that elusive quality that says the system sounds right.
ADMINISTRATIVE RECORDS 165 The IRS individual income tax records contained the necessary ingredients for a viable and sophisticated estimation process-one that permitted a separate estimate of population change through net migration, the major unknown compo- nent in the population change estimation process. The important characteristics of the records were: (1) a unique identifier, the Social Security number, that permitted matching records (within the system) over time and (2) a residential address whereby each record could be coded to an appropriate level of geogra- phy. Another important feature was the consistently relatively high coverage: at that time about 80 to 85 percent of the population was regularly covered on tax returns, with some geographic differential. Most important, the legislation pro- vided for Census Bureau access to the individual records (actually an extract of each record with selected information required by the Census Bureau for the estimation process) on a continuing, annual basis and permission to request peri- odically a modification of the basic individual tax form to obtain information on specific place of residence to supplement the address information. This latter change was needed to provide a means of geographically assigning each record to the appropriate local jurisdiction-one of some 39,000 governmental units to which funds were to be distributed. This type of accommodation through modi- fication of administrative records to meet program needs suggests a precedent for possible modification in tax returns or other administrative records to accommo- date to specific statistical needs of other programs. The feasibility of a census conducted primarily through administrative records could be substantially in- creased by adjustments in both the content and format of the source records. At present the Census Bureau regularly uses micro-level administrative records in its demographic programs only to generate estimates of the total popu- lation for all areas of the country. (With the ending of revenue sharing, estimates of income per-capita income are no longer produced.) However, a useful by- product of the methodology is estimates of gross migration flows for geographic areas of the population covered on tax returns. These are important proxies of population migration patterns and provide states and localities important insights into the characteristics of in- and out-migrants. Data Enhanced Through Linkages In recent years the Census Bureau has researched and experimented with estimation of population by age, race, and Spanish origin for selected large geo- graphic areas by linking a 20 percent sample of the Social Security Admin- istration's NUMIDENT file, which contains such demographic characteristics as age, sex, race, and Spanish origin, to the basic IRS record extract. Estimates of population classified by these characteristics are generated using the same meth- odology as for the total population. The extension of the estimation process to age, race, and Spanish origin illustrates how a significant enhancement to an
66 COUNTING PEOPLE IN THE INFORAlATION AGE existing program can be accomplished through linkages with only a modest use of additional resources and time. However, the 20 percent sample used for assigning these characteristics imposes a lower limit on the size of areas for which sufficiently reliable estimates can be produced. Consideration might be given to expanding the sample or obtaining a complete file to permit more detailed estimation. In addition, al- though methodologically there are no significant problems in generating such estimates, there has been significant erosion of the completeness and quality of the race/ethnicity data in the Social Security file. Under the recently initiated enumeration at birth program, Social Security numbers and birth certificates are now being issued simultaneously for nearly all newborn infants. The Social Security Administration continues to capture the age and sex information for inclusion in the NUMIDENT file, but the standard race and ethnicity items from the Social Security number application form are not being asked. Federal and state laws and policies prevent the Social Security Administration from obtaining the birth certificate information on the race and ethnicity of the mother. If no action is taken to resolve this problem, the proportion of people in the NUMIDENT file lacking race and ethnic data will continue to grow, with nega- tive consequences not only for uses of the NUMIDENT file by the Census Bu- reau but also for the Social Security Administration's ability to determine how its own programs affect different racial and ethnic groups. Until there is wider recognition of the benefits of using administrative records for statistical uses beyond those directly related to the administrative programs they serve, statistical programs that depend heavily on administrative records will continue to be at risk to changes beyond their control. The foregoing illustrates only one type of expansion in program output through file linkage. Recent research by the Internal Revenue Service also sug- gests that significant improvement in population estimates may be achieved by merging and unduplicating files that now exist as part of the income tax collec- tion system. As mentioned above, the basic 1040 individual income tax return file extract now used to generate population estimates covers somewhat less than 90 percent of the population, with considerable variance by state and county. Increasing overall coverage would have a significant positive effect on the accu- racy of the estimate. Internal Revenue Service research with informational docu- ments forms 1099, W-2s, etc. suggest that matching, unduplicating, and merg- ing these records with the individual tax return files could raise coverage to the high 90s (98 percent according to one study) with concomitant increases in coverage for all geographic areas and reduction in geographic differentials. The first stage of the research has been fully documented and evaluated. As noted in Recommendation 5.5, the panel supports and encourages additional research aimed at resolving a number of technical and administrative issues involved in matching, unduplicating, and merging the various files so that the Census
ADMINISTRATIVE RECORDS 167 Bureau's current population (and eventually income) estimates could be based on data covering a substantially greater proportion of the total U.S. population. Estimates of Income and Poverty For the past several years the Census Bureau has been considering ways to provide income and poverty estimates for counties, cities, and places. Although still in the research stage, the methodology would depend heavily on the avail- ability and access to data from the IRS Individual Master File the same file extract the Census Bureau now uses to prepare its population estimates (de- scribed above). Another set of files that would be used in methods development are those that can be created by linking the Individual Master File extract with sample survey data from the March Income Supplement of the Current Popula- tion Survey and the Survey of Income and Program Participation. Linking the sample survey individual and household records to the specific tax return would bring together the information on tax returns with the detailed demographic and socioeconomic data collected in those surveys for a substantial sample of the population. These enriched files for a sample of the population essentially Census Bureau data and tax return data-would provide a basis for modeling small-area estimates (Bureau of the Census, 1993c). New impetus for the development of small-area income and poverty esti- mates would be provided by the enactment of proposed legislation, the Poverty Data Improvement Act of 1993, calling on the Census Bureau to provide esti- mates of the number of children in poverty for states, counties, local units of general purpose government, and school districts. The legislation does not spe- cifically call for access to additional administrative records (access to the IRS files mentioned above is assumed), but it would certainly improve the prospects of the Census Bureau's ability to prepare reliable estimates for the poverty uni- verse if access to other administrative records were possible. Specifically, access to files of programs directed toward people and families at the lowest end of the income scale, e.g., Aid to Families With Dependent Children and Food Stamps, would be an important addition to the administrative record armamentarium use- ful for small-area poverty estimation. Poverty estimates based on a combination of IRS and Aid to Families With Dependent Children (or Food Stamp) files may provide more realistic and credible estimates than those based solely on IRS files linked to national sample survey data. Use of Administrative Records in Surveys: The Survey of Income and Program Participation The Survey of Income and Program Participation (SIPP) is a Census Bureau panel survey that provides detailed information on the economic situation of people and families in the United States and how public transfer and tax programs
68 COUNTING PEOPLE IN THE INFORMATION AGE affect their financial circumstances. SIPP was designed to obtain improved and more comprehensive information on the distribution of household and personal income in the United States. Hitherto, the main source of such information the March Income Supplement of the Current Population Survey had limitations that could be overcome only by making substantial changes in the survey instru- ment and procedures (Bowie and Kasprzyk, 1987~. From the beginning, the SIPP was conceived as an instrument that would combine household survey data and administrative records through linkages based on Social Security numbers ob- tained from survey respondents. Some of the major goals anticipated for the use of administrative records in SIPP included: 1. A supplemental sampling frame to increase the reliability of estimates for selected subgroups (e.g., old age, survivors, and disability insurance recipients, supplemental security income recipients); 2. To evaluate quality of the survey data by comparing items collected in the survey with comparable items available from administrative records; and 3. A supplemental source for items difficult to obtain by a survey (e.g., earning and program benefit histories). Other potential gains include data enhancements through linking demo- graphic data from the survey with economic data sets from establishment and enterprise reporting in the economic census and other data files maintained by the Census Bureau (Herriot et al., 1989a). During the development phases of SIPP, the Income Survey Development Program experimented with a large number of administrative records sources, including the Aid to Families With Dependent Children master file maintained by the Texas State Department of Welfare, the Supplementary Security Record, the Master Beneficiary Record, the Basic Educational Opportunity Grant applicant file, the Veterans Administration's Pension and Compensation file, the Internal Revenue Service's Individual Master File, and state record files for Unemploy- ment Insurance and Workers Compensation. A planned match to the Summary Earnings File, which contains a history of covered earnings for each worker, was never implemented. The SIPP program has a continuing commitment to the use of administrative records for statistical purposes, and the ability to match survey and records infor- mation with a minimum of error is of prime importance. The Social Security number, a unique key identifier, is collected and its quality benefits from the fact that special measures are taken by the Census Bureau to ensure that each number reported in SIPP is complete and valid. The wealth of other data-last name, first name, house number, street name, apartment designation, city, zip code, state, and date of birth adds to the quality of any matching and linking of SIPP to other administrative records. In summary, SIPP is an example of a survey pro- gram designed to take advantage of administrative records to provide the highest
ADMINISTRATIVE RECORDS 169 quality, comprehensive data on a specific subject. In addition, the potential exists for developing a substantially enhanced database by supplementing costly survey data with information already existing in administrative records systems. A recent Committee on National Statistics panel report on the Survey of Income and Program Participation (Citro and Kalton, 1993:90) said "we strongly support an increased role for administrative records in the SIPP program. However, there are many operational and technical problems, in addition to concerns about con- fidentiality, that impede ready use .... Nonetheless, we urge the Census Bureau to seek innovative ways for SIPP to benefit from the extensive information that is available on income and programs from administrative record sources." This last suggestion, that the Census Bureau seek innovative ways for SIPP to benefit from administrative records, need not be limited to SIPP. The post- censal demographic/economics estimates program activities should also be en- couraged to continue research into the use of administrative records files for program expansion and enhancement. Indeed, the federal statistical system in general could benefit by devoting some effort to looking for innovative ways to take advantage of information already available from administrative records. Postcensal Estimates: State Programs The foregoing discussion dealt only with the use of administrative records available at the national level across all areas of the country. There are large numbers of files maintained by and under the auspices of state authority. In at least one state, California, an administrative record file is used to generate popu- lation estimates at the substate level-counties. Data from the driver's license file maintained by the state's Department of Motor Vehicles are used to estimate gross intercounty (and interstate) migration of the population 18-64 years of age. The basic ingredient in the estimation process is the availability of driver's li- cense change-of-address forms maintained by the Department of Motor Vehicles. Holders of driver's licenses are required to report any change of address to the department, and there are a number of incentives to do so within a reasonable time. These change-of-address forms are processed by the department and given to the Demographic Research Unit, California Department of Finance, which is charged with the responsibility of preparing postcensal population estimates. The change-of-address forms are coded to counties, and aggregated data showing gross in and out moves by county are provided to the Demographic Research Unit for its population estimates program (Hoag, 1984~. In this particular application, the statistical agency (Demographic Research Unit) does not obtain the actual micro-record but depends on the program agency, the Department of Motor Vehicles, to provide a special tabulation of its files. This model of interagency cooperation is instructive in that it permits program agencies that may be reluctant or legally prohibited from transferring individu- ally identifiable administrative records to other agencies to still expand their use
170 COUNTING PEOPLE IN THE INFORMATION AGE by carrying through special operations or tabulations to meet other agencies' statistical needs. One problem with this procedure is the total dependence on the originating agency to provide the material in some timely fashion. There may be legitimate reasons for delays and interruptions to occur. Another procedure would be for the originating agency to provide the basic file (extract or mini- record with confidentiality protection) on a regular basis and leave the processing to the receiving agency. But this may involve a number of legal and technical constraints that would have to be addressed. The California program illustrates an important use of state administrative records for population estimates that could be emulated by other states depending on the nature and scope of their own driver's license files. Canada's Use of Administrative Records Statistics Canada has been doing extensive research into the potential use of a variety of administrative records for small-area estimation and at present makes intensive use of the personal income tax records, i.e., Revenue Canada Taxation files (corresponding to our own Internal Revenue Service Individual Income Tax Returns) for this purpose. Statistics Canada had also been doing research on linking various files, on a sample basis, to provide an appreciably enhanced individual tax record for improved population coverage and expanded character- istic variables. However, this research has been set aside for the time being, primarily because changes to the tax system that will cause more low-income Canadians to file returns are expected to make it easier to address the main coverage issues without embarking on further record linkages with their atten- dant privacy concerns. Canada's program of postcensal population estimates is similar to that of the United States but somewhat more elaborate in the characteristic detail available. Estimates of total population by age group and sex are prepared for geographic areas down to the census division (or county) level; data on age, sex, marital status, and number of families by type are provided for provinces and territories (Statistics Canada, 1987) The main administrative record file used to generate population estimates is the Revenue Canada taxation file. The Tax Act provides for the transmission of copies of records held by Revenue Canada to Statistics Canada to meet the needs of the latter's estimation program. (Such legal and continuous access to the file is a basic underpinning of any postcensal estimate program dependent on admin- istrative records.) The methodology for measuring net and interarea migration is essentially the same as that for the United States. Individual tax records are matched for successive periods using Social Insurance Number as the main match- ing key. Migration of tax filers and their dependents to and from geographic areas is determined directly from the addresses on the tax file. The dependents are enumerated from information reported by the tax filers: exemptions for
ADMINISTRATIVE RECORDS 171 dependent spouse, exemptions for dependent children, claims for refundable tax credits for children, reporting of child care expenses, and the receipt of family allowance benefits. These imputations are made while creating the TIFF (T1 Family File, see description below) and are taken directly from it. Finally, an adjustment is made to the interarea migration estimates to benchmark them to an estimate of migration for the total population. Another administrative records file used extensively in the estimates pro- gram, at least until recently, was the family allowance program file, administered by Health and Welfare Canada. This program provided monthly payments to every eligible child under age 18 (with certain limitations). Until very recently, the program file covered essentially all children under age 15, and well over 90 percent of those ages 15-17. The main use of the file was to measure net interarea migration of the population under age 18 (the adult population was not covered) through reporting of change of address. To continue receiving their family allow- ance checks, recipients had to notify the regional office of Health and Welfare Canada of any change of address. These notifications, which were assumed to be fairly accurate and comprehensive, became the vehicle for measuring net interarea migration for the universe coverage. Family allowance files were also used as symptomatic indicators in a regression model to generate preliminary estimates of total population for census divisions. Recent legislation on the New Child Tax Benefit System which took effect on January 1, 1993-eliminated both the Family Allowance Program and the Child Tax Credit program. Extensive research is under way on the use of the new system as an input to migration estimation. An important lesson is that a program that depends on administrative records is always at risk of new legislation and other program changes that can affect consistency with the past, adequacy of coverage, and continued relevance of the new file to statistical program needs. The income tax system file is the backbone and main underpinning of Statis- tics Canada's administrative record work. Some other research projects under way and proposals for expanding uses of the basic individual income tax record file have included: (1) Expanding the individual tax record file to a tax filer family file, thereby creating families from the individual tax file. This is accom- plished by a six-step process of matching and imputation using information from each record within the tax file system, such as Social Insurance number, postal code, and surname, to name a few. (2) A proposed pilot study to compile an administrative record consolidation file by linking a number of records to the individual tax record or the tax filer family file. The files to be linked (20 percent sample) included Old Age Security, Social Assistance, Unemployment Insur- ance, and Family Allowance. The linking might improve population coverage as well as expand and improve characteristics variables. However, the project is not being pursued because the statistical benefits were not seen to outweigh privacy concerns. (3) Development of a longitudinal administrative database. In early 1988, discussions were undertaken to assess the conceptual feasibility of building
72 COUNTING PEOPLE IN THE INFORMATION AGE a longitudinal database from administrative records. Plans were developed for a pilot study to determine whether it would be feasible to use administrative records to build a longitudinal database for social research and policy analysis. Initial plans called for use of a 10 percent sample created from tax filer families plus a 10 percent sample of Social Assistance recipients in two provinces, Quebec and Nova Scotia, that were not on the tax filer database. The longitudinal administra- tive database had been created for a 10 percent sample of those two sources for a 5-year period, 1982-1986. However, again because of privacy concerns about linking files from more than one source, this project was shelved in favor of a 1 percent longitudinal administrative database created entirely from the tax file with no other linkages (Leyes, 1990~. In summary, the Canadian experience illustrates that the unidimensional use of administrative records is extremely useful for postcensal estimates programs. The further potential benefits that can be achieved by matching, merging, and linking files of administrative records have to be weighed against privacy con- cerns associated with such activities and can be realized only if the privacy issues are satisfactorily resolved and public concerns alleviated. Matching and Informed Consent in Canada As stated, the value of administrative records for programmatic purposes can be increased significantly if different program records containing supplementary and complementary data can be linked to form larger databases. At the moment, such linkages are cause for concern by those who perceive such actions by the government as an invasion of privacy. As noted, linkage to administrative records can also be used to improve data quality and reduce respondent burden in cen- suses and surveys. In this context, Statistics Canada is considering the feasibility of obtaining income information by linking Revenue Canada tax records with records of respondents to Statistics Canada income surveys (Greenberg, 1993~. As part of this investigation, a question about permission was asked of a subsample of respondents to the August 1993 Labor Force Survey. Respondents were asked whether they would give permission for Statistics Canada to get their income information directly from Revenue Canada, if they were asked to partici- pate in a Statistics Canada income survey. The results showed that 55 to 62 percent of respondents would be willing to allow access to their files under the stated conditions. The permission rate varied little by geographic area or demo- graphic group. An analysis of the results suggested that there would be benefits if a mixture of interviewing and linkages were used, particularly since many nonrespondents (to the Survey of Consumer Finances) said they would give permission to access their income tax records (Greenberg, 1993~. More research is warranted into this area of permission and informed consent for linking indi- vidual records.
ADMINISTRATIVE RECORDS 173 Summary In this section, we have identified several new ways in which administrative records might be used for statistical purposes not directly related to the decennial census. A vigorous effort to explore and develop some of these uses would serve two important purposes. First, such an effort would bring knowledge and expe- rience that would greatly assist the Census Bureau in future decisions about a greater role for administrative records in the decennial census. Second, it could add substantial value to the Census Bureau's current demographic data programs by providing data more frequently and with greater geographic and subject matter detail. These two purposes were described in a convincing manner by the Panel on Census Requirements in the Year 2000 and Beyond (Committee on National Statistics, 1993a) in its interim report. Concerning the first goal, the panel said (pp. 25-26~: We recommend (see above) that an important first step in examining adminis- trative records is to begin working with them now. If administrative records are to have an expanded use in the decennial census, then there is an urgent need to start to exploit them more heavily for intercensal estimates.... Although the Bureau of the Census has been using administrative records for years, their expanded use for intercensal estimates would provide the necessary experience that is needed for assessing their potential for the decennial census. Concerning the second goal, the panel's view was expressed (p. 26~: Another rationale exists for using administrative records for intercensal esti- mates. Census data are available only every 10 years.... On average ... U.S. small-area estimates are approximately 8 years old over the decade of their use flaking account of the 3-year lag between the census date and initial availabili- ty]. Administrative records have the potential to provide much more frequent information for small geographic areas, on important variables such as popula- tion and housing counts, poverty, and income. In the final section of this chapter, this panel advocates adoption, by the Census Bureau, of a proactive policy for increased statistical use of administra- tive records. Uses not directly related to the decennial census should be an important component of that effort. Recommendation 5.10: The Census Bureau should substantially in- crease the scope of its efforts to use administrative records to produce intercensal small-area tabulations, either through stand-alone tabula- tions of data from one or more administrative record sources or by combining such data with data from current surveys. One step that could bring immediate benefits would be to begin supplementing the individual tax return extract data, which the Internal Revenue Service has
74 COUNTING PEOPLE IN THE INFORMATION AGE been providing annually to the Census Bureau, with data from informational documents submitted to the Internal Revenue Service. Although tabulations of merged files of tax returns and informational documents prepared by the Internal Revenue Service would be of some value, transmission of extract files to the Census Bureau would be preferable, for the reasons described above. Important benefits would also result from any steps that can be taken to add race and ethnic information to the Social Security Administration's NUMIDENT file for all new- born infants and for other persons for whom it is missing or incomplete. As opportunities present themselves, new health care enrollment records should be brought into the picture. The timetable for development of standard- ized national health care enrollment records is uncertain at this writing, but it is quite possible that some states will lead the way by developing their own auto- mated health care information systems. Experience working with state record systems will give the Census Bureau an opportunity to judge their suitability for statistical uses and to develop recommendations for national standards that will facilitate statistical uses. SUMMARY AND CONCLUSIONS The mission of statistical agencies is to meet the information needs of gov- ernment and society as effectively and efficiently as possible, using all available sources of information. The primary sources of statistical information about people are censuses, statistical surveys, and a large and diverse set of administra- tive record systems that have been created primarily for nonstatistical purposes. There are several ways in which administrative records have been or might be used to help satisfy needs for small-area demographic data: 1. As an adjunct to a conventional decennial census of population, to im- prove coverage, reduce costs, make collection operations more efficient, and evaluate census coverage and quality. 2. As the primary source of census information, to be supplemented by other sources of data only to the extent necessary. 3. To produce stand-alone tabulations of small-area data from a single ad- ministrative records system or a combination of systems, as discussed by the Census Requirements Panel in Appendix B of its interim report and in the preced- ing section of this chapter. 4. As inputs to a system of current population estimates designed to provide periodic counts of people classified by age, sex, and race/ethnic status for geo- graphic units defined in as much detail as possible. To the extent that the neces- sary data are available from administrative records, other variables, like income, might be included. 5. As part of a continuous measurement system of surveys designed to pro- vide estimates of census long-form data on a continuing basis for areas down to the tract or block-group level. Small area counts based on an administrative
ADMINISTRATIVE RECORDS 175 records could be used in the sample design and estimation process to reduce the sample sizes needed for the surveys to provide small-area data at acceptable levels of reliability. For the first category, using administrative records as an adjunct to a conven- tional census, we have identified and discussed several possible uses in the 2000 census and plans for testing and evaluating them in the 1995 census test. The second possibility, a census based primarily on administrative records, has been realized in a few western European countries, but is clearly not feasible for the 2000 census in the United States. We believe, however, that it should be seri- ously considered as an option for the 2010 census and that the Census Bureau should continue to explore and evaluate it in a systematic way. In this context, the Census Bureau should give special attention to new health care records that may be developed to implement the reform of the health care system in the United States. However, simply shifting from a twentieth century conventional decennial census to a twenty-first century model based on administrative records is not the only possible and perhaps not the most promising paradigm for counting people in the information age. It is unlikely that an administrative records census could provide all of the content items that have been included on the long-form ques- tionnaire in recent censuses. Alternative paradigms should be identified and evaluated. One long-range goal might be to establish an integrated demographic data system consisting of 3 elements: 1. Annual small-area population counts based entirely on administrative records. People would be classified by age, sex, and the official race/ethnic categories. 2. Continuous measurement surveys to provide long-form data for areas down to the census tract, block group, and school district levels. Data would be available annually for areas with more than 250,000 population and moving multiyear averages would be provided for smaller areas. 3. Integrated coverage measurement surveys, similar to those that have been proposed for the 2000 census, at least once every 10 years, possibly more often. The results of these surveys would be used to adjust the annual population counts and the estimates based on the continuous measurement surveys. The development of such a system is a pipe dream only if we limit ourselves to looking for reasons why it can't be done, rather than asking ourselves what steps would be necessary to accomplish it. To start moving in the direction of such a goal, it may be desirable to give increased attention and priority, for the time being, to uses of administrative records in the third, fourth, and fifth categories enumerated at the beginning of this section, all of which aim at enhancing the scope and content of intercensal demographic data programs. The potential benefits for current data programs are substantial, and such initiatives would provide much-needed experience in using
76 COUNTING PEOPLE IN THE INFORMATION AGE existing and new administrative records systems for statistical purposes. The costs for uses of administrative records in categories 3 and 4 would be relatively low compared with the cost of a decennial census or a continuous measurement system with a large survey component. To the extent that continuous measurement and other programs are success- ful in better meeting user needs for small-area data and providing data more frequently, the pressures for long-form data on the decennial census should abate, thereby opening up the possibility for a relatively smooth transition to the kind of integrated demographic data system we have just outlined. We believe it is a mistake to think of administrative records only in terms of how they might be used to replace a traditional version of the decennial census, duplicating all of its major design features. This kind of thinking can be a straitjacket that inhibits creativity about how to develop an integrated demo- graphic data program that makes effective use of all available sources of data, taking into account relevance to user needs, cost, accuracy, frequency, and time- liness. It is somewhat akin to the frequently observed phenomenon that new technologies are often used at first only to replicate the uses of the old technologies. We have discussed some major federal and state administrative records sys- tems, existing and in prospect, that appear to have the greatest potential for demographic statistical uses. Internal Revenue Service and Social Security Ad- ministration record systems have been used for statistical purposes for many years, but now have substantial potential for enhanced use, as demonstrated by recent research on the coverage of merged files of tax returns and informational documents. The movement for health care reform brings the prospect of a new national system of records that may have close to universal coverage and may include some of the basic variables, especially race and ethnicity, that are not adequately represented in other systems of administrative records. Reforms to welfare programs may lead to better coverage, in the associated record systems, of people who tend to be more difficult to enumerate by conventional methods. Finally, the continuously updated Master Address File/TIGER system, to which the Census Bureau is now committed, will provide a hitherto unavailable capac- ity for using administrative data from different sources and assigning units to their correct geographic locations. The future holds attractive prospects for using administrative records as the keystone in developing a greatly improved small-area demographic data system that can provide data more frequently at no increase and possibly a significant reduction in costs over the decade. However, these prospects can only be realized if the Census Bureau, with support from the Office of Management and Budget's Statistical Policy Office, other federal agencies, and the Congress, adopts a proactive policy to explore expanded uses of administrative records. To maxi- mize the likelihood of success, a proactive policy should include the following elements: . A determination, not just to make use of existing administrative records
ADMINISTRATIVE RECORDS 177 systems, but to play an active part, in cooperation with program agencies, in the development of new systems and in modification of existing systems to improve their utility for statistical uses. · A suitable organizational unit and adequate resources for research and development activities not tied directly to ongoing census and survey programs. However, some of the exploratory research should be directed at existing pro- grams, especially at the use of administrative records to improve current popula- tion estimates and small-area data for use in funds allocation. Hands-on experi- ence is an essential part of learning to use administrative records effectively. Resources devoted to this effort should support work done by contractors and census fellows or under other external arrangements as well as in-house research and development activities. · A determination to take advantage of every possible occasion to explore the long-range as well as immediate potential for using administrative records. For example, the 1995 census test should be seen as an opportunity, not only to test uses that are being considered for the 2000 census, but also to acquire infor- mation about administrative records that will be of value for uses in current population estimate programs or censuses after 2000. Similarly, forward-looking experiments with the use of administrative records should be part of the research program associated with the 2000 census. · Access to a national integrated, continuously updated MAF1IGER system. · Full and continuing awareness of the concerns of individuals whose infor- mation is contained in administrative records that are being used for statistical purposes and recognition of the importance of their views about what kinds of uses are acceptable. Since its beginnings, the Census Bureau has been in the forefront of many important advances in information technology, including the development and application of sampling theory in censuses and surveys, automation of data col- lection and processing operations, development of response error models, and application of tools from cognitive psychology and social anthropology to under- stand and improve data collection procedures. A proactive policy to develop enhanced uses of administrative records would be in keeping with the Census Bureau's tradition of innovation and adaptation to the technical and social envi- ronment in which it carries out its mission as fact-finder for the nation. Recommendation 5.11: The panel urges the Census Bureau to adopt a proactive policy to expand its uses of administrative records, and it urges other executive branch agencies and Congress to give their sup port to such a policy. Any proactive policy has some risks associated with it, but the panel believes that the risks are justified by the potential benefits. Clinging to the twentieth century census model in a twenty-first century data collection environment could well prove to be even more risky.