CHAPTER 4
American Community Survey

THE CONSTITUTIONAL MANDATE for the decennial census is to provide a basic head count for purposes of apportionment, but the nation’s need for accurate measures of its civic health has led the census to develop well beyond a simple count tabulated by age, race, and sex. Over time, the roster of questions included in the census expanded to cover a wide array of socioeconomic and demographic characteristics. The emergence of the statistical theory of survey sampling in the early 20th century brought with it the potential to collect detailed characteristics information without unduly burdening the entire American public. Asking detailed characteristics information from only a sample of the populace began in the 1940 census, when six questions on socioeconomic status were asked of only 5 percent of respondents. In 1960, the concept took its next evolutionary step when two separate census forms began to be used, a design feature that continued through the 2000 census. The short form covers the basic information items to be asked of all residents; the long form—administered only to a sample of the public—includes the complete battery of characteristics questions. In 2000, for example, the short form contained queries for six basic items—age, sex, Hispanic origin, race, relationship



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges CHAPTER 4 American Community Survey THE CONSTITUTIONAL MANDATE for the decennial census is to provide a basic head count for purposes of apportionment, but the nation’s need for accurate measures of its civic health has led the census to develop well beyond a simple count tabulated by age, race, and sex. Over time, the roster of questions included in the census expanded to cover a wide array of socioeconomic and demographic characteristics. The emergence of the statistical theory of survey sampling in the early 20th century brought with it the potential to collect detailed characteristics information without unduly burdening the entire American public. Asking detailed characteristics information from only a sample of the populace began in the 1940 census, when six questions on socioeconomic status were asked of only 5 percent of respondents. In 1960, the concept took its next evolutionary step when two separate census forms began to be used, a design feature that continued through the 2000 census. The short form covers the basic information items to be asked of all residents; the long form—administered only to a sample of the public—includes the complete battery of characteristics questions. In 2000, for example, the short form contained queries for six basic items—age, sex, Hispanic origin, race, relationship

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges to census respondent, and housing tenure (renter/owner), while the long form administered to roughly one-sixth of households added about 62 items, 36 of them pertinent to demographic and economic characteristics and 26 related to housing. Fifty years after the development of separate short and long forms in the census, the Census Bureau proposes to make another change in the collection of population characteristics data by introducing the American Community Survey (ACS). A major household survey intended to include 250,000 housing units each month, the ACS would replace the decennial census long-form sample and permit continuous measurement of the same data items currently collected only every 10 years on the census long form. The 2010 census would therefore include only the short form, which would enable easier (and potentially more accurate) data collection in the census and save costs on data capture from completed paper questionnaires. At the same time, the data on characteristics currently collected on the census long form would be produced on a more timely basis, offering annual assessments rather than a static once-a-decade snapshot. The potential rewards of the ACS are great, but so too are its inherent risks. The survey’s success is contingent on sustained long-term funding, and year-to-year fluctuations in allocated spending levels could cause severe data quality problems, particularly for small population groups. Estimation based on continuous measurement such as the ACS—most likely making use of moving averages of several years of data—also raises conceptual and feasibility issues that must be addressed in order for the survey to win support. These risks, and others, are significant, but perhaps the most important risk associated with the ACS is simply one of timing. A final decision on the methodology for the 2000 census was reached dangerously close to Census Day; extended delay in reaching agreement at all levels—the Census Bureau, the administration, and Congress—about the role of the ACS could similarly raise the risk of having to revamp census design very late in the cycle. The decision on whether the ACS will proceed in full—and, with it, determination of the fate of the census long form—is the single most important element in terms of defining the general shape, structure, and design of the 2010 census.

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges In this chapter, we discuss the background of the ACS and describe the current plans for a fully operating ACS in Section 4-A. We then begin our assessment by identifying key questions (4–B); these major questions generally center around the challenges of estimation using the ACS (4–C) and the basic quality of ACS data (4–D). Our general summary and assessment of the ACS’ proposed role in the 2010 census (4–E) is followed by an outline of major features required in the intensive research and evaluation effort that should complement ACS operations (4–F). 4–A BACKGROUND AND CURRENT PLANS Work on what is now known as the American Community Survey commenced after Alexander (1993) revisited the idea of a continuous measurement survey for gathering long-form data as a complement to a short-form-only census (for historical context, see also Hauser, 1942; Kish, 1981, 1990). Two previous National Research Council panels supported the general principle of a continuous measurement survey and urged further research (National Research Council, 1994, 1995); however, National Research Council (1995) concluded that a proposal to implement the survey to replace the census long form in the 2000 census was infeasible, given inadequate lead time and unresolved conceptual problems. The ACS was also the focus of a 1998 National Research Council workshop to discuss research priorities (National Research Council, 2001b). 4–A.1 Test Sites and the Census 2000 Supplementary Survey Though the ACS was ruled out as a replacement for the long form in 2000, the mid-1990s burst of research and writing about the prospects of continuous measurement launched a wider research and evaluation effort. Pilot data collection for the ACS began in four test sites in 1996. By 1999, data collection in this demonstration phase had grown to include thirty-one sites across thirty-six counties (U.S. Census Bureau, 2003c). During the initial pilot phase in 1996–1998, residents were sampled at a markedly higher rate—15 percent, increased to 30 percent in some communities—than is planned for the full-scale ACS.

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges More significantly, as the ACS began to be adopted as part of the developing 2010 census plan, an experiment was developed in conjunction with the 2000 census to attempt to address the basic question of operational feasibility (that is, whether it is possible for the Census Bureau to conduct the decennial census and an ongoing survey containing usual long-form items at the same time, both operationally and in terms of burden on respondents). Accordingly, the Census 2000 Supplementary Survey (C2SS) began in January 2000 and continued data collection through December 2000. This prototype ACS sampled from 1,203 counties and covered approximately 700,000 households over the course of the year. Data collection continued at these levels in 2001–2003. A report prepared as part of the 2000 census evaluation program concluded that operating a large continuous measurement survey in parallel with the decennial census was operationally feasible, based on the 2000 census and C2SS experience (Griffin and Obenski, 2001). Original plans called for the ACS to begin full field implementation in 2003, a schedule that would support publication of small-area estimates in 2008. However, congressional stalemate on the budget for fiscal year 2003 delayed full implementation by at least one year. 4–A.2 Current ACS Implementation Plans Under the funding levels appropriated for fiscal 2004, questionnaire mailing for a full-scale ACS would begin during the fourth quarter (July–September) of fiscal 2004. Follow-up field work would be deferred until after September 2004, pushing the considerable expense of field interviewing into the fiscal 2005 budget process. Prior to the fourth quarter mailing, data would continue to be collected in the thirty-one test sites and at the C2SS levels (Lowenthal, 2003a). When the ACS is fully fielded, it will use as its sampling frame the same Master Address File (MAF) used by the decennial census. The annual sample of housing units chosen for participation in the survey will be divided into monthly mailout panels, each of which will be a systematic sample across the complete address list. Thus, it is intended that each month’s sample will be a rep-

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges resentative sample (approximately ) of the population of each area of the United States. In practice, this simplified sample selection process will be modified by practices similar to those used for the decennial census long form, including oversampling of small geographic areas. The ACS is intended to be administered primarily via mailout/ mailback. However, the proposed ACS techniques to follow up with households that do not return the mail form differ from decennial census practice. All mail nonrespondents will be initially followed up by computer-assisted telephone interviewing (CATI) during the month following questionnaire mailout, if there is an available phone number. After CATI follow-up, a random one-third of the remaining nonrespondents will be designated for follow-up by field enumerators using computer-assisted personal interviewing (CAPI). The precise nature of this sequential follow-up process remains to be determined; there are tentative plans to sample areas with low mail and telephone response rates at a higher fraction rather than a strict one-third random sample.1 This oversampling may help to make sample variances more comparable across areas. The stagewise nature of ACS follow-up leads to another important design feature, which is that all of the information collected in a given month will be used as inputs for that month’s estimates. That is, a particular month’s estimates may include mailback responses from the present month’s systematic sample of housing units as well as completed telephone and personal interviews from one and two months prior, respectively. This design choice is advantageous in that it simplifies data processing and production load—there is no need to wait until month t+2 for final resolution of all the housing units chosen in month t before processing responses already submitted. But it does raise complex methodological challenges, including the choice of weighting methods to address unit nonresponse. While the size of this survey will make possible some direct small-area estimates, the estimates for areas with a population 1   Due to budget constraints, the Bureau may be required to reduce the sampling rate in higher mail and telephone response areas to accommodate this oversampling. The implications of such a shift need to be researched ahead of time before plans are finalized.

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges of less than 65,000 typically will be produced by aggregating information over either 3 or 5 years, depending on the size of the area. At this time, moving averages are planned to be used for these aggregate-year estimates, though other possibilities could be considered in the future. The need for a 5-year window to produce detailed small-area estimates puts a firm constraint on the date of full ACS deployment. The initial plans for full deployment in 2003 would have produced small-area estimates in 2008, allowing some time for the new ACS figures to gain acceptance as a long-form replacement. To match the long-form data production schedule of the 2000 census, the absolute deadline for full (and sustained) implementation of the ACS is 2007, which would permit the publishing in 2012 of national estimates analogous to those from the long form. 4–B ASSESSING THE ACS In simplest terms, the root practical question that must be answered in justifying the ACS is whether the information generated by the survey is an adequate replacement for the data currently collected on the census long form. Parsed at the most basic and literal level—whether the ACS and the long form are substitutable in content—the answer is simple. By design, the ACS covers the same topics and data items as the census long form; exact question wording and ordering may vary, but in general terms the content matches. Thus, in the simple sense of topical content, the ACS is an obvious substitute for the long form. The more challenging question is whether the ACS can replace the census long-form sample in terms of performance and function. This basic question can be further subdivided into key subquestions, the answers to which are vital to bolstering the case for the ACS. For all but the largest population or geographic groups, ACS estimates will be based on averages across multiple years of data. Is the ACS able to satisfy all of the needs currently addressed by long-form data, or are there applications based on the census long form for which substitution

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges of a moving average-type estimate from the ACS would be inappropriate? How well will ACS estimates match other estimates of the same phenomena? That is, how will ACS measures compare in level or trend to traditional long-form estimates or to other survey measures? What is the quality of ACS estimates and data relative to the census long form? Specifically, what can be said about error—both bias and variance—and undercoverage in data collected through the ACS, and how do they compare with those incurred through the census long form? These and other questions involve concerns about methods of estimation based on the ACS and about the inherent quality of ACS data and estimates; we offer more detailed comment on these concerns in the following two sections. 4–C ESTIMATION USING THE ACS 4–C.1 Adequacy of Moving Averages as Point Estimates A basic concern about the American Community Survey as a replacement for the census long form is whether ACS estimates—which, particularly for small areas or groups, would be moving averages of multiple years’ data points—can effectively replace fixed-point-in-time estimates. Specifically, the concern is whether fund allocation formulas or other public and private planning needs for demographic data can be addressed using a combination of data from multiple years. The Census Bureau has issued a draft report that attempts to address users’ concerns about this shift (Alexander, 2002), and Zaslavsky and Schirm (1998, 2002) outline the advantages and disadvantages a locality may experience through use of either a moving average or a direct (census) estimate. The crux of the debate on this point is that a moving average is a smoothed estimate; by averaging a particular data observation with other observations within a particular time window, the resulting estimate is meant to follow the general trend of the series but not be as extreme as any of the individual points. The

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges ramifications of this method emerge when moving average estimates are used in sensitive allocation formulas or compared against strict eligibility cutoffs. A smoothed estimate may mask or smooth over an individual-year drop in level of need, thus keeping the locality eligible for benefits; conversely, it may also mask individual-year spikes in need and thus disqualify an area from benefits. It is clear that the use of smoothed estimates is neither uniformly advantageous nor disadvantageous to a locality; what is not clear is how often major discrepancies may occur in practice. One answer to this conundrum is to use sample-based estimates from individual years instead of moving averages. These estimates would be unbiased in terms of probability but could be highly variable, which would affect aspects of formula grants such as “hold-harmless” provisions.2 A related worry that has been expressed about moving averages is that, by incorporating estimates from other time periods, the estimates for a given period could be substantially biased and not truly reflect the conditions for that period. The empirical challenge is to assess the bias that may result from averaging over 3 years of data compared to 5, and try to weigh the magnitude of that bias against the bias associated with using an up-to-12-years-old long-form estimate. Intuitively, it is sensible that, when examining data series in which change is substantial between decennial census years, moving average estimates would be preferable to seriously out-dated estimates. When there is little change through the decade, there should be little difference between the two estimates. However, since this is an empirical question, the Census Bureau should carry out research that helps to evaluate this trade-off. The continuous measurement properties of the ACS give it unique advantages over the decennial snapshots available from the census long form, but they also raise another, related point of 2   A “hold-harmless” provision in a funding formula is one that limits the amount by which an allocation can change from one year to another; for instance, under a 70 percent hold-harmless level, a unit’s allocation may only decrease by up to 30 percent. In a hold-harmless situation, an unusually volatile observation one year due to increased variability could mean that the unit’s allocation may remain out of true alignment for several cycles due to the amount of allocation automatically carried over.

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges concern regarding moving averages: assessment of year-to-year change in a data series. It is incorrect to use annual estimates based on moving averages over several years when assessing change since some of the data are from overlapping time periods and thus identical. At the least, the results will yield incorrect estimates of the variance of the estimates of change. Therefore, users should be cautioned about this aspect of the use of moving averages. Along the same lines, moving averages present the same types of problems when they are used as dependent variables in various statistical models, in particular time-series models, and in some regression models. Therefore, the Census Bureau could bolster the case for the ACS and potentially help relieve users’ concerns if it produced a user’s guide that details the statistical uses for which moving averages are and are not intended, the problems they pose to users, and the means to overcome them. 4–C.2 Comparing ACS/C2SS to the Census Long Form Thus far, we have outlined from conceptual and theoretical perspectives the issues surrounding the adequacy of ACS estimates to replace the long form. It is also natural to address the question from a more pragmatic point of view: the ACS and the census long form purport to measure the same basic phenomena, but do the resulting data from both series actually tell the same story? Comparisons of how the ACS or C2SS estimates match census long-form estimates implicitly treat the census long-form data as an effective “gold standard”—a questionable assumption at best, given that it discounts the various (and sometimes substantial) sources of error to which the long form is subject. First, the long-form data for small areas are subject to substantial sampling error. In addition, as mentioned above, the long form is particularly subject to nonresponse, and for some sample items the amount of nonresponse for the long form in the 2000 census was extremely high (National Research Council, 2001a, 2004). Love (2002) has identified a number of sources of differences between the ACS (or C2SS) and long-form census estimates that complicate any direct comparison. These include differences

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges in: reference dates; modes of nonresponse follow-up; criteria used to decide if a response is acceptable; edit and imputation techniques; methods for data capture and processing; the use of proxy interviews (they are accepted for the decennial census but not by ACS); definition of respondent eligibility; and weighting procedures used to address nonresponse and sampling (e.g., the weighting of the long-form estimates to the basic complete-count data). The reference period associated with a question item is of particular interest for ACS estimates, since annual averages will be the average of responses corresponding to twelve different reference periods, depending on when the questionnaire was applied. There are also differences in the target population; for example, the ACS does not currently include group quarters in its survey, but the census does. Work on comparing the ACS (test sites) and C2SS estimates to census long-form estimates has been initiated by the Census Bureau. To date, what is known is that there are some substantial differences. Generally, these differences can be explained by the amount of sampling error in the two surveys (U.S. General Accounting Office, 2002a); however, examination of C2SS data suggests significant differences for the number of housing units lacking complete plumbing facilities and for the number of unpaid workers in a family, for instance. At the state level, a large number of C2SS estimates differed from the long-form estimates by at least 10 percent, including the number of workers that commute using public transportation, the number of households with income above $200,000, the number of housing units that lack complete plumbing facilities, and the number of renter-occupied units with gross monthly rent of $1,000 to $1,499. The Census Bureau needs to complete this analysis, including the contribution of sampling variance, for all years of data collection, and attempt to identify the sources of differences other than sampling error. A priority of this analysis should be responses related to residency, but all responses should be examined. 4–D QUALITY OF ACS ESTIMATES The error associated with ACS data may be decomposed into sampling error (sample variance) and nonsampling error, the

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges latter of which can be further separated into error due to nonresponse and measurement error due to various causes. At the most basic level, sampling error in the ACS will be slightly larger than that for the long-form sample because the total ACS sample size over a 5-year period will be slightly smaller than that for the census long form. On its own, this difference is unlikely to have a substantial impact on users. However, sampling error due to initial mail and CATI nonresponse is widely variable and could be appreciable in some small areas.3 As a result, the Census Bureau is considering raising the sampling rate for CAPI follow-up for areas with high mail and telephone nonresponse to make this source of sampling error more comparable across areas. It should be noted as we review these issues that, generally, these concerns are generic to all surveys, including the census long form—that is, the concerns are not raised as specific flaws of the ACS. They are, nonetheless, features of the ACS that must be measured and weighed in deciding how best to use the data. 4–D.1 Estimating Nonresponse Unit Nonresponse One part of nonresponse in a survey program like the ACS is unit nonresponse—that is, failure to obtain questionnaires and data from households selected for inclusion in the sample. A common combined measure of unit nonresponse and survey undercoverage is the sample completeness ratio, which is the sample-weighted estimate of the population count for a certain area divided by the census count for the area. The sample completeness ratio nationally for C2SS was 90.2 percent, while the comparable figure for the 1990 long-form sample was 89.7 percent (U.S. Census Bureau, 2002b). These figures may appear close, but some care must be taken in interpreting them. For example, the long form accepts proxy responses from landlords or neighbors while proxies are not permitted in the ACS or C2SS, and it is generally accepted that proxy responses are of lower 3   See Salvo and Lobo (2002) for relevant discussion on this point.

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges to collecting characteristics information should funding for the full ACS not be forthcoming. These costs and benefits should be presented for review so that decisions on the ACS and its alternatives can be fully informed. 4–F TOPICS FOR FURTHER RESEARCH AND DESIGN CONSIDERATION A substantial agenda of outstanding operational and methodological issues should be addressed in order to ensure a fully operational ACS. Some of these issues should be tackled in the near future in order to generate the maximum benefits from use of the ACS as part of an integrated framework of estimates. In addition to the research and design issues we raise here, other issues are described in other sections of this report. In particular, reconciliation of the census and ACS definitions of what constitutes residence at a particular location deserves prompt consideration (Section 5-B.3). Likewise, the effects on response of the mode in which the ACS is administered (Section 5-D.2) merit further examination. 4–F.1 Group Quarters The intent of the census long form is to provide information on characteristics of the entire population. This means not only the population residing in housing units but also those living in group quarters, such as college dormitories, military barracks, prisons, and medical and nursing facilities. Nonresponse to the census long form and the need to impute for nonresponse may detract somewhat from the overall reliability of census long-form data, but those data do at least allow users to make some inferences about the group quarters population. Accordingly, the complete elimination of the census long form—and the possible loss of data on the group quarters population—is an obvious concern of some census stakeholders. In its draft operational plan, the Census Bureau has indicated that the ACS will be administered to a 2.5 percent sample each year from the Bureau’s group quarters roster (U.S. Census Bu-

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges reau, 2003c). It remains to be determined how adequate this may be for monitoring this important population group, especially for small geographic areas and small demographic population groups. In Section 5-B.2, we recommend a complete reexamination of the Census Bureau’s approach to enumerating the group quarters population. Continuing research and planning to ensure that this population is adequately covered in the ACS would not only contribute to a better enumeration but also bolster the case for the ACS’ unique role relative to other federal household surveys. 4–F.2 Voluntary versus Mandatory Response The law governing conduct of the census imposes penalties on “whoever, being over eighteen years of age, refuses or willfully neglects … to answer, to the best of his knowledge, any of the questions on any schedule submitted to him in connection with any census or survey” enabled in other parts of the census code (13 USC § 241(a)).13 In addition, it is a crime to willingly give false answers to such censuses or surveys (13 USC § 241(b)). Accordingly, census mailings in 2000, as in previous years, prominently featured notices that “your response is required by law.” The Census Bureau has argued that because the ACS is intended to replace the mandatory census long form it should be conducted on the same mandatory basis as the census. The General Accounting Office has concurred that the Bureau has statutory authority to conduct the ACS and to require responses (U.S. General Accounting Office, 2002b). The distinction between voluntary and mandatory completion is significant because it is believed that the words “required by law” on the census forms are effective in raising response rates. However, early congressional discussion of the nature and content of the ACS led individual members of Congress to suggest that the ACS be conducted on a voluntary basis. Accordingly, the Census Bureau conducted part of the 2003 Supplementary Survey (the prototype ACS) on a voluntary basis; this test included replacing the phrase “required by law” with a more 13   However, the census code does provide that respondents cannot be compelled to disclose their religious beliefs or affiliation (13 USC § 241(c)).

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges generic appeal (U.S. Census Bureau, 2003c). [Rather than alter the instruments and scripts used in telephone or personal visit follow-up on a case-by-case basis, the Census Bureau conducted both types of follow-up using the voluntary participation language.] The response rates, including item nonresponse rates, on the voluntary surveys were compared with results from those obtained one year earlier in the 2002 Supplementary Survey. Preliminary test results were publicly released by the Census Bureau in December 2003 (U.S. Census Bureau, 2003f); a fuller report and analysis is indicated as pending. The Bureau found that mail response dropped by over 20 percentage points when response was changed from mandatory to voluntary; based on the decline, the Bureau projects that a voluntary ACS would increase the annual cost of the survey by at least $59.2 million (U.S. Census Bureau, 2003f:vi). The Bureau also found evidence that participation in a voluntary-response survey was worse in areas that had low mail response to the 2000 census, leading the Bureau to conclude that voluntary methods might “compromise [its] ability to produce reliable data for these areas and for small population groups such as Blacks, Hispanics, Asians, American Indians, and Alaska Natives” (U.S. Census Bureau, 2003f:vi). If respondents decided to fill out the questionnaire, the survey results indicated that the voluntary designation did not degrade responses to individual items; voluntary and mandatory methods generally resulted in comparable levels of item nonresponse. The mandatory versus voluntary distinction is an important one to resolve. The Census Bureau should continue work to assess the impact on nonresponse follow-up costs based on the change (likely, a decrease) in mail response if the full ACS is labeled voluntary rather than mandatory. 4–F.3 ACS as Both a Census Process and a Federal Survey When fully implemented, the ACS will occupy a unique niche among the statistical data series collected by the federal government. Because it is intended to replace the census long form, the ACS should properly be viewed as a parallel component of the census process. It will be charged with producing the small-area and small-demographic-group data required for many legal and

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges regulatory purposes and used in many research applications. In its sweep, the ACS will require development of a technical infrastructure on par with that for the decennial census itself (see Chapter 6). Pending resolution of the debate described in the previous section, it may also bear the notice that responses to the survey are required under the law. All that said, the ACS could also be properly viewed as one of many surveys fielded by the federal government on a number of topics. This dual role of the ACS—census component and federal survey—raises concerns that will require attention in coming years. Primary among these concerns is the substantive overlap between the ACS and other federal surveys such as the Survey of Income and Program Participation, the American Housing Survey, and—especially—the Current Population Survey (CPS). As the Census Bureau works with Congress to secure ACS funding, the panel recognizes that it is virtually inevitable that the question will be asked as to whether other surveys might be cut back or eliminated to help pay for the ACS (or vice versa). In our assessment, a fully operational ACS is not immediately exchangeable with other surveys. For instance, as the potential basis for an estimate of the poverty rate, the ACS has the advantage of larger sample size but does not cover socioeconomic and poverty-specific questions with the same depth as the CPS. The CPS has the further advantage of years of experience in soliciting detailed economic information; face-to-face interviewers acquire fuller knowledge of the survey content area and may be able to assist CPS respondents in interpreting survey questions in ways that the broader-focus ACS interviewers may not be able to match. It is decidedly premature to offer any sort of guidance on whether the ACS or another federal survey should be preferred in given situations. The panel suggests further evaluation and exploration of relative data quality in topic areas where the ACS overlaps with other federal surveys. Research should also consider ways in which the ACS could support or supplement other federal surveys, including possibilities for using recently-collected ACS characteristics data to refine the sampling frames from which other surveys are drawn (for instance, targeting surveys to low- or high-income areas).

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges The second major concern regarding the dual role of the ACS is how the ACS will be treated within the census hierarchy. For cost savings, the ACS and short-form-only census plans should be coordinated in order to avoid redundant effort and to “piggyback” on existing structures when possible (e.g., to perform data capture using the same optical character recognition technology and equipment). Since the panel issued its second interim report, the Census Bureau transferred ACS authority and activities from its Demographic Programs directorate to the Decennial Census directorate, the same division that plans and operates the decennial census process (see Box 2.2). The full implications of this organizational move remain to be seen. The panel suggests, however, that the Census Bureau not lose sight of the inherent sample-survey nature of the ACS. While its weighting, editing, and imputation techniques may be similar to to those used in census operations (and, in particular, to past long-form implementations), they should also differ when appropriate and not be constrained to treat census and ACS returns in the exact same manner. It may also be useful, in the future, for the ACS to leave open the possibility for experimental components such as occasionally occur in federal surveys. These experimental components could include one-shot (or periodic) modules of questions on particular topics such as crime victimization or health care or on items of interest to a particular state or region. Experimental components might also include more general tests of proposed survey practices, such as was done in the test of voluntary versus mandatory response. 4–F.4 Revisiting Sampling Strategies The basic ACS sampling strategy is simple: each month a systematic sample of approximately of the addresses on the Master Address File is taken, with one-third of mail and telephone nonrespondents randomly chosen for in-person follow-up. A number of variations on this basic strategy are either currently designed or under consideration for later implementation in the ACS by the Census Bureau. These additional possibilities are: (1) oversampling of governmental units with small populations, such as small towns, (2) oversampling of minority areas, and

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges (3) differential sampling of areas with poor initial mail and telephone response. We briefly comment on each of these possibilities and some additional methods not currently contemplated for implementation. With respect to oversampling of small areas, the Census Bureau intends to use some version of the decennial census long-form design. In the 2000 census, sampling rates were 1 in 2 for governmental areas (counties, towns, townships, and school districts) with fewer than 800 occupied housing units (fewer than about 2,100 people); 1 in 4 for governmental areas with 800–1,200 occupied housing units (about 2,100–3,100 people); 1 in 6 for census tracts with fewer than 2,000 occupied housing units (fewer than about 5,200 people); and 1 in 8 for larger census tracts. The justification for this plan in the decennial census was originally to support reliable estimates of per capita income for small governmental units for use in fund allocation as part of general revenue sharing. However, this oversampling has been retained past the elimination of general revenue sharing because it tends to make coefficients of variation more equal across areas with different population sizes. Undoubtedly, that is the current justification for oversampling in the ACS. However, a new set of sampling rates may serve that purpose more effectively, and therefore, after the ACS has been in operation for a short while, it would be useful to compute the coefficients of variation for all responses on the ACS questionnaire for areas with different population sizes, to determine whether a different strategy might prove to be superior with respect to this objective. For oversampling of minority areas, the Census Bureau has mentioned an interest in increasing the ACS sampling rate in areas with a high percentage of minority residents in order to provide estimates with lower coefficients of variation for important statistics historically related to racial and ethnic disparities. While the panel believes that this is justifiable, it should be understood that the historically lower mail return rates for minority populations could result in additional nonresponse follow-up costs for the ACS. In terms of the differential sampling of areas with poor initial mail and telephone response, it is true that without it these areas will have much larger coefficients of variation than areas with

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges high mail and telephone rates. Therefore, efforts to balance these coefficients of variation are justified. Clearly, the ACS could be modified in many ways to better satisfy various purposes. While most of the above possibilities have various disadvantages that might argue against their implementation, it would be very helpful for the Census Bureau to provide arguments to help justify the current design. These are topics that need little or no additional data collection or field work to further develop. Rather, what is needed is summary information that is already available from the ACS in the test sites. We encourage the Census Bureau to provide some analysis along these lines. 4–F.5 Interaction with Intercensal Population Estimates and Demographic Analysis Programs One high-priority research area should be the development of models that combine information from other sources—such as household surveys, administrative records, census data, and the like—with ACS information. One prominent example of this is the interplay of ACS estimates and the Census Bureau’s population estimates program. At this point, it is planned that estimates from the ACS are to be controlled to postcensal population estimates at the county level and some degree of demographic aggregation. However, this should not be considered a one-way street. It is also possible for the ACS to be used to provide the population estimates program with improved estimates of internal and external migration, fertility, household size, and vacancy status. The resulting improved population estimates could then be used as improved marginal totals to which to control ACS estimates. Because the ACS also provides direct information on population size, a joint estimate from population estimates and from the ACS is conceivable. The Census Bureau should (1) conduct research on how the ACS can be used to improve intercensal population estimates, and (2) examine how existing household surveys could change their poststratification practices (controlling totals by age, race, and sex) given the collection of ACS data.

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges The potential for the ACS to provide improved estimates of internal and external migration also suggests the importance of exploring possible interactions between the ACS and population estimates derived by demographic analysis. Demographic analysis uses aggregate data on birth, death, immigration, and emigration to produce population estimates by age, sex, and race. It was a key benchmark used to evaluate coverage in the 2000 census, but it has significant limitations. First, estimates of immigration and emigration—particularly those of undocumented immigration—are inherently difficult to produce with precision. Second, existing administrative records used to generate demographic analysis counts facilitate only the most basic racial comparisons—white and black—but do not permit direct estimation of Hispanics and other groups. The Census Bureau should consider ways in which the ACS might inform demographic analysis estimates, including more refined estimators of the size of the foreign-born population and of internal migration. We discuss further possible improvements for demographic analysis in 2010 in Section 7-B.14 Other possibilities—for instance, using ACS and household survey information jointly in regression models to provide improved estimates of the frequency of crime or unemployment—could also be fruitfully addressed as a research topic.15 Another high-priority research area should be identification of better procedures for weighting and imputation, to address nonresponse and undercoverage in the ACS; the hope would be to develop procedures that are, in a sense, optimized for ACS survey data, and not simply borrowed from procedures used on the decennial census long form. 14   The methods by which the ACS data could be used to improve demographic analysis could also be applicable to improvements of intercensal population estimates for the nation as a whole (National Research Council, 2000b; Citro, 2000). 15   The use of models that combine information from other sources has implications for the sample designs of the major household surveys and is a future research topic of great potential interest. Use of these models and connections to external programs such as the ACS may permit other household surveys to reallocate sample to areas in which estimates are less reliable.

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges 4–F.6 Research on General Estimation Issues The challenges of implementing the data collection for the ACS have understandably been given the highest priority at the Census Bureau. As a result, relatively straightforward estimation methods have been proposed for use in the short term, deferring estimation improvements for later. Unfortunately, this has meant that little research has been done on alternative approaches to estimation. We mention here some issues that should be examined by the Census Bureau once data collection is under control: Alternatives to moving averages. Moving averages are easy to implement and have well-understood properties, including variance reduction. However, they will reduce large deviations that obtain for shorter periods of time than the smoothing window. There are methods for reducing this feature of moving averages that still retain much of the variance reduction benefit.16 Controlling versus combination. Current plans are to control ACS population estimates at the county and major demographic group level to postcensal population estimates. For initial implementation, this is a reasonable approach to take, since it will likely improve the quality of the ACS population estimates. However, the use of the ACS in combination with information from various data sources—including census data, data from household surveys, and data from administrative records—needs to be a two-way street, as the ACS will provide independent information on population size and various characteristics information formerly obtained from the long form. Specifically with respect to population size, the ACS will produce estimates at the county and major demographic group level that will have relatively large variances for most smaller counties, but because they are independent, they could still be used to improve postcensal population estimates. This will be more certain the further one moves away from a census year, as postcensal population estimates are increasingly 16   Two possibilities that could be examined are state-space time-series models and spline smoothers.

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges variable as one moves further into each decade. In addition, the ACS data on population would not need to be used directly. Instead, data from the ACS could be applied to components of the postcensal estimates program, in particular estimates of interstate mobility, fertility, and household occupancy. Finally, there is the much more demanding vision of the ACS underlying a small-area estimates program, whereby information from the above sources is used in conjunction with the ACS to produce a wide variety of small-area information of higher quality than could be provided by any individual data source. Given the varying quality of data from ACS and other sources, ACS data should not simply be controlled to data from these other sources; instead, hierarchical models should be used that will let the data from the various sources determine the degree to which estimates are combined. This latter vision in totality is certainly beyond the current research literature in terms of complexity of application, especially since many of the proposed data sources might be inconsistent. However, initial efforts should be undertaken since the methods to carry out simple versions of this possibility currently exist and are regularly used in other applications. Weighting and imputation methods. The Census Bureau currently intends to use ten or so different weighting methods to accommodate: (1) the sample design of the ACS, (2) the use of data from different months (and modes) of response to compute the estimates for a given month and area, (3) whole household nonresponse, (4) individual unit nonresponse, (5) individual item nonresponse, and (6) undercoverage. These methods were adopted because of their current use (when relevant) in processing the decennial census short and long forms, and because of their resulting recognized benefits and ease of implementation in that very similar setting. Some of these weighting approaches are entirely appropriate for the ACS, and some are unique to the ACS as they are meant to address differential mode effects and the more complex sample design of the ACS relative to the long form. However, the current

OCR for page 103
Reengineering the 2010 Census: Risks and Challenges use of sequential hot-deck imputation for the treatment of individual item nonresponse, and the use of variance estimates that ignore the contribution of item nonresponse, are methods that are no longer representative of the current state of the art. Furthermore, it is not clear that nonresponse and undercoverage for the ACS will be sufficiently similar to these problems for the long form that these various long-form weighting methods should be utilized in the ACS without additional supporting research. The particular problem of the treatment of item nonresponse is becoming increasingly important given the degree of nonresponse experienced in the 2000 census.