Click for next page ( 97


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 96
4 Sampling and Statistical Estimation This chapter discusses potential uses of sampling and statistical estimation to address the two main challenges of the 2000 census: reducing differential cover- age and controlling operational costs. Why should the Census Bureau consider the use of sampling and estimation? Sampling and subsequent estimation offer two advantages over enumerating or surveying an entire population. The first, more obvious one, is cost savings. Trying to obtain data from everyone in a large population is usually prohibitively expensive. Drawing a sample can dramati- cally slash resources requirements and often yields adequately precise estimates for the population and major subgroups. Only when estimates are required for fine levels of detail, as in the U.S. census, does it make sense to even consider trying to obtain data from everyone in a large population. The second advantage of sampling is that it enables enhancements in data quality that would be too expensive or intrusive to apply to the entire population. A well-conducted sample survey will usually provide more accurate information than a program that at- tempts to collect data from an entire population but suffers from high nonresponse or biased responses. Indeed, the Census Bureau has traditionally used a sample survey to evaluate census coverage. This chapter focuses on two major innovations that the Census Bureau is considering for producing population counts in the 2000 census. The first inno- vation is sampling for nonresponse follow-up. Instead of trying to enumerate all housing units for which there is no response during mailout-mailback operations, the Census Bureau would follow up only a sample of such housing units (most likely between 10 and 33 percent). Data from housing units sampled for non- response follow-up would allow estimation of counts and characteristics of mail- back nonrespondents who are not sampled. 96

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 97 The second proposal, called integrated coverage measurement (ICM), is de- signed to measure and correct the differential undercount. In July 1990, the Census Bureau conducted the Post-Enumeration Survey (PES) in a sample of 165,000 housing units to allow measurement of the coverage achieved by the main census operation. Although the survey identified a net undercount of about 1.6 percent and substantial differential undercount by geography and demographic characteristics, the official 1990 census counts did not use the information ob- tained as part of the PES. During the 1995 census test, the Census Bureau plans to evaluate a new integrated coverage measurement method, CensusPlus, de- signed to run concurrently with the main census operations and thereby to facili- tate production of official counts by the legal deadlines. The Census Bureau decided not to use sampling during the initial mailout- mailback phase of the census because of concerns about the legality of that strategy and the adverse impact that it would have on the accuracy of counts (Isaki et al., 1993~. We concur with that decision. Both nonresponse follow-up sampling and integrated coverage measurement use sampling to try to obtain more accurate responses than could be achieved in a census. Combined with statistical estimation, these techniques should improve the absolute counts and reduce differentials in census coverage across states, other large political divi- sions, and major demographic categories. At the same time, initial attempts to enumerate everyone should produce acceptable accuracy for smaller areas like minor civil divisions and census tracts. The likely combination of nonresponse follow-up sampling and integrated coverage measurement clarifies the need for statistical estimation in the 2000 census. Consequently, as Chapter 1 explains, the Census Bureau is planning for a "one-number census" that combines the use of enumeration, assignment, and estimation for production of the census counts. The next three sections of this chapter discuss sampling for nonresponse follow-up, integrated coverage measurement, and statistical estimation, respec- tively. Although we discuss them separately, a recurring theme of the chapter is that decisions about each of these topics should be considered in light of the other two. Estimation methods clearly cannot be determined without knowledge about sampling methods. And, for example, ultimate evaluation of a design for inte- grated coverage measurement must refer to specifications of the plans for non- response follow-up sampling and estimation procedures. NONRESPONSE FOLLOW-UP Background The 1990 census was substantially more expensive than the 1980 census, even after accounting for inflation and population growth. The largest single part of the expense was follow-up of housing units that had not responded during the mailout-mailback portion of the census. Estimates of the total cost of nonresponse

OCR for page 96
98 COUNTING PEOPLE IN THE INFORMATION AGE follow-up operations in the 1990 census range from $490 to $560 million, roughly 20 percent of the $2.6 billion 10-year cycle cost of the census (Bureau of the Census, 1992b; U.S. General Accounting Office, 1992~. Each 1 percent of nonresponse to the mailed questionnaire is estimated to have added approxi- mately $17 million to the cost of the census. Perhaps just as important, nonresponse follow-up (NRFU) took much longer than anticipated in some sites (in particular, New York City), pushing back the schedule for completion of the census. In turn, NRFU operations pushed back the beginning of coverage measurement by the Post-Enumeration Survey. The long delay between Census Day and the beginning of coverage measurement compro- mised the ability of the PES to operate accurately and was one of several factors making it impossible for the Census Bureau to incorporate the PES results into official counts released by the legal deadlines. Even without delays in schedule, the latter stages of census operations typi- cally suffer degradation of data quality. Ericksen et al. (1991) report that, for the 1990 census, the rate of erroneous enumeration on mailout-mailback was 3.1 percent. On nonresponse follow-up, the rate was 11.3 percent; on field follow- up, the rate was 19.4 percent. Much of the problem in 1990 resulted from mailback response rates that were lower than expected. Item nonresponse also contributed to the follow-up work because additional contacts were required to complete missing items. Ques- tionnaire simplification, reminder postcards, replacement questionnaires, and other innovations are expected to improve mailback rates, and the use of tele- phone interviews may speed NRFU operations. Even so, a 100-percent NRFU operation would certainly be very expensive. Thus, the Census Bureau has focused substantial efforts on ways to reduce the scope of nonresponse follow-up without undue sacrifices in the accuracy of the count or the content. The Census Bureau has studied three major innovations for nonresponse follow-up in the 2000 census: truncating NRFU early, following up only a sample of mailback nonrespondents, and using administrative records to replace or supplement tradi- tional NRFU. In addition, it has considered combinations of these strategies e.g., a two-stage NRFU consisting of a truncated operation aimed at all mailback nonrespondents, followed by continued nonresponse follow-up for only a sample of households. The Census Bureau's cost models estimated very large cost savings with either a truncated NRFU or with sampling for NRFU. Estimated cost savings from truncation compared with the 1990 10-year cycle costs (in 1992 dol- lars) ranged from about $127 to $160 million (depending on assumptions) for truncation on June 30 up to $740 to $894 million for truncation on April 21 (no follow-up) (Keller and Van Horn, 19931. For NRFU sampling rates of 50 percent down to 10 percent, the models estimated cost savings compared with the 1990 10-year cycle costs ranging from approximately $300 to $750 million, even after increasing the sample size for ICM measurement (Bureau of the Census, 1993d).

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 99 However, those estimates do include some savings that could probably be achieved even with 100 percent NRFU. We have not seen any estimates for cost savings associated with the use of administrative records, presumably because no detailed plans have been proposed for their use in NRFU. Either of these innovations would also offer timing benefits compared with the 1990 scenario. Either truncation or sampling for NRFU would accelerate the completion of ICM. Because one of the potential problems with the planned ICM method is difficulty with retrospective identification of Census Day residency, moving up the last cases could be an important benefit. Earlier completion of ICM would also make it easier for the Census Bureau to produce final counts in time to meet legal deadlines. However, we note that these potential benefits would be more important for a 1990-style PES than for the currently planned ICM survey, which would run concurrently with the main census operations. In contrast to these cost and operational advantages, both truncation and sampling have negative implications for the precision of counts and other results, especially for small areas. Counts and attributes of persons in nonsampled, nonresponding housing units would need to be estimated, producing sampling variability roughly proportional to the number of cases being estimated (although the exact relationship would depend on the sample design and estimation method). An results are aggregated to larger geographic areas, the errors diminish in size relative to the population of the area. The Census Bureau ran simulations with 1990 census data to evaluate the adverse impact on the accuracy of various counts from exclusively using either early truncation of NRFU or sampling for NRFU. Unfortunately, the simulation studies did not produce estimates that allow for direct comparison of the two methods. Even so, the Census Bureau concluded that sampling for NRFU seems the more promising option at this point. Studies of NRFU truncation indicated that, to achieve savings of $300 million (in 1992 dollars), truncation would have had to occur so early in the 1990 census that the residual nonresponse rate would have been 11 percent of all housing units. More troubling, the nonresponse cases would have been spread very nonuniformly across district offices and demo- graphic groups. As a result, truncation would have greatly increased the differen- tial undercount in the census enumeration, placing further burden on integrated coverage measurement. Plans for the 1995 Census Test On the basis of these conclusions, the Census Bureau decided to focus on evaluating sampling for NRFU in the 1995 census test. Households that do not respond to the mail questionnaire by 6 weeks after the initial mailout (14 days after mailing of a replacement questionnaire) will be considered mailback non- respondents, and one-third of these households will be sampled for NRFU. Cur- rent plans call for the collection of only short-form data during NRFU. No

OCR for page 96
100 COUNTING PEOPLE IN THE INFORMATION AGE attempt will be made to obtain information from the other two-thirds of mailback nonresponding households. An attempt will be made to identify vacant housing units before selection of the nonresponse sample. Interviewers will visit units for which a postmaster returned the prenotice to the first mailing. Confimned vacan- cies will not be included in the NRFU sample. A major purpose of testing sampling for NRFU in the 1995 census test is to learn more about the relative merits of sampling individual housing units (a unit sample) versus whole blocks (a block sample); in the test, the NRFU sample will be split evenly between the two types of samples. (Census Bureau documents refer to the former as a case sample design, but we prefer to describe it as a unit sample design.) In a random sample of one-half of the blocks not involved in ICM, the Census Bureau will sample 33 percent (one-third) of nonresponding housing units. In the other non- ICM blocks, block sampling will be used. That is, all mailback nonrespondents will be followed up in one-third of the block-sample blocks, and no NRFU activities will be conducted in the remainder of the block-sample blocks. Com- plete nonresponse follow-up will be conducted in all ICM blocks. Decisions for the 2000 Census The Census Bureau faces several important decisions in connection with sampling for NRFU in the 2000 census. Should sampling for nonresponse follow-up be used at all? Is a unit or a block sample preferable? What proportion of units or blocks should be sampled? Should the sampling probability be uniform across blocks (for a unit sample) or across areas (for a block sample)? How should the Census Bureau treat mail returns received after the begin- ning of NRFU? Should any nonresponse follow-up operations be conducted for all house- holds before (or concurrent) with the sampling for nonresponse follow- up? We discuss these questions in the sections that follow. Should Sampling for Nonresponse Follow-up be Used? Whether to use sampling for NRFU in the 2000 census is mainly a policy decision about whether the expected cost savings from the use of sampling out- weigh the likely decreases in the accuracy of counts and other data, particularly for small areas. The 1995 census test will provide valuable data to inform that decision: more current inputs to the NRFU components of the Census Bureau's cost model and data on the relationship between NRFU and ICM. In particular, it will be important to identify all fixed components of the cost of NRFU sam

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 101 pling in order to obtain accurate estimates of the cost savings during the 2000 census. However, the most complete information about the effects of sampling for NRFU on the accuracy of the census would be gained from additional simu- lations with 1990 data, especially to the extent that these effects vary across geographic areas. Ultimately, resolving whether to sample for nonresponse follow-up is likely to involve answering the question: How accurate does the 2000 census need to be for small areas? Although that question is more central to the charge of the Panel on Census Requirements in the Year 2000 and Beyond, we offer a pair of com- ments. First, counts and other tabulations are needed at the block level primarily to allow flexibility for redistricting and for aggregating results to various political jurisdictions and other territories. Thus, the success of the 2000 census should be measured by the accuracy of these aggregate statistics rather than by the accuracy of block-level data. Even so, we note that there will be no single answer to the question of accuracy because sampling for NRFU would affect various levels of aggregation in different ways. Second, sampling variability is not the only source of error in census results. Incomplete counts and erroneous enumerations occur during both the mailback stage and the NRFU operation (even with 100 percent follow-up). Although sampling for NRFIJ would certainly contribute most to the error in block- and tract-level data, sampling error may be small compared with systematic error at larger levels of aggregation. Systematic errors have contributed most to the differential undercount in past censuses. If sampling for NRFU frees resources for taking steps to reduce other sources of error in the final results, it may produce a more accurate census by some measures. Another concern associated with the use of NRFU sampling is that publicity about it may reduce the mailback response rate. If NRFU sampling is used in the 2000 census, that fact would certainly become public knowledge, which might dilute any positive effect that the mandatory nature of the census has on the mailback response rate. It is also conceivable that Census Bureau staff might be less committed to their enumeration efforts in the belief that sampling will take care of nonresponse. Unfortunately, there is no way to learn from census tests whether concerns about such reactions are warranted. Whether sampling for nonresponse follow-up is used in the 2000 census will also depend on obtaining adequate answers to the other questions posed above. Is a Unit or Block Sample Preferable? The choice between a unit sample and a block sample for NRFU involves mainly a trade-off between the greater statistical efficiency of a unit sample and the operational and cost advantages of a block sample. An additional consider- ation is that a block sample would be easier to combine with the planned version

OCR for page 96
102 COUNTING PEOPLE IN THE INFORMATION AGE of ICM. The 1995 census test will provide much of the information needed to compare the relative advantages of the two options. Sampling for NRFU necessitates estimating the attributes of nonsampled (and nonresponding) housing units in a block from the information obtained (during mailback or NRFU sampling) about responding units in that block and in blocks judged similar in terms of geography and demographic characteristics. There is reason to expect that a unit sample would generally produce more accurate estimates than a block sample of the same size, because there is probably some within-block correlation in household size and other attributes of mailback nonresponse housing units, even within carefully selected strata. Suppose, for illustrative purposes only, that information from sampled hous- ing units in a 100-block area (roughly 1,000 nonresponding housing units) is used in estimating the characteristics of nonsampled mailback nonrespondents in the same blocks. To the extent that there is within-block correlation in the 100 blocks, data on a sample of nonrespondents spread evenly among the 100 blocks would be more valuable by a ratio known as the design effect than data from the same number of housing units concentrated in a smaller number of blocks. A unit sample would also provide the opportunity to use information from sampled mailback nonrespondents in the same block to improve the estimates for non- sampled housing units in that block. Certainly, heterogeneity among blocks can be expected for such characteris- tics as race and ethnicity. However, the critical quantities to estimate may be differences in mailback response rates among groups cross-classified by race, ethnicity, and age; such differences may be relatively homogeneous among blocks. Initial Census Bureau simulations with 1990 census data have found advantages to both unit and block sampling under various circumstances (Fuller et al., 1994), but further investigation is needed to separate the possible effects of the estimation procedures from those of the design. Also, these simulations have been limited to a few district offices. More comprehensive simulations with more fully developed estimators are needed to precisely determine the size of the unit sample advantage. Another potential advantage of unit sampling is that it would spread impreci- sion due to sampling and estimation among all blocks, thereby reducing the maximum amount of block-level error. However, because block sampling would eliminate the need for estimation in sampled blocks, the two methods would not differ in the total number of housing units where estimation is needed. Conse- quently, the relative accuracy of aggregate estimates based on unit sampling would not necessarily increase beyond the amount attributable to within-block correlation. In contrast, block sampling might offer certain operational advantages. Enu- merators would need to spend less time traveling between blocks. They might also be able to use their time in each block more effectively. For example, while visiting a complete sample of ma~lback nonrespondents in a block, enumerators

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 103 might frequently observe occupants entering or leaving other units on the NRFU list. With a unit sample instead, enumerators might tend to finish and proceed to the next block too quickly for such contacts to occur. On the basis of very preliminary assumptions, the Census Bureau has estimated that, compared with a unit sample of the same size, a block sample would save from $14 million (for a 10 percent sample) to $42 million (for a 50 percent sample) more than the corre- sponding amounts saved by the unit sample. Therefore, it is not obvious in advance whether the unit sample or the block sample is more efficient in terms of accuracy for equal costs. Operational data from the 1995 census test should allow the Census Bureau to estimate the relative cost advantage more accurately. Block sampling would fit better with any likely method of ICM, because 100 percent NRFU would be required in the ICM blocks (and, perhaps, in surround- ing blocks). Complete NRFU is needed so that the block total from the ICM operation can be validly compared with the total from preceding census opera- tions. In effect, ICM blocks would also be NRFU block-sample blocks. Thus, even if unit sampling is the primary strategy for NRFU, it may need to be mixed with some block sampling for ICM purposes. A related consideration is whether the choice of sampling design affects coverage in NRFU housing units. For example, with the more concentrated effort involved in following up a block sample, enumerators might be more likely to discover housing units that had been omitted from the frame (e.g., garage apartments). And if they do, it will be easier to use the results, because such housing units will automatically be part of a block sample. Enumerators may also be able to collect better proxy information for difficult-to-complete cases under block sampling. The Census Bureau plans to perform statistical tests for whether the average household size differs systematically between unit and block sampling in the 1995 census test (Bureau of the Census, 1994c). However, the size and design of the planned test are such that it could easily miss a coverage difference of 0.05 person per housing unit (about 2 percent of people in sampled units) between the block-sampling and unit-sampling design; a difference of this magnitude would be important to the decision on which sampling plan to use. If coverage differs under block sampling and unit sampling, then the viability of unit sampling for NRFU operations would be compromised, because ICM would measure cover- age in block-sample NRFU and there would not be an adequate corresponding measure for unit-sample NRFU. Consequently, the Census Bureau should inves- tigate other ways to compare the validity of the two methods, such as comparing the numbers of added housing units. What Proportion of Units or Blocks Should be Sampled? The Census Bureau appears to be considering sampling proportions in the range of 10 to 33 percent. Like the question of whether to sample for nonresponse

OCR for page 96
104 COUNTING PEOPLE IN THE INFORMATION AGE follow-up at all, the choice of sampling proportion rests mainly on a trade-off between cost savings and accuracy of small-area data. Updated estimates of the cost savings available from various sampling proportions will be one critical input to the decision. The other critical input will be detailed simulation studies of the effect of the sampling proportion on the accuracy of various estimates. This decision should be made jointly with the decision about how large a sample to use for ICM (assuming that element is included). Thus, simulations need to account for the trade-offs between these two procedures. Should Sampling Proportions be Uniform? The Census Bureau will need to decide whether to sample all units or blocks with equal probability. Factors that might influence the sampling probability include the mailback response rate in the block, the size of the estimation post- stratum of the block, and the cost of sending an enumerator to the block (or housing unit). How Should Late Mail Returns be Treated? Inevitably, mail returns will continue to trickle in after selection of the NRFU sample. Because these returns will not come from a random sample of all hous- ing units that failed to respond prior to sampling, use of data from these returns could bias estimates. However, ignoring the results might add unnecessary vari- ance and be a public relations problem. Research is needed about the best use of such data from either sampled or nonsampled housing units. (In a later section, we discuss a similar issue and possible solution in the context of ICM.) The 1995 census test will provide information about the likely frequency of late mail re- turns for various cutoff dates. That information may suggest moving the date for beginning NRFU. Operations to Supplement Sampling for Nonresponse Follow-up In the panel's interim report, we recommended that the Census Bureau con- sider a two-stage strategy that combines a truncated NRFU followed by sampling first-stage nonrespondents. Although the Census Bureau chose not to directly try out a two-stage strategy in the 1995 census test, data collected as part of the test could provide valuable information about this option. If the response rate can be increased substantially during a brief effort directed at 100 percent of mailback nonrespondents, this strategy might reduce the magnitude of estimation required while retaining large cost savings. Although the 1990 census results are discour- aging on this score, the computer-assisted telephone interview (CATI) system

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 105 offers some hope for yielding a large number of responses in a cost-effective manner. We also recommended that the Census Bureau investigate the value of ad- ministrative records as background information to make possible more accurate estimation of people in blocks not sampled for nonresponse follow-up. The idea (which could be applied equally well to a unit sample) is to use administrative records information to help estimate the count and characteristics of people in housing units about which there is no other direct information. The Census Bureau would neither accept the administrative records data at face value (too unreliable) nor require direct verification (too expensive). Instead, the same administrative records data would be compiled for housing units in the NRFU sample. The combination of administrative records that best predicted counts and characteristics in the NRFU sample would be used to estimate those same quantities in nonsampled households. Evaluating estimators on the ability to predict in sampled housing units would serve as an aggregate verification process for any administrative record. If some combination of administrative records is fairly accurate, then using such records in the estimation could substantially improve the accuracy of small-area estimates at relatively little increase in costs. Because administrative records data will be collected and processed indepen- dently from NRFU operations, the Census Bureau can evaluate the ability of these records to improve estimates for nonsampled housing units. Recommendation 4.1: Sampling for nonresponse follow-up could pro- duce major cost savings in 2000. The Census Bureau should test nonresponse follow-up sampling in 1995 and collect data that allows evaluation of (1) follow-up of all nonrespondents during a truncated period of time, combined with the use of sampling during a subsequent period of follow-up of the remaining nonrespondents, and (2) the use of administrative records to improve estimates for nonsampled housing units. INTEGRATED COVERAGE MEASUREMENT In addition to the use of sampling and estimation for nonresponse follow-up as described above, current census design plans call for a separate data collection effort in a smaller sample of blocks to measure the coverage of all census opera- tions that precede it. The preceding census operations include address list devel- opment, mailout-mailback of census questionnaires, special enumeration methods, and nonresponse follow-up. In a one-number census, the coverage measurement survey and the estimation and modeling associated with it are con- ceived as an integral component of census-taking, not as a separate postcensal evaluation activity. Hence, this phase of census-taking is called integrated cover- age measurement. In this section we focus on the ICM data collection methodol

OCR for page 96
106 COUNTING PEOPLE IN THE INFORMATION AGE ogy; we discuss coverage measurement methods used in the past, the data collec- tion procedures planned for the 1995 census test, our concerns about those meth- ods, and suggestions for evaluation. We turn our attention to the associated estimation in a later section. Previous Coverage Evaluation Programs The Census Bureau has evaluated coverage of census enumerations since 1950 (Coale, 1955; Himes and Clogg, 1992~. Two methods of coverage evalua- tion have been used: demographic analysis (DA) and dual-system estimation (DSE). Demographic analysis combines data from previous censuses, vital statistics on births and deaths, and other administrative records, such as Medicare data, to obtain national population estimates by age, race or ethnicity, and sex. DA relies on what is called the demographic accounting equation: population = previous population + births - deaths + inmigrants - outmigrants. DA has been useful in determining broad patterns of census coverage over time. Because of the lack of detailed information on internal migration and other state-level components of this accounting method, however, DA is regarded as reliable only for national-level estimates of population by demographic group and cannot provide estimates for subnational areas such as states. Uses and extensions of DA are discussed further in a subsequent section. Dual-system estimation as used in recent censuses is based on data collected for a stratified sample of households in a coverage measurement survey. (DSE more broadly construed has taken many forms in problems of human and animal population estimation see e.g., Marks et al., 1974; Seber, 1982; Chandrasekhar and Deming, 1949~. In short, people "caught" in the survey are matched against the census enumeration in order to estimate the fraction of the population that was included in the census. Similarly, a sample of people enumerated in the census is followed up to determine whether these people should in fact have been included or whether they were erroneously enumerated. The DSE method allows estimation of census coverage-undercount or overcount by combinations of demographic group, geographic area, and other variables available on the census form (such as owner/renter status); the degree of stratification is limited only by the sample size. The coverage measurement survey may be conducted as a postenumeration survey, following the census and temporally and operationally separated from it, as in the 1990 census, but this is only one of several possible alternatives. In 1980, two panels of the Current Population Survey were used as the coverage measurement survey. Pre-enumeration surveys have also been proposed. The 1980 Post-Enumeration Program was designed purely as an evaluation of the 1980 census enumeration. The possibility of using the 1980 coverage

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 125 each cell by the corresponding adjustment factor, and then summing the adjusted counts. The validity of these synthetic estimates was criticized on the grounds that undercount is not in fact uniform across each adjustment cell. The question of how this lack of uniformity affects the accuracy of synthetic estimates of popula- tion (in absolute terms, and in comparison to unadjusted enumerations) has been a subject of lively debate (Freedman and Navidi, 1986; Schirm and Preston, 1987, 1992; Freedman and Wachter, 1994; Wolter and Causey, 1991; Fay and Thompson, 1993~. This question has been approached through theoretical inves- tigation and through simulations based on coverage measurement data and cen- sus data for other variables than the undercount. Further research in this area before the 2000 census would make a useful contribution to design and validation of the estimation methodology. Other methods have been proposed for carrying estimates down to small areas. One alternative, for example, is a simple regression methodology, with proportions from different adjustment cells as covariates (Causey, 1994) or with other variables as covariates (Ericksen and Kadane, 1985~. Like the synthetic approach, these methods should be subjected to testing through simulations be- fore a final decision is made. Direct and Indirect Estimates Direct estimates are defined as estimates based entirely on data from the domain for which the estimates are calculated. Indirect estimates make use of data from outside the domain. Simple survey estimates are direct estimates, whereas model-based estimates may be indirect. In particular, synthetic esti- mates are indirect because they apply factors calculated over an estimation cell to geographic areas that are smaller than the geographic extent of that estimation cell. Empirical Bayes smoothing models are also indirect because the model component of the estimates is estimated across a large number of cells. In the 1990 undercount estimation program, almost all cells cut across state lines. Consequently, estimates of population for states and subdivisions of states were synthetic and therefore indirect. This fact was grounds for some contro- versy over whether it was accurate and fair to estimate one state's population using data from another state, if conditions might in fact have differed among states. Evaluation studies directed at this question were inconclusive (Kim et al., 19911. The Census Bureau is now considering requiring that direct estimates be obtained for all states. This would have major implications for design of the ICM sample, because even the states with the smallest populations would need sub- stantial ICM sample sizes to obtain direct estimates of acceptable accuracy. The calculations of sample sizes could vary greatly, however, depending on the crite- rion of accuracy that is adopted. At one extreme, a criterion of equal coefficient

OCR for page 96
26 COUNTING PEOPLE IN THE INFORMATION AGE of variation of direct population estimates in every state (equal standard error of estimated ICM adjustment factors) would imply roughly equal sample sizes in every state, despite the 100-fold ratio of populations between the most and least populous states. Such a design might be drastically inefficient for estimation of adjustment factors for domains other than states. At the other extreme, a criterion of equal variance of direct population estimates for every state would imply larger sampling rates (and therefore disproportionately larger sample sizes) in larger states. Considerations based on minimization of expected loss (Spencer, 1980; Zaslavsky, 1993) would lead to other sample allocations with higher sam- pling rates but smaller absolute sample sizes in small states compared with large states. A decision to require direct state estimates must be considered in light of its implications for accuracy of estimated population shares for other domains, such as urban versus suburban areas within states and white, black, and Hispanic populations in a region. We recommend that alternative ICM designs should be prepared with and without direct state estimates and that the added costs or loss of accuracy for other domains of interest should be made clear at a policy-making level before it is decided whether this feature is essential. Compromises may also deserve consideration, such as requiring direct estimation only for states and . . . . . cities ~ arger than some minimum size. Use of direct estimation for some or all states would not preclude calculation of indirect estimates for subdomains within states, down to a level of detail similar to the posts/ratification in the 1990 PES. For example, the total popula- tion of Idaho might be obtained from a direct estimate, but the estimate for urban Idaho (compared with suburban Idaho) might be based in part on adjustment factors calculated from data for Idaho, Wyoming, and Montana, and the estimate for Native American reservations in Idaho might use nationally estimated adjust- ment factors for Native American reservations. This could be accomplished, for example, by calculating synthetic estimates for all domains and then ratio adjust- ing them or rallying them to match direct state estimates. Selection of estimators must be guided by an awareness that, although for some purposes state estimates are the most important product of the census, for other purposes the distribution of population within states, for example by race and urbanicity, is paramount. Recommendation 4.5: The Census Bureau should prepare alternative sample designs for integrated coverage measurement with varying levels of support for direct state estimation. The provision of direct state estimates should be evaluated in terms of the relative costs and the consequent loss of accuracy in population estimates for other geographic areas or subpopulations of interest.

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION What Form Will Final Population Counts Take? 127 The Census Bureau has placed a high priority on making all census products consistent internally and with each other. Consistency requires that, when sev- eral published cross-tabulations share a common stub or margin, or when such a margin can be calculated by summing numbers from several tables, the margins of various tables should agree with each other. For example, the total of the number of white males over age 18 in a state should be the same regardless of whether it is read off the state totals or calculated from tables by county, by tract, or by block. Similarly, published means or proportions should agree with quan- tities that might be calculated from tables. A wide variety of census products are produced, and it is impossible to foresee all tabulations that will be generated as part of regular data series or special tabulations. Therefore, the surest way of guaranteeing this consistency is to produce a microdata file that looks like a simple enumeration of people and households but includes records for people estimated through TOFU sampling and ICM as well as those directly enumerated. Then tabulations and public-use microdata samples may be produced from this microdata file. There are two difficulties inherent to this approach. First, simple estimation methods such as those described above for ICM estimation produce adjustment factors that can be used to calculate expected numbers of people. They do not, however, describe the full detail required to create a roster of households that include the estimated number of people in each estimation cell. Therefore, procedures must be devel- oped that predict household structure for the "estimated" part of the population. These procedures could take various forms, ranging from arbitrary grouping of people to full probability modeling. Second, the rounding inherent in imputation of complete households adds noise to the estimates. For example, if calculations based on the adjustment factors indicate that 0.15 person should be estimated in a block (beyond the enumerated roster) in each of 9 cells, then the total number of people added will have to be rounded to either 1 or 2, and the number in each cell to either 0 or 1. The requirements of creating realistic households may require even more round- ing, since it may not be realistic to assume that the estimated people were neces- sarily in households of only one or two people, even if that is the average number estimated per block. A further complication is that there would be a stochastic element in this calculation. Both the rounding and stochastic components of error would be most noticeable at the most detailed levels of geography. At more aggregated levels, rounding could be controlled to area totals, and the stochastic component of error would tend to average out. This problem is another research area that will have to be considered in the years before 2000. Weighting presents an alternative to imputation that avoids the above diffi- culties. But weighting possesses the disadvantage that it produces noninteger counts. Rounding must therefore be performed after the weighted tables are

OCR for page 96
28 COUNTING PEOPLE IN THE INFORMATION AGE created, thus complicating the task of maintaining consistency in all census products. A minor, but possibly sensitive, issue is the treatment in estimation of counts associated with census forms that are collected late, after the implicit reference point of ICM. In principle, these should not be counted because the factors derived from ICM do not include these forms in their base. Simply ignoring these forms might be a public relations disaster, however. One appealing solu- tion would be to substitute these forms for households that would have been imputed for ICM without changing the total number estimated for any geographi- cal area. Similar issues may arise with respect to late mailback returns that come in after NRFU sampling and data collection have taken place. Acceptable Accuracy for Estimates The design of NRFU and ICM samples is motivated by considerations of desired accuracy at various levels of geographic aggregation. Under all designs within the current range of consideration, the NRFU sample will be much larger than the ICM sample (10-33 percent versus less than 1 percent), but the portion of the population that will be estimated and imputed on the basis of NRFU sampling (on the order of the mail nonresponse rate, about 30 percent in 1990) is also much larger than the portion estimated through ICM (of the order of 1-2 percent of pre- ICM totals, on the average). Therefore, issues about the accuracy of estimates under NRFU sampling are concerned primarily with small levels of aggregation, such as blocks, tracts, and minor political divisions, and issues about ICM accu- racy are concerned primarily with larger levels of aggregation, such as states, demographic groups in broad geographic regions, and cities. Early research (Fuller et al., 1994) suggests that coefficients of variation for block population estimates under NRFU sampling will be high. Research may lead to improved estimators that reduce variance for small areas, but it is unlikely that any estimator can make major gains in precision at this level. Precise block- level counts are rarely if ever needed, however, except as a means for building up estimates for larger levels of aggregation. Therefore, evaluation of NRFU sam- pling should focus on accuracy at the level of minor civil divisions, state legisla- tive districts, and similar-sized units. Conversely, evaluation of ICM should focus on accuracy at broader levels of aggregation. There are important interactions between ICM sample design, esti- mation methods, and the units for which ICM accuracy is measured. The ICM sample must be designed to give acceptable accuracy for important units such as those listed above. It would be desirable to attain a level of accuracy such that error for these units is smaller than the differential coverage of the pre-ICM phase of the census, but this may not be attainable within acceptable limits for the scale of ICM.

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 129 The Role of Demographic Analysis Demographic analysis refers to the estimation of population using the basic demographic identities relating population to birth, deaths, immigration, and emigration (e.g., Robinson et al., 1993, 1994~. In practice, demographic analysis can be used to obtain population estimates at high levels of aggregation. Tradi- tional methods for demographic analysis yield estimates of the national popula- tion cross-classified by age, sex, and race. In the 1990 census, demographic estimates were used as a check on the aggregate accuracy of dual-system estimates of undercount by age x sex x race group (Robinson et al., 1993~. They are particularly suited to this purpose be- cause most of the components of demographic estimates are quite accurately determined, although estimates of undocumented immigration remain controver- sial (see, e.g., Bean et al., 1990~. In addition, demographic estimates of sex ratios have been used to check the internal consistency of dual-system estimates from the 1990 PES (Bell, 1993~. For this purpose, the DA estimates of sex ratios were regarded as more reliable than DA estimates of population totals, and the results suggested a substantial undercount of black males, even after dual-system esti- mation. Demographic estimates were also used to evaluate the face validity of subnational dual-system estimates in the decision on adjustment of the 1992 postcensal estimates base. More recent and ongoing efforts have attempted to develop improved demo- graphic estimates at subnational geographic areas for the youngest and oldest segments of the population. Estimates for the population age 65 and over have been produced using state-level Medicare records, with allowance for state-by- state variability in Medicare enrollment rates and in percentages of eligible mem- bers of the population age 65 and over (Robinson et al., 19931. State enrollment and eligibility rates are affected by such variables as citizenship status and the proportion of retirees who have held federal government jobs. The Census Bu- reau has also worked to develop subnational estimates for the population under age 10, using vital statistics and estimates of interstate migration (Robinson et al., 1994~. These estimates require assumptions about the state-to-state variability in migration patterns, the completeness of birth records, and the use of valid resi- dence definitions during hospital birth registration. These research programs are breaking important new ground, but it would be premature to judge the credibility of these estimates. As noted above, the esti- mates are based on a number of assumptions that require further evaluation. Also, a major limitation of demographic analysis is that the uncertainties in estimates provided by the method are largely unknown (see Clogg and Himes, 1993; Robinson et al., 1993~. Demographic analysis possesses a number of potential strengths as a method for coverage evaluation in the 2000 census: operational feasibility, timeliness (estimates could be available early in the census year), low cost, independence of

OCR for page 96
130 COUNTING PEOPLE IN THE INFORMATION AGE ICM, and comparability to historical series (Robinson et al., 1993, 1994; Clogg and Himes, 1993~. But, in addition to the problem of measuring uncertainty, there remain significant difficulties associated with the use of demographic analy- sis that currently limit its role in the decennial census: the lack of reliable data on international migration, particularly for emigration and undocumented immigra- tion, questionable measures of interstate migration, which limit the accuracy of subnational estimates; and problems with racial classification in birth and death records and their congruency with self-identification in the census, which affects estimates for all groups (especially Hispanics, Asians and Pacific Islanders, and Native Americans). Because of the limitations of foreseeable progress on these methodological problems, we expect that demographic analysis will be useful primarily as an evaluation tool for ICM in the 2000 census. We assume that, as in recent cen- suses, national estimates of the population cross-classified by age, sex, and race will be produced for this purpose. It may also be possible and worthwhile to incorporate some demographic information e.g., estimates of sex ratios into the estimation procedure for integrated coverage measurement. However, based on the current state of research, we doubt that demographic analysis could be used in the 2000 census to adjust (or benchmark) final population estimates as part of integrated coverage measurement. Nonetheless, the panel believes further research on demographic methods is a cost-effective investment that could pay long-term dividends beyond the contri- butions to census coverage and evaluation. An exciting new development is the convergence of demographic analysis, the postcensal estimates program, and the Program for Integrated Estimates in connection with the proposed continuous measurement system (see Chapter 6~. The common ground of these programs is that each of them uses a variety of data sources to improve estimation of popula- tion counts and characteristics without relying on the census itself. Demographic analysis traditionally has depended primarily on demographic data, as described above, in order to obtain very aggregated estimates; administrative records such as Medicare registrations have also begun to be used in this program. The postcensal estimates program (Long, 1993) combines census-year population counts with a variety of indicators of change of the population at local levels, such as school enrollments and changes in housing stock, to obtain estimates of the population at a fairly detailed level. The Program for Integrated Estimates (Alexander, 1994) would make use of a wide variety of sources, including data from a new survey and a variety of administrative records with coverage for people, households, or housing units, to produce detailed estimates of counts and characteristics down to the block and tract levels (see Chapter 5 for a discussion of the potential of health care records in this regard). The three programs described above can be seen in a progression according to time of development (from several decades ago to the present and future), cost (from least to most expensive), operational difficulty (from easiest to most chal

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 131 longing), use of localized sources (from purely aggregate analysis to integration of microdata sources), and degree of small-area precision (from least to most detailed). Each step in this progression has been justified by contemporary needs, but each step also brings us closer to having the technical capabilities and experience required to be able to obtain adequate information from administra- tive records without having to mount a full-scale census. Recommendation 4.6: The panel endorses the continued use of demo- graphic analysis as an evaluation tool in the decennial census. However, the present state of development does not support a prominent role for demographic methods in the production of official population totals as part of integrated coverage measurement in the 2000 census. The Cen- sus Bureau should continue research to develop subnational demo- graphic estimates, with particular attention to potential links between demographic analysis and further development of the continuous mea- surement prototype and the administrative records census option. Other Uses of Estimation As discussed in Chapter 3, current proposals (Kalton et al., 1994) for enu- meration of the homeless population call for service-based enumeration. Under these plans, persons making use of services, such as shelters and soup kitchens, would be enumerated on several different occasions. The lists from these enu- merations will then be matched so that the degree of overlap in the service population from day to day can be determined. These data will be the basis for estimation of the total homeless population that males use of these services. These estimation methods are related to dual-system estimation, but there are special complications because one site might be enumerated on several different occasions and because the same person could appear for services at more than one site. Statistical modeling and estimation may also play a role in the use of quality assurance (QA) data to monitor and evaluate the coverage measurement survey (Biemer, 1994~. Evaluation of this survey is difficult, expensive, and controver- sial, because it requires replication and reconsideration of judgmental decisions made during the original coverage measurement operation. For example, the most skilled and experienced matching staff may reanalyze data and obtain addi- tional data from the field long after the census to check the accuracy of matching determinations made during the CensusPlus operation. Because these studies are difficult and depend on skills that are in short supply, they will almost certainly be small compared with the coverage measurement sample itself. If research over the next few years can identify QA measures of the census and coverage measurement process that are correlated with evaluation outcomes, the QA data

OCR for page 96
32 COUNTING PEOPLE IN THE INFORMATION AGE will provide useful auxiliary variables for evaluation of the distribution and con- sequences of errors in the coverage measurement operation. Prespecification and Documentation of Procedures In the year 2000 census, increased reliance on estimation makes it essential that the Census Bureau's choice of estimation methodologies should inspire gen- eral confidence. It is unrealistic to hope that there will be total unanimity in support of any full set of methodologies. Census methods have always received criticism insofar as they have had a discernible effect on an identifiable geo- graphic area or a particular group. For example, the local review process during the 1990 census led to inclusion of units that were determined to have been omitted from address lists and to recanvassing of some areas. Many procedural decisions, however, are invisible to those outside the Census Bureau and have consequences that are obscure. Decisions about estimation methods have been, and in all likelihood will continue to be, especially controversial (1) because they are open and explicit and (2) because their effects on totals for identifiable areas can be determined, at least after the fact. We address each of these consider- ations, taking the second first. Past experience (for example, with adjustment of the 1990 census and with adjustment of the 1992 base for postcensal estimates) has shown that the majority of the public comments received on an estimation methodology are motivated by concerns for its effects on particular political jurisdictions (Bryant, 1993~. An- other common concern was that confusion would be created by having two sets of results, those before and after adjustment. By prespecifying estimation methods as much as possible, the Census Bureau makes it clear that decisions have been made on good statistical principles and judgment, rather than being motivated by any consideration of how they will affect particular areas. Adopting such a policy avoids the 1990 experience of placing estimation choices before decision makers who could perceive the political consequences of different procedures. In this way, the Census Bureau obtains some protection against criticisms directed at the particular effects of methods. A proper balance must be struck that gains these benefits of prespecification without committing the Census Bureau to a rigid set of procedures that permit no leeway for handling unforeseen circumstances in the conduct of the census or adapting estimators to unanticipated features of the data. Therefore, an appropri- ate level of prespecification includes a positive statement of areas in which judg- ment may be exercised as well as areas in which decisions are made before the census. For example, the general form of the estimators to be used and the flow of information in processing should be prespecified, but it may be recognized in advance that judgment will be exercised in deletion or downweighting of outliers, variable selection in regression models, or splitting of poststrata. The openness and explicitness of estimation methodologies make it possible

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 133 to begin to build consensus in support of them, both technically and politically, before the census even begins. To make this possible, the Census Bureau should release a publication describing the main steps in data collection and processing and the estimation methods that will be used in the census, including a descrip- tion of estimators, of evaluations that will be applied to these estimators, and of points at which the use of professional judgment is foreseen. The process of consensus-building continues after the census through the release of suitable technical documentation. This documentation, as well as describing in general terms the estimators that were used, should present in aggregate terms the calculations that produced population totals reported for major geographic areas (states and large cities), as well as for major demographic groupings. It must be emphasized that this documentation would be released after the publication of census population figures used for apportionment and redistricting, and that the intermediate totals in these calculations should not be interpreted as competing estimates of population. The procedures should be regarded as an integrated whole, not a menu of options from which various parties can pick and choose to find the treatment most favorable to their local area. The postcensal documentation should also contain a summary of evaluation results. Summary measures of accuracy for various levels of aggregation, such as those calculated through the total error model in the evaluation of the 1990 PES, may be a suitable format for summarizing these evaluation results. Recommendation 4.7: Before the census, the Census Bureau should produce detailed documentation of statistical methodology to be used for estimation and modeling. After the census, the Census Bureau should document how the methodology was applied empirically and should provide evaluation of the methodology. Reporting of Uncertainty Official statistics have progressed over the century from a narrow focus on simple tabulations of population characteristics to provision of a range of census products, including complex tabulations and sample microdata files. Analytical uses of these data require availability of both point estimates and measures of uncertainty. When complex statistical methods, such as complex sampling schemes, indirect estimation, and imputation are used in creating census prod- ucts, users will not be able to derive valid measures of uncertainty by elementary methods, and they may not have adequate information in the published or avail- able products to derive these measures. It therefore becomes the responsibility of the data producers to facilitate estimation of uncertainty. Total error models have been used by the Census Bureau to measure uncer- tainty in the outcomes of the census and the contributions of the various sources of error to this uncertainty (Hansen et al., 1961~. More recently, a total error

OCR for page 96
34 COUNTING PEOPLE IN THE INFORMATION AGE model was developed for estimation of uncertainty in adjusted estimates based on the 1990 census and PES (Mulry and Spencer, 1993~. Such models take into account both sampling errors in the estimates and potential biases stemming from the regular census and from coverage estimation. Bias can arise, for example, from use of several response modes or from differences among response times. Similar models may be a useful tool for evaluating uncertainty in integrated estimates from a complex census in the year 2000. After uncertainties have been estimated, they should be described in a man- ner that allows users to incorporate them into their data analyses. A variety of methods for representing uncertainty are familiar from the world of survey sam- pling. Summary measures of uncertainty (such as average coefficients of varia- tion or variance functions) may be published as a supplement to published tabu- lations' or standard errors may be published for quantities of particular interest. A number of imputation methodologies are available (Rubin, 1987; Clogg et al., 1991) that enable users of public use microdata samples to estimate the effects of sampling and nonsampling variability on their analyses. Recommendation 4.8: The Census Bureau should develop methods for measuring and modeling all sources of error in the census and for show- ing uncertainty in published tabulations or otherwise enabling users to estimate uncertainty. Research Program on Estimation Necessary research on statistical estimation divides roughly into three phases. In the first phase, which is now under way and continues until the major design decisions have been made for the 1995 census test, estimation research focuses on broadening the range of possibilities for the use of sampling and other statis- tically based techniques. In this phase, preliminary assessments can be obtained of the expected precision for venous designs. In the second phase, roughly coinciding with the planning, execution, and processing of the 1995 census test, the emphasis shifts to developing methods needed for the selected designs and methodological features. Although it is not necessary during this phase to decide on all the estimators that will be used, it is critical that enough progress be made on NRFU sampling and ICM estimators to avoid making decisions about design based on estimators that will later be replaced. In the final phase, beginning with assessment of the 1995 census test and continuing through the decade, the selected estimation methods will have to be consolidated, optimized, validated, and made both theoretically and operationally robust. This last process will ensure that they can stand up to critical scrutiny and to problems that may arise in the course of the 2000 census. In this phase, work will also continue on selecting estimation procedures required for the production

OCR for page 96
SAMPLING AND STATISTICAL ESTIMATION 135 of all census products, including measures of uncertainty, and on more complex procedures that will be used in evaluation of the census estimates. Recommendation 4.9: The Census Bureau should vigorously pursue research on statistical estimation now and throughout the decade. Top- ics should include nonresponse follow-up sampling, coverage estima- tion, incorporation of varied information sources (including administra- tive records), and indirect estimation for small areas.