Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
4 Sampling and Statistical Estimation This chapter discusses potential uses of sampling and statistical estimation to address the two main challenges of the 2000 census: reducing differential cover- age and controlling operational costs. Why should the Census Bureau consider the use of sampling and estimation? Sampling and subsequent estimation offer two advantages over enumerating or surveying an entire population. The first, more obvious one, is cost savings. Trying to obtain data from everyone in a large population is usually prohibitively expensive. Drawing a sample can dramati- cally slash resources requirements and often yields adequately precise estimates for the population and major subgroups. Only when estimates are required for fine levels of detail, as in the U.S. census, does it make sense to even consider trying to obtain data from everyone in a large population. The second advantage of sampling is that it enables enhancements in data quality that would be too expensive or intrusive to apply to the entire population. A well-conducted sample survey will usually provide more accurate information than a program that at- tempts to collect data from an entire population but suffers from high nonresponse or biased responses. Indeed, the Census Bureau has traditionally used a sample survey to evaluate census coverage. This chapter focuses on two major innovations that the Census Bureau is considering for producing population counts in the 2000 census. The first inno- vation is sampling for nonresponse follow-up. Instead of trying to enumerate all housing units for which there is no response during mailout-mailback operations, the Census Bureau would follow up only a sample of such housing units (most likely between 10 and 33 percent). Data from housing units sampled for non- response follow-up would allow estimation of counts and characteristics of mail- back nonrespondents who are not sampled. 96
SAMPLING AND STATISTICAL ESTIMATION 97 The second proposal, called integrated coverage measurement (ICM), is de- signed to measure and correct the differential undercount. In July 1990, the Census Bureau conducted the Post-Enumeration Survey (PES) in a sample of 165,000 housing units to allow measurement of the coverage achieved by the main census operation. Although the survey identified a net undercount of about 1.6 percent and substantial differential undercount by geography and demographic characteristics, the official 1990 census counts did not use the information ob- tained as part of the PES. During the 1995 census test, the Census Bureau plans to evaluate a new integrated coverage measurement method, CensusPlus, de- signed to run concurrently with the main census operations and thereby to facili- tate production of official counts by the legal deadlines. The Census Bureau decided not to use sampling during the initial mailout- mailback phase of the census because of concerns about the legality of that strategy and the adverse impact that it would have on the accuracy of counts (Isaki et al., 1993~. We concur with that decision. Both nonresponse follow-up sampling and integrated coverage measurement use sampling to try to obtain more accurate responses than could be achieved in a census. Combined with statistical estimation, these techniques should improve the absolute counts and reduce differentials in census coverage across states, other large political divi- sions, and major demographic categories. At the same time, initial attempts to enumerate everyone should produce acceptable accuracy for smaller areas like minor civil divisions and census tracts. The likely combination of nonresponse follow-up sampling and integrated coverage measurement clarifies the need for statistical estimation in the 2000 census. Consequently, as Chapter 1 explains, the Census Bureau is planning for a "one-number census" that combines the use of enumeration, assignment, and estimation for production of the census counts. The next three sections of this chapter discuss sampling for nonresponse follow-up, integrated coverage measurement, and statistical estimation, respec- tively. Although we discuss them separately, a recurring theme of the chapter is that decisions about each of these topics should be considered in light of the other two. Estimation methods clearly cannot be determined without knowledge about sampling methods. And, for example, ultimate evaluation of a design for inte- grated coverage measurement must refer to specifications of the plans for non- response follow-up sampling and estimation procedures. NONRESPONSE FOLLOW-UP Background The 1990 census was substantially more expensive than the 1980 census, even after accounting for inflation and population growth. The largest single part of the expense was follow-up of housing units that had not responded during the mailout-mailback portion of the census. Estimates of the total cost of nonresponse
98 COUNTING PEOPLE IN THE INFORMATION AGE follow-up operations in the 1990 census range from $490 to $560 million, roughly 20 percent of the $2.6 billion 10-year cycle cost of the census (Bureau of the Census, 1992b; U.S. General Accounting Office, 1992~. Each 1 percent of nonresponse to the mailed questionnaire is estimated to have added approxi- mately $17 million to the cost of the census. Perhaps just as important, nonresponse follow-up (NRFU) took much longer than anticipated in some sites (in particular, New York City), pushing back the schedule for completion of the census. In turn, NRFU operations pushed back the beginning of coverage measurement by the Post-Enumeration Survey. The long delay between Census Day and the beginning of coverage measurement compro- mised the ability of the PES to operate accurately and was one of several factors making it impossible for the Census Bureau to incorporate the PES results into official counts released by the legal deadlines. Even without delays in schedule, the latter stages of census operations typi- cally suffer degradation of data quality. Ericksen et al. (1991) report that, for the 1990 census, the rate of erroneous enumeration on mailout-mailback was 3.1 percent. On nonresponse follow-up, the rate was 11.3 percent; on field follow- up, the rate was 19.4 percent. Much of the problem in 1990 resulted from mailback response rates that were lower than expected. Item nonresponse also contributed to the follow-up work because additional contacts were required to complete missing items. Ques- tionnaire simplification, reminder postcards, replacement questionnaires, and other innovations are expected to improve mailback rates, and the use of tele- phone interviews may speed NRFU operations. Even so, a 100-percent NRFU operation would certainly be very expensive. Thus, the Census Bureau has focused substantial efforts on ways to reduce the scope of nonresponse follow-up without undue sacrifices in the accuracy of the count or the content. The Census Bureau has studied three major innovations for nonresponse follow-up in the 2000 census: truncating NRFU early, following up only a sample of mailback nonrespondents, and using administrative records to replace or supplement tradi- tional NRFU. In addition, it has considered combinations of these strategies e.g., a two-stage NRFU consisting of a truncated operation aimed at all mailback nonrespondents, followed by continued nonresponse follow-up for only a sample of households. The Census Bureau's cost models estimated very large cost savings with either a truncated NRFU or with sampling for NRFU. Estimated cost savings from truncation compared with the 1990 10-year cycle costs (in 1992 dol- lars) ranged from about $127 to $160 million (depending on assumptions) for truncation on June 30 up to $740 to $894 million for truncation on April 21 (no follow-up) (Keller and Van Horn, 19931. For NRFU sampling rates of 50 percent down to 10 percent, the models estimated cost savings compared with the 1990 10-year cycle costs ranging from approximately $300 to $750 million, even after increasing the sample size for ICM measurement (Bureau of the Census, 1993d).
SAMPLING AND STATISTICAL ESTIMATION 99 However, those estimates do include some savings that could probably be achieved even with 100 percent NRFU. We have not seen any estimates for cost savings associated with the use of administrative records, presumably because no detailed plans have been proposed for their use in NRFU. Either of these innovations would also offer timing benefits compared with the 1990 scenario. Either truncation or sampling for NRFU would accelerate the completion of ICM. Because one of the potential problems with the planned ICM method is difficulty with retrospective identification of Census Day residency, moving up the last cases could be an important benefit. Earlier completion of ICM would also make it easier for the Census Bureau to produce final counts in time to meet legal deadlines. However, we note that these potential benefits would be more important for a 1990-style PES than for the currently planned ICM survey, which would run concurrently with the main census operations. In contrast to these cost and operational advantages, both truncation and sampling have negative implications for the precision of counts and other results, especially for small areas. Counts and attributes of persons in nonsampled, nonresponding housing units would need to be estimated, producing sampling variability roughly proportional to the number of cases being estimated (although the exact relationship would depend on the sample design and estimation method). An results are aggregated to larger geographic areas, the errors diminish in size relative to the population of the area. The Census Bureau ran simulations with 1990 census data to evaluate the adverse impact on the accuracy of various counts from exclusively using either early truncation of NRFU or sampling for NRFU. Unfortunately, the simulation studies did not produce estimates that allow for direct comparison of the two methods. Even so, the Census Bureau concluded that sampling for NRFU seems the more promising option at this point. Studies of NRFU truncation indicated that, to achieve savings of $300 million (in 1992 dollars), truncation would have had to occur so early in the 1990 census that the residual nonresponse rate would have been 11 percent of all housing units. More troubling, the nonresponse cases would have been spread very nonuniformly across district offices and demo- graphic groups. As a result, truncation would have greatly increased the differen- tial undercount in the census enumeration, placing further burden on integrated coverage measurement. Plans for the 1995 Census Test On the basis of these conclusions, the Census Bureau decided to focus on evaluating sampling for NRFU in the 1995 census test. Households that do not respond to the mail questionnaire by 6 weeks after the initial mailout (14 days after mailing of a replacement questionnaire) will be considered mailback non- respondents, and one-third of these households will be sampled for NRFU. Cur- rent plans call for the collection of only short-form data during NRFU. No
100 COUNTING PEOPLE IN THE INFORMATION AGE attempt will be made to obtain information from the other two-thirds of mailback nonresponding households. An attempt will be made to identify vacant housing units before selection of the nonresponse sample. Interviewers will visit units for which a postmaster returned the prenotice to the first mailing. Confimned vacan- cies will not be included in the NRFU sample. A major purpose of testing sampling for NRFU in the 1995 census test is to learn more about the relative merits of sampling individual housing units (a unit sample) versus whole blocks (a block sample); in the test, the NRFU sample will be split evenly between the two types of samples. (Census Bureau documents refer to the former as a case sample design, but we prefer to describe it as a unit sample design.) In a random sample of one-half of the blocks not involved in ICM, the Census Bureau will sample 33 percent (one-third) of nonresponding housing units. In the other non- ICM blocks, block sampling will be used. That is, all mailback nonrespondents will be followed up in one-third of the block-sample blocks, and no NRFU activities will be conducted in the remainder of the block-sample blocks. Com- plete nonresponse follow-up will be conducted in all ICM blocks. Decisions for the 2000 Census The Census Bureau faces several important decisions in connection with sampling for NRFU in the 2000 census. · Should sampling for nonresponse follow-up be used at all? · Is a unit or a block sample preferable? · What proportion of units or blocks should be sampled? · Should the sampling probability be uniform across blocks (for a unit sample) or across areas (for a block sample)? How should the Census Bureau treat mail returns received after the begin- ning of NRFU? Should any nonresponse follow-up operations be conducted for all house- holds before (or concurrent) with the sampling for nonresponse follow- up? We discuss these questions in the sections that follow. Should Sampling for Nonresponse Follow-up be Used? Whether to use sampling for NRFU in the 2000 census is mainly a policy decision about whether the expected cost savings from the use of sampling out- weigh the likely decreases in the accuracy of counts and other data, particularly for small areas. The 1995 census test will provide valuable data to inform that decision: more current inputs to the NRFU components of the Census Bureau's cost model and data on the relationship between NRFU and ICM. In particular, it will be important to identify all fixed components of the cost of NRFU sam
SAMPLING AND STATISTICAL ESTIMATION 101 pling in order to obtain accurate estimates of the cost savings during the 2000 census. However, the most complete information about the effects of sampling for NRFU on the accuracy of the census would be gained from additional simu- lations with 1990 data, especially to the extent that these effects vary across geographic areas. Ultimately, resolving whether to sample for nonresponse follow-up is likely to involve answering the question: How accurate does the 2000 census need to be for small areas? Although that question is more central to the charge of the Panel on Census Requirements in the Year 2000 and Beyond, we offer a pair of com- ments. First, counts and other tabulations are needed at the block level primarily to allow flexibility for redistricting and for aggregating results to various political jurisdictions and other territories. Thus, the success of the 2000 census should be measured by the accuracy of these aggregate statistics rather than by the accuracy of block-level data. Even so, we note that there will be no single answer to the question of accuracy because sampling for NRFU would affect various levels of aggregation in different ways. Second, sampling variability is not the only source of error in census results. Incomplete counts and erroneous enumerations occur during both the mailback stage and the NRFU operation (even with 100 percent follow-up). Although sampling for NRFIJ would certainly contribute most to the error in block- and tract-level data, sampling error may be small compared with systematic error at larger levels of aggregation. Systematic errors have contributed most to the differential undercount in past censuses. If sampling for NRFU frees resources for taking steps to reduce other sources of error in the final results, it may produce a more accurate census by some measures. Another concern associated with the use of NRFU sampling is that publicity about it may reduce the mailback response rate. If NRFU sampling is used in the 2000 census, that fact would certainly become public knowledge, which might dilute any positive effect that the mandatory nature of the census has on the mailback response rate. It is also conceivable that Census Bureau staff might be less committed to their enumeration efforts in the belief that sampling will take care of nonresponse. Unfortunately, there is no way to learn from census tests whether concerns about such reactions are warranted. Whether sampling for nonresponse follow-up is used in the 2000 census will also depend on obtaining adequate answers to the other questions posed above. Is a Unit or Block Sample Preferable? The choice between a unit sample and a block sample for NRFU involves mainly a trade-off between the greater statistical efficiency of a unit sample and the operational and cost advantages of a block sample. An additional consider- ation is that a block sample would be easier to combine with the planned version
102 COUNTING PEOPLE IN THE INFORMATION AGE of ICM. The 1995 census test will provide much of the information needed to compare the relative advantages of the two options. Sampling for NRFU necessitates estimating the attributes of nonsampled (and nonresponding) housing units in a block from the information obtained (during mailback or NRFU sampling) about responding units in that block and in blocks judged similar in terms of geography and demographic characteristics. There is reason to expect that a unit sample would generally produce more accurate estimates than a block sample of the same size, because there is probably some within-block correlation in household size and other attributes of mailback nonresponse housing units, even within carefully selected strata. Suppose, for illustrative purposes only, that information from sampled hous- ing units in a 100-block area (roughly 1,000 nonresponding housing units) is used in estimating the characteristics of nonsampled mailback nonrespondents in the same blocks. To the extent that there is within-block correlation in the 100 blocks, data on a sample of nonrespondents spread evenly among the 100 blocks would be more valuable by a ratio known as the design effect than data from the same number of housing units concentrated in a smaller number of blocks. A unit sample would also provide the opportunity to use information from sampled mailback nonrespondents in the same block to improve the estimates for non- sampled housing units in that block. Certainly, heterogeneity among blocks can be expected for such characteris- tics as race and ethnicity. However, the critical quantities to estimate may be differences in mailback response rates among groups cross-classified by race, ethnicity, and age; such differences may be relatively homogeneous among blocks. Initial Census Bureau simulations with 1990 census data have found advantages to both unit and block sampling under various circumstances (Fuller et al., 1994), but further investigation is needed to separate the possible effects of the estimation procedures from those of the design. Also, these simulations have been limited to a few district offices. More comprehensive simulations with more fully developed estimators are needed to precisely determine the size of the unit sample advantage. Another potential advantage of unit sampling is that it would spread impreci- sion due to sampling and estimation among all blocks, thereby reducing the maximum amount of block-level error. However, because block sampling would eliminate the need for estimation in sampled blocks, the two methods would not differ in the total number of housing units where estimation is needed. Conse- quently, the relative accuracy of aggregate estimates based on unit sampling would not necessarily increase beyond the amount attributable to within-block correlation. In contrast, block sampling might offer certain operational advantages. Enu- merators would need to spend less time traveling between blocks. They might also be able to use their time in each block more effectively. For example, while visiting a complete sample of ma~lback nonrespondents in a block, enumerators
SAMPLING AND STATISTICAL ESTIMATION 103 might frequently observe occupants entering or leaving other units on the NRFU list. With a unit sample instead, enumerators might tend to finish and proceed to the next block too quickly for such contacts to occur. On the basis of very preliminary assumptions, the Census Bureau has estimated that, compared with a unit sample of the same size, a block sample would save from $14 million (for a 10 percent sample) to $42 million (for a 50 percent sample) more than the corre- sponding amounts saved by the unit sample. Therefore, it is not obvious in advance whether the unit sample or the block sample is more efficient in terms of accuracy for equal costs. Operational data from the 1995 census test should allow the Census Bureau to estimate the relative cost advantage more accurately. Block sampling would fit better with any likely method of ICM, because 100 percent NRFU would be required in the ICM blocks (and, perhaps, in surround- ing blocks). Complete NRFU is needed so that the block total from the ICM operation can be validly compared with the total from preceding census opera- tions. In effect, ICM blocks would also be NRFU block-sample blocks. Thus, even if unit sampling is the primary strategy for NRFU, it may need to be mixed with some block sampling for ICM purposes. A related consideration is whether the choice of sampling design affects coverage in NRFU housing units. For example, with the more concentrated effort involved in following up a block sample, enumerators might be more likely to discover housing units that had been omitted from the frame (e.g., garage apartments). And if they do, it will be easier to use the results, because such housing units will automatically be part of a block sample. Enumerators may also be able to collect better proxy information for difficult-to-complete cases under block sampling. The Census Bureau plans to perform statistical tests for whether the average household size differs systematically between unit and block sampling in the 1995 census test (Bureau of the Census, 1994c). However, the size and design of the planned test are such that it could easily miss a coverage difference of 0.05 person per housing unit (about 2 percent of people in sampled units) between the block-sampling and unit-sampling design; a difference of this magnitude would be important to the decision on which sampling plan to use. If coverage differs under block sampling and unit sampling, then the viability of unit sampling for NRFU operations would be compromised, because ICM would measure cover- age in block-sample NRFU and there would not be an adequate corresponding measure for unit-sample NRFU. Consequently, the Census Bureau should inves- tigate other ways to compare the validity of the two methods, such as comparing the numbers of added housing units. What Proportion of Units or Blocks Should be Sampled? The Census Bureau appears to be considering sampling proportions in the range of 10 to 33 percent. Like the question of whether to sample for nonresponse
104 COUNTING PEOPLE IN THE INFORMATION AGE follow-up at all, the choice of sampling proportion rests mainly on a trade-off between cost savings and accuracy of small-area data. Updated estimates of the cost savings available from various sampling proportions will be one critical input to the decision. The other critical input will be detailed simulation studies of the effect of the sampling proportion on the accuracy of various estimates. This decision should be made jointly with the decision about how large a sample to use for ICM (assuming that element is included). Thus, simulations need to account for the trade-offs between these two procedures. Should Sampling Proportions be Uniform? The Census Bureau will need to decide whether to sample all units or blocks with equal probability. Factors that might influence the sampling probability include the mailback response rate in the block, the size of the estimation post- stratum of the block, and the cost of sending an enumerator to the block (or housing unit). How Should Late Mail Returns be Treated? Inevitably, mail returns will continue to trickle in after selection of the NRFU sample. Because these returns will not come from a random sample of all hous- ing units that failed to respond prior to sampling, use of data from these returns could bias estimates. However, ignoring the results might add unnecessary vari- ance and be a public relations problem. Research is needed about the best use of such data from either sampled or nonsampled housing units. (In a later section, we discuss a similar issue and possible solution in the context of ICM.) The 1995 census test will provide information about the likely frequency of late mail re- turns for various cutoff dates. That information may suggest moving the date for beginning NRFU. Operations to Supplement Sampling for Nonresponse Follow-up In the panel's interim report, we recommended that the Census Bureau con- sider a two-stage strategy that combines a truncated NRFU followed by sampling first-stage nonrespondents. Although the Census Bureau chose not to directly try out a two-stage strategy in the 1995 census test, data collected as part of the test could provide valuable information about this option. If the response rate can be increased substantially during a brief effort directed at 100 percent of mailback nonrespondents, this strategy might reduce the magnitude of estimation required while retaining large cost savings. Although the 1990 census results are discour- aging on this score, the computer-assisted telephone interview (CATI) system
SAMPLING AND STATISTICAL ESTIMATION 105 offers some hope for yielding a large number of responses in a cost-effective manner. We also recommended that the Census Bureau investigate the value of ad- ministrative records as background information to make possible more accurate estimation of people in blocks not sampled for nonresponse follow-up. The idea (which could be applied equally well to a unit sample) is to use administrative records information to help estimate the count and characteristics of people in housing units about which there is no other direct information. The Census Bureau would neither accept the administrative records data at face value (too unreliable) nor require direct verification (too expensive). Instead, the same administrative records data would be compiled for housing units in the NRFU sample. The combination of administrative records that best predicted counts and characteristics in the NRFU sample would be used to estimate those same quantities in nonsampled households. Evaluating estimators on the ability to predict in sampled housing units would serve as an aggregate verification process for any administrative record. If some combination of administrative records is fairly accurate, then using such records in the estimation could substantially improve the accuracy of small-area estimates at relatively little increase in costs. Because administrative records data will be collected and processed indepen- dently from NRFU operations, the Census Bureau can evaluate the ability of these records to improve estimates for nonsampled housing units. Recommendation 4.1: Sampling for nonresponse follow-up could pro- duce major cost savings in 2000. The Census Bureau should test nonresponse follow-up sampling in 1995 and collect data that allows evaluation of (1) follow-up of all nonrespondents during a truncated period of time, combined with the use of sampling during a subsequent period of follow-up of the remaining nonrespondents, and (2) the use of administrative records to improve estimates for nonsampled housing units. INTEGRATED COVERAGE MEASUREMENT In addition to the use of sampling and estimation for nonresponse follow-up as described above, current census design plans call for a separate data collection effort in a smaller sample of blocks to measure the coverage of all census opera- tions that precede it. The preceding census operations include address list devel- opment, mailout-mailback of census questionnaires, special enumeration methods, and nonresponse follow-up. In a one-number census, the coverage measurement survey and the estimation and modeling associated with it are con- ceived as an integral component of census-taking, not as a separate postcensal evaluation activity. Hence, this phase of census-taking is called integrated cover- age measurement. In this section we focus on the ICM data collection methodol
106 COUNTING PEOPLE IN THE INFORMATION AGE ogy; we discuss coverage measurement methods used in the past, the data collec- tion procedures planned for the 1995 census test, our concerns about those meth- ods, and suggestions for evaluation. We turn our attention to the associated estimation in a later section. Previous Coverage Evaluation Programs The Census Bureau has evaluated coverage of census enumerations since 1950 (Coale, 1955; Himes and Clogg, 1992~. Two methods of coverage evalua- tion have been used: demographic analysis (DA) and dual-system estimation (DSE). Demographic analysis combines data from previous censuses, vital statistics on births and deaths, and other administrative records, such as Medicare data, to obtain national population estimates by age, race or ethnicity, and sex. DA relies on what is called the demographic accounting equation: population = previous population + births - deaths + inmigrants - outmigrants. DA has been useful in determining broad patterns of census coverage over time. Because of the lack of detailed information on internal migration and other state-level components of this accounting method, however, DA is regarded as reliable only for national-level estimates of population by demographic group and cannot provide estimates for subnational areas such as states. Uses and extensions of DA are discussed further in a subsequent section. Dual-system estimation as used in recent censuses is based on data collected for a stratified sample of households in a coverage measurement survey. (DSE more broadly construed has taken many forms in problems of human and animal population estimation see e.g., Marks et al., 1974; Seber, 1982; Chandrasekhar and Deming, 1949~. In short, people "caught" in the survey are matched against the census enumeration in order to estimate the fraction of the population that was included in the census. Similarly, a sample of people enumerated in the census is followed up to determine whether these people should in fact have been included or whether they were erroneously enumerated. The DSE method allows estimation of census coverage-undercount or overcount by combinations of demographic group, geographic area, and other variables available on the census form (such as owner/renter status); the degree of stratification is limited only by the sample size. The coverage measurement survey may be conducted as a postenumeration survey, following the census and temporally and operationally separated from it, as in the 1990 census, but this is only one of several possible alternatives. In 1980, two panels of the Current Population Survey were used as the coverage measurement survey. Pre-enumeration surveys have also been proposed. The 1980 Post-Enumeration Program was designed purely as an evaluation of the 1980 census enumeration. The possibility of using the 1980 coverage
SAMPLING AND STATISTICAL ESTIMATION 107 estimates for adjustment was proposed, debated, and litigated (Ericksen and Kadane, 1985; Freedman and Navidi, 1986, 1992~. The 1990 Post-Enumeration Survey (PES) was designed not only to measure coverage of the census enumera- tion, but also to allow adjustment of the 1990 census counts if it was judged that PES data could be used to improve their accuracy. Eventually, however, a decision was made by the Secretary of Commerce not to carry out the adjustment (U.S. Department of Commerce, 1991; Fienberg, 1992; Bryant, 1993; Choldin, 1994). All previous coverage evaluation programs (both DSE and DA) have dem- onstrated the existence of an overall undercount. These programs have also found that there is a differential undercount; i.e., certain groups, such as black males, and certain areas, such as inner cities, are systematically undercounted relative to other groups and areas in conventional census enumerations. Despite improved overall census coverage, the black-white coverage differentials have remained remarkably constant at about 4 percent since the 1950 census, when evaluation programs of this kind began. The Census Bureau's research program since 1990 has studied methods that might lead to higher response rates to the census mail questionnaire. The avail- able evidence indicates that mail response rates can be increased with the use of reminder cards, precensus notices, and user-friendly census forms, but there is no evidence that these techniques will reduce differential response rates. Other programs that target special, hard-to-enumerate subpopulations might reduce dif- ferential coverage to some extent, but it is unlikely that these innovations will close the gap completely, at least at acceptable costs. Rather, the effect of these programs may be primarily to prevent the coverage gap from widening and to maintain public confidence in the face of increasingly difficult census-taking conditions. (See Citro and Cohen, 1985:Ch. 5, for similar predictions concerning the 1990 census, which were borne out by the 1990 experience; see U.S. General Accounting Office, 1992, for similar evaluations or predictions pertaining to future censuses.) Therefore, the need for coverage measurement in 2000 prom- ises to be at least as great as in previous censuses. Current plans for the 2000 census are predicated on the use of integrated coverage measurement as an essential part of census-taking and not just as an evaluation of other census operations, although ICM would also produce valu- able evaluative data. ICM is therefore not regarded as a method of producing a second set of population estimates that competes with population estimates ob- tained without the use of ICM. Instead, ICM would integrate coverage measure- ment which includes the use of samples, statistical estimation based on these samples, and statistical modeling-with the other census-taking operations. This new use of ICM as an essential component of census-taking defines the one- number census concept referred to throughout this report. In our judgment, the one-number census concept utilizing ICM in this fash- ion has several advantages. First, ICM can be used to remove or at least decrease
108 COUNTING PEOPLE IN THE INFORMATION AGE differential coverage by estimating the magnitude of such differentials, making possible the use of estimators designed to correct for the resulting biases. Sec- ond, the design of all aspects of the census can be optimized using the knowledge that ICM is part of the design. Third, decisions on methodology for ICM data collection and estimation would be governed by statistical and scientific criteria such as bias, mean squared error, and risk with respect to various loss functions (see later section on statistical estimation). Therefore, the proper use of ICM would minimize political concerns about which group is positively or negatively affected by the estimation. Finally, it should be possible to garner wide support for a one-number census based in part on ICM given the overriding importance of reducing census cost and reducing differential undercount. Recommendation 4.2: Differential undercount cannot be reduced to acceptable levels at acceptable costs without the use of integrated cover- age measurement and the statistical methods associated with it. We endorse the use of integrated coverage measurement as an essential part of census-taking in the 2000 census. Major Criteria for Selection of an ICM Method An ICM method should be chosen with the following set of criteria in mind. 1. The method must control bias. That is, the method must produce esti- mates that remove or greatly reduce coverage differentials by demographic group, by relevant socioeconomic factors, and by pertinent levels of geography. No method can control bias at all levels of geography without using very large samples, however, so the bias to be removed or reduced must be defined in terms of pertinent levels of detail, considering such geographic areas as major political jurisdictions. 2. The method must give precise estimates of population for various perti- nent levels of geography. Sample size is the most important factor determining precision (or variance), and sample design is another important factor. These first two criteria cannot be considered in isolation for several reasons. First, neither the bias nor the variance of an ICM method can be judged without reference to the estimation methods, including the use of models for combining information from "similar" blocks. Second, bias and variance reduction often conflict, making it necessary to strike a balance between these first two criteria (see also discussion in the statistical estimation section). Mean squared error provides a credible measure that balances the two objectives; it will be important to be able to estimate such measures of accuracy. Third, these criteria are not intended to suggest that an ICM method must improve the counts simultaneously for every area at a particular level of geography. A method should be judged by how well it improves overall accuracy for areas of a given size, because any kind
SAMPLING AND STATISTICAL ESTIMATION 109 of estimation is almost sure to make counts less accurate in a small fraction of areas. 3. The method must be operationally feasible. How well an ICM method could theoretically meet criteria 1 and 2 is irrelevant if operational or cost prob- lems mean that it will not be conducted as designed. Operations such as com- puter matching and elimination of duplicate records, intensive telephone or per- sonal interviewing, utilization of administrative records and other address lists, computer and clerical processing and checking of information, preparation of estimates, and so on, must be feasible given cost constraints. 4. The method must produce estimates by legal deadlines. Under current law, population totals for states are due by December 31 of the census year. Detailed files with counts for blocks by age, race, and Hispanic origin must be available by March 31 of the following year for use in legislative redistricting. The ICM method must be operationally feasible within these constraints. (In Chapter 2, however, we argue that a superior method should not be disqualified by this criterion alone.) 5. Finally, it must be possible to demonstrate that the method can meet the above criteria when implemented on a large scale. In practice, this means that the ICM method chosen must be thoroughly evaluated prior to going into the field in 2000. Alternative Methods for Integrated Coverage Measurement Since 1990 three basic designs for integrated coverage measurement have received serious attention: (1) a modified PES modeled after the 1990 methodol- ogy but taking account of experience gained in 1990 and subsequent analyses; (2) a new method called CensusPlus; and (3) another new method called SuperCensus. All three methods estimate census coverage using information collected by a coverage measurement survey conducted in a sample of blocks. The 1990 PES consisted of two surveys-one to measure census omissions and one to measure erroneous enumerations-conducted in identical samples of 5,000 block clusters. Methods were developed for adjusting census data for all subnational geographic units and for demographic groups. The 1990 PES has been extensively documented (Hogan, 1992, 1993; Mulry and Spencer, 1991, 1993; Belin et al., 1993~. The DSE methodology used in 1990 is valid under the statistical assumption of independence between capture in the census enumeration and capture in the PES samples within each poststratum. Great efforts were taken to make the 1990 PES operationally independent of other census operations. Much of the statisti- cal controversy over the use of DSE in 1990 focused on the accuracy of the assumption of independence and on the effect of lack of independence on the validity of the DSE (e.g., Freedman et al., 1993, 1994; Freedman and Wachter, 1994~.
0 COUNTING PEOPLE IN THE INFORMATION AGE The assumptions and estimation methods associated with the PES operation have been subjected to much scrutiny. It is important to note that, in 1990, coverage evaluation using PES data and the DSE was studied more than any other coverage evaluation method applied to any other census in the world. The exten- sive program of research evaluating the 1990 PES methodology, including criti- cisms of that methodology, has played an important role in developing plans for integrated coverage measurement for the 2000 census. We believe that, with modifications, DSE based on a PES could be used in the 2000 census, satisfying the above criteria, if other methods are judged to be unreliable or infeasible. We also believe that the controversies that occurred in the 1990 PES program could be resolved if a larger sample were used and if past experience and criticisms of the 1990 PES were taken into account. CensusPlus uses intensive enumeration methods and highly trained inter- viewers with the objective of obtaining a complete enumeration of the true popu- lation in a sample of blocks. As with the PES, regular census operations, includ- ing the mailout of census forms and NRFU, also take place in the blocks sampled for ICM. CensusPlus is properly understood as an ICM field operation, and at least two estimation strategies could be applied to the data collected during CensusPlus. A ratio estimator (described below) is perhaps most natural, although a variant of dual-system estimation could also be used. Under ratio estimation, capture in the ICM blocks chosen for intensive enumeration is assumed to occur with probabil- ity one. That is, the ratio estimation methodology assumes that a complete enumeration can be obtained for the blocks included in the ICM sample. In subsequent sections, we consider possible problems and discuss evaluations re- lated to this assumption. The assumption of complete coverage replaces the independence assumption implicit in use of the DSE after a PES, although it would still be important that the CensusPlus operations be conducted in a way that does not make the sample blocks atypical with respect to the conduct of primary census operations. The ICM enumeration involves adding people found in ICM who were omitted from the census and deleting people who were included in the census but found by ICM to have been erroneously enumerated. The coverage rate for each estima- tion stratum may then be estimated as the ratio of the count obtained by pre-ICM operations in sample blocks to the corresponding count after completion of ICM. The logic of the method can be illustrated as follows. We compare the CensusPlus count for a given stratum (e.g., black males ages 20-34 in urban areas in the Northeast who rent instead of own their home) in the ICM sample blocks with the count obtained from the census enumeration, assignment, and NRFt in those blocks. The ratio of these numbers is a coverage factor that can be applied to counts from non-ICM blocks. CensusPlus will be tested in the 1995 census test. Details of the implementation of CensusPlus are discussed below, and estimation methods are considered in a later section.
SAMPLING AND STATISTICAL ESTIMATION 111 SuperCensus, like CensusPlus, would involve selecting a sample of blocks or block clusters and striving for a complete enumeration of all housing units and persons in those blocks. In the SuperCensus method, however, implementation of the special enumeration methods would begin on or before Census Day; no regular census-taking would be done in those blocks and no counts would be obtained there for non-ICM census methods. Coverage factors would therefore have to be based on a count that is available before the census specifically, the number of housing units. The number of people per housing unit in SuperCensus sample blocks would be calculated, and this estimate would be applied to housing unit counts for other blocks in the corresponding area to estimate population in the nonsampled blocks. Little evaluation of other census operations would be possible because cen- sus mailout would not take place in the SuperCensus sample blocks. For this reason and because of the possibility that ratios of people to housing units would be too variable to permit accurate estimates, the panel expressed reservations in its interim report about use of the SuperCensus method. The Census- Bureau has subsequently rejected SuperCensus as a method for ICM in either the 1995 cen- sus test or the 2000 census. CensusPlus in the 1995 Census Test The Census Bureau has decided to test a version of the CensusPlus method in the 1995 census test. The proposed implementation involves conducting CensusPlus operations concurrently with regular census operations in the ICM sample block in order to facilitate identification of residence on Census Day and to improve the ability to produce final census results by legal reporting deadlines. The procedure has also been designed to distinguish housing unit coverage errors in the Master Address File (MAP) from coverage errors that occur during census enumeration of housing units. The PES will not be tested in 1995 because much more is known about it than about CensusPlus in terms of implementation, operational feasibility, the empirical validity of the assumptions on which it is based, the degree to which the assumptions can be checked, whether bias and precision can be controlled at acceptable levels of cost, and whether it can produce results within time con- straints. Current plans for implementing the CensusPlus ICM method in the 1995 census test are described briefly here; the proposed implementation for the 2000 census would be similar except in the extent of the operations. First, the sample of ICM blocks or block clusters will be selected. Current plans call for a sample of 100 to 200 ICM sample blocks at each of the four 1995 census test sites. Early in the year, prior to the census mailout, interviewers will canvass the ICM sample blocks to construct an independent listing of housing units (and addresses). This list will then be matched to the MAP, the frame for enumeration
2 COUNTING PEOPLE IN THE INFORMATION AGE and NRFU in non-ICM census operations, generating two lists: (1) housing units that were found by the ICM canvass but missed in the MAF and (A housing units that were included in the MAF. The two lists of units will be followed up in the housing unit coverage and within-housing-unit coverage portions of ICM, re- spectively. The housing unit coverage operation is designed to check the completeness of the MAF and estimate: (1) the number of housing units that were omitted from the MAF (and therefore from the frame for mailout and NRFU) and (2) the number of people omitted because they were in these housing units. ICM inter- viewers will go into the field immediately after Census Day to check on the accuracy of the listing of these units in the ICM canvass and to enumerate the households living there or determine that the units are vacant. Housing units that appear in the MAF but not in the independent listing will also be flagged for attention as part of this operation. Housing units in the within-housing-unit sample will be followed up as their census returns come in, whether by mailback, by other enumeration methods, or by NRFU interviews. (In order to avoid confusion over which households were included in the NRFU sample, NRFU will be carried out on a 100 percent basis in these blocks regardless of whether a block or unit sample is used in other blocks.) Computer-assisted telephone interviews will be attempted first, and com- puter-assisted personal interviews will be used to follow up households not con- tacted by telephone. Each responding household will be given a two-part re- interview by an ICM interviewer. First, the interviewer will collect information similar to that on the census form (but possibly including more detailed and probing questions) to construct a roster of persons living in that household. The computer will then reveal to the interviewer the roster from the original census response, showing discrepancies from the reinterview. In the reconciliation phase of the reinterview, the interviewer will attempt to resolve these discrepancies in order to come up with an accurate roster, using information from both the original response and the reinterview. The use of the computer makes it possible to conduct both an independent reinterview and the reconciliation in the same tele- phone call or personal visit. Some housing units will be resolved as vacant by NRFU; these will be rechecked by ICM interviewers in order to verify that the units are in fact vacant or, when a unit is not vacant, to conduct an interview in order to obtain informa- tion on the household living there. Conversely, ICM interviewers may determine that some households enumerated by mailback or NRFU were erroneously enu- merated and should be removed from the roster. The end product of these operations is a corrected or resolved roster of both housing units and people in the ICM sample blocks, from which resolved counts of units and of people by age, sex, race, and other variables would be calculated. It should be clear from the above description that there are many general similarities between the CensusPlus methodology as outlined above and the PES.
SAMPLING AND STATISTICAL ESTIMATION 113 The specifics of the implementation of CensusPlus, however, include several important differences from the PES as conducted in 1990: 1. ICM operations would be conducted simultaneously with other census operations, although any given housing unit should not be contacted for ICM until after other census operations for that unit have been completed. This would permit ICM to begin much earlier than the PES did (as early as Census Day), spreading out the workload and moving up the anticipated completion date for ICM. This is one of the most important potential benefits of the CensusPlus design. The temporal interpenetration of mailout-mailback, NRFU, and ICM is illustrated in Table 4.1. Control of these simultaneous operations is predicated on improved management capabilities and, in particular, on very good control of addresses using the TIGER geographical information system (see Chapter 2) in order to separate distinct operations that may be going on at the same time in adjacent housing units. 2. The ICM interview will not be completely independent of other census information, because names from the previous response will be available for matching and reconciliation on the spot, eliminating in most cases the need for an additional contact to resolve discrepancies as in the 1990 PES. The independence of the first phase of the reinterview, however, provides a useful check on the accuracy of both the reinterview and the original response. The same comment applies to the independence of the ICM housing unit listing relative to the MAP. 3. The ICM interview will attempt to establish an accurate roster for the household as of Census Day. The 1990 PES was defined to include people at the sample address at the time of the PES; this roster could be different from the Census Day roster because households moved in the intervening months. CensusPlus must be able to follow up people who move out of the ICM sample blocks after Census Day; this may be facilitated by the relatively short interval between Census Day and the time of the CensusPlus interview. 4. The 1995 CensusPlus (and any ICM methodology used in 2000) may benefit from research that has been conducted on enumeration methods and ad- ministrative records since the last census. In hard-to-enumerate areas, special methods from the tool kit (see Chapter 3), especially those judged to be effective but too expensive to use except in a sample of blocks, can be employed in order to improve the coverage of the ICM effort. Information from other sources, such as administrative lists, could also be matched with the census file and made available for the reconciliation phase. Issues for Evaluation of CensusPlus Methodology The CensusPlus procedures proposed for the 1995 census test have some very attractive new features. We congratulate the Census Bureau for developing this new design, which represents a many-sided rethinking of coverage measure
114 i , ~ < ~ 1, o · O ~ i== ., ..-3 .~ 64 ~C , ~:E ~Cn ~ to g Cal o U. Cal Cal Ct ,,4 .= Ct a, ~ Us Ct 5 - ~ ~ .^ Ct 3 3 c, ~ 4 - ~ Ct I- Ct CO Ha .~ ~ s: of o o _ Ct ILK - ^ ~ [14 ~ o ~ I: o . ., Us ° "sat ~ ~ 3 ,o Z C) Ct (L) ° ._ ~ ~ so ~ CC V, o ~ ~ =( U. _I ~ ·_ - ._ ~ C: 0 ~ Ct ~ Ct ~ ~ .. ~ ~ 0 ~ .e Z C)
SAMPLING AND STATISTICAL ESTIMATION 115 ment methodology. At the same time, however, we believe that two critical issues about CensusPlus methodology must be evaluated in 1995 before it can be adopted for use in 2000: 1. Can CensusPlus be conducted without affecting the results of the regular enumeration in the CensusPlus sample blocks? 2. Can CensusPlus attain near-perfect coverage in the sample blocks? The answers to these questions will determine the accuracy of the inputs from CensusPlus to the estimation phase of ICM (in particular, to calculation of the numerators and denominators of estimated coverage rates) and consequently the validity of final population estimates. Some of the particular issues to be addressed by evaluations have been identified by Singh (1993) and Thompson (1993) and have been discussed by Census Bureau staff and panel members. Evaluations of these issues in the 1995 census test have been given high priority by the Census Bureau. These evalua- tions will also guide the formulation of the final plans for 2000; they may lead to incremental modifications (such as changing the scheduling of particular opera . ~ . . . tons) or mayor revisions. We focus attention first on the two key issues that must be addressed by these evaluations. These are of particular importance because they are fundamental to validation of the assumptions that underlie the CensusPlus methodology. Can CensusPlus be conducted without affecting the results of the regular enumeration in the CensusPlus sample blocks? The census coverage rates mea- sured in the ICM sample blocks can be regarded as valid estimates of the cover- age rates in other blocks only if the conduct of the census is essentially indistin- guishable in the sample and nonsample blocks. As noted above and in Table 4.1, the ICM operations will overlap in time with other operations. It is therefore possible that the conduct of the ICM operations will have some direct effect on the pre-ICM census counts obtained in the sample blocks, causing them to be systematically either larger or smaller than what they would have been if ICM had not occurred there. This effect is referred to as contamination of the census by ICM. Contamination can lead to bias in CensusPlus estimates, because, if there is contamination, coverage rates measured in sample blocks will differ systematically from coverage rates in other blocks. A number of potential forms of contamination have been identified. · The precensus canvass for housing units conducted in ICM blocks may affect awareness of the census and consequently response to the regular census in those blocks particularly if census personnel knock on doors to verify the exist- ence of housing units. · The responses of residents in sample blocks may be affected by their awareness of the presence of ICM interviewers, particularly if an ICM inter
6 COUNTING PEOPLE IN THE INFORMATION AGE viewer accidentally approaches a household that has not yet made its mailback or NRFU response. · It may be difficult to determine that a household that has not responded to the census at a particular time will probably never respond and can therefore be approached for an ICM interview; if the treatment of the cutoff date for response is different in ICM blocks than in other blocks, then ICM contaminates the census. · NRF7U interviewers may become aware of the presence of ICM interview- ers and make special efforts in ICM sample blocks to obtain more complete follow-up results than in other blocks. · We expect that some housing units missed by the MAP in non-ICM blocks will still get counted (census adds) due to fortuitous contacts by census interviewers or respondent-initiated responses (possible through unaddressed questionnaires). Early ICM interviews could forestall some of these responses in ICM blocks. These issues are particular salient if more intrusive and public ICM tool-kit methodologies such as team or blitz enumeration are employed. Several evaluations may be directed at this issue in 1995. The Census Bureau should compare the results from the enumerative part of the census in ICM blocks with those from non-ICM blocks. These comparisons should be disaggregated to focus attention on effects on distinct aspects of primary data collection, such as census adds or late mail returns, that are particularly likely to be affected by ICM. Due to the small ICM sample size in the 1995 census test and the lack of direct measures of some of the effects, it will be hard to conclude that there are no significant biases due to contamination on the basis of purely quantitative comparisons. Further evaluations could look for direct evidence of contamination by debriefing of enumerators and ICM interviewers and re- interview of respondents. If these evaluations identify problems with contamina- tion, they may be addressed through changes in the timing of ICM operations- or other operational changes to keep them more strictly separated from enumerations. Can CensusPlus attain near-perfect coverage in the sample blocks? The proposed ratio estimator, described above, is based on the assumption that the resolved roster in ICM blocks can be treated as the truth for those blocks. First and foremost, there is the problem that many individuals in our society are diffi- cult to count. Comparisons of 1990 coverage measurement results to DA sug- gested that at least in some groups, a substantial number of people were missed by both the PES and the regular census, and that the number of these people was underestimated by DSE (Bell, 1993~. CensusPlus may be no more successful at finding the very toughest households and individuals; hence, the resolved roster will probably be incomplete. However, we caution against using this concern by itself to dismiss any ICM method. Even if no method can totally eliminate the
SAMPLING AND STATISTICAL ESTIMATION 117 differential undercount, which is probably the case, ICM may well lead to sub- stantially better estimates. We see several other challenges to this coverage assumption that apply more specifically to CensusPlus. · Does the reconciliation phase of the reinterview give an accurate roster? The two-phase reinterview and reconciliation envisaged for CensusPlus provides important operational advantages, but it also is a new and as yet untested method- ology. The technical aspects of the computer-aided reconciliation will challenge the capabilities of matching technology to support integration of two or more lists on the fly, but the deepest issues may lie in the dynamics of the reinterview process itself. Previous research on reinterview methodologies (Biemer and Forsman, 1992) suggests that a respondent confronted with discrepancies be- tween two interviews may tend to seek a resolution that is consistent with the most recent responses; the interviewer may also prefer to obtain an outcome at reconciliation that is consistent with information recorded just before by the same interviewer. · Can the CensusPlus methodology identify erroneous enumerations? In particular, one form of erroneous enumeration is duplication, i.e., the listing of one person in more than one place. Detection of duplications must involve at least some searching in blocks adjacent to the blocks that were sampled. lIow should the search area be defined? What information should be used to define matches or mismatches? There is a trade-off between improved accuracy and increasing cost as the size of the search area is increased; it may be possible to reduce bias at minimal cost by using an extended search area for a sample of the cases. · Is it possible to resolve place-of-residence ambiguities, for example, when true residence is close to a block included in a given CensusPlus sample block, or when a person could plausibly be regarded as resident at any of several ad- dresses? · Can the ICM instrument find the people who lived in the sample blocks on Census Day, even if they have moved since then? And can it distinguish them from people who moved in after Census Day? Because in ICM blocks the ICM sample is accumulated on a flow basis as the census enumerations and NRFU are accumulated, it becomes important to determine whether information added actu- ally pertains to residence on Census Day. If, for example, enumerations in the ICM samples add people who moved into the blocks in question after Census Day, then this would create an upward bias in the population total, with an obvious effect on the validity of estimates. · At least two modes and instruments will be used in the ICM interview: the CATI and CAPI versions. The possibility of mode effects, systematic differ- ences in responses to the different interview modalities, in enumerating people in ICM blocks should be considered and their effects on estimates analyzed.
8 COUNTING PEOPLE IN THE INFORMATION AGE Recommendation 4.3: The Census Bureau should investigate during the 1995 census test whether the CensusPlus field operation can attain ex- cellent coverage in CensusPlus blocks without contaminating the regu- lar enumeration in those blocks. If substantial problems are identified, CensusPlus should not be selected as the field methodology for inte- grated coverage measurement in the 2000 census unless clearly effective corrective measures can be implemented within the research and devel- opment schedule. Evaluations should be designed to respond to these issues. The following suggestions address particular concerns in CensusPlus evaluation and are among the ideas that could be implemented in l99S or later tests. · It will be quite difficult to demonstrate perfect coverage of the resolved CensusPlus roster, but some evaluations are possible that would help to assess the quality of its coverage. One would be to find a third source of names and perform triple system estimation (Marks et al., 1974; Zaslavsky and Wolfgang, 1993) to evaluate the number missed in both of the original lists. Possible sources would be an administrative list that was not used either in the original enumeration or in constructing the ICM roster, or a list from a particularly intensive form of enu- meration, such as observation by a resident ethnographer. The objective would be to seek out people who are particularly hard to count. · It would be useful to conduct some experiments to evaluate the effect of the design of the ICM reinterview. For example, the reconciliation phase for some fraction of the cases could be carried out by experts different from the original ICM interviewers, and the results compared with those obtained in simi- lar households when the interview and reconciliation are carried out in the same session. A careful study of these dynamics under cognitive laboratory conditions may also be helpful. Some information will be gained by carefully monitoring particular ICM operations, such as follow-up of movers. · It may be possible to deliberately salt some of the data with information that is incorrect (enumerations or deliberate omissions) but plausible, in order to measure the success of the ICM reinterview in detecting and correcting these cases. Other Issues for ICM Methodology In addition to the two major issues addressed above, there are many others to be considered in design and evaluation of the new ICM methodology. We men- tion several of them here. · The Census Bureau intends to experiment with use of administrative records as part of integrated coverage measurement (see Chapter S). Administra
SAMPLING AND STATISTICAL ESTIMATION 119 live records can be used to provide additional addresses for the independent address listing and also as a source of additional names to be checked in the ICM reinterview. The Census Bureau should study whether administrative records can contribute to the accuracy and completeness of ICM. . Other innovations in census-taking, including the use of forms to be re- ceived from nontraditional sources and returns received by phone (reverse CATI), need to be examined in relation to their effects on assumptions used in ICM. · How will the use of tool-kit methods be integrated with ICM? The use of special tool-kit methods for hard-to-count populations has been endorsed by the panel as a way to increase response rates and improve the reliability of data. These methods also hold promise for improving coverage of the ICM survey. Some decisions must be reached as to which tool-kit methods should be used in primary enumeration and which in ICM. Some consideration must also be given to the effect of the former on the conduct of the ICM survey, since tool-kit methods may involve very different schedules of census activities from those in most areas. . What sample size is required to obtain acceptable ICM estimates? Simu- lations based on the 1990 PES, together with information from field tests, can be used to determine optimal design and sample size. We anticipate that a block sample for ICM that includes at least 300,000 households-double the number of households included in the 1990 PES would be required for ICM to succeed in 2000. Considerations in selection of sample size are taken up again in the next section in the context of estimation. Recommendation 4.4: Whatever method for integrated coverage mea- surement is used in 2000, the Census Bureau should ensure that a sufB~- ciently large sample is taken so that the single set of counts provides the accuracy needed by data users at pertinent levels of geography. STATISTICAL ESTIMATION Estimation and the One-Number Census Census design features being considered for 2000 will create new demands for statistical estimation methods. Each of the methods described previously in this chapter sampling of nonresponse follow-up and integrated coverage mea- surement requires a corresponding estimation strategy and research on particu- lar aspects of implementation, as do some uses of administrative records and other additional sources of information. We briefly summarize these areas here and provide further details below. · Nonresponse follow-up sampling. Estimates must be obtained of num- bers and characteristics of people and households who would have been found in each block during nonresponse follow-up had all households been included in the
120 COUNTING PEOPLE IN THE INFORMATION AGE nonresponse follow-up sample. The information that can be used in this estima- tion process includes the number and characteristics of people found during nonresponse follow-up in sample blocks or households, the number of unre- solved nonresponse addresses in the nonsample blocks or households, and the number and characteristics of people found during unsampled census operations (mailback and other presampling responses). · Coverage estimation. Methods such as models or posts/ratification schemes must be developed for summarizing patterns of undercoverage. Formu- las must be developed for estimating net coverage rates from ICM data. · Additional information sources. Inclusion of new information sources into the census, such as administrative records and multiple response modes, may create new demands on estimation methodologies. · Population estimates for small areas. Results from NRFU sampling and ICM must be combined to produce estimates of population for small areas, down to the level of individual blocks. At the most detailed level, methods will be required for incorporating estimated persons and households into individual blocks, creating units with realistic characteristics in such a way that additivity is maintained across levels of geography (i.e., the total of counts for a collection of smaller areas equals the count for the larger area that they constitute.) The fundamental principle of the one-number census, as described above, is that a single set of population numbers will be produced, incorporating estima- tion as well as counting and assignment. Although in a sense the reported census counts have always been estimates, in the 2000 census for the first time the role of estimation as part of the census process will be made explicit. Fundamental Criteria for Estimation Methodologies A number of criteria should be considered in the choice of estimation meth- odologies; in considering these, estimation is inextricable from sample design. 1. Reduction of variance for given cost. The variance of estimates depends on the sample design, the sample size, and the estimation procedure. Efficiencies gained in estimation may be used to reduce variance (improve precision), to reduce cost, or some combination of the two. 2. Reduction of bias. The term bias is used here in its strict statistical sense, meaning that an estimate does not, even on the average over hypothetical re- peated surveys, give exactly the value that would be obtained if the entire popu- lation were observed. This does not imply intent or unfairness. Some biases will inevitably exist in a survey like this one, which is directed at a complex and heterogeneous population and designed to be very sensitive to nonresponse. Estimation decisions require trade-offs between bias and variance. Proce- dures that reduce variance through modeling may create biases at some level, even if the models are very simple. For example, if an attempt were made to
SAMPLING AND STATISTICAL ESTIMATION 121 calculate unbiased ICM estimates for very fine levels of detail, the variances of the estimates would be very large because each area might have only a few sample blocks. However, estimators that reduce variance by sharing information across wider areas risk creating bias for small areas if some parts of the larger area actually had poorer coverage during the census enumeration than did other parts. This example illustrates that it is important to consider different levels of geography because variance tends to predominate for small areas, but, since variance is inversely proportional to sample size, bias becomes more prominent in larger areas. Thus, different estimators may look best depending on the level of geography considered important. "Loss measures" such as weighted mean squared error are credible ways of combining estimated bias and variance into a single measure at any selected level of geography. 3. Simplicity and explainability. Other factors being equal, a method that is simple and can be explained is preferable to a more complicated method. This objective may conflict with the first two criteria, because complex statistical methods may be required to optimize the bias-variance trade-off. Note, however, that methods that are not used for estimation because they are regarded as too complex may still be used for evaluation of simpler estimators. One aspect of simplicity is directness, the use of data from an area to produce estimates for the same area; some issues about directness are discussed below. Specific Issues in Estimation Methodology It appears likely that estimation will be divided into two stages. One stage is concerned with estimating the population in nonsample nonresponding house- holds, i.e., those that did not respond by mail and were not included in the NRFU sample. The other stage uses ICM data to estimate the difference between the numbers and characteristics of households and people estimated by all stages preceding ICM and the true population and characteristics. ICM estimation may be further divided into two aspects. The first aspect concerns the way in which ratios or factors are estimated based on data in sample blocks. The second concerns how these estimates are applied to pre-ICM census counts (i.e., those based on direct enumeration, assignment, and estimation from the NRFU sample). Estimation Methods for NRFU Sampling If NRFU is conducted in a sample of households, it will be necessary to calculate estimates of numbers and characteristics of households at addresses that were not included in the NRFU sample. In other words, estimates must be obtained for numbers of households with various characteristics and for the num- ber of addresses that are vacant. Under a block sampling design, if households are added through NRFU field operations that are not at addresses on the pre- census address list (census adds), it will also be necessary to estimate the number
22 COUNTING PEOPLE IN THE INFORMATION AGE of census adds that would have appeared in the NRFU nonsample blocks. Sev- eral sources of information are available for use in estimation of these numbers for each target block for which estimation is required: (1) numbers and character- istics of households that responded to the mailback census in the target block and every other block, (2) numbers and characteristics of households in addresses that did not respond by mail but were included in the NRFU sample, in other blocks, and (under a unit-sampling design only) in the target block, (3) administrative list or other auxiliary information for the target block, and (4) other block-level covariates. The Census Bureau has used relatively simple estimators so far in research to contrast unit and block designs (Fuller et al., 1994~. More sophisti- cated estimators that try to optimize use of the various information sources may lead to improved estimation. Some of the relevant concepts and considerations involved in use of these more complex estimators are discussed below in the context of ICM. The main basis for considering estimation methods and accuracy will be simulations using 1990 data, even after 1995 census test results become avail- able. Only simulations using data with complete follow-up can provide data for evaluating the accuracy of various estimation schemes under various sample designs. The 1995 census test results will primarily answer questions about operational and cost issues associated with NRFU sampling. Estimation Methods for ICM: Estimating Factors In the 1990 undercount estimation program, adjustment factors were calcu- lated for domains or cells defined by geography and population characteristics that cut across state lines. These factors were initially estimated using dual system estimation. With this estimator, the estimated adjustment factor for a domain takes the following form: A = ~ - ° - ° ]( C )( M ) ~ CO Actually enumerated persons in census Census totals Estimated correctly enumerated persons Estimated actually nclualng imputations enumerated persons Estimated number of persons actually in area Estimated persons correctly enumerated by the census The product of these three factors, by cancellation, is the estimated ratio of people actually in the domain to the census totals including imputations. The first factor is the fraction of the census counts (CO) that represent people actually enumerated, excluding imputations and other cases for which matching and fol- low-up are impossible (Io). The second factor is an estimate of the fraction of the people enumerated in the census (C) who were not erroneous enumerations (E).
SAMPLING AND STATISTICAL ESTIMATION 123 The third factor is the inverse of the estimated fraction of people in the whole population (represented by P. the P-sample or PES count) who were included in the census (M, or matched, part of the P-sample). The first factor is calculated from the complete census (up to but not including coverage measurement), while the second and third factors are based on the coverage measurement survey (hence the distinction between C and CO). This estimator is valid under the assumption that (within any given estima- tion cell) people in the P-sample are included in the census at the same rate as those who are not in the P-sample (the assumption of independence). It therefore implicitly includes an estimate of the number of people who would be missed by both census and PES even if their block were included in the sample, the so- called fourth cell (in the two-by-two table of inclusion in the census by inclusion in the ICM survey). Some research conducted prior to and around the 1990 census suggests that the number of people in this cell may be larger than the number predicted under independence, at least for some demographic groups, causing the estimated adjustment factors to be biased downward compared with the correct adjustment factors (ratio of true to census population) (Bell, 1993; Zaslavsky and Wolfgang, 1993~. This bias may be explained by heterogeneity in the probability of enumeration among people in the same adjustment cell, the problem of the hard-to-enumerate population (Alho et al., 1993; Darroch et al., 1993), but such effects can only be estimated using supplementary information of some kind. The estimation method that would be used with the proposed ICM methodol- ogy has not been determined. One possibility would be an estimator similar to the DSE used in 1990, treating the census (including NRFU) as the first source and the ICM interview (before reconciliation) as the second source. Another estimator, which we call the resolved population ratio estimator, would treat as the truth the counts in ICM blocks after enumeration of missed addresses and reconciliation of discrepancies between census and ICM enumerations (Wright, 1993~. (This is the estimator assumed in the descriptions in the preceding sec- tion.) Adjustment factors would then be estimated as the ratio of these resolved counts to pre-ICM census counts. The resolved population ratio estimator is valid under the assumption that the reinterview and reconciliation process attains near-complete coverage. In the 1995 census test, the statistical robustness of the ratio estimator should be assessed by determining the effects of modest depar- tures from complete coverage on the accuracy of ICM estimates. This estimator would tend to produce estimated factors that are smaller than those obtained from the DSE, but the differences should be small except for poststratum cells with large omission rates. An important issue is the choice of cells or poststrata for which adjustment factors are calculated. In order to make the final stage of estimation as accurate as possible, the cells should be internally homogeneous but different from each other with respect to coverage rates. Cells for the 1990 PES were defined by age,
24 COUNTING PEOPLE IN THE INFORMATION AGE sex, race, type of place, tenure (owner/renter), and geographical region. Analy- ses of 1990 census data led to some changes in cell definitions for the 1992 decision on adjustment of postcensal estimates, with greater emphasis on the owner/renter distinction. With improved data processing in 2000, it may be possible to define cells using census process data, such as mailback rates and NRFU completion rates, which have been found to be strongly related to omis- sion and erroneous enumeration rates. The development of a new ICM methodology may provide an opportunity to consider variations on the estimation procedure other than alternative methods of calculating adjustment factors for cells. The fact that omitted addresses will be located in an operation distinct from that which identifies missed persons in listed addresses suggests the possibility of calculating an adjustment factor for number of addresses or for number of households. It would also be possible to distin- guish between people omitted from the pre-ICM census in enumerated house- holds and people omitted in households that were omitted from the census. Fi- nally, models could be used to describe the omissions and erroneous enumerations in ways that are more complicated than simple estimates of ratios of true to enumerated counts in a cell, such as models with covariates for individual or household omissions. Another feature of the estimation procedure in the 1990 effort was "smooth- ing" of the estimates through an empirical Bayes model (Hogan 19931. In effect, this model combines the direct estimates based on sample data for each cell with estimates from a regression model fitted to data from all cells, weighting the direct estimates more heavily when they are more precise. This methodology holds the promise of giving smoothed estimates with smaller variance than the direct estimates, but it also was a source of much controversy in evaluation of the 1990 estimates. This methodology was not used in calculation of the proposed 1992 adjustment of postcensal estimates. The smoothing methodology bears further consideration, especially if an attempt will be made to calculate estimates for a refined set of adjustment cells. Further research will be required to make it sufficiently robust to stand up against possible criticisms of model bias or arbi- trariness. Estimation Methods for ICM: Carrying Estimates Down to Lower Levels Let us assume for the moment that adjustment factors for cells have been calculated. Because the geographic detail for cells will almost certainly be much coarser than many of the units for which estimates are required (such as census tracts or small political subdivisions), there must be a procedure for carrying estimates down to lower levels. A synthetic estimation procedure was used for this purpose in the 1990 undercount estimation program, meaning that the popu- lation for each area was estimated by disaggregating the enumerated population for that area into the parts falling into each adjustment cell, multiplying the part in
SAMPLING AND STATISTICAL ESTIMATION 125 each cell by the corresponding adjustment factor, and then summing the adjusted counts. The validity of these synthetic estimates was criticized on the grounds that undercount is not in fact uniform across each adjustment cell. The question of how this lack of uniformity affects the accuracy of synthetic estimates of popula- tion (in absolute terms, and in comparison to unadjusted enumerations) has been a subject of lively debate (Freedman and Navidi, 1986; Schirm and Preston, 1987, 1992; Freedman and Wachter, 1994; Wolter and Causey, 1991; Fay and Thompson, 1993~. This question has been approached through theoretical inves- tigation and through simulations based on coverage measurement data and cen- sus data for other variables than the undercount. Further research in this area before the 2000 census would make a useful contribution to design and validation of the estimation methodology. Other methods have been proposed for carrying estimates down to small areas. One alternative, for example, is a simple regression methodology, with proportions from different adjustment cells as covariates (Causey, 1994) or with other variables as covariates (Ericksen and Kadane, 1985~. Like the synthetic approach, these methods should be subjected to testing through simulations be- fore a final decision is made. Direct and Indirect Estimates Direct estimates are defined as estimates based entirely on data from the domain for which the estimates are calculated. Indirect estimates make use of data from outside the domain. Simple survey estimates are direct estimates, whereas model-based estimates may be indirect. In particular, synthetic esti- mates are indirect because they apply factors calculated over an estimation cell to geographic areas that are smaller than the geographic extent of that estimation cell. Empirical Bayes smoothing models are also indirect because the model component of the estimates is estimated across a large number of cells. In the 1990 undercount estimation program, almost all cells cut across state lines. Consequently, estimates of population for states and subdivisions of states were synthetic and therefore indirect. This fact was grounds for some contro- versy over whether it was accurate and fair to estimate one state's population using data from another state, if conditions might in fact have differed among states. Evaluation studies directed at this question were inconclusive (Kim et al., 19911. The Census Bureau is now considering requiring that direct estimates be obtained for all states. This would have major implications for design of the ICM sample, because even the states with the smallest populations would need sub- stantial ICM sample sizes to obtain direct estimates of acceptable accuracy. The calculations of sample sizes could vary greatly, however, depending on the crite- rion of accuracy that is adopted. At one extreme, a criterion of equal coefficient
26 COUNTING PEOPLE IN THE INFORMATION AGE of variation of direct population estimates in every state (equal standard error of estimated ICM adjustment factors) would imply roughly equal sample sizes in every state, despite the 100-fold ratio of populations between the most and least populous states. Such a design might be drastically inefficient for estimation of adjustment factors for domains other than states. At the other extreme, a criterion of equal variance of direct population estimates for every state would imply larger sampling rates (and therefore disproportionately larger sample sizes) in larger states. Considerations based on minimization of expected loss (Spencer, 1980; Zaslavsky, 1993) would lead to other sample allocations with higher sam- pling rates but smaller absolute sample sizes in small states compared with large states. A decision to require direct state estimates must be considered in light of its implications for accuracy of estimated population shares for other domains, such as urban versus suburban areas within states and white, black, and Hispanic populations in a region. We recommend that alternative ICM designs should be prepared with and without direct state estimates and that the added costs or loss of accuracy for other domains of interest should be made clear at a policy-making level before it is decided whether this feature is essential. Compromises may also deserve consideration, such as requiring direct estimation only for states and . . . . . cities ~ arger than some minimum size. Use of direct estimation for some or all states would not preclude calculation of indirect estimates for subdomains within states, down to a level of detail similar to the posts/ratification in the 1990 PES. For example, the total popula- tion of Idaho might be obtained from a direct estimate, but the estimate for urban Idaho (compared with suburban Idaho) might be based in part on adjustment factors calculated from data for Idaho, Wyoming, and Montana, and the estimate for Native American reservations in Idaho might use nationally estimated adjust- ment factors for Native American reservations. This could be accomplished, for example, by calculating synthetic estimates for all domains and then ratio adjust- ing them or rallying them to match direct state estimates. Selection of estimators must be guided by an awareness that, although for some purposes state estimates are the most important product of the census, for other purposes the distribution of population within states, for example by race and urbanicity, is paramount. Recommendation 4.5: The Census Bureau should prepare alternative sample designs for integrated coverage measurement with varying levels of support for direct state estimation. The provision of direct state estimates should be evaluated in terms of the relative costs and the consequent loss of accuracy in population estimates for other geographic areas or subpopulations of interest.
SAMPLING AND STATISTICAL ESTIMATION What Form Will Final Population Counts Take? 127 The Census Bureau has placed a high priority on making all census products consistent internally and with each other. Consistency requires that, when sev- eral published cross-tabulations share a common stub or margin, or when such a margin can be calculated by summing numbers from several tables, the margins of various tables should agree with each other. For example, the total of the number of white males over age 18 in a state should be the same regardless of whether it is read off the state totals or calculated from tables by county, by tract, or by block. Similarly, published means or proportions should agree with quan- tities that might be calculated from tables. A wide variety of census products are produced, and it is impossible to foresee all tabulations that will be generated as part of regular data series or special tabulations. Therefore, the surest way of guaranteeing this consistency is to produce a microdata file that looks like a simple enumeration of people and households but includes records for people estimated through TOFU sampling and ICM as well as those directly enumerated. Then tabulations and public-use microdata samples may be produced from this microdata file. There are two difficulties inherent to this approach. First, simple estimation methods such as those described above for ICM estimation produce adjustment factors that can be used to calculate expected numbers of people. They do not, however, describe the full detail required to create a roster of households that include the estimated number of people in each estimation cell. Therefore, procedures must be devel- oped that predict household structure for the "estimated" part of the population. These procedures could take various forms, ranging from arbitrary grouping of people to full probability modeling. Second, the rounding inherent in imputation of complete households adds noise to the estimates. For example, if calculations based on the adjustment factors indicate that 0.15 person should be estimated in a block (beyond the enumerated roster) in each of 9 cells, then the total number of people added will have to be rounded to either 1 or 2, and the number in each cell to either 0 or 1. The requirements of creating realistic households may require even more round- ing, since it may not be realistic to assume that the estimated people were neces- sarily in households of only one or two people, even if that is the average number estimated per block. A further complication is that there would be a stochastic element in this calculation. Both the rounding and stochastic components of error would be most noticeable at the most detailed levels of geography. At more aggregated levels, rounding could be controlled to area totals, and the stochastic component of error would tend to average out. This problem is another research area that will have to be considered in the years before 2000. Weighting presents an alternative to imputation that avoids the above diffi- culties. But weighting possesses the disadvantage that it produces noninteger counts. Rounding must therefore be performed after the weighted tables are
28 COUNTING PEOPLE IN THE INFORMATION AGE created, thus complicating the task of maintaining consistency in all census products. A minor, but possibly sensitive, issue is the treatment in estimation of counts associated with census forms that are collected late, after the implicit reference point of ICM. In principle, these should not be counted because the factors derived from ICM do not include these forms in their base. Simply ignoring these forms might be a public relations disaster, however. One appealing solu- tion would be to substitute these forms for households that would have been imputed for ICM without changing the total number estimated for any geographi- cal area. Similar issues may arise with respect to late mailback returns that come in after NRFU sampling and data collection have taken place. Acceptable Accuracy for Estimates The design of NRFU and ICM samples is motivated by considerations of desired accuracy at various levels of geographic aggregation. Under all designs within the current range of consideration, the NRFU sample will be much larger than the ICM sample (10-33 percent versus less than 1 percent), but the portion of the population that will be estimated and imputed on the basis of NRFU sampling (on the order of the mail nonresponse rate, about 30 percent in 1990) is also much larger than the portion estimated through ICM (of the order of 1-2 percent of pre- ICM totals, on the average). Therefore, issues about the accuracy of estimates under NRFU sampling are concerned primarily with small levels of aggregation, such as blocks, tracts, and minor political divisions, and issues about ICM accu- racy are concerned primarily with larger levels of aggregation, such as states, demographic groups in broad geographic regions, and cities. Early research (Fuller et al., 1994) suggests that coefficients of variation for block population estimates under NRFU sampling will be high. Research may lead to improved estimators that reduce variance for small areas, but it is unlikely that any estimator can make major gains in precision at this level. Precise block- level counts are rarely if ever needed, however, except as a means for building up estimates for larger levels of aggregation. Therefore, evaluation of NRFU sam- pling should focus on accuracy at the level of minor civil divisions, state legisla- tive districts, and similar-sized units. Conversely, evaluation of ICM should focus on accuracy at broader levels of aggregation. There are important interactions between ICM sample design, esti- mation methods, and the units for which ICM accuracy is measured. The ICM sample must be designed to give acceptable accuracy for important units such as those listed above. It would be desirable to attain a level of accuracy such that error for these units is smaller than the differential coverage of the pre-ICM phase of the census, but this may not be attainable within acceptable limits for the scale of ICM.
SAMPLING AND STATISTICAL ESTIMATION 129 The Role of Demographic Analysis Demographic analysis refers to the estimation of population using the basic demographic identities relating population to birth, deaths, immigration, and emigration (e.g., Robinson et al., 1993, 1994~. In practice, demographic analysis can be used to obtain population estimates at high levels of aggregation. Tradi- tional methods for demographic analysis yield estimates of the national popula- tion cross-classified by age, sex, and race. In the 1990 census, demographic estimates were used as a check on the aggregate accuracy of dual-system estimates of undercount by age x sex x race group (Robinson et al., 1993~. They are particularly suited to this purpose be- cause most of the components of demographic estimates are quite accurately determined, although estimates of undocumented immigration remain controver- sial (see, e.g., Bean et al., 1990~. In addition, demographic estimates of sex ratios have been used to check the internal consistency of dual-system estimates from the 1990 PES (Bell, 1993~. For this purpose, the DA estimates of sex ratios were regarded as more reliable than DA estimates of population totals, and the results suggested a substantial undercount of black males, even after dual-system esti- mation. Demographic estimates were also used to evaluate the face validity of subnational dual-system estimates in the decision on adjustment of the 1992 postcensal estimates base. More recent and ongoing efforts have attempted to develop improved demo- graphic estimates at subnational geographic areas for the youngest and oldest segments of the population. Estimates for the population age 65 and over have been produced using state-level Medicare records, with allowance for state-by- state variability in Medicare enrollment rates and in percentages of eligible mem- bers of the population age 65 and over (Robinson et al., 19931. State enrollment and eligibility rates are affected by such variables as citizenship status and the proportion of retirees who have held federal government jobs. The Census Bu- reau has also worked to develop subnational estimates for the population under age 10, using vital statistics and estimates of interstate migration (Robinson et al., 1994~. These estimates require assumptions about the state-to-state variability in migration patterns, the completeness of birth records, and the use of valid resi- dence definitions during hospital birth registration. These research programs are breaking important new ground, but it would be premature to judge the credibility of these estimates. As noted above, the esti- mates are based on a number of assumptions that require further evaluation. Also, a major limitation of demographic analysis is that the uncertainties in estimates provided by the method are largely unknown (see Clogg and Himes, 1993; Robinson et al., 1993~. Demographic analysis possesses a number of potential strengths as a method for coverage evaluation in the 2000 census: operational feasibility, timeliness (estimates could be available early in the census year), low cost, independence of
130 COUNTING PEOPLE IN THE INFORMATION AGE ICM, and comparability to historical series (Robinson et al., 1993, 1994; Clogg and Himes, 1993~. But, in addition to the problem of measuring uncertainty, there remain significant difficulties associated with the use of demographic analy- sis that currently limit its role in the decennial census: the lack of reliable data on international migration, particularly for emigration and undocumented immigra- tion, questionable measures of interstate migration, which limit the accuracy of subnational estimates; and problems with racial classification in birth and death records and their congruency with self-identification in the census, which affects estimates for all groups (especially Hispanics, Asians and Pacific Islanders, and Native Americans). Because of the limitations of foreseeable progress on these methodological problems, we expect that demographic analysis will be useful primarily as an evaluation tool for ICM in the 2000 census. We assume that, as in recent cen- suses, national estimates of the population cross-classified by age, sex, and race will be produced for this purpose. It may also be possible and worthwhile to incorporate some demographic information e.g., estimates of sex ratios into the estimation procedure for integrated coverage measurement. However, based on the current state of research, we doubt that demographic analysis could be used in the 2000 census to adjust (or benchmark) final population estimates as part of integrated coverage measurement. Nonetheless, the panel believes further research on demographic methods is a cost-effective investment that could pay long-term dividends beyond the contri- butions to census coverage and evaluation. An exciting new development is the convergence of demographic analysis, the postcensal estimates program, and the Program for Integrated Estimates in connection with the proposed continuous measurement system (see Chapter 6~. The common ground of these programs is that each of them uses a variety of data sources to improve estimation of popula- tion counts and characteristics without relying on the census itself. Demographic analysis traditionally has depended primarily on demographic data, as described above, in order to obtain very aggregated estimates; administrative records such as Medicare registrations have also begun to be used in this program. The postcensal estimates program (Long, 1993) combines census-year population counts with a variety of indicators of change of the population at local levels, such as school enrollments and changes in housing stock, to obtain estimates of the population at a fairly detailed level. The Program for Integrated Estimates (Alexander, 1994) would make use of a wide variety of sources, including data from a new survey and a variety of administrative records with coverage for people, households, or housing units, to produce detailed estimates of counts and characteristics down to the block and tract levels (see Chapter 5 for a discussion of the potential of health care records in this regard). The three programs described above can be seen in a progression according to time of development (from several decades ago to the present and future), cost (from least to most expensive), operational difficulty (from easiest to most chal
SAMPLING AND STATISTICAL ESTIMATION 131 longing), use of localized sources (from purely aggregate analysis to integration of microdata sources), and degree of small-area precision (from least to most detailed). Each step in this progression has been justified by contemporary needs, but each step also brings us closer to having the technical capabilities and experience required to be able to obtain adequate information from administra- tive records without having to mount a full-scale census. Recommendation 4.6: The panel endorses the continued use of demo- graphic analysis as an evaluation tool in the decennial census. However, the present state of development does not support a prominent role for demographic methods in the production of official population totals as part of integrated coverage measurement in the 2000 census. The Cen- sus Bureau should continue research to develop subnational demo- graphic estimates, with particular attention to potential links between demographic analysis and further development of the continuous mea- surement prototype and the administrative records census option. Other Uses of Estimation As discussed in Chapter 3, current proposals (Kalton et al., 1994) for enu- meration of the homeless population call for service-based enumeration. Under these plans, persons making use of services, such as shelters and soup kitchens, would be enumerated on several different occasions. The lists from these enu- merations will then be matched so that the degree of overlap in the service population from day to day can be determined. These data will be the basis for estimation of the total homeless population that males use of these services. These estimation methods are related to dual-system estimation, but there are special complications because one site might be enumerated on several different occasions and because the same person could appear for services at more than one site. Statistical modeling and estimation may also play a role in the use of quality assurance (QA) data to monitor and evaluate the coverage measurement survey (Biemer, 1994~. Evaluation of this survey is difficult, expensive, and controver- sial, because it requires replication and reconsideration of judgmental decisions made during the original coverage measurement operation. For example, the most skilled and experienced matching staff may reanalyze data and obtain addi- tional data from the field long after the census to check the accuracy of matching determinations made during the CensusPlus operation. Because these studies are difficult and depend on skills that are in short supply, they will almost certainly be small compared with the coverage measurement sample itself. If research over the next few years can identify QA measures of the census and coverage measurement process that are correlated with evaluation outcomes, the QA data
32 COUNTING PEOPLE IN THE INFORMATION AGE will provide useful auxiliary variables for evaluation of the distribution and con- sequences of errors in the coverage measurement operation. Prespecification and Documentation of Procedures In the year 2000 census, increased reliance on estimation makes it essential that the Census Bureau's choice of estimation methodologies should inspire gen- eral confidence. It is unrealistic to hope that there will be total unanimity in support of any full set of methodologies. Census methods have always received criticism insofar as they have had a discernible effect on an identifiable geo- graphic area or a particular group. For example, the local review process during the 1990 census led to inclusion of units that were determined to have been omitted from address lists and to recanvassing of some areas. Many procedural decisions, however, are invisible to those outside the Census Bureau and have consequences that are obscure. Decisions about estimation methods have been, and in all likelihood will continue to be, especially controversial (1) because they are open and explicit and (2) because their effects on totals for identifiable areas can be determined, at least after the fact. We address each of these consider- ations, taking the second first. Past experience (for example, with adjustment of the 1990 census and with adjustment of the 1992 base for postcensal estimates) has shown that the majority of the public comments received on an estimation methodology are motivated by concerns for its effects on particular political jurisdictions (Bryant, 1993~. An- other common concern was that confusion would be created by having two sets of results, those before and after adjustment. By prespecifying estimation methods as much as possible, the Census Bureau makes it clear that decisions have been made on good statistical principles and judgment, rather than being motivated by any consideration of how they will affect particular areas. Adopting such a policy avoids the 1990 experience of placing estimation choices before decision makers who could perceive the political consequences of different procedures. In this way, the Census Bureau obtains some protection against criticisms directed at the particular effects of methods. A proper balance must be struck that gains these benefits of prespecification without committing the Census Bureau to a rigid set of procedures that permit no leeway for handling unforeseen circumstances in the conduct of the census or adapting estimators to unanticipated features of the data. Therefore, an appropri- ate level of prespecification includes a positive statement of areas in which judg- ment may be exercised as well as areas in which decisions are made before the census. For example, the general form of the estimators to be used and the flow of information in processing should be prespecified, but it may be recognized in advance that judgment will be exercised in deletion or downweighting of outliers, variable selection in regression models, or splitting of poststrata. The openness and explicitness of estimation methodologies make it possible
SAMPLING AND STATISTICAL ESTIMATION 133 to begin to build consensus in support of them, both technically and politically, before the census even begins. To make this possible, the Census Bureau should release a publication describing the main steps in data collection and processing and the estimation methods that will be used in the census, including a descrip- tion of estimators, of evaluations that will be applied to these estimators, and of points at which the use of professional judgment is foreseen. The process of consensus-building continues after the census through the release of suitable technical documentation. This documentation, as well as describing in general terms the estimators that were used, should present in aggregate terms the calculations that produced population totals reported for major geographic areas (states and large cities), as well as for major demographic groupings. It must be emphasized that this documentation would be released after the publication of census population figures used for apportionment and redistricting, and that the intermediate totals in these calculations should not be interpreted as competing estimates of population. The procedures should be regarded as an integrated whole, not a menu of options from which various parties can pick and choose to find the treatment most favorable to their local area. The postcensal documentation should also contain a summary of evaluation results. Summary measures of accuracy for various levels of aggregation, such as those calculated through the total error model in the evaluation of the 1990 PES, may be a suitable format for summarizing these evaluation results. Recommendation 4.7: Before the census, the Census Bureau should produce detailed documentation of statistical methodology to be used for estimation and modeling. After the census, the Census Bureau should document how the methodology was applied empirically and should provide evaluation of the methodology. Reporting of Uncertainty Official statistics have progressed over the century from a narrow focus on simple tabulations of population characteristics to provision of a range of census products, including complex tabulations and sample microdata files. Analytical uses of these data require availability of both point estimates and measures of uncertainty. When complex statistical methods, such as complex sampling schemes, indirect estimation, and imputation are used in creating census prod- ucts, users will not be able to derive valid measures of uncertainty by elementary methods, and they may not have adequate information in the published or avail- able products to derive these measures. It therefore becomes the responsibility of the data producers to facilitate estimation of uncertainty. Total error models have been used by the Census Bureau to measure uncer- tainty in the outcomes of the census and the contributions of the various sources of error to this uncertainty (Hansen et al., 1961~. More recently, a total error
34 COUNTING PEOPLE IN THE INFORMATION AGE model was developed for estimation of uncertainty in adjusted estimates based on the 1990 census and PES (Mulry and Spencer, 1993~. Such models take into account both sampling errors in the estimates and potential biases stemming from the regular census and from coverage estimation. Bias can arise, for example, from use of several response modes or from differences among response times. Similar models may be a useful tool for evaluating uncertainty in integrated estimates from a complex census in the year 2000. After uncertainties have been estimated, they should be described in a man- ner that allows users to incorporate them into their data analyses. A variety of methods for representing uncertainty are familiar from the world of survey sam- pling. Summary measures of uncertainty (such as average coefficients of varia- tion or variance functions) may be published as a supplement to published tabu- lations' or standard errors may be published for quantities of particular interest. A number of imputation methodologies are available (Rubin, 1987; Clogg et al., 1991) that enable users of public use microdata samples to estimate the effects of sampling and nonsampling variability on their analyses. Recommendation 4.8: The Census Bureau should develop methods for measuring and modeling all sources of error in the census and for show- ing uncertainty in published tabulations or otherwise enabling users to estimate uncertainty. Research Program on Estimation Necessary research on statistical estimation divides roughly into three phases. In the first phase, which is now under way and continues until the major design decisions have been made for the 1995 census test, estimation research focuses on broadening the range of possibilities for the use of sampling and other statis- tically based techniques. In this phase, preliminary assessments can be obtained of the expected precision for venous designs. In the second phase, roughly coinciding with the planning, execution, and processing of the 1995 census test, the emphasis shifts to developing methods needed for the selected designs and methodological features. Although it is not necessary during this phase to decide on all the estimators that will be used, it is critical that enough progress be made on NRFU sampling and ICM estimators to avoid making decisions about design based on estimators that will later be replaced. In the final phase, beginning with assessment of the 1995 census test and continuing through the decade, the selected estimation methods will have to be consolidated, optimized, validated, and made both theoretically and operationally robust. This last process will ensure that they can stand up to critical scrutiny and to problems that may arise in the course of the 2000 census. In this phase, work will also continue on selecting estimation procedures required for the production
SAMPLING AND STATISTICAL ESTIMATION 135 of all census products, including measures of uncertainty, and on more complex procedures that will be used in evaluation of the census estimates. Recommendation 4.9: The Census Bureau should vigorously pursue research on statistical estimation now and throughout the decade. Top- ics should include nonresponse follow-up sampling, coverage estima- tion, incorporation of varied information sources (including administra- tive records), and indirect estimation for small areas.