Read "Preparing For the 2000 Census: Interim Report II" at NAP.edu

Page 46 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

6
Integrated Coverage Measurement: Tackling the Differential Undercount

The problem of undercoverage by the census has been known and studied during the last five censuses. Differential under coverage has always been considered particularly important because it affects distributive accuracy of the census--how benefits such as congressional representation and funds are divided among areas.

Much of what we know about historic trends in undercoverage is based on demographic analysis, which is a useful tool for obtaining aggregated coverage estimates (Robinson et al., 1993). In such analysis, population estimates for age cohorts, by sex and by race, are produced by the basic accounting equation:

current population = births - deaths + immigration - emigration.⁹

Historical series of demographic analyses show that overall census coverage increased from 1940 to 1980 and then decreased in 1990. But there has been persistent and systematic undercoverage of certain groups, including blacks in general and, particularly, black males from about 20 to 40 years old. In addition, the coverage gap between blacks and whites has not closed over time (Passel, 1991; Fay, Passell, and Robinson 1988; Citro and Cohen, 1985).

Demographic analysis has only a limited ability to produce subnational population estimates because of little knowledge about internal migration patterns. Therefore, survey-based coverage measurement is essential to estimate undercoverage in smaller areas, such as states and cities. A Post-Enumeration Survey (PES) was conducted after the 1950 census (Marks, Mauldin, and Nisselson, 1953; Bureau of the Census, 1960), and it is remarkable how many of the design features and issues that are important today were confronted in that early work, although in a different social, organizational and technological environment. ¹⁰ The 1980 Post-Enumeration Program (Fay, Passell, and Robinson, 1988) was primarily based on the Current Population Survey, while the 1990 coverage measurement effort was based on a separate Post-Enumeration Survey (Hogan, 1992, 1993).

The intention of the 1990 effort was to measure coverage and then decide after the census whether to use the results of the postcensal analysis in calculating the final

⁹	For demographic projections from a census to a postcensal year, the base-year population would also appear on the right-hand side of the equation; in this application, however, each cohort is followed from its birth year so the base population is zero.
¹⁰	Similarities between the 1950 PES and the 1995 Census Plus, an alternative integrated coverage measurement methodology, are discussed below.

Page 47 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

population counts. This procedure placed the Census Bureau in the position of having to decide whether or not to use the analysis when its effects were already known. The director of the Census Bureau recommended using that analysis to adjust the 1990 census, based on a technical assessment that this would improve the overall accuracy of the census counts, but the Secretary of Commerce did not accept the recommendation. Much valuable experience was gained from this first full-scale attempt to implement coverage measurement and adjustment in the census.

INCLUSION OF INTEGRATED COVERAGE MEASUREMENT IN THE FINAL CENSUS ENUMERATION

Current plans for the 2000 census call for full integration of coverage measurement into the census process (Bureau of the Census, 1996a). Major decisions about methodology for measuring coverage of the initial census operations (mail-out/mailback returns, nonresponse follow-up, "Be Counted" forms, etc.) and for including the results of coverage measurement in the final counts will be made before the census. Therefore, there will be a single set of census numbers (a "one-number census") rather than a decision on which of two sets of numbers to use after the census. To emphasize the interdependence of all aspects of the census process, the coverage measurement component of the procedure is called ''integrated coverage measurement." The Census Bureau's intention is to do an adequate evaluation of the methodology before the census so that the quality of the integrated coverage measurement can be determined before the census and the decision can be made in advance on whether to incorporate its results into the final enumeration. The advantages of this approach include better timing (no need to allow time in the schedule for making a decision between the initial census results and the release of the final counts), more efficient and focused census operations (no need to pursue two tracks simultaneously), and greater acceptability (because the main decisions on methods are made before their consequences for particular groups or areas are known).

Integrated coverage measurement subsumes both the data collection operations that measure coverage and the estimation procedures that bring the results into the enumeration. Integrated coverage measurement is intended to measure and correct differences in census undercoverage across fairly broad domains. These domains could, for example, be major geographic regions or states; urban, suburban, and rural parts of a geographic region; or subgroups by race and Hispanic origin. These are the levels at which systematic differences in coverage have been found to occur and to persist over time. Integrated coverage measurement is not intended to measure and correct relatively local differences in coverage, such as those occurring in particular localities (except, perhaps, in a few very large cities that are comparable in size to states); the costs and limitations of technical resources do not permit extremely detailed local corrections. Allocation of population to these very small domains depends on the mail-out/mail-back responses together with the nonresponse follow-up. However, integrated coverage measurement domains can be much more detailed geographically than those for which reliable demographic analysis estimates can be obtained. A general discussion of

Page 48 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

integrated coverage measurement and its relationship to nonresponse follow-up can be found in a previous National Research Council report (Steffey and Bradburn, 1994).

The purpose of the decennial census is to obtain as accurate a count of the U.S. population as possible. The existence and persistence of systematic biases in conventional census methodology cannot be ignored. Although we cannot interpret the precise meaning of the term "enumeration" in the Constitution, as statisticians and demographers we interpret the problem of "enumeration" as one of determining the numbers of people in various areas, and we endorse the application of methods that do as well as possible at this, including the correction of long-established biases. Of course, the accuracy of the new census methods must be subject to evaluation, as was the accuracy of the old methods.

If the traditional method of the census tends to over count one group or area relative to another, the overcounted areas do not have a "right" to additional benefits, and they are not being penalized if integrated coverage measurement corrects the enumeration to bring it closer to the truth. It should be noted, however, that even with integrated coverage measurement, there is still an incentive for local efforts to improve the count because integrated coverage measurement measures coverage (and corrects bias) only over broad areas; within those areas, the smaller areas that work for a better count will benefit from it. Integrated coverage measurement can be expected to reduce enumeration errors for most small areas.

Recommendation: The Census Bureau should continue on the path toward incorporating integrated coverage measurement in the 2000 census.

ALTERNATIVE STRATEGIES

Any method of integrated coverage measurement will have to accomplish three goals. First, it will have to estimate the ratio of the true count to the initial count across various domains. Second, the measurement process will have to allow for the fact that the initial count includes both underenumerations and omissions (people who should have been counted at a particular place but were not), as well as overenumerations and erroneous enumerations (people who should not have been counted at a particular place but were). Third, if the estimates are based on some kind of field operation (which seems to be necessary), then the measurement process will have to project from a sample of cases to the entire population in the domain.

The Census Bureau is currently considering two integrated coverage measurement strategies: a Post-Enumeration Survey (PES) with dual-system estimation (DSE) and Census-Plus. (See Steffey and Bradburn, 1994, for a more detailed discussion of these methodologies and the associated evaluation issues and estimation procedures.) Coverage measurement has a fairly long history at the Census Bureau, although the current effort is more ambitious than those in previous censuses. The PES/DSE strategy was used in the 1990 census, and a dual-system estimation procedure (using the Current Population Survey as the second survey source, rather than a specially conducted postenumeration

Page 49 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

survey) was used to evaluate the 1980 census. A version of Census-Plus was used in evaluation of the 1950 census; it was unsuccessful in obtaining the desired level of coverage. An approach considered under the name "Census-Plus" during the 1980s is actually more similar to the "Super Census" idea that was a design option earlier in the current planning cycle (see National Research Council, 1993). Hogan (1989a, 1989b) and Mulry (1993) discuss a number of coverage measurement methodologies including these, with historical references. Hogan discusses, in particular, the conclusions of evaluations of some of these methods after past censuses.

Each of these strategies, PES/DSE or Census-Plus, incorporates both an approach to field data collection and an estimation procedure. The two strategies include a number of common features, including: (1) conducting operations in a sample of blocks in which nonresponse follow-up is also conducted for 100 percent of the nonresponding units; (2) creating an independent listing of housing units for those blocks (i.e., one that does not depend on the address lists used for the initial census); (3) using an interview that is run separately from the main census data collection to collect information about households in those blocks; (4) comparing the results of the interview to the initial census to identify errors in the initial census; and (5) using the results to estimate a ratio of true-to-initial census population.

The main difference in the field operation has to do with the way that features (3) and (4) are implemented. In the PES/DSE strategy (which is similar in broad outlines to the approach used in 1990, with one important difference, noted below), the second (PES) interview is conducted after the census and is made as independent of the census as possible. The results of the interviews are compared ("matched") to the census roster, and all discrepancies are noted. It is then necessary to do follow-up interviews to "resolve" cases for which the census and PES disagree. People in the census who were not in the PES are resolved as either correct enumerations or erroneous enumerations. People who were in the PES but not in the census are resolved as either census omissions or erroneous counts in the PES: this will not necessarily require a follow-up interview, since the PES interview can often collect enough additional information to make resolution possible without further follow-up.

A key difference between the PES plan tested in 1995 and the 1990 plan is that the survey in 1995 was directed at finding out who was in the sample blocks on census day (what has been called "PES-A") rather than those resident at the time of the PES ("PES-B"). This approach requires improved abilities of the Census Bureau to trace people who move (Anolik, 1996). An important motivation for trying this approach is that it is more consistent with what is required for Census-Plus, which was tested simultaneously with PES in 1995 and 1996. Furthermore, the PES-B approach cannot be applied exactly as in 1990 if there is also sampling for nonresponse follow-up. Because most in-movers will have come from blocks with incomplete nonresponse follow-up, it will be impossible to directly check the completeness of their enumeration for mail nonrespondents, who are not included in the nonresponse follow-up sample. This will make it necessary either to use PES-A or develop some new strategy for movers.

In the Census-Plus strategy, the first part of the reinterview is independent, much like the PES interview; the second part is intended to carry out the resolution of

Page 50 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

discrepancies between the two interviews ("reconciliation"). The integrated coverage measurement interviewer uses a computer-aided personal interview (CAPI) instrument, a laptop computer that guides the interviewer through the sequence of questions and records the responses. The CAPI instrument is loaded with the census roster for the household that is being interviewed. (It is interesting that the 1950 PES also involved providing the interviewer with a census roster, which was hidden during the first part of the interview and uncovered in the second part, although obviously at that time it involved paper forms rather than a computer.) Matching is carried out in the field by software in the machine, and the interviewer continues with a series of questions that produces a final "resolved roster." No further follow-up is required. A major operational advantage of Census-Plus, if it can be successfully implemented, is that field work can be completed relatively quickly because only one reinterview is required for each household.

Both the Census-Plus and PES methodologies were tested in the 1995 census test, using the same set of integrated coverage measurement interviews. This strategy was possible because the first, independent, part of the Census-Plus interview is essentially equivalent to the PES interview. Other PES field operations (follow-up) could be added later, without interfering with the reconciliation part of the integrated coverage measurement interview.

The estimation methods for the two approaches differ slightly (see Steffey and Bradburn, 1994:122-124). In principle, the statistical assumption underlying the dual-system estimation is that the PES is statistically independent of the census, in the sense that within defined population groups, the probability of being included in the PES is the same whether or not a person was included in the initial census; perfect coverage by the PES is not required. The statistical assumption underlying Census-Plus is that the resolved roster is the truth for the sample blocks, that is, that it has perfect coverage. In practice, neither of these theoretical assumptions is true, but the mathematical differences between the two estimators are not as great as their differences in field procedure.

The real issue is not whether one or the other of the assumptions is perfectly correct, but the effect on the quality of the estimates when the assumptions deviate from the truth. Research conducted after the 1990 census (Bell, 1993) and still under way at the Census Bureau used demographic analysis to estimate how far the independence assumption of the dual-system estimation in the PES was from the truth and the possible effect of these deviations on estimates. In fact, the assumptions of the Census-Plus estimator are stronger than in the PES, since complete coverage by one source implies statistical independence. Dual-system estimation, in assuming only statistical independence, does not in principle require that the PES have perfect coverage or even that it has better coverage than the census. Moreover, even if the independence assumption fails, there is a good basis for believing that dual-system estimation at least moves the estimates for each population group in the right direction. This reasoning suggests that PES/DSE estimates may be more robust against failures of their assumptions than Census-Plus estimates, and this argument was made by Marks (in Krotki, 1978) in support of a dual-system estimation rather than a Census-Plus strategy. (Marks was writing in part in reaction to the experience of the 1950 PES, which used an estimator similar to Census-Plus.) Viewed another way, Census-Plus attempts to improve on dual

Page 51 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

system estimation by using field procedures that force the independence assumptions to hold--by having perfect coverage in the PES. If Census-Plus fails to achieve this objective, there is a strong likelihood that its results will be more in error than those from the PES/DSE.

RESULTS FROM THE 1995 CENSUS TEST

At this writing, evaluation of the integrated coverage measurement methodologies must be based on experience in the 1995 census test, the only comparative study to date. Integrated coverage measurement was carried out in each of the three test sites: in general, all integrated coverage measurement operations were carried out as planned and on schedule, and noninterview rates were not excessive. The CAPI instruments worked as intended, although there were some difficulties with the details of the CAPI interview (discussed below). The operations provided a wealth of information on the interview process, and the operational successes give grounds for general optimism about the integrated coverage measurement process.

Thirteen evaluation studies were conducted; an excellent summary of their conclusions is given in Vacca, Mulry, and Killion (1996). The Census Bureau should be congratulated for completion of these evaluations, which have been extremely helpful in assessing the integrated coverage measurement methodologies and developing improved approaches. In the remainder of this section we consider some of the results of these studies, focusing on those that address the critical decisions and issues noted in earlier National Research Council reports.

Face Validity of the Methodologies

The numerical results of integrated coverage measurement in the 1995 test were examined for their face validity in comparison with patterns that have been consistently observed in the past. The Census-Plus results did not seem reasonable, while the PES/DSE results were consistent with past patterns. For example, Census-Plus estimated that there was an overcount in 11 of 14 poststrata for blacks in Oakland and in 13 of 14 poststrata for blacks in Paterson; the PES estimated that there was an undercount in all of these poststrata (Mulry and Griffiths, 1996). Thus, the Census-Plus results do not agree with past results from both demographic estimation and survey data indicating that blacks are among the most undercounted groups.

Research comparing the 1995 Census-Plus and PES/DSE results with benchmarks from demographic analysis indicates that the latter was more successful in estimating coverage error. For example, Robinson (1996) reported that the demographic analysis estimates of undercoverage for ages 0-14, based on birth, death, and migration records, were 16.7 percent for Oakland and 18.7 percent for Paterson. The respective dual-system estimates for ages 0-17 were 12.8 percent and 13.8 percent, and the respective Census-Plus estimates were 3.8 percent and 3.4 percent. In Louisiana, the demographic

Page 52 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

analysis undercoverage estimate was 2.2 percent, which was very close to the Census-Plus estimate of 2.1 percent (dual-system estimation was not performed in this area). Robinson (1996) also examined sex ratios (number of males divided by number of females) by two age groups for blacks in Oakland and Paterson: Census-Plus lowered the estimated sex ratios from those obtained before integrated coverage measurement, while dual-system estimation raised them. The latter results were again more credible, since undercoverage is higher for black males than it is for black females. Kohn (1996) also examined sex ratios and found that the dual-system estimation results were much more consistent with demographically plausible sex ratios than were the Census-Plus results (although the dual-system estimation results were still too low).

Operational Problems: Incomplete Rosters

We have considered possible reasons for the shortfall of the Census-Plus estimator. An operational problem that appears to have contributed substantially to the implausible results for Census-Plus is that many of the integrated coverage measurement interviews were conducted either with an incomplete initial census roster in the CAPI instrument or without a census roster at all. This deficiency was a result of late nonresponse follow-up interviews and other late-arriving information (such as the "Be Counted" forms) and processing difficulties that kept the final census roster from being ready on time for the integrated coverage measurement interview. Approximately 22 percent of integrated coverage measurement interviews in Oakland were conducted without any census roster in the CAPI instrument; this is a rough estimate based on a restricted sample used in the evaluation interview, described below (West and Griffiths, 1996).

The census roster was unavailable in some other cases, even if it had been loaded into the CAPI instrument, because the interviewer might go to the wrong unit, especially in a multi-unit structure. This might have occurred in structures without clearly marked separate mailboxes, because the original census questionnaire was delivered to a unit other than the one coded in the address or because definitions of the units was unclear, making it difficult to choose the correct one. The design of the CAPI instrument did not allow the interviewer, after the first part of the interview, to match the household to the census household to which it actually corresponded. Instead, the interviewer had to attempt a match of the household being interviewed to the census roster for the other household which had been mistakenly called up, making the census roster worse than useless. In Oakland, it appears that approximately 1.6 percent of the census cases were not matched to the integrated coverage measurement interview at the address given, but were known to be at a nearby address; another 2.7 percent were completely unknown at the given address and might or might not have been nearby (D. Childers, Census Bureau, private communication).

An apparent consequence of not loading rosters and loading incomplete rosters in the CAPI instruments was that a large number of people who were in the census but not in the independent integrated coverage measurement interview were ultimately not

Page 53 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

included in the integrated coverage measurement resolved roster; thus, they were in effect considered to have been erroneously enumerated by the census. This problem resulted in Census-Plus estimates that were too low. Presumably, had complete census rosters been in the interviewers' machines, many people not named in the independent integrated coverage measurement interview would have been identified as correctly enumerated by the census.

Treat (1996b) reported on an evaluation study that used matching between data used in Census-Plus and data used in dual-system estimation for Oakland to estimate how many people were missed in Census-Plus that were ultimately included in the E sample (census enumerations) for the dual-system estimate. It was estimated that 32,365 would be missed by Census-Plus in housing units where there was a complete or partial integrated coverage measurement interview. (This estimate is considered an upper bound because a nonsystematic clerical review indicated that there were some false nonmatches.) To put this in perspective, the Census-Plus estimate for the total population was 334,493 (Vacca, Mulry, and Killion, 1996). Analyses of the missed people by length of residence, sex, race or Hispanic origin, and age suggest that renters, males, blacks, and young people were missed in higher proportions by Census-Plus than by the census. This finding may explain the poor results observed for the Census-Plus in traditionally undercounted groups. Further analysis by Treat (1996b) of the people estimated to have been missed by Census-Plus indicates that 60 percent were from housing units for which census rosters had been loaded into the integrated coverage measurement CAPI instruments; 40 percent were from housing units for which rosters had not been loaded. Thus, large numbers of missed people in Census-Plus are estimated to occur in both of these situations.

Insight into these numbers is given by the integrated coverage measurement evaluation interview (West and Griffiths, 1996). A sample of integrated coverage measurement housing units in Oakland where there had been complete or partial interviews was reinterviewed by especially skilled interviewers using techniques similar to the original integrated coverage measurement interview, but with the input rosters in the CAPI instruments supplemented with the nonmatches from the original integrated coverage measurement interview. In housing units with at least one match between the initial census roster and the integrated coverage measurement independent roster, 3 percent of the people who were determined by the evaluation interview to be residents were not included on the integrated coverage measurement resolved roster. In housing units for which no census rosters were loaded into the integrated coverage measurement CAPI instruments, the corresponding figure was 12 percent; for those where census rosters were loaded but there were no matches, the figure was 11 percent. These findings suggest that the chance of a person being missed by Census-Plus is much lower in a housing unit with a census roster that corresponds to that person's household. Nevertheless, a large percentage of the estimated missed people were from this type of housing unit because such units occur more frequently than units with whole-household nonmatches or with no census rosters. In this study, the weighted number of people in housing units with at least one match is 74 percent of the total study population.

These evaluations illustrate the absolute necessity of having the census rosters in

Page 54 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

the CAPI instruments prior to the integrated coverage measurement interview and making these rosters as complete as possible. Further analyses of data from the 1995 tests and from future tests would enhance these evaluations, perhaps making it possible to predict how much the coverage of Census-Plus could be improved if the roster problem is resolved. For example, it might be feasible to compare the probabilities that a person is included in the integrated coverage measurement resolved roster under three different conditions: that there is no census roster (or a roster for a different household), that there is a roster but the person is not included, and that the person is in the roster.

Operational Problems: The Integrated Coverage Measurement Interview

An important issue for the validity of Census-Plus is whether there was a tendency to bias the results of the interview in such a way that the resolved roster agrees with the initial (independent) part of the integrated coverage measurement interview. This question implies that names that appeared on the census roster but not on the integrated coverage measurement independent roster would tend to be resolved as nonresident and therefore be left off the resolved roster, reducing the Census-Plus estimates. This phenomenon is called "reconciliation bias." Studies of a nonrandom set of interviews that had been tape recorded did not reveal evidence of overt bias, such as interviewers trying to convince respondents not to include a census person or deliberately going back and changing the independent integrated coverage measurement results to conform to the census (Bates and Good, 1996). However, the absence of overt bias does not exclude the possibility that there is a more subtle tendency, involving both the interviewer and the respondent, to make the results consistent with the initial integrated coverage measurement roster. Evaluation of the extent and effect of reconciliation bias is related to the issues raised above about the consequences of incomplete or missing rosters. Further analyses might show whether there is statistical dependence between inclusion of an individual in the census and the integrated coverage measurement resolved roster, given that the person is determined in the evaluation interview to be a correctly enumerated resident.

A study by Biemer and Treat (1996) used latent class models applied to Oakland data to estimate the rates at which the original census, the integrated coverage measurement interview, and the evaluation interview correctly or incorrectly classified cases as resident or nonresident on census day. (Because the methodology is highly model dependent, the results must be treated with some caution.) The best-fitting models selected in this exercise had to allow for an interaction between each of the pairs of sources: even within a race stratum, inclusion in the original census enumeration was related to inclusion in the Census-Plus resolved roster. It is not possible to determine from this study, however, to what extent this relationship was due to the problems with rosters described above. An implication of this model is that the evaluation interview appeared to be virtually error-free in classification of cases as resident or not (excluding cases classified as unresolved), but the original roster and the integrated coverage measurement resolved roster both missed substantial numbers of residents. According to

Page 55 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

this model, 7.8 percent of actual residents (among blacks, Hispanics, and Asians and Pacific Islanders) were omitted, or 7.3 percent of all cases considered, while 2.6 percent of all cases considered were erroneously included in the resolved roster; again, this supports the view that the resolved roster substantially underestimates population.

Another feature of the integrated coverage measurement interview process is that during the reconciliation phase, the interviewer "probed" the respondent to obtain information to determine the residency status of people on either the census roster or the integrated coverage measurement roster (but not both), as well as to confirm the status of matching people. Qualitative study of this part of the integrated coverage measurement instrument revealed that the process for checking the roster was awkward and slow, tending to wear out respondents' patience. Furthermore, the probes were slow and repetitious. This caused problems in the interview, in which the interviewer tended to skip over the details of the probes, or the respondent rushed an answer without listening to all the alternatives. Moreover, in the 1995 test, the CAPI instrument had no provision to accommodate the absence of a census roster. Instead, the respondent had to explain for each household member individually that they had been included on a form that was not available to the interviewer (Bates and Good, 1996).

Interviewers had a choice of codes to describe the reason for each nonmatch. West (1996) reported on a descriptive study of the frequencies with which the various codes were chosen in the three census sites. A few interviewers tended to be "outliers" in that each chose a specific code much more frequently than a large majority of interviewers. The most striking result was the frequency with which most interviewers chose the code "Other," in which cases the CAPI instruments prompted the interviewers to include notes. For people on the integrated coverage measurement roster who did not match anyone on the census roster, the percentage of cases in which this code was used was 88 in Oakland, 94 in Paterson, and 76 in Louisiana. For people on the census roster who did not match anyone on the integrated coverage measurement roster, the corresponding percentages were 30, 29, and 36. Use of the "Other" code resulted in a case being reviewed clerically to determine residency status. The clerks, when debriefed, remarked that information in the accompanying notes suggested the interviewer could have selected one of the other codes. They also noted that "Other'' was apparently used to explain whole-household nonmatches and housing units for which there was no census roster. Some other possible concerns with respect to the frequency in which various codes were used are discussed in West (1996) and Vacca, Mulry, and Killion (1996). Some suggest errors that could lead to classifying residents as unresolved; others seem to indicate confusion among respondents regarding census residency rules.

More positively, evaluation studies of matching for dual-system estimation indicated that the information in the integrated coverage measurement interview was adequate to code P-sample cases (i.e., people who were listed in the initial integrated coverage measurement roster) as resident or nonresident, with a high degree of accuracy (especially for those who were resolved, given a definite status as either resident or nonresident). The information from the integrated coverage measurement interview was also very useful in determining whether E-sample cases (those listed in the census roster) were in fact resident, as long as an integrated coverage measurement interview was

Page 56 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

obtained with the same household (Childers, 1996).

Weighting Procedures

Another possible factor contributing to error in the integrated coverage measurement was the weighting system used for households that were in the sample but not interviewed. This system did not make use of information from the census: it assumed that the noninterview households had the same proportional composition as all integrated coverage measurement households for which an integrated coverage measurement interview was obtained; it did not attempt to match them to integrated coverage measurement households whose census data was similar to the missed households. A follow-up study (Gbur, 1996) of a sample of noninterviewed households from Oakland suggests that these households tended to be smaller than those contacted during integrated coverage measurement and that the match rates of people in these households to people in the corresponding census rosters tended to be lower than the dual-system estimation match rates for the block clusters in which the households were located. Thus, addition of the noninterview follow-up data to the data used for Census-Plus and the PES/DSE lowered the Census-Plus adjustment factor from 1.005 to 0.978 but raised the PES/DSE adjustment factor from 1.0866 to 1.1081 (Vacca, Mulry, and Killion, 1996).

We do not know what the effect on estimates would have been of a more conditional weighting scheme--one that made use of more information about the census characteristics of households that were not contacted. Assuming, however, that the census would have indicated that such households were smaller, we expect that the effect of a more conditional weighting scheme would also have been to lower the Census-Plus estimates of the undercount. Part of the significance of this finding is that it suggests that the Census-Plus field procedure did even worse than the estimates show because the estimates were helped by the fact that the nonresponse weighting procedure raised the Census-Plus estimates.

Comparability of Blocks

Because integrated coverage measurement is conducted only in a sample of blocks, one must be able to generalize from the sample blocks to the remaining blocks in the census. For this generalization to be valid, there should be no systematic differences in coverage between blocks in and not in the integrated coverage measurement.

There are two reasons that the blocks could differ. First, the integrated coverage measurement blocks are a block sample for nonresponse follow-up, while it appears likely that outside of those blocks there will be a unit sample. Every mail-back nonrespondent address in the integrated coverage measurement sample blocks is followed up in the field during nonresponse follow-up. This procedure is necessary because the integrated coverage measurement interviewers work independently of the nonresponse follow-up

Page 57 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

process and cannot distinguish between nonresponse follow-up sample and nonsample households in the same block. Second, the integrated coverage measurement process itself could have some effect on the census response rates. For example, the independent listing operation for integrated coverage measurement could sensitize residents of integrated coverage measurement blocks to the census, or the nonresponse follow-up and integrated coverage measurement interviewers could run into each other in the field. These interactions are referred to as "contamination" of the census by integrated coverage measurement and would only be a problem if integrated coverage measurement and nonresponse follow-up overlap in the same blocks.

Both of these issues can be quantitatively assessed by comparing indicators of coverage in blocks from which a nonresponse follow-up unit sample was drawn, nonresponse follow-up blocks that were included in a block sample but not in integrated coverage measurement, and integrated coverage measurement blocks (which are also in the nonresponse follow-up block sample). Comparisons among groups of integrated coverage measurement blocks are difficult because their numbers are small, although it would be possible to perform such comparisons in an evaluation in the 1998 dress rehearsal and in the 2000 census itself, in which the integrated coverage measurement sample will be much larger than in the 1995 test. Comparisons involving nonresponse follow-up sample blocks that are not in the integrated coverage measurement will not be possible if the unit sample design is adopted.

Griffiths (1996) compared both kinds of blocks in Oakland, paired by predicted nonresponse rates, and tested differences within four integrated coverage measurement sampling strata: blocks with high concentrations of blacks, of Hispanics, of Asians and Pacific Islanders, and all other blocks. Several outcomes were considered that described the census process, of which the mail response rate was of primary interest, as it is believed to have a strong relationship with inclusion probability. The results for mail response were inconsistent across strata (with two showing higher rates and two showing lower nonresponse rates for the integrated coverage measurement blocks), and in only one stratum were they even marginally statistically significant. No differences could be detected for other variables. Since one might expect the effects of contamination at least to go in the same direction in most strata, we conclude that this study found no evidence for such effects, although with the caveat that the data are consistent with a small effect on the order of a few percentage points.

Treat (1996b) also compared the two kinds of blocks in Oakland, as well as the nonresponse follow-up block and unit sample panels. As with the other study, most differences were not statistically significant, and few consistent patterns were found, but the standard errors were large enough so that small but important effects could have eluded detection.

Census managers who are familiar with the field operation have expressed a strong view that census results are affected neither by contamination nor by differential coverage by nonresponse follow-up sample design. Their view is based on an assessment that the design of the nonresponse follow-up makes it very unlikely that the interviewers would be able to distinguish whether their assignments come from a unit or a block sample.

Page 58 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

Comparability of integrated coverage measurement blocks with other blocks is one of the critical issues for the validity of integrated coverage measurement, and we regard the currently available evidence as favorable but not conclusive. The worst possible outcome would be if the Census Bureau proceeded on the assumption that there are no contamination effects or differences between blocks in or not in integrated coverage measurement and was later taken by surprise when evidence appeared for these effects in the census.

Recommendation: Quantitative evaluation of the comparability of integrated coverage measurement blocks with other blocks should be built into the 1998 dress rehearsal and the 2000 census, and plans for the census should include a statistical correction for any important differences found.

NEXT STEPS

The Census Bureau plans two changes in the integrated coverage measurement process to correct the problems with the interview experienced in the 1995 census test, which most strongly affected the Census-Plus results. These changes are being implemented in the 1996 test, which is under way as this report is being written. First, the Bureau intends to speed up processing of census returns (from mailed questionnaires and nonresponse follow-up interviews) so that a more complete census roster will be available for the integrated coverage measurement interviews.

Second, the Bureau has redesigned the CAPI instrument for the integrated coverage measurement interview so that it is less dependent on the judgment and skills of the interviewer and better adapted to situations that commonly arise. For example, it is now easier for a respondent to confirm the residency of the household on census day without going through each household member individually. The questions used for resolution of residency status are more structured and do not require open-ended coding by the interviewer. Each type of discrepancy between the initial integrated coverage measurement interview and the census roster is handled with a distinct set of questions tailored to that type of case, and there is also a set of questions designed for whole household nonmatches (arising when the interviewer has the wrong roster or no roster or when a family moved). Preliminary tests of this new CAPI instrument show that the interviewers followed the desired question sequence much more consistently than in the 1995 test, especially prior to the final section of the interview in which the status of unmatched people in the census roster was probed. In addition, undesirable responses were recorded at much lower rates (Bates, 1996).

As this report was being prepared, the Census Bureau decided to drop the Census-Plus option from consideration and use PES/DSE methods for integrated coverage measurement in 2000. The panel congratulates the Census Bureau on making this timely decision. It is essential to select methodology that is fairly certain to work well enough in the 1998 dress rehearsal so that only minor refinements would be necessary for the 2000 census. (In fact, it is unlikely that the Census-Plus methodology could have been

Page 59 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

conclusively supported by the results of the small 1996 test.)

This Census-Plus versus PES/DSE decision should not prevent consideration of strategies that combine some of the best features of the two approaches. For example, the PES/DSE procedure could be combined with aspects of the 1995 interview structure. In this plan, the initial (independent) phase of the integrated coverage measurement interview would be regarded as a PES interview. The reconciliation phase of the integrated coverage measurement interview would be used to collect as much information as possible about the discrepant cases, but because a dual-system estimation procedure is used, estimation does not depend on the assumption that a correct and complete resolved roster can be created in the field. In particular, the integrated coverage measurement interviewer might attempt to determine whether individuals who were listed on the census roster but not in the independent reinterview were actually census day residents, and likewise whether individuals who were listed in the independent roster but not in the census were residents on census day. Although some additional field follow-up might be necessary, it should involve fewer cases than with traditional PES methodology and therefore be completed more quickly. However, it is not clear whether the benefits of this combined interview would outweigh the delays and processing costs associated with loading census information in the CAPI instrument or whether the quality of the information obtained for these cases would be adequate to make a further interview unnecessary for a large fraction of them. For example, one issue is that the integrated coverage measurement interview only searches for matches to the census within the same housing unit, while PES procedures (at least as implemented in 1990) typically search over a larger area, such as the entire block or the block together with surrounding blocks. The 1996 census test should shed further light on these questions.

Recommendation: The Census Bureau should consider integrated coverage measurement strategies that combine features of the current Census-Plus and PES dual-system estimation approaches, rather than restricting its options to using one approach or the other in its present form.

OTHER ISSUES AND RESEARCH

The Census Bureau is developing a detailed research agenda (Killion, 1996a) for the 2000 census. We applaud this effort and believe that relatively small investments in research now can have a big payoff in a more efficient, accurate, and operationally smooth census in 2000. At this time we note a few issues that we regard as especially important in light of our current understanding of the integrated coverage measurement methodology.

Page 60 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

Total Error Model

Much of the discussion above deals with some specific issues with the integrated coverage measurement process in the 1995 test. A more general issue concerns the difficulties that we encountered in summarizing the effects of the various problems on the estimates. The calculations that lead from the initial census and integrated coverage measurement data to the final population estimates are quite complex, and it was often difficult to determine the quantitative effect of a particular type of error (measured or hypothesized) on the estimates.

One of the most useful products of the evaluation of the 1990 Post-Enumeration Survey was a total error analysis (Mulry and Spencer, 1993). This analysis listed all of the various types of error that could occur in the census and PES, summarized what was known about their magnitude, and then summarized their contribution to the overall bias and uncertainty of the estimates. We recognize that this approach is a difficult undertaking, but we believe that it would be useful in interpreting the census tests. Most important, it would be an essential part of the presentation of the results of the 2000 census, including the integrated coverage measurement. This type of systematic presentation will be important to provide convincing evidence of the overall accuracy of the one-number census.

Recommendation: The Census Bureau should prepare to carry out a total error analysis for the 2000 census.

Use of Administrative Records

One approach that has been considered for improving the completeness of the integrated coverage measurement roster is to collect names from administrative records pertaining to each integrated coverage measurement address, load them in the CAPI instrument, and then check in the field on whether they correspond to census day residents at that address. Preliminary studies suggest that a substantial number of names could be added by this method, but it raises serious and possibly insuperable concerns about confidentiality. A discussion of administrative records is in Chapter 7.

Sample Design and Estimation Methods

We have focused on issues concerning the operational aspects of integrated coverage measurement because the 1995 and 1996 census tests were small-scale efforts that resemble the decennial census at the micro level but not at the macro level of sample design and estimation procedures. Nonetheless, choices made at the macro level will have an important effect on the usefulness of integrated coverage measurement results. A number of these issues are discussed in a previous National Research Council report (Steffey and Bradburn, 1994), and we anticipate consideration of these issues in our final

Page 61 Cite

Suggested Citation:"6 Integrated Coverage Measurement: Tackling the Differential Undercount." National Research Council. 1997. Preparing For the 2000 Census: Interim Report II. Washington, DC: The National Academies Press. doi: 10.17226/5886.

×

report. At this time we summarize briefly some of the important questions that must be dealt with in order to plan the decennial design.

Sample design and estimation procedures are intimately related to each other because the overall size and allocation of the sample in part determines what kinds of estimates will be feasible, while the estimation procedures conversely determine what the requirements are for the sample. One important issue is whether state population estimates will all be direct estimates, that is, estimates that use only data collected within that state. If this constraint is placed on the estimation procedures, then the sample must be designed so that every state has a large enough sample size to support a direct estimate of acceptable accuracy.

Current plans call for a sample size of approximately 750,000 households for integrated coverage measurement. This figure is calculated (Navarro, 1994) to make possible direct estimates with coefficients of variation of no more than 0.5 percent for states and for some important substate areas, such as major cities. The calculations appear in a series of internal Census Bureau memoranda from 1994-1995, which consider several different sample allocations with differing tradeoffs between the criteria of controlling the maximum standard deviation of the estimated population (which requires larger samples in the larger states) and controlling the maximum coefficient of variation (which requires roughly equal sample sizes in all states). We have not yet reviewed these issues closely, but we hope that the Census Bureau will prepare a more detailed discussion of sample allocation in time for the panel to give this closer attention in our final report.

The decision to allocate the sample size to support direct estimates for every state implies that the sampling rate--the ratio of sample size to state population--will be much lower in large states than in small ones, leaving little sample available for differential adjustment for substate domains (e.g., geographical regions in large states or urban compared with suburban and rural areas). This decision should be based on appropriate cost estimates. We also look forward to research on other features of the estimation procedure, such as the use of indirect estimates for substate domains.

Recommendation: The Census Bureau should perform the calculations necessary to clarify the effect of using direct state estimates on the sample sizes required for state estimates for the 2000 census and the consequences of these requirements for the accuracy of other estimates affected by integrated coverage measurement.

Fielding a large survey as part of the census will be a major challenge for the Census Bureau, but there is reason to believe that it will be possible. First, the management structure for the survey would be similar to that for the 1990 PES, which was successfully implemented by the regional offices. Second, the number of temporary staff in 2000 will be relatively lower than was required to implement the 1995 census test integrated coverage measurement (since the test had a much denser sample than is planned for the 2000 census), and the training and CAPI instrument will be better developed than in 1995. Since it was possible to do the necessary recruitment in 1995 even in the areas that are typical of those in which it is hard to recruit skilled personnel, it should be possible to do so in 2000.