NATIONAL RESEARCH COUNCIL
COMMISSION ON BEHAVIORAL AND SOCIAL SCIENCES AND EDUCATION
2101 Constitution Avenue Washington, D.C. 20418
COMMITTEE ON NATIONAL STATISTICS
Panel to Review the 2000 Census
Telephone: 202-334-3096 Facsimile: 202-334-3751
May 3, 1999
Dr. Kenneth Prewitt
U.S. Bureau of the Census
Room 2049, Building 3 Washington, DC20233
Dear Dr. Prewitt:
As part of its charge, the new Panel to Review the 2000 Census offers this letter report on the Census Bureau’s plans for the design of the Accuracy and Coverage Evaluation (ACE) survey, a new post-enumeration survey. This survey is needed in light of the recent U.S. Supreme Court ruling regarding the use of the census for reapportionment.
In general, the panel concludes that the ACE design work to date is well considered. It represents good, current practice in both sample design and post-stratification design, as well as in the interrelationships between the two. In this letter the panel offers observations and suggestions for the Census Bureau’s consideration as the work proceeds to complete the ACE design.
Because it is not possible to count everyone in a census, a post-enumeration survey is an important element of census planning. The survey results are combined with census data to yield an alternative set of estimated counts that are used to evaluate the basic census enumeration and that can be used for other purposes. For 2000, an Integrated Coverage Measurement (ICM) survey had been planned for evaluation and to produce adjusted counts for all uses of the census.1 The recent U.S. Supreme Court ruling against the use of sampling for reapportionment among the states eliminates the need for a post-enumeration survey that supports direct state estimates, as was originally planned for the ICM survey. (The state allocations of the ICM sample design deviated markedly from a proportional-to-size allocation in order to support direct state
See National Research Council(1999), Measuring a Changing Nation: Modern Methods for the 2000 Census. Michael L. Cohen, Andrew A. White, and Keith F. Rust, eds., Panel to Evaluate Alternative Census Methodologies, Committee on National Statistics, National Research Council. Washington, D.C.: National Academy Press.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
NATIONAL RESEARCH COUNCIL COMMISSION ON BEHAVIORAL AND SOCIAL SCIENCES AND EDUCATION 2101 Constitution Avenue Washington, D.C. 20418 COMMITTEE ON NATIONAL STATISTICS Panel to Review the 2000 Census Telephone: 202-334-3096 Facsimile: 202-334-3751 May 3, 1999 Dr. Kenneth Prewitt Director U.S. Bureau of the Census Room 2049, Building 3 Washington, DC20233 Dear Dr. Prewitt: As part of its charge, the new Panel to Review the 2000 Census offers this letter report on the Census Bureau’s plans for the design of the Accuracy and Coverage Evaluation (ACE) survey, a new post-enumeration survey. This survey is needed in light of the recent U.S. Supreme Court ruling regarding the use of the census for reapportionment. In general, the panel concludes that the ACE design work to date is well considered. It represents good, current practice in both sample design and post-stratification design, as well as in the interrelationships between the two. In this letter the panel offers observations and suggestions for the Census Bureau’s consideration as the work proceeds to complete the ACE design. Background Because it is not possible to count everyone in a census, a post-enumeration survey is an important element of census planning. The survey results are combined with census data to yield an alternative set of estimated counts that are used to evaluate the basic census enumeration and that can be used for other purposes. For 2000, an Integrated Coverage Measurement (ICM) survey had been planned for evaluation and to produce adjusted counts for all uses of the census.1 The recent U.S. Supreme Court ruling against the use of sampling for reapportionment among the states eliminates the need for a post-enumeration survey that supports direct state estimates, as was originally planned for the ICM survey. (The state allocations of the ICM sample design deviated markedly from a proportional-to-size allocation in order to support direct state 1 See National Research Council(1999), Measuring a Changing Nation: Modern Methods for the 2000 Census. Michael L. Cohen, Andrew A. White, and Keith F. Rust, eds., Panel to Evaluate Alternative Census Methodologies, Committee on National Statistics, National Research Council. Washington, D.C.: National Academy Press.
OCR for page 1
estimation. Specifically, the ICM design required a minimum of 300 block clusters in each state.) Alternative approaches are now possible for both sample and post-stratification designs for the 2000 ACE survey. As a result, the planned ACE post-enumeration survey will differ in several important respects from the previously planned ICM survey. The National Research Council is the principal operating agency of the National Academy of Sciences and the National Academy of Engineering to serve government and other organizations Plans for ACE Sample and Post-Stratification Design Our understanding of the current plans for the ACE survey is based on information from Census Bureau staff.2 Building on its work for the previously planned ICM, the Census Bureau will first identify a sample of block clusters containing approximately 2 million housing units and then will independently develop a new list of addresses for those blocks.3 In a second stage, a sample of block clusters will be drawn from the initial sample to obtain approximately 750,000 housing units, which was the number originally planned for the ICM. (Larger block clusters will not be drawn in their entirety; they will first be subsampled to obtain sampling units of 30-50 housing units. Because the costs of interviewing are so much greater than the costs of listing addresses, this subsampling approach allows the interviewed housing units to be allocated in a more effective manner.) Finally, in a third stage, a sample of block clusters will be drawn from the second-stage sample to obtain the approximately 300,000 housing units required for the ACE sample. The target of 300,000 housing units for the ACE, which may be modified somewhat, will be based on a new set of criteria that are not yet final. The Census Bureau is considering three strategies for selection of the 300,000 ACE subsample from the 750,000 sample: (1) reducing the sample proportionately in terms of state and other block characteristics from 750,000 to 300,000; (2) reducing the sample by using varying proportions by state; or (3) differentially reducing the sample by retaining a higher proportion of blocks in areas with higher percentages of minorities (based on the 1990 census).4 These options for selection of the 300,000 ACE housing units from the 750,000 units first selected will be carefully evaluated. The plans include three evaluation criteria for assessing the options: (a) to reduce the estimated coefficients of variation for 51 post-stratum groups (related to the 357-cell post-stratification design discussed below); (b) to reduce the differences in coefficients of variation for race/ethnicity and tenure groups; and (c) to reduce the coefficients of 2 See Kostanich, Donna, Richard Griffin, and Deborah Fenstermaker (1999), Accuracy and Coverage Evaluation Survey: Plans for Census 2000. Unpublished paper prepared for the March 19, 1999, meeting of the Panel to Review the 2000 Census. U.S. Bureau of the Census, Department of Commerce, Washington, D.C. 3 The use of the term block cluster refers to the adjoining of one or more very small blocks to an adjacent block for the purpose of the ACE sample design. Large blocks often form their own block clusters. 4 The Census Bureau is aware that mixtures of strategies (2) and (3) are also possible, although such mixtures are not currently being considered.
OCR for page 1
variation for estimated state totals. (Option (3) above is motivated by criterion (b).) Without going into detail, it is also useful to mention that the Census Bureau has instituted a number of design changes from the 1990 post-enumeration survey for the ACE that will reduce the variation in sampling weights for blocks, which will reduce the sensitivity of the final estimates to results for individual blocks. This represents a key improvement in comparison with the 1990 design. The current plan to produce post-strata involves modification of the 357-cell post-stratification design suggested for use in 1990-based intercensal estimation. Current modifications under consideration by the Census Bureau include expansion of the geographic stratification for non-Hispanic whites from four regions to nine census divisions, adding a race/ethnicity group, changing the definition of the urbanicity variable, and adding new post-stratification factors, such as mail return rate at the block level. Logistic regression, modeling inclusion in the 1990 census, is being used to help identify new variables that might be useful, as well as to provide a hierarchy of the current post-stratification factors that will be used to guide collapsing of cells if that is needed. (In comparison, the analysis that generated the 357-cell post-stratification was based on indirect measures of census undercoverage, such as the census substitution rate.) The Census Bureau plan demonstrates awareness of the interaction of its modification of the 750,000 housing unit sample design with its modification of the 357 post-strata design. (On the most basic level, the sample size allocated to each post-stratum determines the variance of its estimate.) The plan also makes clear that even though much of the information used to support this modification process must be based on the 1990 census, it is important that the ultimate design for the ACE survey (and any associated estimation) allows for plausible departures from the 1990 findings. For example, significant differences between the 1990 and 2000 censuses could stem from the change in the surrounding block search for matches, the planned change in the treatment of ACE movers, or changes in patterns and overall levels of household response. Observations and Comments Sample design to select the 300,000 housing units Because of the need to keep the ACE on schedule by initiating resource allocations that support the independent listing of the 2 million addresses relatively soon, as well as the need to avoid development and testing of new computer software, the Census Bureau has decided to subsample the 300,000 ACE housing units from the 750,000 housing units of the previously planned ICM design. The panel agrees that operational considerations support this decision. The cost of the constraint of selecting the 300,000 ACE housing units from the 750,000 ICM housing units, in comparison with an unconstrained selection of 300,000 housing units, is modest. While the constrained selection will likely result in estimates with somewhat higher variances, the panel believes that careful selection of the subsample can limit the increase in variance so that it will not be consequential. (By careful selection, the panel means use of the suggested approaches of the Census Bureau, or new or hybrid techniques, to identify a method that best satisfies the criteria listed above.) This judgment by the panel, although not based on a specific analysis by itself or the Census Bureau, takes into account the fact that a large fraction of
OCR for page 1
the 750,000 housing units of the ICM design are selected according to criteria very similar to those proposed for the ACE design. In addition, the panel notes that the removal of the requirement for direct state estimates permits a substantial reduction in sample size from the 750,000 ICM design in sparsely populated states, for which ACE estimates can now pool information across states. As a result, the ACE design could result in estimates with comparable reliability to that of the previously planned, much larger ICM design. Given the freedom to use estimates that borrow strength across states, the final ACE sample should reduce the amount of sampling within less populous states from that for the preliminary sample of 750,000 housing units. However, there is a statistical basis either for retaining a minimum ACE sample in each state, or what is nearly equivalent, for retaining a sample to support an ACE estimate with a minimum coefficient of variation. The estimation now planned for the ACE survey assumes that there will be no important state effects on post-stratum undercoverage factors. In evaluating the quality of ACE estimates, it will be important to validate this assumption, which can only be done for each state if the direct state estimates are of sufficient quality to support the comparison, acknowledging that for some of these analyses one might pool data for similar, neighboring states. (Identification of significant state effects would not necessarily invalidate use of the ACE estimates for various purposes but would be used as part of an overall assessment of their quality.) This validation could take many forms, and it is, therefore, difficult to specify the precise sample size or coefficient of variation needed. We offer one approach the Census Bureau should examine for assessing the adequacy of either type of standard. Using the criteria for evaluating alternative subsample designs (i.e., the estimated coefficients of variation for 51 post-stratum groups, the differences in coefficients of variation for race/ethnicity and tenure groups, and the coefficients of variation for state totals), the Census Bureau should try out various state minima sample sizes to determine their effects on the outputs. It is possible that a moderately sized state minimum sample can be obtained without affecting the above coefficients of variation to any important extent. There are a variety of ways in which the assumption of the lack of residual state effects after accounting for post-stratum differences could be assessed, including regression methods. We encourage the Census Bureau to consider this important analytic issue early and provide plans for addressing it before the survey design is final. The panel makes one additional point on state minima. The state minima will support direct state estimates that will be fairly reliable for many states. The Census Bureau should consider using the direct state estimates not only for validation, but also in estimation– in case of a failure of the assumption that there will be no important state effects on undercoverage factors. Specifically, the Census Bureau should examine the feasibility of combining the currently planned ACE estimates at the state level with the direct state estimates, using estimated mean-squared error to evaluate the performance of such a combined estimate in comparison with the currently planned estimates. We understand that the necessity of prespecification of census procedures requires that the Census Bureau formulate an estimation strategy prior to the census, which adds urgency to this issue. Finally, the panel has two suggestions with respect to the criteria used for assessing the ACE sample design. First, there should be an assessment of the quality of the estimates for
OCR for page 1
geographic areas at some level of aggregation below that of states, as deemed appropriate by the Census Bureau. (This criterion is also important for evaluating the ACE post-stratification design, discussed below.) Second, the importance of equalizing the coefficients of variation for different post-strata depends on how estimates for specific post-strata with higher coefficients of variation adversely affect the variance of estimated counts for certain areas. Coefficients of variation for post-strata that do not have much effect have less need to be controlled, assuming that the estimates for these post-strata do not have other uses. Post-stratification plans The 1990 census adjusted counts used 1,392 post-strata, but post-production analysis for calculating adjusted counts for intercensal purposes resulted in the use of 357 post-strata. The panel believes that the use of these 357 post-strata (and the hierarchy for collapsing post-stratification cells) was a reasonable design for 1990, and that, in turn, the 1990 design is a good starting point in determining the post-strata to be used in the 2000 ACE. The Census Bureau is considering four types of modifications to the 357 post-strata design, although it has not yet set the criteria for evaluating various post-stratification designs. Logistic regression will be used to identify new variables and interactions of existing variables that might be added to the post-stratification. Finer post-strata have the advantage of greater within-cell homogeneity, potentially producing better estimates when carried down to lower levels of geographic aggregation. Some gains with respect to the important problem of correlation bias might also occur. However, stratifying on factors that are not related to the undercount will generally decrease the precision of undercount adjustments. The tradeoff between within-cell homogeneity and precision needs to be assessed to determine whether certain cells should be collapsed and whether additional variables should be used. It is also important to examine the effects of various attempts at post-stratification on the quality of substate estimates, especially since certain demographic groups are more subject to undercoverage, and so substate areas with a high percentage of these groups will have estimates with higher variances. (This argument is based on the fact that, as in the binomial situation, the mean and the variance of estimated undercounts are typically positively related.) We believe it is extremely important that analyses at substate levels of aggregation be conducted to inform both the sample design and the post-stratification scheme. Furthermore, this issue needs to be studied simultaneously with that of the effect of the design and post-stratification on the post-stratum estimates. The fact that analysis of substate areas appears in both sample design and post-stratification design is an indication of the important interaction between these two design elements and justifies the need for studies of them to be carried out simultaneously. The panel encourages the Census Bureau to work on them at the same time. The panel notes that the decision to use a modification of the 357-strata system from 1990 for the ACE post-stratification design will probably not permit many checks against estimates from demographic analysis that use direct estimates from ACE. This limitation may increase the difficulty of identifying the precise source of large discrepancies in these comparisons. However, the panel does not view this as a reason not to proceed, since the
OCR for page 1
precision of direct estimates at the finest level of detail of post-stratification (using 1,392 strata in this context) could make such comparisons more difficult to interpret, and the estimates from demographic analysis are not extremely useful for this purpose (except for blacks, and then only nationally). As work on both the sample design and post-stratification design progresses, the Census Bureau should not rely entirely on information from the 1990 census: substantial differences might occur between the 1990 and the 2000 censuses that would lead to either a sample design or a post-stratification design that was optimized for 1990 but that might not perform as well in 2000. Instead, the Census Bureau should use a sample design that moves toward a more equal probability design than 1990 information would suggest. Similarly, the Census Bureau, using whatever information is available since 1990 on factors related to census undercoverage, should develop a post-stratification design that will perform well for modest departures from 1990. Finally, when considering criteria for both sample design and post-strata, it is important to keep in mind that the goal of the census is to provide estimated counts for geographic areas as well as for demographic groups. Since the use of equal coefficients of variation for post-strata will not adequately balance these competing demands, the Census Bureau will need to give further attention to this difficult issue. The balancing of competing goals is not only a post-stratification issue, but also a sample design issue. For example, if block clusters that contain large proportions of a specific demographic group are substantially underrepresented in the ACE sample, the performance of the estimates for some areas could be affected. Documentation Given the importance of key decisions and input values for the ACE design, it is important that they be documented. In particular, the Census Bureau should produce an accessible document in print or in electronic form that (1) gives the planning values for state-level, substate level, and post-stratum level variances resulting from the decisions for the sample and post-stratification designs and (2) provides the sampling weights used in the ACE selection of block clusters. Summary From its review of the Census Bureau’s current plans for design of the ACE survey, the panel offers three general comments: The panel concludes that the general nature of the Census Bureau’s work on the ACE design represents good, current practice in sample design and post-stratification design and their interactions. The panel recognizes that operational constraints make it necessary for the Census Bureau to subsample the ACE from the previously planned ICM sample. The subsampling, if done properly, should not affect the quality of the resulting design if compared with one that sampled 300,000 housing units that were not a subset of the 750,000 housing units previously planned for the ICM.
OCR for page 1
The panel believes that removal of the constraint to produce direct state estimates justifies the substantial reduction in the ACE sample size from the ICM sample size. The planned ACE could result in estimates with comparable reliability to that of the larger ICM design. The panel offers three suggestions for the Census Bureau as it works to finalize the ACE design, some of which the Census Bureau is already considering: (1) a method for examining how large a state minimum sample to retain; (2) some modifications in the criteria used to evaluate the ACE sample design and post-stratification, namely, lower priority for coefficients of variation for excessively detailed post-strata and more attention to coefficients of variation for substate areas; and (3) a possible change in the ACE estimation procedure, involving use of direct state estimates in combination with the currently planned estimates. In addition, the Census Bureau should fully document key decisions for the ACE design. The panel looks forward to continuing to review the ACE design and estimation as the Census Bureau’s plans are further developed. The panel is especially interested in the evolving plans for post-stratification design, including the use of logistic regression to identify additional post-stratification factors; plans for the treatment of movers in ACE; and the treatment of nonresponse as it relates to unresolved matches in ACE estimation. In addition, after data have been collected, the panel is interested in the assessment of the effect of nonsampling error on ACE estimation and the overall evaluation criteria used to assess the quality of ACE estimates. We conclude by commending you and your staff for the openness you have shown and your willingness to discuss the ACE survey and other aspects of the planning for the 2000 census. Sincerely, Janet L. Norwood, Chair Panel to Review the 2000 Census Attachment: Panel Roster