Initial Views on 2010 Census Evaluations
SUGGESTIONS FOR THE 2010 CENSUS EVALUATIONS
The panel’s first priority is to provide input to the selection of experiments to be implemented in 2010, since the design of these experiments needs to begin very soon to allow for the development of associated materials and protocols. In addition, the panel has some suggestions relative to the evaluations to be carried out in conjunction with the 2010 census. There is also a time pressure for them since, as stated previously, much of the data collection in support of the 2010 census evaluations needs to be specified relatively early, in particular so that the contractors involved in many of the census processes can make plans for the collection and structuring of data extracts that relate to the functioning of those processes.
Address List Improvement
For the 2000 census, the Census Bureau departed from past practice of building the address list for the census from scratch. Instead, it pursued a strategy of building a Master Address File (MAF), using the 1990 address list as a base and seeking ways to “refresh” the database during the intercensal period. Legislation enacted in 1994 created two major tools for address list improvement. First, the new law authorized the Census Bureau to use the U.S. Postal Service’s Delivery Sequence File (DSF; as the name suggests, a master list of mail delivery addresses and locations used to plan postal
routes) as an input source. Second, it permitted limited sharing of extracts of the Master Address File (which is confidential information under Title 13 of the U.S. Code) with local and tribal governments. Specifically, this provision led to the creation of the Local Update of Census Addresses (LUCA) program, first conducted in several phases in 1998 and 1999 (see National Research Council, 2004a:62–65).
The Master Address File used to support the American Community Survey during the intercensal period is essentially an update of the 2000 census MAF, revised to include edits to the Postal Service’s Delivery Sequence File and new construction. Through these actions, the MAF, heading into the 2010 census, will be certainly more than 90 percent complete but probably not 99 percent complete. (There will almost certainly be a substantial amount of duplication as well.)
The Census Bureau will utilize two operations to increase the degree of completeness of the MAF from its status in 2008 in preparation for its use in the decennial census in 2010. First, it will again use the LUCA program, in which local governments will be asked to review preliminary versions of the MAF for completeness and to provide addresses that may have been missed (or added in error). However, even granting that LUCA will be improved over the 2000 version, it is likely that the participation will be uneven and that a substantial amount of incompleteness will remain after these addresses are added to the MAF. In anticipation of that, the Census Bureau will carry out a national block canvass, visiting each census block, and adding any missed housing units to the MAF (while collecting information from global positioning systems for all housing units).
It may be the case that for many well-established blocks in the United States a 100 percent block canvass is wasteful, given that there is little possibility in these blocks of addition or deletion of housing units over time. It would be useful to identify such blocks in advance, since then the block canvass could be restricted to the subset of blocks in need of MAF updating (this is consistent with item C.3 in Appendix A). Given the costs of a 100 percent block canvass, identifying a targeting methodology that does an excellent job of discriminating between those blocks that are very stable over time and those blocks that are likely to have recent additions or deletions (or both) would provide substantial cost savings with possibly only a negligible increase in the number of omissions (or erroneous inclusions) in the MAF. It is likely that administrative records, especially building permit records, commercial geographic information systems, and the ACS could provide useful predictors in discriminating between stable and nonstable blocks. Such targeting is already used in the Canadian census; it uses an address register that is updated intercensally, and field verification is restricted to areas where building permit data indicate the presence of significant new construction (Swain et al., 1992).
To support the determination as to whether any targeting methods might satisfy this need—and, indeed, to facilitate a richer evaluation of MAF accuracy than was possible in 2000—the Census Bureau should ensure that the complete source code history of every MAF address is recoverable. In 2000, the MAF was not structured so that it was possible to fully track the procedural history of addresses—that is, which operations added, deleted, or modified the address at different points of time. Therefore, it was not possible to accurately determine the unique contributions of an operation like LUCA or the block canvass; nor was it possible to assess the degree to which various operations overlapped each other in listing the same addresses. Census Bureau staff ultimately derived an approximate “original source code” for MAF addresses, albeit with great difficulty; see National Research Council (2004b: 146–147). Redesign of the MAF database structure was included in the plans to enhance MAF and TIGER during this decade; the Census Bureau should assess whether the new structure will adequately track the steps in construction of the 2010 (and future) MAF.
Recommendation 4: The Census Bureau should design its Master Address File so that the complete operational history—when list-building operations have added, deleted, modified, or simply replicated a particular address record—can be reconstructed. This information will support a comprehensive evaluation of the Local Update of Census Addresses and address canvassing. In addition, sufficient information should be retained, including relevant information from administrative records and the American Community Survey, to support evaluations of methods for targeting blocks that may not benefit from block canvassing. Finally, efforts should be made to obtain addresses from commercial mailing lists to determine whether they also might be able to reduce the need for block canvassing.
Master Trace Sample
The idea of creating a master trace sample, namely designating a sample of households in, say, census blocks, for which the full history of relevant census operations is retained in an accessible manner for subsequent analysis, is extremely important. In each decennial census, there are unanticipated problems that need to be fully understood in order to make modifications to the census design, to partially or completely eliminate their chance of occurring in the subsequent decennial census. A master trace sample provides an omnibus tool for investigating the source of any of a large variety of potential deficiencies that can arise in such a complicated undertaking as the decennial census. Otherwise, the Census Bureau is usually left with evaluation studies that, due to the limited information available, are often univariate or bivari-
ate summaries that cannot inform about even relatively simple interactions between the individuals, the housing unit, and the enumeration techniques that resulted in a higher frequency of coverage (or content) errors.
The value of a master trace sample database or system has been advocated by several National Research Council panels, including the Panel on Decennial Census Methodology (National Research Council, 1985:Rec. 6.3), the second phase of the Panel on Decennial Census Methodology (National Research Council, 1988), the Panel on Alternative Census Methodologies (National Research Council, 1999:Rec. 5.1), and the Panel on Research on Future Census Methods (National Research Council, 2004b:Rec. 8.4, 8.5, 8.6, 8.7). The last cited report contains a useful history of the development of this idea and includes the following recommendation: “The Census Bureau should carry out its future development in this area of tracing all aspects of census operations with the ultimate aim of creating a Master Trace System, developing a capacity for real-time evaluation by linking census operational databases as currently done by the Master Trace Sample. Emerging 21st century technology should make it feasible to know almost instantaneously the status of various census activities and how they interact. Such a system should be seriously pursued by the Census Bureau, whether or not it can be attained by 2010 (or even by 2020).” Such a proposal is a straightforward generalization of item A.3 of the Census Bureau’s list, though expanding from a focus on the coverage measurement survey to the full set of census operations.
Such a database could be used to evaluate many things, including determining what percentage of census omissions are in partially enumerated households and what percentage of omissions are found on the merged administrative records database. A master trace sample database would be extremely useful in addressing the needs described in the previous section, including understanding the source of duplicates in the Master Address File and evaluating the benefits of LUCA and the block canvass operation. An overall assessment of the workings of the coverage follow-up interview would be feasible if the master trace sample database collected sufficient data so that it was known for each housing unit in the CFU interview what triggered the CFU interview and what the result of the interview was—that is, what changes were made and what information precipitated the change. As indicated, inclusion of the merged administrative records file and relevant data from the American Community Survey in such a database would provide additional information at the individual and local area levels.
Creation of a master trace sample presents a number of challenges. First, there is the retention of the data from the census and affiliated activities. Some modest planning is needed here, especially given the necessity of collecting data from various contractors who are likely not to have planned in advance to provide for such data extracts. In addition, it is necessary
to find an effective way of linking the information retained about the enumerators, the housing units, the residents, the census processes, the type of census coverage error made, and contextual information in a way that facilitates a broad range of potential analyses, especially those that examine interactions among these various aspects of the census process. Also, selecting the minimum data to be collected that is included in the master trace sample database is crucial to address early on. This is because while the addition of various sets of variables from different parts of the census and the census management information system provides broader capabilities for investigating various aspects of census-taking, the inclusion of each additional set of variables complicates the formation of the database. This is a hard database management problem, and the Census Bureau should enter into such a project with the recognition of the need for input of considerable expertise in database management to ensure success. (We think that the relative lack of use of the 2000 Master Trace Sample was due in part to its inability to facilitate many types of analysis.)
An additional concern is that the sampled blocks included have to be kept confidential so that the behavior in these blocks is representative of the entire census. Finally, we do not think the size of the master trace sample database is a major concern. A smaller but somewhat analogous database was constructed by the Census Bureau in 2000 and, as noted above, there have been substantial advances in computing memory and speed since then.
Recommendation 5: The Census Bureau should initiate efforts now for planning the general design of a master trace sample database and should plan for retention of the necessary information to support its creation.
Reverse Record Check
The Canadian Census has successfully employed a reverse record check for the last eight censuses to measure net coverage error. Briefly, four samples are collected: (1) a sample of enumerations from the previous census, (2) a sample of births in the intercensal period, (3) a sample of immigrants in the intercensal period, and (4) a sample of those missed in the previous census. The fourth sample is clearly the most difficult, but by matching those contained in the four samples for the previous reverse record check to the census to determine omissions and continuing this process over several censuses, a relatively useful sample of omissions can be formed over time. Once the four samples are formed, current addresses are determined, and the sample is matched to the census using name, addresses, and other characteristics. In a separate operation, the census is matched against itself to generate an estimate of the overcount, and, using both, an estimate of the net undercount
is derived. Characteristics for both the omissions and overcounts support tabulations by age, sex, race, geography, etc.
To date, this procedure has not been used to evaluate the U.S. decennial census, mainly due to the 10-year period between censuses (as opposed to the 5 years between Canadian censuses), which complicates the need to trace people’s addresses from one census to the next. This issue was specifically examined in the Forward Trace Study (Mulry-Liggan, 1986). However, with administrative records systems improving each year, and given the emergence of the American Community Survey, tracing people over a 10-year period is likely to be much more feasible now in comparison to 1984. Furthermore, a reverse record check has an important advantage over the use of a postenumeration survey with dual-systems estimation in that there is no need to rely on assumptions of independence or homogeneity to avoid correlation bias, a type of bias that occurs in estimating those missed by both the census and the postenumeration survey. There are also more opportunities for validating the reliability of the estimates provided. For example, a reverse record check provides an estimate of the death rate. The key issue concerning feasibility remains tracing, and a useful test of this would be to take the 2006-2007 ACS and match that forward to see how many addresses could be found over the 3.5-year period. In such a test, the ACS would serve as a surrogate for the sample from the previous census enumerations. Either relating this back to a sample of census enumerations and a sample of census omissions, or developing a sample of ACS omissions, remains to be worked out. But certainly, successful tracing of nearly 100 percent of the ACS would be an encouraging first step.
Recommendation 6: The Census Bureau, through the use of an experiment in the 2010 census (or an evaluation of the 2010 census) should determine the extent to which the American Community Survey could be used as a means for evaluating the coverage of the decennial census through use of a reverse record check.
Edit protocols are decisions about enumerations or the associated characteristics for a housing unit that are made based on information already collected, hence avoiding additional fieldwork. For example, an edit protocol might be that, when an individual between ages 18 and 21 is enumerated both away at college and at their parent’s home, the enumeration at the parent’s home is deleted. (Note that census residence rules are to enumerate college students where they are living the majority of the time, which is typically at the college residence.) This would avoid sending enumerators either to the parent’s home or to the college residence, but it would occasionally
make this decision in error. The Census Bureau has made widespread use of edit protocols in the past to deal with inconsistent data. For example, there are rules to deal with inconsistent ages and dates of birth. Furthermore, early in 2000, when it became apparent that the MAF had a large number of duplicate addresses, the Census Bureau developed an edit protocol to identify the final count for households with more than one submitted questionnaire (see Nash, 2000).
More generally, edit protocols might be useful in resolving duplicate residences, as well as in situations in which the household count does not equal the number of people who are listed as residents. Again, as with targeting, edit protocols avoid field costs but do have the potential of increased census error. However, given the increasing costs of the decennial census, understanding precisely what the trade-offs are for various potential edit protocols would give the Census Bureau a better idea of which of these ideas are more or less promising to use in the 2020 census. The panel therefore suggests that the Census Bureau prioritize evaluations that assess the promise of various forms of edit protocols and therefore retain sufficient data to ensure that such evaluations can be carried out. Creation of a master trace sample is likely to satisfy this data need.
Coverage Assessment of Group Quarters
The census coverage measurement program in 2010 will not assess some aspects of the coverage error for individuals living in group quarters. Through use of a national match, as in the 2000 census evaluation, the Census Bureau will be able to estimate the number of duplicates both between those in the group quarters population and those in the nongroup quarters population and the number of duplicates entirely within the group quarters population (see Mule, 2002, for the rate of duplication for various types of group quarters in the 2000 census). However, the number of omissions for group quarters residents will not be measured in 2010, nor will the number of group quarters and their residents who are counted in the wrong place.
Given the variety of ways that group quarters are enumerated, and given the various types of group quarters, coverage evaluation methods will probably need to be tailored to the specific type. We are unclear about the best way to proceed, but it is crucial that the Census Bureau find a reliable way to measure the coverage error for this group, which has been unmeasured for two censuses, going on a third. It is likely that there are sources of information, which if retained, could be used to help evaluate various proposals for measuring coverage error for group quarters residents in 2020.
What is needed is that the list of residents as of Census Day for a sample of group quarters be retained, and for this sample to be drawn independently of the Census Bureau’s list of group quarters. Creating such a list probably
differs depending on the type of group quarters. One would take the list of residents as the ground truth, and determine whether the residents had been included in the census and at which location. These are ideas are very preliminary, and we hope to revisit this issue prior to issuing our final report. (This general topic was item A.4 on the Census Bureau’s list.)
Recommendation 7: The Census Bureau should collect sufficient data in 2010 to support the evaluation of potential methods for assessing the omission rate of group quarters residents and the rate of locating group quarters in the wrong census geography. This is a step toward the goal of improving the accuracy of group quarters data.
Training of Field Enumerators
The 2010 census will be the first in which handheld computing devices are used. They will be used in the national block canvass to collect information on addresses to improve the MAF, and they will also be used for nonresponse follow-up and for coverage follow-up. While the implementation of handheld computing devices was tested in the 2006 census test and will be tested further in the 2008 dress rehearsal, there remain concerns as to how successful training will be and whether some enumerators will find the devices too difficult to comfortably learn to use in the five days allotted to training. Given that it will be extremely likely that such devices will again be used to collect information in 2020 (and in other household surveys intercensally), it would be useful to collect information on who quit, and why they quit, during the training for field enumeration work, who quit and why they quit during fieldwork, and the effectiveness of the remaining enumerators using the devices. In addition, any characteristics information that would be available from their employment applications should be retained as potential predictors for the above. Finally, the Census Bureau should undertake some exit interviews of those leaving training early and those quitting fieldwork early to determine whether their actions were due to discomfort with the handheld devices. This might provide some information either about training that would be useful in adjusting the training used in 2020, or about the ease of use of the devices or about hiring criteria. (This issue is consistent with item D.3 on the Census Bureau’s list.)
A GENERAL APPROACH TO CENSUS EVALUATION
The panel also has some general advice on selecting and structuring census evaluations. As mentioned above, the evaluations in 2000 were not as useful as they could have been in providing detailed assessments as to the types of individuals, housing units, households, and areas for which various
census processes performed more or less effectively. This is not to say that an assessment of general functioning is not important, since processes that experienced delays or other problems are certainly candidates for improvement. However, evaluations focused on general functioning do not usually provide as much help in pointing the way toward improving census processes as analyses for subdomains or analyses that examine the interactions of various factors. Since the costs of such analyses are modest, we strongly support the use of evaluations for this purpose. This issue was addressed in The 2000 Census: Counting Under Adversity, which makes the following recommendation, which this panel supports (National Research Council, 2004a:Rec. 9.2):
The Census Bureau should materially strengthen the evaluation [including experimentation] component of the 2010 census, including the ongoing testing program for 2010. Plans for census evaluation studies should include clear articulation of each study’s relevance to overall census goals and objectives; connections between research findings and operational decisions should be made clear. The evaluation studies must be less focused on documentation and accounting of processes and more on exploratory and confirmatory research while still clearly documenting data quality.
To this end, the 2010 census evaluation program should:
identify important areas for evaluations (in terms of both 2010 census operations and 2020 census planning) to meet the needs of users and census planners and set evaluation priorities accordingly;
design and document data collection and processing systems so that information can be readily extracted to support timely, useful evaluation studies;
focus on analysis, including use of graphical and other exploratory data analysis tools to identify patterns (e.g., mail return rates, imputation rates) for geographic areas and population groups that may suggest reasons for variations in data quality and ways to improve quality (such tools could also be useful in managing census operations);
consider ways to incorporate real-time evaluation during the conduct of the census;
give priority to development of technical staff resources for research, testing, and evaluation; and
share preliminary analyses with outside researchers for critical assessment and feedback.
Item (3) is particularly important, in stressing the need for analysis, not just summaries of the (national) functioning of various census processes.
We think that evaluations should attempt to answer two types of questions. First, evaluations should be used to support or reject leading hypotheses about the effects on census costs or data quality of various census
processes. Some of these hypotheses would be related to the list of topics and questions that were provided to the panel, but more quantitatively expressed. For example, such a hypothesis might be that bilingual questionnaire delivery will increase mail response rates in the areas in which it is currently provided in comparison with not using this technique. To address this question, assuming that targeting of mail questionnaires to all areas with a large primarily Spanish-speaking population is used, one might compare the mail response for areas just above the threshold that initiates this process to those just below. While certainly not as reliable or useful as a true experiment, analyses such as these could provide useful evidence for the assessment of various component processes without any impact on the functioning of the 2010 census.
Second, comprehensive data from the 2010 census, its management information systems, the 2010 census coverage measurement program, and contextual data from the American Community Survey and from administrative records need to be saved in an accessible form to support more exploratory analysis of census processes, including graphical displays. Each census surprises analysts with unforeseen problems, such as the large number of duplicate addresses in the 2000 census, and it is important to look for such unanticipated patterns so that their causes can be investigated. Standard exploratory models should be helpful in identifying these unanticipated patterns. Of course, any findings would need to be corroborated with additional testing and evaluation.
INITIAL CONSIDERATIONS REGARDING A GENERAL APPROACH TO CENSUS RESEARCH
The Census Bureau has a long and justifiably proud history of producing important research findings in areas relevant to decennial census methodology. However, the panel is concerned that in more recent times research has not played as important a role in census redesign as it has in the past. Furthermore, there is the related concern that research is not receiving the priority and support it needs to provide the results needed to help guide census redesign. We give four examples to explain this concern.
First, research in areas in which the results were relatively clear has been unnecessarily repeated. An example is the testing of the benefits from the use of a targeted replacement questionnaire, which was examined during the 1990s and also in 2003. The increased response resulting from the use of a targeted replacement questionnaire was relatively clear based on research carried out in the 1970s by Dillman (1978). In 1992 the Census Bureau carried out the Simplified Questionnaire Test (SQT), which examined the use of a blanket replacement questionnaire. Dillman et al. (1993a,b) describe
the Implementation Test (IT), also carried out in 1992, which attempted to determine the contribution of each part of the mailing strategy toward improving response. As a result of the SQT and the IT, Dillman et al. (1993a,b) estimated that the second mailing would increase response by 10.4 percent. Subsequently, the Census Bureau also carried out two studies investigating the impact of a second mailing in hard-to-count areas. Dillman et al. (1994) showed that a second mailing added 10.5 percent to the response rate. Given the findings of this research, it is unclear why there was a need to examine the benefits from the use of a replacement questionnaire in the 2003 census test (National Research Council, 2003).
Second, areas in which research has demonstrated clear preferences have been ignored in subsequent research projects, when, for example, the previously preferred alternative was not included as a control (see National Research Council, 2006:Box 5-3). Furthermore, there are some basic questions that never get sufficient priority because they are by their nature long-term questions. The best way to represent residence rules is an obvious example. Finally, the analysis of a test census is often not completed in time for the design of the next test census, therefore preventing the continuous development of research questions.
The Census Bureau needs to develop a long-term plan for obtaining knowledge about census methodology in which the research undertaken at each point in time fully reflects what has already been learned so that the research program is truly cumulative. This research should be firmly grounded in the priorities of improving data quality and reducing census costs. Research continuity is important not only to reduce redundancy and to ensure that findings are known and utilized, but also because there are a number of issues that come up repeatedly over many censuses that are inherently complex and therefore benefit from testing in a variety of circumstances in an organized way, as unaffected as possible by the census cycle. These issues therefore need a program of sustained research that extends over more than a single decennial cycle. Also, giving people more freedom to pursue research issues may reduce turnover in talented staff.
Finally, given the fielding of the American Community Survey, there is now a real opportunity for research on census and survey methodology to be more continuous. These preliminary considerations will be greatly amplified by the panel in its subsequent activities. In the meantime, we make the following recommendation as an indication of the overall theme for which the panel anticipates developing a more refined and detailed message in later reports.
Recommendation 8: The Census Bureau should support a dedicated research program in census and survey methodology, whose work is relatively unaffected by the cycle of the decen-
nial census. In that way, a body of research findings can be generated that will be relevant to more than one census and to other household surveys.
For example, the Census Bureau can determine what is the best way to improve response to a mailed questionnaire through use of mailing materials and reminders, or what is the best way using a paper questionnaire or the Internet to query people as to their race and ethnicity, or what is the best way using a paper questionnaire or the Internet to query people as to the residents of a household. The objective will be to learn things whose truth could be applied in many survey settings and to create an environment of continual learning, and then document that learning, to create the best state-of-the-art information on which to base future decisions. When an answer to some issue is determined, that information can be applied to a variety of censuses and surveys, possibly with modest adaptations for the situation at hand. This is preferable to a situation in which every survey and census instrument is viewed as idiosyncratic and therefore in need of its own research projects. However, one complication of developing a continuous research program on censuses and surveys is the different environments that censuses and surveys of various kinds represent. We hope to have more to say on how to deal with this in our final report.
As pointed out by the Panel on Residence Rules in the Decennial Census, “Sustained research needs to attain a place of prominence in the Bureau’s priorities. The Bureau needs to view a steady stream of research as an investment in its own infrastructure that—in due course—will permit more accurate counting, improve the quality of census operations, and otherwise improve its products for the country” (National Research Council, 2006:271). A major objective of the remainder of the panel’s work will be to provide more specifics on how such a research group could develop and carry out a research program in various areas and overall, and how they would make use of the various venues and techniques for research, testing, experimentation, and evaluation.