RECENT DECENNIAL CENSUSES HAVE INCLUDED a program of planned evaluation studies that focus on ther quality of census operations and the data that result from them. In some respects, the 2000 census could be interpreted as having two such programs. One was the series of detailed reports from the Accuracy and Coverage Evaluation (ACE) program. Numerous supporting reports accompanied each of the decision documents on census adjustment issued by the Bureau’s Executive Steering Committee for ACE Policy (ESCAP) in March and October 2001, and another series of supporting reports accompanied the final estimates from the 2000 coverage evaluation program—called ACE Revision II—when those estimates were released at a joint meeting of our panel and the Panel to Review the 2000 Census in March 2003.
The second evaluation program of the 2000 census was the planned slate of evaluations of census operations and of the quality of the census content, which was administered by the Census Bureau’s Planning, Research, and Evaluation Division (PRED). The original plan of PRED evaluations for 2000 was very ambitious, including 149 separate studies; 18 of these were “cancelled” only in the sense that they were instead expedited and completed as part of the ESCAP evaluation series. But subsequently, in at least two major waves (in early and late 2002), the
evaluation program “was refined and priorities reassessed due to resource constraints at the Census Bureau” (U.S. Census Bureau, 2003d). In an “[attempt] to obtain the best balance of resources” between “completing and releasing Census 2000 data products” and “conducting key Census 2000 evaluations” (U.S. Census Bureau, 2003d), the Census Bureau ultimately reduced the list of studies from 149 to a still-formidable 91. In addition, a series of 15 topic reports was developed based on groupings of evaluation reports; the Bureau released the individual evaluation reports only after the relevant topic report was publicly released.
In this chapter, we discuss suggestions for developing the evaluation program for the 2010 census. In Section 8-A, we outline major challenges that we perceive in defining evaluation studies for the 2010 census and, more broadly, redefining the research and evaluation program of the Census Bureau. In Section 8-B, we describe the Master Trace Sample, an evaluation tool that we—like other National Research Council panels—believe may be particularly critical to learning about census operations and guiding future practice.
8–A STRENGTHENING THE EVALUATION PROGRAM OF THE 2010 CENSUS
The staff of both our panel and the Panel to Review the 2000 Census received access to the Census Bureau’s PRED-series evaluations and topic reports on an advance release basis, for which we thank the Census Bureau (the General Accounting Office, Department of Commerce Inspector General, and Congressional staff received the evaluation reports on the same basis). While our arguments in this report reflect observations from the evaluation reports, we do not here attempt a comprehensive review of the entire slate of Census Bureau 2000 census evaluations. This decision reflects a change in the panel’s charge described in Chapter 1 to a more forward-looking study of the developing plans for the 2010 census.
Of the Census Bureau’s 2000 census PRED-series evaluations, the Panel to Review the 2000 Census commented (National Research Council, 2004:Sec. 9–B):
We applaud the effort the Census Bureau devoted to evaluation of the 2000 census. Yet we must note the serious deficiencies of many (but by no means all) of the evaluation studies released to date. Too often, the evaluations do not clearly answer the needs of relevant audiences, which include 2000 census data users who are interested in data quality issues that bear on their analyses and Census Bureau staff and others who are concerned with the lessons from the 2000 experience that can inform 2010 planning. No single evaluation will necessarily speak to both audiences, but every evaluation should clearly speak to at least one of them.
Yet many of the completed evaluations are accounting-type documents rather than full-fledged evaluations. They provide authoritative information on such aspects as number of mail returns by day, complete-count item nonresponse and imputation rates by type of form and data collection mode, and enumerations completed in various types of special operations (e.g., urban update/leave, list/enumerate). This information is valuable but limited. Many reports have no analysis as such, other than simple one-way and two-way tabulations. Reports sometimes use different definitions of variables, such as type of form (self or enumerator), and obtain data from files at different stages of processing, with little attempt to reconcile such differences. Almost no reports provide tables or other analyses that look at operations and data quality for geographic areas.
Based on our reading of many of the evaluation reports, we concur. Like the Panel to Review the 2000 Census, we believe that there is merit in the completed evaluation studies of the 2000 census as rough documentation of operational processes. We also appreciate the difficulty faced by evaluators in marshaling data resources that could be daunting in scale or, worse, simply not amenable to necessary evaluation work (the dominant example being the MAF Extract described in Chapter 3, in which the logical flags associated with different address updating operations were not conducive to tracking the complete history or original source of an address).
That said, we believe that the 2000 census evaluation work is suggestive of two broader problems. We hope that by address-
ing these problems the Bureau is able not only to strengthen the evaluation program for 2010 but also to redefine the way it thinks about research.
8–A.1 Correcting the Disconnect Between Research and Operations
The first broad problem is, as we perceive it, a real disconnect between the Bureau’s research objectives and its operational and planning groups. In our first interim report (National Research Council, 2000a:28–29), we recommended that the Census Bureau “develop a detailed plan for each evaluation study on how to analyze the data collected and how to use the results in decision making.” Specifically, we suggested that these plans “include detailed information on how the data will be analyzed, how the results obtained will inform decisions about the 2010 census design, and what resources, in terms of data collection costs and staff expertise, are required.” We continue to believe that a strong connection between a research base and operational decisions is vital. However, we see signs in many aspects of the emerging 2010 census plan that research is not strongly connected to planning needs and goals.
The clearest such sign is the articulation of the basic “three-legged stool” approach to the 2010 census itself. As noted in Section 2-C.2, the key initiatives of the 2010 census plans were developed as the 2000 census was being collected and tabulated. As such, they were developed without the benefit of completed evaluation studies and research. Thus, the MAF/TIGER Enhancements Program was pursued as an objective without reference to either the unique contribution of address update sources to the Master Address File or detailed assessment of geocoding and enumerator difficulties associated with TIGER database inaccuracy. Likewise, the American Community Survey was proposed before all information was known about nonresponse—in full and item by item—to the 2000 census long form. As elsewhere in this report, we emphasize our agreement with the basic goals of the 2010 census plan and, as we said in Section 2-C.2, it was appropriate that major initiatives for the 2010 census were proposed and initiated as early as they were. Still, it is significant that these initiatives did not derive naturally from the results of
2000 census evaluations or the problems detected and measured by those evaluations.
In other areas, the Census Bureau’s planning and research entities operate at either a high level of focus (for decision making) or at a microlevel that tends toward detailed accounting of individual census processes (as in the series of 2000 census evaluations), with little to bridge the gap. Examples of decisions not well informed by research include:
the decision to implement a complete block canvass for 2010 address list updating, without full research into alternative methods or definitive knowledge of the unique contributions of address update operations to the 2000 MAF;
the decision to favor imputation methods over extended nonresponse follow-up operations in the 2000 census, without complete research into the effect of imputation on resulting data series or development of modern imputation tools; and
the decision to limit the number of persons who can respond on the basic census return form from 7 to 6 in 2000, which served the need to shorten the form but may have had unintended consequences for the reporting of large households.
As it designs its research and evaluation program for 2010, the Census Bureau should work to bridge the gap between research and operations in the census process; evaluations should be forward-looking and designed to inform and satisfy specific planning objectives. The goal should be research studies that produce real data that lead to actionable results.
Accordingly, we offer a general recommendation for the development of a research and evaluation plan for the 2010 census. This recommendation represents an endorsement and restatement of a similar recommendation from the Panel to Review the 2000 Census (National Research Council, 2004:Rec. 9.2).
Recommendation 8.1: The Census Bureau should materially strengthen the evaluation component of the 2010 census, including the ongoing testing program for 2010. Plans for census evaluation studies
should include clear articulation of each study’s relevance to overall census goals and objectives; connections between research findings and operational decisions should be made clear. The evaluation studies must be less focused on documentation and accounting of processes and more on exploratory and confirmatory research while still clearly documenting data quality. To this end, the 2010 census evaluation program should:
identify important areas for evaluations (in terms of both 2010 census operations and 2020 census planning) to meet the needs of users and census planners and set evaluation priorities accordingly;
design and document data collection and processing systems so that information can be readily extracted to support timely, useful evaluation studies;
focus on analysis, including use of graphical and other exploratory data analysis tools to identify patterns (e.g., mail return rates, imputation rates) for geographic areas and population groups that may suggest reasons for variations in data quality and ways to improve quality (such tools could also be useful in managing census operations);
consider ways to incorporate real-time evaluation during the conduct of the census;
give priority to development of technical staff resources for research, testing, and evaluation; and
share preliminary analyses with outside researchers for critical assessment and feedback.
8–A.2 Pursuing New Research Directions
The second broad problem that we believe is suggested by the evaluations of the 2000 census concerns the relative lack of
attention to some types of research, key among them the investigation of opportunities for targeting and efficiency. The Census Bureau’s 2000 census evaluations are, apparently by design, national in scope and focus; there is very little disaggregation by geography or (in some cases) demographic group that might shed light on important local variation in census operations. Even in those cases where disaggregation is attempted, the results can be confusing and unhelpful for planning. Perhaps most notably, evaluation reports on the Local Update of Census Addresses (LUCA) program offer tabulations of additions and deletions broken down by state, even though it was smaller-area governments (counties, minor civil divisions, places, and reservations) that participated in LUCA; the breakdown by state provides some insight but, generally, misses the important story of participation and nonparticipation in the LUCA process (Owens, 2003, 2002).
Block canvassing, group quarters enumeration, small multiunit structures, and rural areas—as well as other topics raised throughout this report—are cases in which 2010 census planning would benefit by departing from the “one size fits all” approach that often characterizes census operations. Just as it is likely that canvassing for addresses in selected areas may be effective relative to a blanket block canvass, so too is it likely that the accuracy of the count of special hard-to-count populations may be improved by tailoring questionnaires and enumeration methodologies to reach them. Accordingly, we recommend:
Recommendation 8.2: A major focus of the Census Bureau’s ongoing research and evaluation program must be opportunities for targeting and efficiency—tailoring approaches to key population groups and areas rather than pursuing a “one-size-fits-all” approach.
Our discussion in this report also suggests areas where increased focus on cognitive testing and questionnaire design would be beneficial. Better articulation and presentation of the census residence rules could help identify or deter person duplication in the census. Moreover, the establishment of a parallel
data system in the ACS highlights the importance of maintaining appropriate consistency in questionnaire content and design; the divergent residence rules for the ACS and the census stand as an open question that should be resolved, and wording and structuring of race and Hispanic origin questions should be consistent between the two questionnaires. On a related matter, human factors and usability testing should become increasingly important in the Bureau’s research and evaluation programs, due to the plans to deploy portable computing devices among the large corps of temporary census enumerators and the wider availability of the self-response Internet option.
8–A.3 Exploiting Existing Data Resources
While it may be tempting to look at the completed 2000 census evaluation reports and topic reports and conclude that evaluation of the 2000 census is complete, the panel argues that much remains to be learned from the extant operational data from the 2000 census. Further disaggregation and mining of these data should be an informative and relatively inexpensive way to for-mulate a stronger research base for the 2010 census and its constituent programs. We recommend the following (see also National Research Council, 2004:Rec. 9.1):
Recommendation 8.3: The Census Bureau must mine and fully exploit data resources currently available in order to build a research base for the 2010 census and to further evaluate the 2000 census. These resources include:
microdata from the 2000 Accuracy and Coverage Evaluation and its related Person Duplication Studies;
extracts from the Master Address File;
the Local Census Office Profile dataset;1
a match of census records and the March 2000 Current Population Survey; and
the Master Trace Sample.
We address one of these as-yet-untapped data resources—the Master Trace Sample—in detail in the following section.
8–B MASTER TRACE SAMPLE
The idea for what has come to be known as the Master Trace Sample (MTS) can be traced to a recommendation by one of our predecessor National Research Council panels on the decennial census. That panel suggested (National Research Council, 1985:Rec. 6.3) “that the Census Bureau keep machine-readable records on the follow-up history of individual households in the upcoming pretests, and for a sample of areas in the 1990 census, so that information for detailed analysis of the cost and error structures of conducting census follow-up operations on a sample basis will be available.” Three years later, the idea had developed into a fuller proposal; in a brief report evaluating the projects for the REX (research, evaluation, and experimentation) program of the 1990 census, the panel commented on the idea and used the name that the project has since retained (National Research Council, 1988):
The panel supports the concept of a master trace sample (MTS) that will facilitate a wide range of detailed studies of the quality of the 1990 census content. … The MTS will comprise a sample of census records that include not only the final values for each questionnaire item, but also the values for these items at each step in the processing, along with additional information such as whether the respondent mailed back a filled-in questionnaire or responded to telephone or personal follow-up. The MTS sample could well overlap other samples of interest, including the Current Population Survey (CPS), the Survey of Income and Program Participation (SIPP), the census reinterview sample, and others, and could have pertinent administrative records data appended to it…. We applaud the objectives of the MTS and support having as much of the file content as possible available in a public-use format.
The panel further noted that the sample “would greatly facilitate error analyses of the census” and would permit detailed examination of errors introduced in such processes as geocoding and imputation.
For various reasons—among them the overwhelming task of preparing for dual-systems estimation and subsequent coverage evaluation—the Bureau did not put the master trace sample idea into practice in 1990. But the concept of maintaining, pulling together, and analyzing detailed records of field operations took root. It was revisited, elaborated upon, and given much greater emphasis by the Panel on Alternative Census Methodologies as the 2000 census drew very close (National Research Council, 1999:93):
The panel strongly supports a renewal and modest expansion of the suggestion by the Panel on Decennial Census Methodology of 10 years ago … for the collection of a master trace sample. With the various innovations in the 2000 census such as the possibility of sampling for nonresponse follow-up and alternative methods for enumeration (e.g., “Be Counted” forms), it would be very useful if the planned data management system could collect a trace sample in, say, 100 census tracts around the country. (Sampling tracts would facilitate study of the effects at the block or interviewer level.) The trace sample would provide information as to what happened in all phases of data collection, which will be instrumental in guiding methodological advances to be used in 2010 and beyond. Specific variables that could be included in the trace sample collection are as follows:
where the address came from (original master address list, local update, casing check, etc.);
the type of questionnaire (long or short form), whether and when it was returned, whether it was the first or a replacement questionnaire (or both), whether respondent-friendly enumeration was (also) used, if the household was a nonrespondent and a member of nonresponse follow-up sample, then how many approaches for field enumeration were made, which mode was used, whether they were ultimately successful, whether data capture required proxy enumeration and, if so, what type of proxy enumeration, edit failures, and finally whether there
were any data differences among duplicate responses for households or individuals; and
the identification number of the enumerator, to facilitate evaluation of interviewer effects.
Of course, any of the above information that could easily be collected on a 100 percent basis should be.
National Research Council (1999) continues with additional suggestions for data sources for inclusion in the Master Trace Sample, including measures of interviewer quality, results of unduplication programs, and information from the then-planned Integrated Coverage Measurement (later replaced by the Accuracy and Coverage Evaluation, or ACE, Program). The panel formalized its thoughts in a recommendation (National Research Council, 1999:Rec. 5.1):
The panel recommends that a trace sample be collected in roughly 100 tracts throughout the United States and saved for research purposes. The trace sample would collect detailed process data on individual enumerations. In addition, similar information on integrated coverage measurement should be collected, on a sample basis if needed. It would be very useful if information could be collected, again on a sample basis, to support complete analysis of the census costs model, all aspects of the amount of duplication and efforts to unduplicate, and information needed to support total error modeling of the 2000 census.
Picking up where earlier panels left off, our Panel on Research on Future Census Methods has considered the Master Trace Sample to be central to its charge. Indeed, the MTS was a major topic of our first interim report (National Research Council, 2000a:1–2):
We believe that the master trace sample database has the potential to be the single most useful source of information for assessing alternative designs for the 2010 census…. The current plans for the master trace sample database should be augmented so that data for all key steps in the process—starting with address assignment and ending with a final disposition for each case—are included in the master trace sample database.
We made a number of other suggestions to the Census Bureau relative to the construction of the database (National Research Council, 2000a:15–18):
use a two-stage sample design;
oversample ACE blocks, list/enumerate and update/leave households and households that are hard to enumerate;
improve the quality of information on the number and dates of attempts at enumeration;
set priorities for the retention of master trace sample input files;
provide for the accessibility and availability of the databases;
increase the resources for developing the database; and
collect sufficient information to support a model of total census error.
In March 2003, the panel was briefed on Census Bureau plans to implement a Master Trace Sample based on the 2000 census, including the proposed contents of the database and its intended uses. At that briefing, the issue of potential research questions—about which the panel was already somewhat aware—was spelled out with greater specificity. We were told that a total of fifteen “requirements” had guided Master Trace Sample research and development and that ten supplementary research questions fell into the category of acceptance testing. The requirements were mostly stated in the form of questions about simple relationships between two variables of interest; for example,
What is the correlation between the date of completion of NRFU cases and the rate of item nonresponse?
What is the correlation between the history of address sources and the need for applying the Primary Selection Algorithm (PSA) because of multiple responses for the same address?
The supplementary questions, for the most part, involved similar bivariate relationships.
Developed as part of the formal program of 2000 census evaluations, the Master Trace Sample final report was issued on September 29, 2003 (Hill and Machowski, 2003). Consisting of only eight text pages, the report confirmed the structural requirements and related questions noted at the March briefing. According to the report, “the MTS database links micro-level data from various stages of the Census 2000 project such as address frame development, data collection, data capture, data processing, and enumeration contact records” (Hill and Machowski, 2003:4). These data are linked at the following levels:
local census office (LCO),
return (that is, census questionnaire),
enumeration contact (that is, personal visit), and
Moreover, “the MTS database is intended to address a wide variety of research requests that link decennial census response, data collection, and processing information with enumeration characteristics” (Hill and Machowski, 2003:4). The database contains the following types of data:
census response data at various stages of processing;
enumeration characteristics (related to operations and enumerators);
record of contact information from the nonresponse followup (NRFU) and coverage improvement follow-up (CIFU) operations;
data capture system evaluation information from a reconciled keyed-from-image data set;
geocoding error results from one of the Census 2000 evaluations; and
housing unit status (i.e., occupied/vacant/delete/unresolved) from NRFU, CIFU, and ACE.
Among its limitations, the Master Trace Sample report notes that the database does not have Census 2000 person or housing unit coverage data from ACE; it excludes special places and group quarters; it does not include “the various Local Update of Census Addresses (LUCA) files or the bulk of the MAF extract files used to update the DMAF” (Hill and Machowski, 2003:6).
Under the heading “Intended Uses/Targeted Users,” Hill and Machowski (2003:2) note that there is great potential for research in the following areas:
modeling to identify and measure associations and relationships;
tracing items, such as population count, through census processes; and
investigating how to develop improved trace databases in future censuses.
Especially worth noting, under the same heading, Hill and Machowski (2003:3) state that
the MTS database is limited to internal Census Bureau use. Census Bureau researchers interested in pursuing studies that will help guide the planning of the 2010 short form census will develop research proposals for review and approval by senior staff as well as planning groups guiding 2010 Census research.
We are greatly pleased to learn that the prototype of a Master Trace Sample was implemented in the 2000 census. We commend the Bureau for taking seriously the recommendations of our predecessor panels on census methodology; database construction has required a substantial commitment of Bureau personnel and resources. However, based on the information that has been given to us, we have some serious concerns about the direction that the MTS appears to be taking. Our concerns are rooted in a perceived divergence between the panel’s vision of the MTS and its use and that of the Bureau, as we understand its
position. The differences in these views fall under the headings of research, access, and plans.
Because our definition of research implies free-ranging and diligent inquiry, we are unconvinced of the wisdom of building the MTS on a set of preidentified research questions. Each of the fifteen database developmental requirements is reasonable in its own right, but, when taken as a set, the resulting structure is too narrowly focused. It is difficult to see how the inevitable questions that follow from initial queries can be pursued using the resulting database. Clearly, some crucial issues cannot be investigated at all given the data source limitations noted earlier, such as the extent to which duplication problems in the 2000 census may be traced to group quarters, or the characteristics of cases where the ACE failed to recognize and correct for person duplication.
Unfortunately, we are in the position of not knowing whether the MTS can contribute to a satisfactory understanding of any truly substantial design issue. In our view, a relational database contains a set of variables and their measures and permits the user to answer queries based not only on simple bivariate relationships but also on a broad range of joint and conditional associations. A menu-driven rather than query-based approach to analysis seems to us to be antithetical to good research.
To promote use of the MTS and expand its usefulness, we recommend the following:
Recommendation 8.4: The Census Bureau should develop a list of studies important to 2010 census planning that can exploit the richness of the Master Trace Sample. These studies should be prioritized and then conducted as resources permit.
Recommendation 8.5: The Master Trace Sample from the 2000 census should be expanded to include data from group quarters enumeration, the Accuracy and Coverage Evaluation, and the Local Update of Census Addresses Program.
With regard to MTS access, we are greatly troubled by the progression from the 1988 National Research Council panel’s
support for making as much of the file content as possible available in a public-use format, to the Census Bureau’s March 2003 briefing document that allowed “internal research with indirect research opportunities for external customers such as the National Academy of Sciences,” and finally to the final report’s statement that “the MTS database access is limited to internal Census Bureau use.” We are sensitive to confidentiality considerations in this regard but if, as stated in our first interim report, the MTS has the potential to be the single most important source of information for assessing alternative designs for the 2010 census, a great deal of this potential is lost to the Bureau by restricting its use.
In the language of the final MTS report, Census Bureau researchers will seek “approval by senior staff as well as planning groups guiding 2010 Census research” to investigate “hypotheses that involve relationships of various Census 2000 operations or systems” (Hill and Machowski, 2003:3). We have not been made aware of any specific projects that are now being pursued in this way. Moreover, in the absence of concrete knowledge of the database capabilities, we are unable to propose relevant and feasible studies for Bureau personnel or prioritize important areas for research. In brief, the extent to which the MTS can be properly mined remains unclear to us. We have not had access to the MTS, but we hope that the Bureau will modify its stance on access to permit broader use of the MTS.
Recommendation 8.6: The Census Bureau should explore ways to allow the broader research community to perform analyses using the 2000 Master Trace Sample, subject to confidentiality limitations.
Our final recommendation is as follows:
Recommendation 8.7: The Census Bureau should carry out its future development in this area of tracing all aspects of census operations with the ultimate aim of creating a Master Trace System, developing a capacity for real-time evaluation by linking census operational databases as currently done by the Mas-