AT THE REQUEST OF THE U.S. CENSUS BUREAU, the Panel on Research on Future Census Methods was organized to review the early planning process for the 2010 census. Its work includes observing the operation of the 2000 census, deriving lessons for 2010, and advising on effective evaluations and tests. The panel has previously issued two interim reports (National Research Council, 2000a, 2003a) and a letter report (National Research Council, 2001c), and this is our final report.
EMERGING STRUCTURE OF THE 2010 CENSUS
The Census Bureau’s current plans for the 2010 census are predicated on the completion of three major initiatives, which the Bureau has described as a “three-legged stool”:
MAF/TIGER Enhancements Program. A specific set of improvements has been proposed to the Census Bureau’s address list (Master Address File, or MAF) and geographic database (Topologically Integrated Geographic Encoding and Referencing System, or TIGER).
American Community Survey (ACS). The decennial census long form will be replaced by a continuous survey, thus permitting a short-form-only census in 2010. The ACS covers the same social, economic, and demographic data as
the census long form but will provide estimates in a more timely manner.
Early Integrated Planning. The Census Bureau hopes that early attention to planning will make census tests leading up to 2010 more informative and useful.
The Census Bureau’s emerging 2010 census plan also includes the development of portable computing devices (PCDs) for use in nonresponse follow-up work and the increased use of multiple response modes (mail, Internet, and telephone).
CENSUS REENGINEERING: A PROCESS AT RISK
The Census Bureau has advanced an ambitious vision for the 2010 decennial census and—as our previous reports and the balance of this report suggest—the panel strongly supports the major aims of the plan. The implementation of the ACS and, with it, the separation of the long form from the census process are very good concepts; the Bureau’s address and geographic databases are in dire need of comprehensive update; and the implementation of new technologies in census-taking is crucial to improving the accuracy of the count. There is much to like about the emerging plans for the 2010 census, and we strongly support these efforts toward a modernized and improved census in 2010. To this end, the Census Bureau’s focus on planning early in the decennial cycle is highly commendable.
However, based on the information available, the panel finds that the reengineering of the 2010 census is a process at high risk. The major initiatives of the 2010 census plan—the MAF/TIGER Enhancements Program and the American Community Survey—are intended to reduce operational risk in the census in the long term. However, their implementation in the short term necessarily creates unique risks and challenges. In addition, adoption of new technology is inherently risky, particularly when done on the tight schedule and large scale of the decennial census.
To be clear, our conclusion that the reengineering of the 2010 census is a process at risk should not be interpreted as a conclusion that the 2010 census is irrevocably headed for serious
problems. It is not an argument that Census Bureau resources are being focused on plans that are wrong and, therefore, that spending should be trimmed. Quite to the contrary, our argument is meant to underscore the importance of a rigorous and amply funded planning and testing cycle for the 2010 census.
The reengineering of the 2010 census faces two paramount risks that are, to a large extent, out of the control of the Census Bureau. Those risks are, first, that funding will not be available for major components of the census plan or will be available at unpredictable levels and, second, that the decision on the final design of the census (particularly the role of the ACS or a census long form) will not be finalized or will shift with time. Decisions on funding and overall design direction are ultimately up to Congress and the administration. However, though the Census Bureau may not have control over funding decisions, the panel believes that the Bureau must take a more active role in informing the funding and decision process in at least two ways.
First, as we emphasize throughout this report, the Bureau must develop a sound research and evidentiary base for its 2010 census plan, thus making a stronger and more compelling case for sustained long-term funding. Building this research base should include carefully examining operational data from the 2000 census to guide planned practice for 2010 and fully exploring the potential of new tools for evaluation, among them the Master Trace Sample containing results of all census operations for a limited national subset. Much work also remains in integrating and mapping the logical and technical infrastructures of the entire census process, and in developing a rigorous and timely testing and evaluation program for new census systems and techniques. The consequences of failing to develop a strong research base for the 2010 census are significant: repeating past census processes that may be inefficient or suboptimal, conducting a census with methods that are out of step with the dynamics of the population it is intended to count, making limited technological innovations that may not match real needs, and marking a flawed beginning for the 2020 census.
Second, the Bureau should be explicit in identifying, articulating, and quantifying the consequences associated with risks in the census process—for instance, the impact of reduced fund-
ing on the quality of ACS estimates for small geographic areas and population groups. Failure to reach consensus on the role of the ACS in the census process raises the undesirable prospect of a reversion to the long form, possibly late in the census process and therefore implemented in a rushed manner, thus incurring the same nonresponse and data quality problems as were experienced with the 2000 long form. Such failure would impair other parts of the census plan, including effective use of PCDs. More significantly, failure to reach closure on census design leaves open the possibility that the detailed socioeconomic and demographic characteristics measured by the current census long form may not be estimated at all in 2010, an unacceptable outcome for many reasons.
SPECIFIC RISK CATEGORIES AND MITIGATION STRATEGIES
Beyond the broad risks of funding and design selection, the 2010 census planning process faces many risks of a more specific nature. Some of them are acknowledged in the Bureau’s draft risk management plan, but many are not. Based on the information known to us, we find that the 2010 census reengineering process may be jeopardized in the following areas, among others.
Modernizing Geographic Resources
The panel believes that the process by which the Master Address File is updated and improved is severely at risk. The Census Bureau’s current approach relies principally on updates from U.S. Postal Service files, effectively treating MAF updating as “routine maintenance.” Moreover, the Bureau appears set to rely on a complete block canvass of mailing addresses, a costly operation just before the census. Absent a strong focus on enhancing the MAF in its own right, throughout the decade and independent of presumed benefits from linkage to a realigned TIGER database, the 2010 census may be conducted with an address source that has unacceptable levels of housing unit duplication in some areas and coverage gaps in others.
In the panel’s assessment, it is particularly critical that the Census Bureau develop a comprehensive plan for updating and improving the Master Address File (Recommendation 3.1). A centralized staff position to coordinate housing unit definition and listing (Recommendation 3.2) would help create a quality address list for 2010. The panel also suggests research and analysis of various possible sources of address updates: work with the Postal Service on assessing the quality of the Delivery Sequence File (Recommendation 3.3), analysis of the possible contribution of the Community Address Updating System (3.4), and justification of the Bureau’s plan to implement a complete block canvass (3.6). Further analysis of MAF data from the 2000 census is a crucial learning tool (3.7).
The Bureau’s current MAF/TIGER Enhancements Program focuses on the realignment of TIGER features and modernization of the TIGER database structure; each of these tasks has considerable associated risk. First, the initial realignment of TIGER geographic features to be consistent with GPS coordinates may not be completed in time, or change detection for new features after the initial realignment may not be properly performed. These outcomes would have a negative impact on plans for the use of personal computing devices by field enumerators and would lead to continued errors in the geocoding of addresses in the census and in nonresponse follow-up operations. Second, the conversion of the MAF/TIGER database from its current homegrown format to a modern, object-oriented computing environment may be slower or more difficult than anticipated. The transition will be more risky if the Census Bureau attempts the conversion en masse, rather than via a more carefully designed software reengineering process with ample testing. Careful planning will be essential to keeping TIGER modernization on track. We also suggest that the development of MAF/TIGER support software could be an opportunity to build stronger ties with software developers outside the Bureau (Recommendation 6.3).
The Census Bureau’s draft risk management plan discusses the possibility of alienating key stakeholders, including local and tribal governments. Alienation of local authorities is a risk, to be certain, but a more fundamental risk is failure to fully involve them in census design and operations. We urge the Census Bu-
reau to develop a complete plan for its partnerships with local and tribal governments, with particular regard to address list updating (Recommendation 3.5) and more generally. In our assessment, cultivation of strong local and tribal partnerships will also help redefine enumeration techniques for group quarters and other special populations, tailor enumeration techniques to specific areas within localities, and foster the acceptance and use of the ACS.
American Community Survey
As discussed previously, the introduction of the ACS and elimination of the census long form are the most fundamental factors in determining the final design of the 2010 census, and delay in finalizing that design is one of the most fundamental risks. Accordingly, the panel emphasizes the need for a clear and early decision (Recommendation 4.4) and for contingency plans for the collection of traditional long-form data should full ACS funding not be forthcoming (4.5).
The panel believes that development of a strong research and evaluation program for the ACS is important in several respects. Resolution of issues regarding estimation techniques based on a continuous survey like the ACS and further exploration of the relationship between the ACS and other federal surveys are essential to winning support for the ACS and to its adoption by data users. Lack of a full ACS research agenda may also pose longer-term risks to the quality and usefulness of the survey, hindering potential ties between the ACS and programs for producing postcensal population and demographic analysis estimates. The panel recommends continued research on the relative quality of ACS and census long-form-sample estimates (Recommendation 4.1), development of a “user’s guide” to ACS data (4.3), and sharing of detailed ACS data with local data analysts and the broader research community (4.2).
Enumeration and Data-Processing Methods
The Census Bureau’s plans for the use of portable computing devices (PCDs) in the 2010 census are a particularly exciting
part of a reengineered census, but the plans also entail risks associated with the implementation of new technologies. Perhaps most significant is the risk that the Bureau may fail to fully understand the direction in which the technology is moving and thus may spend its resources testing devices that are inferior to those that will be available in 2010, in terms of both size and computing capacity. A consequence of this error is that wrong and misleading conclusions would be drawn about the real potential for portable computing devices to improve census data collection. Accordingly, we recommend that the Bureau conduct a rigorous test of PCDs for data collection, including tests using current high-end devices that may be closer to what will be widely available at time of procurement (Recommendation 5.1).
A second risk inherent with the PCD technology lies in making the decision to purchase too early and without fully specified requirements, resulting in the possible selection of obsolete or inadequate devices. Third, and related, is the risk that the Census Bureau may not use the sheer size of its order (perhaps on the order of 500,000 devices) to obtain devices tailored to census needs, as opposed to buying only what is commercially available off the shelf. Finally, given the fact that the principal users of the devices will be the large corps of temporary enumerators (with limited training), there is the risk that Census Bureau PCD development will not take human factors into sufficient consideration. These risks are significant, and there are no set guidelines we can offer regarding the optimal time to buy. In our assessment, the best way to mitigate these risks is to focus on the detailed specifications for the devices—defining exactly what the devices must be able to do—and try to tailor the final devices to those specifications (and not vice versa; Recommendation 5.2). Particular attention must also be paid to designing a complete testing protocol for PCD software and hardware components (Recommendation 5.3).
It is a basic truth that some people, households, and areas are inherently more difficult to count in the census than others. The experience of the 2000 census suggests several risks for 2010 planning related to hard-to-count populations. Significant among these is the population living in group quarters (places such as hospitals, dormitories, and prisons). In 2000, as in pre-
vious censuses, procedures for enumerating group quarters were conducted separate from the rest of the census process. Group quarters listings were not reconciled with the MAF for households, and little effort was given to the challenges of enumerating different types of group quarters. The enumeration processes for group quarters were not well controlled. Continuing with this approach incurs the risk of duplication and other enumeration errors and ineffective coverage of this small but important population group. We strongly recommend comprehensive reexamination of the definition, listing, and enumeration procedures for special places and group quarters (Recommendation 5.4).
Group quarters are not the only populations that have traditionally posed difficulties; others include immigrant communities, irregular multiunit housing structures, gated communities, colonias along the U.S.-Mexico border, and the homeless. Enumeration efforts for these populations may be compromised by failure to clarify the definition and presentation to respondents of the residence rules for the decennial census. Consequences associated with such a failure include poor-quality data, failure to meet consumer needs, and continued differential undercount. There is no definitive advice we can offer about the best way to count these groups; what we do suggest is that dialogue and plans for counting them begin early in the census planning cycle rather than being saved as last-minute considerations (Recommendation 5.5).
Though the 2010 census is still expected to be conducted largely by mail, the Census Bureau will likely promote other modes by which respondents can return their forms and also introduce different contact strategies to reach respondents. In particular, use of the Internet to reply to the short-form-only census will likely be encouraged; interactive voice response through an automated telephone system may also be used, though that technology has experienced difficulty in early testing. Additional response modes and other programs such as a repeat of the 2000 “Be Counted” campaign (by which people who believed they were missed in the census could pick up a form in public locations) may increase public cooperation in the census. However, they also raise the risk of higher levels of duplication, of both persons and housing units. The Census Bureau had to imple-
ment ad hoc unduplication techniques as the 2000 census was processed, and some of these activities may be formalized in 2010. The panel notes the need for a research plan for unduplication techniques and the need to test proposed techniques in the 2006 census test (Recommendation 5.7). One respondent contact strategy of particular merit is the sending of replacement questionnaires to nonresponding households; plans to do this in 2000 had to be abandoned when it became apparent that it could not be done in a timely fashion. The Census Bureau must proceed quickly to find ways to effectively operationalize a targeted replacement questionnaire in 2010 (Recommendation 5.6).
For the 2000 census, the Census Bureau chose to complete nonresponse follow-up activities very quickly and rely on longstanding imputation techniques to fill in missing questionnaire items and, in some cases, to impute household size when no information was available for a presumed-occupied unit. These techniques came under scrutiny following the census when the state of Utah challenged the inclusion of some types of imputations in apportionment totals. Although the U.S. Supreme Court ultimately upheld the use of existing Bureau imputation practices, the debate suggests the need to revisit the techniques, including the “hot-deck” methodology that has been used by the Census Bureau for several decades. Specifically, the panel strongly urges the Census Bureau to investigate further the costs and benefits of the basic trade-off between continuing field nonresponse follow-up work versus imputation for nonresponse (Recommendation 5.8) and to further study the effect of imputation techniques on the distribution of census data items (5.9).
Conduct of the decennial census requires a sound technical infrastructure—the amalgam of people, computer hardware, software programs, and telecommunication networks that facilitate the flow and processing of information from beginning to end. In 2000, many of the systems that the Census Bureau used were ultimately successful but were developed at great risk, often hastily and without opportunity for full testing. The Census Bureau has begun efforts toward modeling the logi-
cal infrastructure—the information blueprint that diagrams all the informational dependencies between pieces of the census process—and using that logical infrastructure to guide the development of the physical technical architecture. The panel strongly endorses these efforts (Recommendation 6.1) and notes the need for strong institutional commitment and “championship” of architecture redesign. To this end, the panel advocates the creation of the position of system architect of the decennial census to coordinate this effort, and further recommends that subsystem architects for MAF/TIGER and field operations (PCDs) be recruited (Recommendation 6.2). The Census Bureau’s draft architecture documents suggest that the Bureau’s efforts have not yet reached the stage at which real reengineering can take place, but this will hopefully be resolved quickly with further experience with the modeling techniques. Failure to achieve the full potential of architecture modeling would incur severe risks: systems may be ill-suited to handle 2010 census process needs, may fail during actual census operations due to lack of proper testing, and may not properly interoperate with each other.
Generally, the Census Bureau has expressed the desire to improve its capabilities in software engineering, motivated in particular by the need to redesign the database structure underlying the MAF/TIGER system from a home-grown environment to one based on commercial products. While noting that improving software engineering practices is difficult in its own right, much less on the tight schedule and amid the other demands of the decennial census, the panel supports the Bureau’s effort to improve its software standards (Recommendation 6.4) and, in particular, urges greater attention to the Bureau’s protocols for computer hardware and software testing (6.5).
Disputes over the role of sampling methods and the use of dual-systems estimation (based on matching an independent postenumeration survey to census returns) were the dominant force in planning the 2000 census. From all indications, it is the role of the American Community Survey and the prospective replacement of the census long form that will be the major force in
deciding the overall shape of the 2010 census, rather than considerations of coverage measurement and evaluation. After the statistical adjustment battles preceding and following the 2000 census, the Census Bureau may be understandably reluctant to take up active debate on coverage techniques for 2010. However, this reluctance incurs the risk that a comprehensive plan for the measurement and assessment of census coverage in 2010 will be deferred until late in the census process. It is essential that the Census Bureau have the means to determine the accuracy of its count, and a late-course fallback to the same Accuracy and Coverage Evaluation methodology used in 2000 could be unfortunate, particularly if research on problems raised by the 2000 census experience are not addressed in the intervening decade.
In the panel’s assessment, the coverage measurement program in 2010 need not take the same exact shape as that of 2000; what is important is that plans for the program are developed early, and that techniques are tested in 2006 and in the 2008 dress rehearsal. The Panel to Review the 2000 Census has comprehensively reviewed the 2000 Accuracy and Coverage Evaluation research and suggested changes and improvements; these should be implemented, to the extent that a postenumeration survey is part of the 2010 coverage plan. The panel encourages further research on the data and assumptions that support demographic analysis estimates (Recommendation 7.1), which have served as an important coverage benchmark in recent censuses. The panel also encourages further work on methods based on administrative records; whether or not there is a role for such methods in the conduct of the 2010 census, administrative records work should at least be a major experiment in the 2010 census as it was in 2000.
General Research, Evaluation, and Testing
As evidenced by the common theme in many of our recommendations, the panel believes that research and evaluation are essential not only to the diagnosis of risks inherent in the census process but also to their mitigation. Accordingly, the Census Bureau should materially strengthen and extend its program of evaluations (Recommendation 8.1). Evaluation should play
a central role in operations rather than being relegated to a peripheral, post hoc role. In the past, the Census Bureau’s planning and research entities have operated at either a high level of focus (e.g, articulation of broad objectives such as the “three-legged stool” components without laying out a clear base in empirical evidence) or at a microlevel that tends toward detailed accounting of individual census processes. As it designs its research and evaluation program, the Census Bureau should work to bridge the gap between research and operations in the census process; evaluations should be forward-looking and designed to inform and satisfy specific planning objectives.
The panel strongly encourages the further mining and reanalysis of operational data from the 2000 census to build a strong base for 2010 census planning (Recommendation 8.3). In particular, the panel urges the Bureau to make use of the Master Trace Sample, a compilation of data from many census operations, for a sample of the population that the panel has advocated in its previous reports (Recommendation 8.4). In addition to expanding the sample’s scope to include key data such as results from the Accuracy and Coverage Evaluation (Recommendation 8.5), the panel strongly urges the Bureau to reconsider its current decision to limit access to the Master Trace Sample to internal Bureau users. Instead, the sample—with appropriate safeguards on confidentiality—should be accessible to the broader research community (Recommendation 8.6). As it designs its technical infrastructure and, hopefully, makes research a strong central focus, the Bureau should have as a long-term objective the maintenance of a Master Trace System—through which real-time evaluation could inform census operations even as the census is being fielded—rather than merely a Sample for which data are assembled after operations are completed (Recommendation 8.7).
A major focus of the Bureau’s ongoing research and evaluation program should be the development of targeted methods for address list development and enumeration (Recommendation 8.2). Examples of these methods include targeting block canvass to verify address list entries to particular (e.g., high-growth) areas and expansion of update/leave enumeration (where a census enumerator drops questionnaires at housing units, which are then expected to mail them back) in areas where mail delivery
may not be effective. Failure to implement these methods appropriately may result in increased costs and continued problems of enumeration in high-density areas with structures containing multiple (and not well-listed or easily differentiated) housing units.
The 2010 census is more imminent than many lay observers might expect; in particular, the number of opportunities for major census tests between now and 2010 is very limited. The 2004 test is currently being conducted, leaving only the 2006 census test and 2008 dress rehearsal as the major anticipated testing opportunities. The panel strongly encourages the Census Bureau to pursue smaller-scale testing as resources and timing permit, with the argument that not all census tests need to be part of a general, omnibus test census that is commonly the shape of the Bureau’s major test opportunities. The Census Bureau has taken as a major goal the performance of a true dress rehearsal in 2008, in comparison to the 1998 dress rehearsal which was fundamentally an experimental test of competing census designs. The panel believes that this goal makes the 2006 census test much more important and crucial to a successful 2010 census. The 2006 census test should therefore be cast as a proof of concept, not a technical test; it must provide the basis for answering any remaining experimental questions in order to make the 2008 test a truly preoperational rehearsal, and it must be funded accordingly.