6
Census Content

An important part of the panel's charge was to consider the needs for census data and whether those needs are better met by the census itself or by some other data collection system. In responding to this part of our charge, we did not attempt to determine the merits of each and every item in the census. We believe it would be inappropriate for us to substitute our judgment on specific items for that of federal government agencies and others who use the data. Rather, we sought to determine in broad terms whether the kinds of data now collected in the census, beyond those items required for the constitutionally mandated purposes of reapportionment and redistricting, serve important public purposes.

In our review (see Chapter 1 and Appendices D-H and M), we determined that federal agencies require census-type data on small areas and small population groups for legally mandated purposes (e.g., for allocation of funds and enforcement of antidiscrimination laws) and, more broadly, to implement and evaluate government programs and policies. We also determined that other data users—such as state and local governments, researchers, and business organizations—require census-type data for purposes that directly or indirectly serve the public interest. Our conclusion, therefore, is one of unequivocal support for the importance of census-type information: the nation needs to have the breadth of data for small areas and small population groups that the census now provides.

Conclusion 6.1 The panel concludes that, in addition to data to satisfy constitutional requirements, there are essential public needs for small-area data and data on small population groups of the type and breadth now collected in the decennial census.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 113
Modernizing the U.S. Census 6 Census Content An important part of the panel's charge was to consider the needs for census data and whether those needs are better met by the census itself or by some other data collection system. In responding to this part of our charge, we did not attempt to determine the merits of each and every item in the census. We believe it would be inappropriate for us to substitute our judgment on specific items for that of federal government agencies and others who use the data. Rather, we sought to determine in broad terms whether the kinds of data now collected in the census, beyond those items required for the constitutionally mandated purposes of reapportionment and redistricting, serve important public purposes. In our review (see Chapter 1 and Appendices D-H and M), we determined that federal agencies require census-type data on small areas and small population groups for legally mandated purposes (e.g., for allocation of funds and enforcement of antidiscrimination laws) and, more broadly, to implement and evaluate government programs and policies. We also determined that other data users—such as state and local governments, researchers, and business organizations—require census-type data for purposes that directly or indirectly serve the public interest. Our conclusion, therefore, is one of unequivocal support for the importance of census-type information: the nation needs to have the breadth of data for small areas and small population groups that the census now provides. Conclusion 6.1 The panel concludes that, in addition to data to satisfy constitutional requirements, there are essential public needs for small-area data and data on small population groups of the type and breadth now collected in the decennial census.

OCR for page 113
Modernizing the U.S. Census On the basis of this conclusion, we undertook the following tasks: to review in broad terms the process by which the census content is decided; to assess whether the content beyond that required for constitutionally mandated purposes of reapportionment and redistricting—largely the information now collected from the census long form—adversely affects census costs and coverage; and to evaluate the merits of a continuous measurement data collection system (i.e., a large, continuing monthly mail household survey) as a possible replacement for the census long form. In considering census content, we also reviewed the difficult issues posed by the collection of data on race and ethnicity. These data are vitally needed not only for redistricting under the Voting Rights Act and court decisions, but also for many other program, policy, planning, and research purposes. The results of this analysis are in Chapter 7. THE PROCESS FOR DETERMINING CENSUS CONTENT In recent decades, the process of determining the questions to include in the census has involved a council or committee of federal agency representatives, coordinated by the Statistical Policy Office of the Office of Management and Budget (OMB). The Census Bureau has generally determined the dimensions of the "envelope," that is, the overall length of the short and long form, and has weighed in on the desirability of including questions for such purposes as coverage improvement and historical continuity. It has also evaluated questions in terms of their fitness and feasibility in a census context (thus, some questions, such as religion, are considered inappropriate for the census, and others are determined to be too complex or too subject to misinterpretation to provide reliable responses from a mail questionnaire). Within this framework, federal agencies have argued for items to serve their program and policy needs and have made trade-offs as needed. In recent censuses the Census Bureau has also sought input from states and cities and the public, through such mechanisms as public meetings. These meetings have generated a long list of potential new items to include in the census. Ultimately, however, federal agency data needs take precedence, and not even all of the agencies' proposed items can be accommodated because of the limits set by the Census Bureau on feasibility and questionnaire length. OMB has a formal role in approving the questionnaire under the terms of the Paperwork Reduction Act. Indeed, in 1988, OMB disapproved the questionnaire for the census dress rehearsal and, by extension, for the 1990 census. OMB requested that seven housing items be moved from the short to the long form,

OCR for page 113
Modernizing the U.S. Census that three housing questions on the long form be deleted entirely, and that the size of the long-form sample be reduced from about 16 million households (about 1 in 6) to 10 million households, with possible use of a variable-rate sampling plan. After appeals and analysis by data users and statistical work by the Census Bureau and others on the sample size, OMB rescinded most of the changes it had requested. A few housing items were moved from the short to the long form, and a variable-rate sampling plan was adopted (see Choldin, 1994). Congress also plays a role because the secretary of commerce is required to provide to Congress the list of topics proposed for inclusion in the census no later than 3 years before Census Day and to provide the proposed list of specific items no later than 2 years before Census Day. For the 1990 census, Congress altered the question on race from the format originally proposed by the Census Bureau. Congress has already been involved in the questionnaire for the 2000 census. Some members have argued that the questionnaire should be limited to basic items that are needed for key federal purposes. In response, the Census Bureau proposed to include in the 1995 census test only those items that are required by law to be collected in the census. From a review of legislation and agency documents, the Census Bureau classified items into three categories: (1) items mandated by law to come from the census; (2) items required by law although not necessarily from the census, but for which there is no reasonable alternative source; and (3) items not mandated but needed for agency program purposes. Restricting the questionnaire to the items in the first two categories would have little effect on the content of either the long form or the short form compared with 1990—most items are required by law. Restricting the questionnaire to the items in the first category, however, would considerably shorten both the long and short forms because many laws do not specifically name the census as the data source (see Bureau of the Census, 1993a, and Appendix M). The prospect that the test census content in 1995 (and, consequently, the 2000 census) might be restricted to items mandated in a strict sense led several agencies (notably, the Department of Transportation) to explore obtaining legal mandates for needed items they believed could be obtained in a cost-effective manner only from the census. Subsequently, the Census Bureau proposed to include virtually all of the 1990 content in the 1995 test, and agencies have put in abeyance, at least for the time being, efforts to mandate specific items. We support the Census Bureau's decision for the test: information is needed on the mail return rates and costs of the long form in the context of the important changes in census methodology that will be tested in 1995 (see the panel's letter report: Schultze, 1993). That information is needed whether or not the 2000 long form is ultimately reduced in length or replaced by another data collection method, such as the continuous measurement system currently under consideration at the Census Bureau (see below).

OCR for page 113
Modernizing the U.S. Census In the process of determining the census content, there is a need to balance many factors, including program and policy data requirements, limitations on questionnaire space, and feasibility considerations and costs, as well as a need to make some provision for historical continuity. Inevitably, trade-offs must be made. That such trade-offs result from a full consideration of the range of agency needs and other relevant factors in a broad context is a positive benefit of the process as it has operated to date. The process would be likely to be impaired if a decision were made that all items must have explicit legislative mandates to be included in the census. Such a requirement could lead to individual initiatives by agencies and congressional committees to mandate items that could make it harder to control the size of the questionnaire or balance data needs across agencies. We support the continuation of a process for determining census content that involves the relevant federal agencies and data users. However, we believe that the integrity of the process would be enhanced by strengthening the oversight role of the chief statistician in OMB. For example, the Office of the Chief Statistician could take a lead in organizing agency meetings to consider such issues as the trade-offs between collecting data in the census itself or in other ways. That office brings a useful breadth of perspective, which encompasses the needs of program agencies for census data as well as of other users, including statistical agencies. That office also brings a concern for such issues as the feasibility of asking questions in a census context. Conclusion 6.2 The panel concludes that the process of determining the census content by involving federal agencies and eliciting the views of other users has worked well in the past and should be continued. In our judgment, the process would be strengthened by increasing the oversight and coordination role of the chief statistician in the Office of Management and Budget. THE LONG FORM Given the importance of the broad range of data for small areas and small population groups that the census currently collects, the main question is whether those data should be collected as part of the census or by some other means. Is the census the right vehicle to collect additional questions beyond the minimum required for reapportionment and redistricting? Do the added questions—principally those on the census long form, which is sent to a sample of households—increase the costs of the census and impair the quality of the data? The argument for the view that the long form is a problem for the census can be formulated as follows: respondents find the long form unduly burdensome because of its length and complexity; this burden lowers the overall mail return rate and also lowers item response rates for people who do send back a form;

OCR for page 113
Modernizing the U.S. Census both of these effects, particularly the former, increase census costs; and the lower mail return rate may also contribute to population undercoverage. Questions about the long form are of intense interest to almost every user of census data and elicit strong opinions on all sides. Many data users are impassioned in their defense of the long form, implicitly rejecting the idea that it hurts the basic census enumeration and arguing for the need for the rich multivariate data that it provides for small areas and small population groups. Others see the long form as a threat to the cost and quality of the census data that are needed to serve the basic constitutional purposes. In light of this controversy, in this section we review the evidence about the effects of the long form on census costs, mail return rates, and coverage. Costs The long form adds costs to the census in a number of ways, including: extra printing costs, extra postage, additional follow-up for every percentage point that the mail return rate for the long form is less than that for the short form, additional editing and follow-up for item nonresponse, coding of such items as industry and occupation, and additional data processing and publication costs. However, the long form, which is essentially a large sample survey on top of the massive effort undertaken for the complete (short-form) census, represents a marginal addition to total census costs. Moreover, the costs associated with the long form do not explain the escalation in census costs that has occurred. Over the period from 1960 to 1990, the long form became less rather than more of a burden on the population and a smaller component of the census. The sample size of the long form was reduced from 25 percent of the households in the 1960 census to about 17 percent of the households in the 1990 census, i.e., 1 in 6. (In the variable-sampling rate design used, in small places the sample size was 1 in 2 of households, and in large census tracts the sample size was 1 in 8 of households.) Over this period, the number of questions remained about the same (the number increased somewhat from 1960 to 1980 but declined from 1980 to 1990—see Appendix A). According to Miskura (1992), the total costs of the long form, including follow-up and all other costs, may have contributed 9 to 10 percent ($230 to $250 million) to the $2.6 billion costs of the census in 1990. More recent estimates that were provided to the panel from the Census Bureau's cost model suggest that the marginal cost of the long form in the context of the 1990 census methodology may range from $300 to $500 million, or 11 to 19 percent of total 1990 census costs. The added cost due to the somewhat lower mail return rate for long forms compared with short forms in 1990 (see below) is a relatively minor component of the total. Because relatively few households were sent the long form, the lower long-form mail return rate reduced the combined short-and long-form mail return rate (based on occupied housing units) by less than 1

OCR for page 113
Modernizing the U.S. Census percentage point: 74.1 percent for short and long forms combined and 74.9 percent for the short form alone. Using the Census Bureau's estimate that each percentage point drop in the mail return rate contributed 0.6 percent to total costs, this percentage point difference increased census costs by about $16 million. The changes that are contemplated for the redesigned 2000 census, such as truncation and sampling for nonresponse follow-up, would reduce the cost of the long form. (We discuss in greater detail the likely costs of the long form in the context of a new census design in the section below on continuous measurement.) Also, just as the long form itself represents a marginal addition to census costs, so, too, marginal reductions in the content of the long form would likely have very limited effects on census costs. It is certainly important to scrutinize all proposed content items to determine their usefulness and appropriateness for inclusion in the census, but eliminating a few items will not produce much cost saving. Conclusion 6.3 The panel concludes that the marginal cost of incremental data on the decennial census is low. In particular, we conclude that the extra cost of the census long form, once the census has been designed to collect limited data for every resident, is relatively low. Mail Return Rates The long form increases the burden of the census on the population, which may in turn reduce response rates relative to the short form. We noted above that the burden in terms of the proportion of households receiving the long form has declined progressively from 1960 to 1990; also, the number of questions has changed relatively little over the period. In this section, we review the evidence on mail return rates in the 1980 and 1990 censuses and the results of experiments to improve response.1 Effects in 1980 and 1990 The effect of the long form on mail return rates in 1980 was minimal (see Bureau of the Census, 1986): the mail return rate (which covers occupied housing units) was 81.6 percent for the short form and 80.1 percent for the long form, a difference of 1.5 percentage points.2 Return rates were considerably higher in decentralized, easy-to-enumerate suburban district offices than in centralized, hard-to-enumerate urban offices, but the disparity in return rates differed little by type of form. Return rates varied by census regional office—from 77 percent overall in the Dallas and New York City regional offices to 85 percent in Detroit

OCR for page 113
Modernizing the U.S. Census and Chicago, but the largest difference in return rates between short and long forms was 3.5 to 3.6 percentage points (in Dallas and Los Angeles). The long form had a somewhat greater effect on mail return rates in 1990 than it had in 1980. The long-form mail return rate was 4.5 percentage points below the short-form mail return rate (70.4 percent versus 74.9 percent), compared with the difference of 1.5 percentage points in 1980. There is some evidence that the long-form/short-form differential in return rates was greater in hard-to-enumerate areas: district offices in central cities had the largest short-form/long-form difference—6.9 percentage points—compared with the national average of 4.5 (see Thompson, 1992). Since only one-sixth of all households received the long form, however, the difference in return rates reduced the overall mail return rate by less than 1 percentage point. Indeed, what stands out about mail return rates in the 1990 census is not the relatively minor difference between the short-form and long-form rates, but the overall decline in the mail return rate. A 1990 survey to find out why people did not send back their questionnaires showed that most of the reasons cited apply to either form (Kulka et al., 1991): three-fourths of nonrespondents said they never received a form, or never opened it, or filled it out but never mailed it back; the remainder said they opened the form but did not start to fill it out or did not complete filling it out. Experiments to Improve Response Because declines in the mail return rate affect census costs, the Census Bureau moved quickly after the 1990 census to conduct experiments with both the short and the long forms to determine ways to increase response. In 1992 the Census Bureau conducted the Simplified Questionnaire Test (SQT) to assess the impact on mail return rates of reducing the number of questions on the short form and of making the form more user-friendly (see Dillman et al., 1993). The SQT tested five different short forms: the form used in 1990 (with wording updated to 1992) as a control; the ''booklet" form, which contained all of the 1990 content but in a user-friendly format; the "micro" form, which contained no housing items and asked only for name, age, sex, race, and ethnicity; the micro form with a request for Social Security number added; and the "roster" form, which asked only for name and age (birth date). Every form except the control was in a user-friendly format. The appropriate comparison for evaluating the effect of questionnaire length on response to the SQT is to look at the booklet, micro, and roster forms. There was virtually no difference in response between the micro and the roster forms. Both the micro and the roster forms achieved higher return rates overall—by 4.1 to 4.6 percentage points—than the booklet form.3 However, these improvements were largely in areas that were relatively easy to enumerate in 1990; the improvements in hard-to-enumerate areas were much less impressive: 1.9 to 2.4

OCR for page 113
Modernizing the U.S. Census percentage points compared with 4.4 to 4.8 percentage points in the easy-to-enumerate areas. The effects of a user-friendly format were just the opposite: overall, the user-friendly booklet form achieved a higher return rate than the control form by 3.4 percentage points. The effect was most pronounced in hard-to-enumerate areas: the difference was 7.6 percentage points in these areas, compared with 2.9 percentage points in easy-to-enumerate areas. The SQT and a subsequent experiment, the Implementation Test (IT), also provided evidence on the effects of such strategies to improve response as the use of a prenotice letter, a reminder postcard, a stamped return envelope, and a replacement questionnaire (see Census Data Quality Branch, no date). The IT, which used the micro short form for all treatments, found that the use of a prenotice letter increased response rates by 6 percentage points, and the use of a reminder card increased response rates by 8 percentage points (the effects were larger in areas that were relatively easier to enumerate in 1990). There were no significant effects from the use of a stamped return envelope. From comparing the IT treatment that combined the use of a prenotice letter and reminder card with the results of the SQT for the micro form (the SQT used a prenotice letter, a reminder card, and a replacement questionnaire for all treatments), the Census Bureau estimated that the use of a second or replacement questionnaire increased response rates by 10 to 11 percentage points. The Census Bureau conducted a third experiment in 1993 with user-friendly long forms and appeals to increase response—the Appeals and Long-Form Experiment (ALFE) (see Treat, 1993). For the long-form component, ALFE tested four different forms: the same 20-page long form used in 1990 (with wording updated to 1993) as a control; a 28-page user-friendly long form in a booklet format (with all of the questions for each household member preceding the questions for the next member); a 20-page user-friendly row-and-column long form; and the same 8-page booklet short form used in the SQT. The format of the booklet long form was expected to be more conducive to response than the format of the row-and-column long form, but the increased page length of the booklet long form was expected to be an impediment to response. None of the forms included any type of motivational appeal. Overall, the booklet long form increased return rates by 4.1 percentage points compared with the 1990 long form. In contrast, there were no statistically significant effects of the user-friendly row-and-column format. Whether the booklet long form might have increased return rates by an even higher percentage if its page length had been the same as the 1990 long form cannot be determined from the experiment. By type of response area, the booklet long form increased the return rate in areas that were easy to enumerate in 1990 but had no effect in hard-to-enumerate areas. All of the ALFE long forms achieved substantially higher return rates than the 1990-type long form that was used in the 1986 National Content Test (NCT): the difference in return rates between the ALFE and the NCT 1990-type long

OCR for page 113
Modernizing the U.S. Census forms was 10.3 percentage points. Based on evidence from experiments with mailing strategies to increase response, Treat (1993) surmises that the difference in return rates was due to the use of a prenotice letter and a reminder postcard in the ALFE, which were not used in the NCT. The ALFE booklet short form achieved substantially higher return rates than did the booklet long form: the difference was 11.3 percentage points overall, 10.8 percentage points in easy-to-enumerate areas, and 15.3 percentage points in hard-to-enumerate areas. However, one cannot conclude that such a wide differential would occur under census conditions of extensive publicity and outreach. As a separate part of the ALFE, the use of three different types of motivational appeals was tested, emphasizing, respectively, the benefits of the census, the confidentiality of the data, and the mandatory nature of responding. The first two types of appeals had very little effect; however, a strong emphasis on the mandatory aspect increased the return rate by 10 to 11 percentage points. Unfortunately, the appeals portion of the experiment was conducted only with the short form, so that no information is available about possible effects on response to the long form. In general, the ALFE leaves many questions unanswered about the long form, such as the effects on return rates of different page lengths or the difference between short-and long-form return rates with a mandatory appeal, particularly when carried out in the context of a national census (with its attendant publicity and legitimacy). Conclusions Overall, the evidence is clear that the problems with mail return rates experienced in the 1990 census characterized the short form almost to the same degree as the long form. The Census Bureau's experiments with different form types and lengths and with other aspects of the mailout process (the use of a prenotice letter, a reminder postcard, a replacement questionnaire, motivational appeal, etc.) have identified some promising ways to increase mail returns by households for both the short and long forms. Features of the mailout not related to the forms as such (e.g., motivational appeal, use of replacement questionnaire) seem to be particularly effective in increasing mail return rates. With regard to the forms themselves, there is evidence that making both the short form and the long form more user-friendly improves responses somewhat, particularly (in the case of the short form) in hard-to-enumerate areas. There is also evidence that reducing the length of the short form—the form that most households receive—helps response to a limited extent. It may be that implementing the various improvements to the mailout process, including making the short form shorter and more user-friendly, will widen the differential between the short-form and long-form mail return rates. One must assess the evidence cautiously, however, because none of the tests of improvements

OCR for page 113
Modernizing the U.S. Census to the short and long forms and the comparative effects on mail return rates has yet been conducted in anything approaching a census environment. (The decision to include the long form in the 1995 census test is critically important in this regard.) Also, a wider differential in the context of a considerably higher overall mail return rate has very different implications from a wider differential at the level of the rate in 1990. Finally, it is important to remember that the long form, however it is designed, goes to relatively few households. Coverage Coverage errors in the 1990 census, specifically people missed within households, were higher for forms obtained by enumerators compared with forms that households filled out and mailed back themselves. The reason, presumably, is not because the enumerators did a poor job, but because people who do not mail back their questionnaires also do not respond well to follow-up. Thus, the percentage of people in the 1990 Post-Enumeration Survey (PES) who were not matched to the census although their housing unit was matched (within-household misses) was 11.6 percent for enumerator-filled returns compared with 1.8 percent for mail returns (Siegel, 1993; see also Keeley, 1993).4 This difference means that the somewhat lower mail return rates for long forms in 1990 could have had the effect of increasing coverage errors. Overall, however, the effect of the long form on within-household misses of people in 1990 was trivial (Siegel, 1993) because most people (5 out of 6) did not receive the long form; the difference between short-form and long-form mail return rates was not large (4.5 percentage points); and rates of within-household misses were virtually the same for short and long forms within type of return—1.9 and 1.8 percent for short-form and long-form mail returns and 11.7 and 11.3 percent for short-form and long-form enumerator-filled returns. Thus, the nonmatch rate for people in enumerated households, combining mail and enumerator-filled forms, was 4.2 percent for short forms and 4.4 percent for long forms, not a statistically significant difference.5 The contemplated change in methodology for the 200 census to take account of coverage errors and complete the count by means of statistical estimation essentially renders the effect of the long form on coverage moot. To the extent that there is an effect, it will be taken care of in the estimation process (see Chapter 5). Other changes, such as the use of sampling for nonresponse follow-up, would also minimize the effects of lower long-form mail return rates on coverage. Matrix Sampling Although the long form had relatively little effect on mail return rates and almost no effect on coverage in 1990, it does represent a burden on households

OCR for page 113
Modernizing the U.S. Census in the census sample. One proposal to reduce the burden on these households—and, thereby, perhaps improve response—is to employ matrix sampling. This approach divides the long-form sample into subgroups, each of which receives a different, shorter version of the long-form questionnaire with a subset of the content. Matrix sampling was used in the 1970 census, which had two intermediate-length forms that included some questions in common. It was also used to a lesser extent in the 1960 census for the housing items (see Appendix A). We understand that the Census Bureau currently plans to include a full long form and an intermediate-length form in the 1995 test, which should provide valuable information about optimal form length in the context of a user-friendly design and other improvements to census methodology.6 We support this decision, which recognizes the need to test content in the context of design, rather than trying to separate the two. There are many issues about matrix sampling that need to be addressed before making a decision about its use in the 2000 census. All other things being equal, data users are most likely to prefer a single long form for the reason that every variable can be cross-tabulated with every other variable. Also, a single long form provides the maximum sample size (for a given overall sampling rate) and facilitates data analysis. In addition, a single long form is likely to be easier to control in the field. However, because each version of the long form in a matrix sampling scheme is reduced in length, there may be a positive effect on mail return rates and item nonresponse rates, which may in turn reduce the costs associated with follow-up. Also, there may be ways to minimize operational and data processing and analysis problems associated with matrix sampling. Information is needed on the balance of the positive and negative implications of this approach for reducing respondent burden. We encourage the Census Bureau to conduct a comprehensive analysis, with the results from the 1995 test and other information, of the cost-effectiveness of matrix sampling in comparison with a single long-form design. Recommendation 6.1 The panel recommends that the Census Bureau evaluate the merits of a matrix sampling approach that uses several intermediate forms in place of a single long form to reduce respondent burden. The Census Bureau should examine the effects on: satisfying data users' needs; mail return rates; sampling and nonsampling errors (including item nonresponse rates); operational problems; and data processing and estimation problems that could affect the usefulness of the information, particularly for multivariate analysis.

OCR for page 113
Modernizing the U.S. Census respond to the telephone follow-up, 6 percent are contacted by personal visit, and the remaining 12 percent are not contacted. After all of the follow-up is carried out as specified, there are still some 27,200 households of the 225,000 selected (or 12 percent) that are left as nonrespondents by survey design (i.e., of the 39,600 households that are nonrespondents after the telephone follow-up, only 12,400 are sent for personal interview follow-up). The total monthly cost for continuous measurement, under the scenario just outlined, is $5.14 million. Over 10 years, the continuous measurement operation would cost about $615 million (rounding to the estimate provided by Alexander to the panel). The intensity of operations under this proposal corresponds to one of the contemplated truncated census designs with sampling for nonresponse follow-up for which the marginal cost of the long form is estimated at $200 to $400 million. Hence, the estimated costs for continuous measurement are 1.5 to 3 times the estimated marginal cost of the long form under a similar design. Furthermore, as noted above and discussed below, we believe there are missing elements for which cost information is needed. Overall, we believe that a realistic estimate of the costs of the continuous measurement is likely to be higher than the estimates developed to date. Missing Cost Data To prepare a more complete estimate of the costs of continuous measurement would require more careful consideration of several questions and issues: What cost components does the assumed cost of $11 per completed mail interview include? This estimate, which seems unrealistically low, must cover paper costs, printing the questionnaires, printing envelopes, logistics (getting the questionnaires to the right places both on their way out and on their way back), postal costs for mailout and mailback, checking in the questionnaires, training staff to edit the questionnaires, data entry, computer editing, and handling of edit rejects. What is the cost to maintain the master address file, block by block, for every quarter for a 10-year period at a level of coverage completeness commensurate with that achieved by the long form embedded in the census? Also, what is the cost to deal with rural addresses that are not yet in city-style format? What is the net cost increase required to maintain a continuously updated geographic referencing system, linked to the master address file, for a 10-year period compared to the one-time updating work involved in the census? What cost components are included in the estimate of fixed costs, which amount to 16 percent of total costs? Specifically, what is the cost in maintaining a headquarters staff for continuous measurement over a 10-year period? Other than headquarters staff, what are the other fixed costs for continuous measurement for a 10-year period?

OCR for page 113
Modernizing the U.S. Census What is the cost if initial mail response rates are different than those projected? The Census Bureau estimates assume a 60 percent mail return rate from occupied housing units, which compares with a 70 percent mail return rate for the census long form. The latter rate was achieved with all the intense publicity and sense of legitimacy that can only derive from a national census. What is the evidence that a continuing survey would achieve a response as high as 60 percent? (In this regard, it would be useful to compare the response rates with other mail surveys conducted by the Census Bureau.) According to Alexander (1994b), the assumed 60 percent mail return rate is based in part on results from the ALFE experiment, which suggest that such a rate could be obtained by improving the questionnaire package and the mailing process (e.g., having a prenotice letter, a reminder card, and a user-friendly questionnaire). Such improvements, however, would apply to the long form as well as to a continuing survey. In other words, an appropriate cost comparison must build in realistic assumptions about the response to a continuing survey that is not conducted under census conditions vis-à-vis the response that is obtained from a comparably designed long form as part of the census. What is the cost associated with the operation of finding telephone numbers for 115,000 households each month for a 10-year period in order to contact households that did not return their questionnaire? Telephone numbers are also needed for households that mailed back but did not fully fill out their questionnaire; presumably, the questionnaire will ask for a telephone number for follow-up purposes. If the costs to locate telephone numbers are built into the unit cost of completed telephone interviews, what are those built-in costs? What is the cost associated with attempted telephone calls that result in a telephone nonresponse? In the Census Bureau's calculations, telephone costs are computed as a unit telephone interview cost multiplied by the number of completed telephone interviews. What is the cost of extra calls or other costs incurred due to edit failures (missing or inconsistent responses to questionnaire items), other than those built into the unit costs per completed mail return, telephone, or personal interview? Finally, what are the costs associated with data processing and dissemination? Other Cost Savings Taking the Census Bureau cost estimates at face value, the costs of a continuous measurement system with a sample size of 250,000 housing units per month, when compared with the marginal cost of the long form executed with the use of truncation and sampling for nonresponse follow-up, leave a difference of $215 to $415 million to be made up through other savings. (This range is $615 million for the full-cycle costs of continuous measurement minus the estimated

OCR for page 113
Modernizing the U.S. Census marginal cost of $200 to $400 million for the long form under a comparable design.) One suggestion (see Alexander, 1994b) is that continuous measurement would reduce the costs of compiling the master address file for the census. The marginal costs to the continuous measurement scheme for updating the master address file throughout the decade are unknown. It is the panel's understanding that these costs are not currently included in the continuous measurement proposal. Moreover, little is known about the improved quality of the master address file that would be gained by using continuous measurement interviewing staff to check and update local-area address lists. Estimates of cost savings from the continuous updating of the master address file are speculative at this moment. The data obtained by continuous measurement could also prove useful for existing surveys and estimates programs in a number of ways that cut costs. For example, continuous measurement data could provide a more efficient means of designing surveys of rare populations, by reducing the need for expensive screening surveys to identify eligible respondents. All of these potential benefits of continuous measurement for the statistical system should be investigated carefully. However, at this stage, estimates of savings are highly speculative, and the likelihood that such savings could make up all or a significant fraction of the difference between the full-cycle costs of continuous measurement and the savings from dropping the long form must be viewed with skepticism. Data Quality Another important aspect of evaluating a continuous measurement system in comparison to the long form concerns data quality. In terms of sampling errors, the plan for continuous measurement that we have just reviewed will produce estimates for small geographic areas (when data are cumulated) that have sampling errors about 25 percent higher than the long form in the context of the 1990 census methodology. The sampling errors may be about the same as the long form in a redesigned census that uses truncation and sampling for nonresponse follow-up. Some analysts anticipate that data quality in terms of nonsampling error (e.g., item nonresponse, reporting errors of various kinds) would be improved in a continuous measurement system over the long form and hence that the mean squared error (combining sampling and nonsampling error) would be no worse or even better. The assumption is that a continuous measurement system would achieve quality improvements because of the use of experienced interviewers and supervisors and generally the ability to monitor data collection in a more effective manner than can be achieved in the compressed and hectic schedule of the census.

OCR for page 113
Modernizing the U.S. Census It may well be that more experienced interviewers will improve the quality of the data obtained by telephone and personal follow-up. Such improvements could be important in light of evidence about data-quality problems with long forms (and short forms) that are obtained by census enumerators (see Appendix L). On one important dimension of quality, however, completeness of within-household coverage, continuous measurement is likely to perform worse than the census long form. It is well known that household surveys rarely cover the population as well as the decennial census (see, e.g., Shapiro and Kostanich, 1988). For example, even after adjustment for nonresponse, the March Current Population Survey and the Survey of Income and Program Participation typically cover only 80 to 85 percent of black men and 90 to 95 percent of other people when compared with unadjusted census-based population estimates (i.e., estimates that have not been adjusted for the undercount in the census itself; see Citro and Kalton, 1993: Table 3-12).12 Alexander (1993, 1994b) indicates that continuous measurement might do better than the Current Population Survey because it will include some features of the census, such as questions designed to improve within-household coverage. However, we believe that a major reason for improved coverage in the census relative to household surveys is the widespread and intense publicity and sense of legitimacy associated with the census, which cannot be replicated for continuous measurement (or other) surveys.13 Because of the much lower intensity of planned follow-up of continuous measurement, its much larger scale of operation, and its incomparably lower expected unit costs compared with other household surveys, we see no reason to expect that continuous measurement would have improved coverage. Moreover, a lower intensity of follow-up would disproportionately affect those with traditionally high coverage errors: minorities, poor people, and mobile and transient populations. These are groups for which census long-form data are heavily used.14 Conceptual Issues with Cumulated Data If a continuous measurement system is adequately funded throughout the decade, it promises users the benefits of more frequent estimates for both small and large geographic areas than are available from the census long form. At the same time, the use of cumulated data, collected on a continuing basis, to produce small-area estimates poses a number of analytical problems that must be addressed and resolved. Reference Periods and Recall. One issue with continuous measurement is the reference period that is used for each month's survey. For annual income, for example, one could ask respondents each month to report their income for the preceding 12 months or to report their income for the previous calendar year.

OCR for page 113
Modernizing the U.S. Census The former option reduces the length of recall but specifies a reporting period (the preceding 12 months) that is not natural for respondents (except for interviews that fall in January of each year). In addition, the data would need to be adjusted in various ways to produce calendar-year estimates. The latter option lengthens the recall for most respondents considerably: people will be asked as late as December about their income in the preceding calendar year. Yet it is well known that it is best to ask about income fairly soon after the end of the year when people are preparing to file their income tax returns. This is what happens in the current census, in which questionnaires are mailed out in late March. Similar problems of reference period and recall can apply to employment status, occupation, and industry, and other important variables on the long form. Analysis Problems. Data that refer to 5-year annual averages have different analytic issues than those for a single year. How does one compare the 5-year average of two municipalities, one of which was growing strongly, the other declining strongly, during the period? What is the meaning of median income when some of the income data refer to the beginning and some to the end of the 5-year averaging period? Residence Rules. Given the mobility of the U.S. population, there is a problem of how one assigns population consistently to different areas in different months and years. Changes in Content or Design. Surveys (and censuses) need to be modified from time to time, with regard to question wording and aspects of their design. Such changes always pose a problem for continuity of time series. In the case of continuous measurement, the problem is more acute because the key estimates for small areas require cumulated data and hence depend on stability in all aspects of the design, questionnaire, and operation of the survey. Relation to Other Household Surveys Much of the nation's most important social and economic information is collected in household surveys conducted by the Census Bureau for other agencies in the federal statistical system. The sponsoring agencies and the Census Bureau have for years conducted extensive research to improve the quality of such surveys as the Current Population Survey, the Health Interview Survey, and the National Crime Survey, as well as the Survey of Income and Program Participation, which is sponsored by the Census Bureau itself. Substantial sums are spent on these surveys to obtain detailed high-quality information on such topics as employment and unemployment, income, and health conditions. The system of continuous measurement currently planned by the Census Bureau will overlap in content with many of those surveys. A continuous measurement survey would not provide the same refined measures or the detailed subject content of these other surveys, but it would provide

OCR for page 113
Modernizing the U.S. Census the small-area detail that they do not. For key summary indicators, such as the unemployment rate or the poverty rate, less refined estimates would also be available from continuous measurement for larger geographic areas—the nation, regions, states, and larger metropolitan areas. These estimates would inevitably be compared with estimates from the other major federal household surveys. The very large sample size for continuous measurement and the fact that estimates for large areas could be provided on an annual or even more frequent basis would, at first glance, make continuous measurement an attractive source for key estimates.15 The much lower unit costs of continuous measurement compared with existing surveys could lead to pressures to cut back the scope of existing surveys in order to reduce the overall costs of the federal statistical system. But there are a number of reasons to continue to require information from current surveys. First, estimates from continuous measurement are likely to have much larger nonsampling errors than are estimates from the other surveys, given the crudeness of the mail questionnaire that must be used for continuous measurement. Moreover, some surveys like the Current Population Survey, which provide data widely used in macroeconomic analysis, are specifically designed to capture month-to-month fluctuations, a characteristic that would not be possessed by the continuous measurement survey. Finally, little is known about the quality of the data to be collected through continuous measurement and the manner in which those data compare to estimates from the existing household surveys. The collection procedures for continuous measurement, based on a mail survey with telephone and some personal interview follow-up for nonresponse, differ markedly from the personal interview and telephone collection with structured questionnaires used in the existing household surveys. Past research suggests that the nonsampling errors associated with self-enumeration in mail surveys are likely to be much larger than those in surveys conducted by personal visit or telephone. We believe that the Census Bureau should develop methods for evaluating the quality of continuous measurement data compared with the other household surveys and for integrating estimates from continuous measurement with the other surveys before making any decision to proceed with implementation of a continuous measurement system. Otherwise, there is the prospect of competing estimates for key statistics without knowledge of how to interpret or reconcile them. The work that is needed on methods of integrating continuous measurement with other surveys in effect involves a complete redesign of the nation's household survey system; this work will take time. Alternative Ways to Provide Small-Area Data In conducting research on the costs and benefits of continuous measurement, it is important to broaden that research to examine the feasibility and cost-effectiveness of other means of obtaining more frequent small-area estimates.

OCR for page 113
Modernizing the U.S. Census Possible alternatives include expanding existing surveys, conducting a mid-decade census or large-scale survey, exploiting data from administrative records, or some combination of these approaches (see Chapter 8). More broadly, it is important to consider competing uses within the federal statistical system for the additional funding that would otherwise be allotted to continuous measurement. In other words, if more complete estimates indicate, as it seems to us, that continuous measurement is likely to cost more than the savings from dropping the long form and other savings from integration with existing household surveys, then careful consideration should be given to alternative uses of that extra funding. The cost and benefits of a wider range of investments in the federal statistical system—whether to obtain more frequent small-area estimates by continuous measurement or by some other means or to meet other kinds of data needs—should be considered before deciding that continuous measurement is preferred. Conclusions For the short term, we conclude that it is not feasible for a continuous measurement system to replace the census as a means of collecting long-form type data. There are too many unanswered questions for which research is needed. Thus, credible estimates must be developed of the savings from dropping the long form (which we believe have been overestimated) and the costs of continuous measurement (which we believe have been underestimated). Also, many other issues remain to be addressed, such as data quality, conceptual issues of using cumulated data, the relationship of continuous measurement to existing household surveys, and the costs and benefits of continuous measurement compared with other methods for obtaining more frequent small-area data. We do not believe that the needed research can be completed in time when decisions must be made about the content of the 2000 census. The Census Bureau has proposed a schedule for the development of continuous measurement that would lead to a dress rehearsal in 1997, with a decision at the end of that year on whether to drop the long form and proceed to implement continuous measurement beginning in 1998. Given all of the unresolved issues, we believe that this schedule is wholly unrealistic and that continuous measurement should therefore be ruled out as a replacement for the long form in 2000. Conclusion 6.4 The panel concludes that the work to date on continuous measurement has overestimated the savings from dropping the long form, understated the cost of a continuous measurement system, and not sufficiently examined feasible alternatives for meeting the nation's needs for more timely long-form-type data at reasonable overall cost. We conclude that it will not be possible to complete the needed research in time to make the critical decisions

OCR for page 113
Modernizing the U.S. Census regarding the format of the 2000 census. We therefore do not recommend substituting continuous measurement for the long form in the 2000 census. For the longer term, we support research on continuous measurement in the context of a broader program to evaluate alternatives for more frequent small-area data and how these alternatives could be integrated with existing household surveys. We encourage the Census Bureau, in cooperation with the other agencies of the federal statistical system, to undertake a comprehensive research program to evaluate the quality of the data that would be collected in continuous measurement; to design a new, modern, integrated system of household surveys; and to consider other sources of small-area estimates. This research, which has hardly begun, must be carried out before it is possible to effectively evaluate plans for a continuous measurement system. Recommendation 6.2 The panel recommends that the Census Bureau broaden its research on alternatives for more frequent small-area data to encompass a wider range than continuous measurement, as currently envisaged. In that context, the Census Bureau should examine the cost-effectiveness of alternatives, the ways in which they meet user needs, and the manner in which continuous measurement or other alternatives could be integrated into the nation's system of household surveys. The research program should be carried out in cooperation with the federal statistical agencies that sponsor household surveys and should include evaluation of the quality of important data elements, the frequency and modes of data collection, and the manner in which results would be presented, as well as methods for introducing change over time. CONCLUSIONS: CONTENT IN THE 2000 CENSUS We have concluded that the long form is a cost-effective means to collect needed data for small areas and small population groups. The long form provides valuable information at reasonable marginal costs—which are likely to decrease with the redesign of the census process—and with little adverse effect on completeness of census coverage. We have also concluded that there is no feasible alternative to including the long form in the 2000 census. This conclusion does not mean that there needs t o be a long form as it existed in 1990: it may be possible, for example, to use matrix sampling to reduce the length of the form that is sent to any single household. We see no prospect that a continuous measurement system can reasonably substitute for the long form in 2000. Such a system may ultimately prove advantageous,

OCR for page 113
Modernizing the U.S. Census particularly for improving the frequency with which long-form content is obtained. But considerable added research and development must be undertaken before its use to replace the long form can be seriously considered. There is no likelihood in our view that the necessary research can be completed in time to make decisions about the 2000 census. Another possible alternative to the long form, administrative records, is not satisfactory at this time because no single record system (or feasible combination of systems) contains the needed information. Hence, we recommend that the 2000 census include a sample survey that obtains the content associated with the census long form. The data can be obtained without jeopardizing the basic census operation and are vitally needed for important public policy purposes. Recommendation 6.3 The panel recommends that the 2000 census include a large sample survey that obtains the data historically gathered through a long form. NOTES 1   The reason to use mail return rates (the proportion of forms mailed back from occupied housing units) rather than mail response rates (the proportion of forms mailed back from all housing units, including vacant units) is that the focus is on the effects of the long form on household behavior. Including vacant units in the denominator would overstate the extent to which the perceived burden of the long form induces households not to mail back their questionnaire. In contrast, the discussion of overall census costs and design in Chapter 5 used mail response rates because all units that need to be followed up, whether they turn out to be vacant or not, add costs. 2   In the 1970 census the mail return rate was 87.8 percent for the short form and 85.5 percent for the long form, a difference of 2.3 percentage points. However, 1970 mail return rates are not strictly comparable with 1980 and 1990 rates because the 1970 mailout/mailback areas covered only 60 percent of the population compared with over 90 percent in 1980 and 1990 (see Appendix A). 3   The micro form with a request for Social Security number had lower response rates than the micro form without this request, especially in areas that had low response rates in the 1990 census. 4   The within-household nonmatch rates cited in the text are not the same as the within-household net undercount (because the nonmatch rates do not account for erroneous within-household enumerations, such as duplications). 5   Also, coverage errors in the census that involve missing whole households or structures because they are not in the address list cannot be attributed to the type of form.

OCR for page 113
Modernizing the U.S. Census 6   The proposal is to test ''nested" rather than matrix sampling as such; that is, the test will include a short form, an intermediate-length form with additional items, and a long form with all of the items on the intermediate-length form and some added items. However, it should be possible to use the results to simulate a matrix design that has two (or more) intermediate-length forms, each with some unique items as well as some items in common. 7   Such a design is distinct from a rolling census, which would collect both short-form and long-form information over the course of the decade and not include a contemporaneous once-a-decade enumeration of the entire population; see Chapter 4. 8   The Census Bureau is currently undertaking a program, at the request of Congress, to develop small-area intercensal poverty estimates; see Chapter 8. 9   Many of these issues are raised in Alexander's papers (1993, 1994b), and a number of important issues are reviewed in documents prepared by the Bureau of the Census (1988a, 1988b) that evaluate two somewhat different versions of a proposed integrated system of area statistics that share some features with the proposed continuous measurement system. 10   Alexander provided this estimate in communication with the panel. We were able to closely reproduce the estimate for the current proposed sample size of 250,000 housing units per month (see text) by using cost factors from Alexander (1994a: Attachment B), which provides detailed estimates for a continuous measurement survey with a somewhat smaller sample size of 233,000 housing units per month. 11   The cost calculation shifts from completed interviews (in the case of mail and telephone responses) to attempted interviews in the case of personal visits. The assumption is that attempted but unsuccessful personal interviews do not save money since most of the cost is associated with making contact. 12   Recent work by Shapiro et al. (1993) suggests that the difference between survey and census coverage is somewhat less pronounced when census overcounts are excluded from the comparison. Nonetheless, undercoverage in household surveys is significantly worse than in the census. 13   One interesting question in this regard is whether response to the continuous measurement survey could be made mandatory like the census. Such a requirement could improve coverage, although it would represent a departure from current practice, in which household surveys are voluntary. 14   Alexander (1994b:7) suggests that continuous measurement could make it possible to introduce corrective actions for such problems as poorer response in some areas (e.g., by assigning more effective interviewers or increasing the sampling rates in those areas). 15   Some existing surveys (e.g., the Current Population Survey) publish monthly estimates; others publish quarterly or annual estimates. Current estimates from major federal household surveys are limited, however, to states and major metropolitan areas. The Census Bureau's plans for the continuous measurement

OCR for page 113
Modernizing the U.S. Census     system call for quarterly processing of the data with estimates released 6 months after the end of a quarter. In addition, the Census Bureau plans to release annual data for all urban areas with 250,000 or greater population. However, user interest could lead to pressures to increase the frequency and timeliness of publication.