National Academies Press: OpenBook

The 2000 Census: Counting Under Adversity (2004)

Chapter: 7 Assessment of Basic and Long-Form-Sample Data

« Previous: 6 The 2000 Coverage Evaluation Program
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

CHAPTER 7
Assessment of Basic and Long-Form-Sample Data

THE CONTENT OF THE 2000 CENSUS, as in past censuses, included basic demographic items plus a wide range of social and economic characteristics. The basic items (or complete-count items) were asked of everyone, whether they received the short or the long form; the additional items (or sample items) were asked of people selected for the long-form sample of about one-sixth of the population (see Appendix B for the list of items). The demographic items have widespread use, particularly as they form the basis of small-area population estimates that the Census Bureau develops for years following each census (see Section 2-C). The additional long-form-sample items on such topics as income, employment, education, occupation and industry, transportation to work, disabilities, housing costs, and others are used extensively by federal, state, and local government agencies, the private sector, academic researchers, the media, and the public (see Section 2-D).

Users need to understand the quality of the basic and the sample data to interpret census results appropriately. The Census Bureau needs to understand data quality to determine how best to improve census processes to produce high-quality information and how to inform users about its strengths and weaknesses. Past censuses provided a rich array of basic and long-form-sample data quality

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

measures from studies of nonresponse, exact matches with surveys and administrative records, content reinterviews with samples of respondents, and experiments to determine response effects of alternative questionnaire formats and wording (see, e.g., Bureau of the Census, 1964, 1970, 1975a,b, 1982a,b, 1983a,b, 1984). To date, data quality measures are somewhat sparser for the 2000 census.

The panel requested and received detailed tabulations of basic and long-form-sample item imputation rates for the 2000 and 1990 censuses and more limited information on item nonresponse in the Census 2000 Supplementary Survey (C2SS).1 (An analysis commissioned by the Census Bureau used these tabulations; see Schneider, 2003.) The panel also compared the consistency of basic characteristics for people in the census-based E-sample who matched cases in the independent P-sample of the 2000 Accuracy and Coverage Evaluation (A.C.E.) Program.2 A Content Reinterview Survey, conducted primarily by telephone in June–November 2000 of 20,000 long-form recipient households, provided indexes of inconsistency between census and survey responses for most questionnaire items for one randomly chosen member of each household. Unlike previous censuses, the 2000 Content Reinterview Survey did not try to measure systematic response biases by including probing questions to determine the most accurate response (see Singer and Ennis, 2003). A set of questionnaire experiments in 2000 examined forms design, listing of household members, and race and ethnicity questions (see Martin et al., 2003). At a later date, information on response variance and bias will be available from an exact match of long-form census records and the April 2000 Current Population Survey.

In this chapter, we briefly discuss the usefulness of three types of available 2000 census data quality measures: imputation rates, consistency measures, and variability and sample loss for the long-form sample (7-A). We then review available data quality measures

1  

The C2SS surveyed 700,000 households, or 1.8 million people, by mail with computer-assisted telephone and in-person follow-up; it is a precursor to the planned American Community Survey (see Appendix I.3).

2  

The 2000 P-sample surveyed 0.3 million households, or 0.7 million people, in about 11,000 block clusters, using computer-assisted telephone and in-person interviewing; the E-sample contained about the same number of households and people as the P-sample, drawn from the 2000 census records in the same block clusters (see Chapters 5 and 6).

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

for three population groups: all household members (7-B); household members in the long-form sample (7-C); and group quarters residents in the long-form sample (7-D). Each part concludes with a summary of findings and recommendations for 2010. Appendixes G and H describe the imputation and other data processing procedures that affect the basic and long-form-sample items, respectively. Appendix F reviews alternative item imputation procedures that may be more accurate than the current “hot-deck” procedures.

7–A AVAILABLE QUALITY MEASURES

7–A.1 Imputation Rates

The census enumeration will always have nonresponse: some households may not want to be found or are overlooked; respondents for participating households may not answer every question because they do not want to answer a particular item or do not know the answer; some responses are unintelligible and are voided in data processing. Since 1960, data processing for the complete-count census has used computer-based imputation for whole-household and item nonresponse (see Appendix C.5.d).3 The long-form sample, like other household surveys, has used imputation for missing items; it accounts for household nonresponse by weighting respondent cases. Imputation makes census data more useful because analysts do not have to discard cases with missing values.4 Imputation by the Census Bureau is also more efficient and facilitates consistency in uses of the data than if each analyst were to develop his or her own imputation procedure.

However, imputation is a source of error. Because imputation commonly uses reported values, the distribution of values after imputation will be inaccurate to the extent that cases requiring imputation differ from cases for which there are responses, in ways that are not or cannot be made part of the imputation procedure. Furthermore, the relationships of two or more variables may be distorted

3  

By “complete-count census,” we mean the 100 percent enumeration, including short forms and the basic-item responses on long forms.

4  

One long-form-sample item first asked in 1980—namely, ancestry—is not imputed; instead, a “not reported” category is tabulated, as was common practice for census items prior to 1960.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

if imputation levels are high and imputation techniques do not take account of these relationships. Consequently, except when cases with reported and missing values are similar in their characteristics or auxiliary information is available with which to improve the accuracy of the imputations, higher missing data rates will indicate poorer data quality.

For the 2000 census and the C2SS, codes on the data records distinguish imputations based on the use of another person’s or household’s information (“allocations”) from “assignments” based on known information for the specific record. For example, first name was used to assign values to a large fraction of the records with missing sex;5 answers to the race question were used to assign values to some cases with missing Hispanic origin; and answers to questions on housing costs were used to assign values to a large fraction of long-form cases missing housing tenure. Codes on the 1990 census records did not distinguish between imputations and assignments, so our tables for the 2000 census sometimes show both types of rates—the imputation/assignment rate is comparable to 1990 and indicates nonresponse;6 the imputation rate per se indicates the fraction of cases that required a donor record to supply missing values.

7–A.2 Consistency Measures

Comparing the consistency of responses to the same question in two or more data sources can help identify possible reporting biases, although it is often not possible to say which source is more accurate. Consistency measures can also indicate response variability if responses tend to differ according to such factors as data collection mode, question format, and who answered for the household.

7–A.3 Sample Loss and Variability (Long Form)

Estimates from the long-form sample, like other surveys, are subject to variability from sampling and also from unit nonresponse

5  

This type of assignment was not possible in the 1990 census, which did not capture names except for records in the PES E-sample.

6  

The imputation/assignment rate is not exactly the same as a nonresponse rate because reported values that are inconsistent with other reported values may be blanked and another value imputed or assigned.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

that further reduces the effective sample size. The long-form records for respondents are weighted to agree with complete-count totals (see Appendix H.2). This weighting effectively adjusts for sampling rates, instances of whole-household nonresponse, and additional sample loss due to some households having provided only minimal data. The 1990 and 2000 long-form sampling probabilities varied, by design, from 1 in 2 households to 1 in 8 households, depending on the population size and type of geographic area. To receive a nonzero weight, households had to include at least one member who reported at least two basic items and two long-form items. The measures of variability, or variance, that are constructed for the long-form sample take account of the weighting (see Appendix H.2), but not of the variability introduced by item imputation.

7–B QUALITY OF BASIC DEMOGRAPHIC CHARACTERISTICS

The basic demographic characteristics in 2000, asked on both the short and long forms, were age; month, day, and year of birth; sex; ethnicity (Hispanic origin); race; relationship to household reference person (first person listed on the questionnaire); and housing tenure (own or rent). The 1990 census included additional basic items (see Appendix B).

7–B.1 Imputation Rates for Complete-Count Basic Items

Table 7.1 provides imputation rates for the basic demographic items from the 2000 and 1990 census complete counts (separately for short and long forms) for total household members, people in households headed by blacks, and people in households headed by Hispanics.7 These rates include cases of whole-household imputation in addition to individual item imputation.

In 2000 the combined whole-household and item imputation rates for all household members ranged from 2.3 percent for sex to 5.4 percent for ethnicity (Hispanic origin). The rates for long-form recipients were higher, ranging from 3 percent for sex to 9.3 percent for housing tenure. The rates for members of minority households

7  

Complete-count data were made available in the P.L. 94-171 file for redistricting and in Summary Files 1 and 2 (see Box 2.1).

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Table 7.1 Basic Item Imputation Rates, 2000 and 1990 Complete-Count Census, by Type of Form and Race/Ethnicity, Household Population

Population Group and Form Type

Agea

Sex

Race

Ethnicity (Hispanic Origin)

Relation-ship to Head

Tenure

2000 Census

Total Household Population

4.8

2.3

5.1

5.4

3.4

4.8

Short-Form Recipients

4.6

2.1

5.1

5.3

3.2

3.9

Long-Form Recipients

6.0

3.0

5.5

6.1

4.4

9.3

Black Non-Hispanic

7.6

3.9

4.9

10.5

5.6

7.3

Short-Form Recipients

7.1

3.5

4.6

10.3

5.2

6.0

Long-Form Recipients

10.2

5.8

7.0

11.6

7.9

14.5

Hispanic

7.5

4.2

17.2

6.5

6.3

5.9

Short-Form Recipients

7.4

4.1

17.4

6.4

6.0

5.2

Long-Form Recipients

8.4

4.9

15.9

7.6

7.5

10.3

1990 Census

Total Household Population

3.1

1.9

2.6

10.5

3.3

3.1

Short-Form Recipients

3.0

1.9

2.8

11.7

3.4

3.1

Long-Form Recipients

3.5

1.7

2.1

4.1

2.9

2.7

Black Non-Hispanic

5.6

3.6

3.5

18.4

6.1

5.4

Short-Form Recipients

5.4

3.6

3.5

20.3

6.2

5.6

Long-Form Recipients

6.6

3.1

3.1

7.8

5.5

4.2

Hispanic

3.9

2.6

9.3

7.8

5.9

4.3

Short-Form Recipients

3.8

2.7

10.0

8.3

6.1

4.5

Long-Form Recipients

4.6

2.4

5.5

4.8

5.0

3.1

NOTES: Rates include whole-household imputations (types 2–5; see Box 4.2); wholly imputed persons (type 1); and item imputations. Race and ethnicity population groups are defined by the response of the household reference person.

Household population totals for 2000:273.6 million total (83.4 percent on short forms, 16.6 percent on long forms); 32.4 million people in households headed by non-Hispanic blacks (84.8 percent on short forms, 15.2 percent on long forms); 33.4 million people in households headed by Hispanics (85.4 percent on short forms, 14.6 percent on long forms).

Household population totals for 1990:242 million total (83.1 percent on short forms, 16.9 percent on long forms; 28.0 million people in households headed by non-Hispanic blacks (84.8 percent on short forms, 15.2 percent on long forms); 21.2 million people in households headed by Hispanics (85.4 percent on short forms, 14.6 percent on long forms).

a Excludes imputation of age from date of birth.

SOURCE: Tabulations by U.S. Census Bureau staff from the 2000 and 1990 Household Census Edited Files (HCEF), provided to the panel spring 2003.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

were also higher. Of particular note are the response patterns to the ethnicity and race questions: while 5 percent of total household members did not respond to these items, 11 percent of members of households headed by blacks did not respond to the ethnicity question, and 17 percent of members of households headed by Hispanics did not respond to the race question, perhaps believing it did not apply. For the 10 percent of census tract neighborhoods with the highest percentages of basic item imputations, the imputation rates were one-half to three times higher than the total population rates shown (data not shown).

In 1990 the combined whole-household and item imputation rates for household members were generally below the corresponding 2000 rates (Table 7.1). A reason for generally lower basic item imputation rates in 1990 compared with 2000 was that the 1990 census used telephone and field follow-up for missing or inconsistent data content, but the 2000 census did not. The exception was the ethnicity item on the short form, for which imputation rates in 1990 were twice as high as the 2000 rates (except for Hispanics). A reason for higher short-form imputation rates for ethnicity in 1990 than in 2000 was that the content review and follow-up procedures for mailed-back short forms in 1990 were trimmed back for budgetary reasons, so that only a one-tenth sample of short forms were reviewed and sent to follow-up if necessary (see Appendix C.3.c). Reordering the race and ethnicity items so that ethnicity came before race in 2000 (and not after as in 1990) also contributed to lower item imputation rates for ethnicity in 2000 compared with 1990.

In 2000 about 1.3 percentage points of the total household population imputation rates shown in Table 7.1 were due to whole-household imputations (types 2–5—see Box 4.2). For these cases data from neighboring households were used to supply complete records of basic information for members of households for which information on members’ characteristics, and sometimes household size, was missing. Whole-household imputations contributed 1.2 percentage points to the imputation rates for short-form records and 1.8 percentage points to the imputation rates for long-form records. Whole-household imputation rates were highest for enumerator long forms (5.2 percent), followed by enumerator short forms (4.7 percent), with self-response forms by mail, telephone, Internet, or the Be Counted program including very few whole-household

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

imputations (less than 0.1 percent).8 In 1990 whole-household imputations contributed 0.7 percentage point to the total household population imputation rates shown.

In 2000 about 0.9 percentage point of the total household population imputation rates shown in Table 7.1 was due to wholly imputed persons (type 1—see Box 4.2 and Table 4.1), which occurred when there was not room to report basic characteristics for all household members on the questionnaire. The corresponding figure for 1990 was 0.2 percentage point. Imputations for missing members of enumerated households were made item by item, using information about the other household members to construct a reasonable household composition (e.g., imputing race and ethnicity to be consistent with other household members).

Imputation rates for some basic items would have been higher in 2000 than shown in Table 7.1 if the 2000 imputation procedures had not been able to take advantage of names and other information for assigning rather than imputing missing values (see Section 7-A.1). For example, first names were used to assign sex for about 1 percent of household members (data not shown); if this procedure had not been feasible, the imputation rate for sex for total household members in 2000 would have been 3.3 percent, not 2.3 percent as in Table 7.1.

Basic item imputation rates from the complete count for the nation as a whole and for large population groups were reasonably low for the most part, but some small geographic areas and population groups required much more imputation, which users should consider in their analyses. For example, imputation rates for race at the county level reached as high 17 percent, while imputation rates for ethnicity at the county level reached as high as 35 percent (see Section 8-C.2; see also Appendix H.3.b).

7–B.2 Missing Data Patterns for Basic Items

An analysis by Zajac (2003) provides information on patterns of missing responses in 2000 census records; that is, percentages of person records that are missing one, two, or three or more of the

8  

From tabulations by U.S. Census Bureau staff of the 2000 and 1990 Household Census Edited Files, provided to the panel in spring 2003.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

five basic person items. We also computed these statistics for the 2000 census records in the A.C.E. E-sample and for the records in the independent P-sample. Table 7.2 provides these data from all three sources; the E-sample percentages exclude whole-household and whole-person imputations, as well as reinstated cases from the special Master Address File (MAF) unduplication operation, which could not be matched to P-sample cases.

More P-sample persons answered all five items—95 percent—than did census persons—87 percent (the corresponding figure for the E-sample—data not shown—is 89 percent). This result is not surprising because the interviewing for the P-sample was more carefully controlled than was the census enumeration. However, the census and the P-sample were much closer in the percentage of respondents who answered at least four items (97.6 percent P-sample, 96.1 percent 2000 census). The most commonly omitted basic items in the census were age and ethnicity (data not shown).

Rates of answering all five basic person items in the census varied by whether the household responded for itself or answered to an enumerator (Table 7.2, panel A). Members of self-responding households (mail, Internet, telephone, Be Counted) were more likely to answer all five items (90 percent) than were members of households visited by enumerators (79 percent). By the race/ethnicity and housing composition of the A.C.E. block cluster (Table 7.2, panel B), household members living in white and some other race owner and renter block clusters were most likely to answer all five questions (92 and 89 percent, respectively); household members living in Hispanic renter block clusters were least likely to answer all five questions (77 percent). These data are from the A.C.E. E-sample and underestimate the extent of nonresponse in the census; in contrast, the P-sample achieved a high level of reporting of all five basic person items for all neighborhood types—92 to 95 percent.

7–B.3 Consistency of Responses to Basic Items

Comparing census cases in the E-sample that matched P-sample cases revealed low rates of inconsistent reporting of basic items for the household population as a whole. Thus, 4.7 percent of matched cases (unweighted) had conflicting values for housing tenure; 5.1

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Table 7.2 Percentage of Household Members Reporting Basic Items, 2000 Census, 2000 A.C.E. E-Sample and Independent P-Sample (weighted)

PANEL A

Number of Items Reported

Sample

All Five

Four

Three

Two or Less

Census Totala

87.0

9.1

1.2

2.8

Self Enumerations

89.8

7.9

1.1

1.1

Interviewer Enumerations

79.4

11.9

1.6

7.1

P-Sample Total

94.9

2.7

0.8

1.6

PANEL B

Household Members Reporting All Five Items by Neighborhood Typeb

E-Samplec

P-Sample

American Indian and Alaska Native

 

Owner

91.7

94.9

Renter

85.4

93.2

Hispanic

 

Owner

80.6

94.5

Renter

77.2

94.1

Black

 

Owner

85.0

94.5

Renter

81.7

93.8

Native Hawaiian and Other Pacific Islander

 

Owner

84.7

94.0

Renter

85.4

92.2

Asian

 

Owner

86.7

93.9

Renter

80.4

93.9

White and Other

 

Owner

91.6

95.4

Renter

88.9

94.2

a Census percentages reporting two or fewer items include people in wholly imputed households (imputation types 2–5, see Box 4.2)and wholly imputed people (type1); census percentages for interviewer enumerations include all people in households that were contacted in the coverage edit and follow-up operation of mail returns to obtain basic characteristics for missing household members.

b Neighborhood (A.C.E. block cluster) type determined by Census Bureau staff from 1990 characteristics (A.C.E. block clusters were defined as one or more contiguous blocks, intended to contain about 30 housing units on average—see Appendix E.1.a).

c The E-sample excludes whole-household and whole-person imputations and reinstated records due to the special summer 2000 unduplication operation (see Section 4-E).

SOURCE: For 2000 census, Zajac (2003:Table 34), adjusted to include people in wholly-imputed households; for E-sample and P-sample, tabulations by panel staff of P-Sample and E-Sample Dual-System Estimation Output Files (U.S. Census Bureau, 2001b), provided to the panel February 16, 2001, weighted using TESFINWT.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

percent had conflicting values for age and sex group; and 3.9 percent had conflicting values for race/ethnicity domain (Farber, 2001a:Table 1). Reasons for conflicting values could include reporting error in one or both samples, differences in question format and mode and time of collection, different household respondents for the census enumeration and P-sample interview, different imputation methods, errors in imputation, and errors in matching.

Rates of inconsistency were higher for matched cases for which the characteristic in question was imputed in one or both samples than for nonimputed cases: thus, 12 percent of cases with imputed race or ethnicity, 22 percent of cases with imputed housing tenure, and 36 percent of cases with imputed age or sex were inconsistent between the E-sample and P-sample, compared with 3, 4, and 3 percent, respectively, of nonimputed cases (Farber, 2001a:Table 1). This result indicates that, at an individual level, imputations were often not accurate, which could be consequential for analyses of public use microdata samples and for small geographic areas. However, because overall imputation rates were low, the effects of imputation error were not large for the household population as a whole. Overall distributions by age and sex group, housing tenure, and race/ethnicity domain remained very similar in both the E-sample and P-sample (Farber, 2001a:Tables A-1, A-2, A-3).

When population groups were defined by multiple characteristics, instead of a single variable, then very high rates of inconsistency often occurred, particularly for imputed cases (see Farber, 2001a:Tables E-1 through E-64). As a best-case example, people who were classified in one or both samples as non-Hispanic white owners in medium-sized mailout/mailback areas with high response rates in the Midwest were classified inconsistently in another poststratum 10 percent of the time overall, 6 percent for nonimputed cases, and 41 percent for imputed cases, which accounted for only 11 percent of this group. As a worst-case example, people who were classified in one or both samples as American Indians and Alaska Natives off reservations were classified inconsistently in another poststratum 59 and 57 percent of the time overall for owners and renters, respectively; 55 and 52 percent for nonimputed owner and renter cases, respectively; and 74 and 77 percent for imputed owner and renter cases, respectively. Imputations accounted for 21 percent of each of these two groups (owners and renters).

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

7–B.4 Basic Item Imputation and Inconsistency Rates: Summary of Findings

In general the 2000 census complete count exhibited reasonably good data quality as measured by low imputation rates and low rates of inconsistency for individual items (less than 5 percent). However, there was considerable variability in the completeness of reporting (answering all or most items) by type of neighborhood (in contrast to the P-sample, which uniformly obtained a high level of complete reporting). There was also evidence of reporting problems for particular items, such as high rates of nonresponse to the race item by members of Hispanic-headed households and high rates of nonresponse to the ethnicity item by members of black non-Hispanic-headed households. In general, as we discuss in Chapter 8, responses to race and ethnicity items are sensitive to minor variations in question placement, format, and wording.

Comparisons of matched E-sample and P-sample cases indicated that rates of inconsistency were much higher for imputed cases than for nonimputed cases. Rates of inconsistency were also much higher for some groups than others. However, aggregate distributions were very much the same. Not answerable by any of the available evidence is whether there were systematic reporting errors for any of the basic items or whether imputation introduced bias, although further analysis of matched E-sample and P-sample cases could be helpful in this regard.

The census had higher missing data rates for basic items in 2000 than in 1990. Not known is whether the field follow-up to reduce missing data rates in 1990 obtained less accurate responses than the heavier reliance on computer-based imputation in 2000. We repeat that the Census Bureau’s 2010 testing program should include tests of the trade-offs in costs and accuracy between computer imputation and additional field work for missing data (see Recommendation 4.2).

Finding 7.1: Rates of missing data in 2000 were low at the national level for the basic demographic items asked of everyone (complete-count items)—age, sex, race, ethnicity, household relationship, and housing tenure. Missing data rates for these items ranged from 2 to 5 percent (including records for people with one or more missing

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

items and people who were wholly imputed). Rates of inconsistent reporting for the basic items (as measured by comparing responses for census enumerations and matching households in the independent Accuracy and Coverage Evaluation survey) were also low. However, some population groups and geographic areas exhibited high rates of missing data and inconsistent reporting for one or more of the basic items. No assessments have yet been made of reporting errors for such items as age, nor of the effects of imputation on the distributions of basic characteristics or the relationships among them.

7–C QUALITY OF BASIC AND ADDITIONAL LONG-FORM DATA

Data from the census long-form sample are eagerly awaited and widely used throughout government, academia, and the private sector. They are made available in Summary Files 3 and 4 and in public use microdata samples, which contain individual records for a sample of long-form cases, processed to protect individual confidentiality (see Box 2.1). For 2010 the current design calls for a short-form-only census, with long-form-type data collected in the new, continuously fielded American Community Survey (see Section 3-D).

7–C.1 Imputation Rates for Basic Items in the Long-Form Sample

Table 7.3 shows household member weighted imputation rates for the basic demographic items from the 2000 census long-form sample and three other surveys: (1) the Census 2000 Supplementary Survey; (2) the 2000 P-sample survey; and (3) the 1990 census long-form sample. Note that the 2000 and 1990 census long-form-sample rates are lower than the complete-count rates shown in Table 7.1. The reason is that whole-household imputations and members of nonsample-data-defined households were not given a weight in the long-form sample; instead, weighting was used to adjust for their nonresponse. (Sample-data-defined households in 2000 and 1990 were households with at least one member who reported at least 2

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

basic items and 2 sample items.) Consequently, a full understanding of the completeness of response in the long-form sample requires examining not only item imputation rates but also weighting adjustments (see Section 7-C.5).

Generally, imputation rates for basic items were lowest for the P-sample and for the 1990 census long-form sample, whether on self returns or enumerator returns. Imputation rates, even excluding assignments, were highest (although still below 5 percent, except for housing tenure) for the 2000 census long-form sample; again, there was not much difference between self and enumerator returns. Housing tenure had a much higher rate of nonresponse in the 2000 long-form sample (8 percent) than in the 1990 long-form sample (1 percent), which may have been due to the placement of the question (see Section 7-C.2). Imputation rates in the 2000 long-form sample forbasicitemsbyraceandethnicityandbytypeofgeographicarea (e.g., central city, rural) showed relatively little variation; imputation rates for census tracts with the highest percentage of basic item imputations were about twice the overall rates (see Appendix H).

7–C.2 Imputation Rates for Additional Long-Form Items

Table 7.4 compares nonresponse rates for the 2000 and 1990 long-form samples for 15 person items and 12 housing items that were asked solely of the long-form sample, by type of return (self, enumerator).9 The items are listed in order of their appearance on the 2000 long-form questionnaire. With the exception of the items for English-speaking ability and year structure built, the 2000 long-form sample exhibited higher nonresponse rates than occurred in 1990. For some items, the differences were dramatic: for example, the 2000 and 1990 nonresponse rates were 16 and 9 percent, respectively, for occupation and 16 and 1 percent, respectively, for monthly rent (which was a complete-count item in 1990). For income, 30 percent of long-form cases in 2000 failed to report some or all income items, compared with only 13 percent of long-form cases in 1990.10 The much higher nonresponse rates for most housing items in 2000

9  

See Appendix H (Tables H.6 and H.7) for comparable tabulations for all person and housing long-form items.

10  

Denominators of imputation rates for additional long-form items use the appropriate universe for the question.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Table 7.3 Basic Item Imputation Rates, 2000 and 1990 Census Long-Form Sample, Census 2000 Supplementary Survey, and 2000 P-Sample, by Type of Rate and Form, Household Population (weighted)

Population Group and Form Type

Agea

Sex

Race

Ethnicity (Hispanic Origin)

Relationship to Head

Housing Tenure

2000 Census Long-Form Sample

 

Imputations/Assignments

2.6

1.6

3.2

4.0

2.7

8.0

Self Responses

1.9

1.5

3.7

4.7

2.6

7.6

Enumerator Responses

4.3

2.0

2.0

2.4

3.0

8.9

Imputations Only

2.6

0.9

3.2

3.6

2.3

4.3

Census 2000 Supplementary Survey

 

Imputations Only

2.4

0.5

2.4

3.6

1.6

1.4

Self Responses

3.6

0.7

3.4

5.7

0.7

1.6

Enumerator Responses

0.8

0.2

1.1

0.7

1.2

1.0

2000 Accuracy and Coverage Evaluation P-Sample

 

Imputationsb

2.3

1.6

1.2

2.2

1.7c

1.9

1990 Census Long-Form Sample

 

Imputationsb

0.9

0.8

1.1

3.4

1.9

1.4

Self Responses

0.8

0.9

1.0

4.0

1.7

1.4

Enumerator Responses

1.2

0.7

1.6

1.7

2.4

1.4

NOTES: Imputation/assignment rates (percents) are included for comparability with 1990 imputation rates; imputation/assignment rates include item imputations plus assignments of values for missing or inconsistent responses based on known information for the specific record.—; not available. Census self responses include mail, telephone, Internet (2000 only), and Be Counted returns; enumerator responses include forms obtained in nonresponse follow-up, list/enumerate, and other field operations.

a Excludes imputation of age from date of birth.

b Includes “assignments.”

c Relationship was not imputed in the P-sample; the rate reported is the missing data rate.

SOURCE: Tabulations by U.S. Census Bureau staff from the 2000 and 1990 Sample Census Edited Files (SCEF), and the C2SS file, provided to the panel spring 2003. P-sample tabulations by panel staff from P-Sample Dual-System Estimation Output File (U.S. Census Bureau, 2001b), provided to the panel February 16, 2001, weighted using TESFINWT.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

compared with 1990 could in part have resulted from the different placement of the housing items on the two long forms: in 2000, the housing items followed all of the basic and additional items for the first person; in 1990, the housing items followed all of the basic items for all household members and were followed in turn by the additional items for all household members.

By type of return, nonresponse rates for most long-form items were higher for enumerator returns than for self-returns.11 For some items, the differences were minor, but for others they were large. For example, in 1990, 21 percent of people on enumerator returns failed to answer some or all of the income items, compared with 11 percent of people on self returns; in 2000, 40 percent of people on enumerator returns failed to answer some or all of the income items, compared with 26 percent of people on self returns—a disturbingly high percentage in itself. By race and ethnicity (defined for the household reference person; see Appendix H, Table H.2), 40 percent of non-Hispanic blacks and 34 percent of Hispanics failed to answer some or all of the income questions in 2000, compared with 25 percent of Asians and 28 percent of non-Hispanic whites.

Table 7.5 shows imputation rates (excluding assignments) for the same 15 person items and 12 housing items for the 2000 census long-form sample and the Census 2000 Supplementary Survey, by type of return. For every item shown except year structure built, imputation rates were higher for the 2000 long-form sample compared with the C2SS. The differences in favor of the C2SS ranged from 0.4 percentage points for marital status to 11.2 percentage points for property taxes. For the 2000 long-form sample, for most person items, imputation rates for enumerator returns exceeded those for self returns, sometimes by large margins. In contrast, for the C2SS, most person-item imputation rates were lower for enumerator returns than for self returns. The exception for the person items shown was income, for which enumerator returns had higher imputation rates than self returns for both the 2000 long-form sample and the C2SS, although

11  

Percentages of self returns in the 2000 and 1990 long-form samples (76 and 73 percent, respectively) are not the same as final mail return rates, which were 71 percent in both years for the long form. One reason is that proportionately more enumerator returns than mail returns were dropped from the long-form sample because they were not sample-data-defined, so their nonresponse was handled by weighting instead of imputation (see Section 7-C.5).

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Table 7.4 Imputation/Assignment Rates for Selected Long-Form Items, 2000 and 1990 Census Long-Form Samples, by Type of Response, Household Population (weighted)

 

2000 Long Form

1990 Long Form

Item

Total

Self

Enumerator

Total

Self

Enumerator

Marital Status

3.4

2.3

6.2

0.9

0.8

0.9

Educational Attainment

7.2

5.2

12.0

4.5

3.8

6.1

English-Speaking Ability

7.6

7.3

7.9

8.5

8.3

9.0

Place of Birth

9.2

7.8

12.5

5.1

4.3

7.1

Residence 5 Years Ago

8.6

7.4

11.6

5.2

3.6

9.3

Mobility Disability

10.0

10.5

8.6

5.1

4.7

6.4

Work Disabilitya

11.4

12.2

9.3

7.4

7.4

7.6

Veteran Status

8.2

6.8

11.8

4.8

4.0

7.0

Employment Status Recode

11.1

10.2

13.4

3.8

3.0

6.2

Place of Work (State)

9.7

7.3

15.5

7.2

6.2

9.7

Transportation to Work

8.2

6.0

13.3

4.6

3.7

7.3

Occupation Last Year

16.1

14.3

20.4

9.1

7.9

12.5

Weeks Worked Last Year

20.2

19.7

21.3

14.7

14.6

15.0

Wage and Salary Income

20.0

15.0

32.6

10.0

7.7

16.3

Income, All Sources

100 Percent Imputed

24.5

18.9

38.5

11.7

9.1

19.1

Some Imputedb

29.7

25.5

40.3

13.4

10.9

20.5

Units in Structurec

4.4

4.9

3.0

1.6

1.8

1.2

Year Structure Built

11.7

9.3

18.0

23.0

20.2

30.7

Number of Roomsc

6.2

6.2

6.4

0.4

0.4

0.5

Complete Plumbing

3.4

3.5

3.1

1.7

1.7

1.8

Complete Kitchen

3.4

3.5

3.1

1.8

1.8

1.8

Fuel Used for Heating

7.4

6.3

10.1

2.9

2.7

3.4

Annual Electric Cost

18.5

15.3

26.9

5.5

4.4

8.5

Monthly Rentc

15.6

13.2

19.2

1.3

1.1

1.6

Property Taxes

32.0

27.0

49.6

12.2

10.3

19.4

Value of Propertyc

13.3

12.3

16.6

3.3

3.3

3.4

NOTES: 2000 rates (percents) are imputation/assignment rates for comparability, as a measure of nonresponse, with 1990 imputation rates (see notes to Table 7.3). Self responses (76 percent of 2000 long-form sample; 73 percent of 1990 long-form sample) included mail, telephone, Internet (2000 only), and Be Counted returns; enumerator responses included forms obtained in nonresponse follow-up, list/enumerate, and other field operations.—; not available.

a In 1990, “work disability” refers to a disability that prevents working; in 2000, the term refers to a disability that makes it difficult to work.

b Includes 100 percent of income imputed.

c Basic (complete-count) item in 1990.

SOURCE: Tabulations by U.S. Census Bureau staff from the 2000 and 1990 Sample Census Edited Files (SCEF), provided to the panel spring 2003.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

the difference was smaller in the C2SS. For most housing items, imputation rates for enumerator returns were higher than the rates for self returns for both surveys.

At least three factors argue why the C2SS could be expected to achieve lower imputation rates than the long-form sample and, in particular, why the C2SS interviewers could be expected to outperform the census enumerators. First, the C2SS interviewers were more experienced and much better trained than the temporary census staff, which meant that they were better able to obtain responses from reluctant respondents. Second, the C2SS interviews used computer-assisted telephone and in-person interviewing. (By design the C2SS used sampling for households that did not mail back a return and did not respond during the initial telephone-follow-up phase; the personal interview sampling rate was 1 in 3 of nonresponding households.) Third, the goal for the C2SS was to collect all of the information. In contrast, the essential goal for the 2000 census was to obtain a complete count—if household respondents balked at answering the additional long-form-only questions, the enumerators were not as likely to press hard for a response as would the C2SS interviewers. Yet these factors were apparently not sufficient to overcome respondents’ reluctance or inability to answer questions on income and housing costs in the C2SS as well as the census.

7–C.3 Missing Data Patterns for Additional Items

No analysis has been conducted to date of patterns of response and nonresponse to the additional long-form items; that is, whether people tended to omit single items or clusters of items or most items. The panel made a limited set of tabulations of nonresponse patterns of cases in the A.C.E. E-sample who fell into the 2000 long-form sample and whose records were augmented at the panel’s request by the additional long-form items. These tables focused on the income and employment items, which exhibited high nonresponse rates.

A tabulation of people age 16 and over in the E-sample long-form records found that 71 percent answered all 9 income items, 11 percent answered 6–8 items, 2 percent answered only 3–5 items, 8 percent answered only 1 item, and 8 percent did not answer any of

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Table 7.5 Imputation Rates for Selected Long-Form Items, 2000 Long-Form Sample and Census 2000 Supplemental Survey, by Type of Response, Household Population (weighted)

 

2000 Long Form

Census 2000 Supplementary Survey

Item

Total

Self

Enumerator

Total

Self

Enumerator

Marital Status

2.2

1.4

4.3

1.8

2.4

1.0

Educational Attainment

7.2

5.2

12.0

4.8

4.9

4.7

English-Speaking Ability

7.6

7.3

7.9

6.0

10.5

2.3

Place of Birth

9.2

7.8

12.5

6.4

8.1

4.1

Residence 5 Years Ago

5.8

4.3

9.6

4.0

5.6

1.8

Physical Disability

7.6

7.1

8.9

5.2

7.4

2.1

Work Disability

11.4

12.2

9.3

5.9

8.3

2.2

Veteran Status

7.5

6.1

11.0

4.7

6.1

2.5

Employment Status Recode

11.1

10.2

13.4

6.0

8.2

2.6

Place of Work - State

9.7

7.3

15.5

5.8

6.5

4.8

Transportation to Work

7.6

5.4

13.0

4.6

5.5

3.3

Occupation Last Year

14.9

13.2

19.2

9.5

11.1

7.1

Weeks Workeda

19.3

18.6

20.9

9.6

11.1

7.3

Wage and Salary Income

20.0

15.0

32.6

16.4

13.0

21.4

Income, All Sourcesa

100 Percent Imputed

24.5

18.9

38.5

20.0

16.1

25.7

Some Imputedb

29.7

25.5

40.3

23.9

20.7

28.6

Units in Structure

4.4

4.9

3.0

1.4

1.6

1.0

Year Structure Built

11.7

9.3

18.0

13.4

7.4

22.8

Number of Rooms

6.2

6.2

6.4

2.6

3.4

1.4

Complete Plumbing

3.4

3.5

3.1

1.0

1.4

0.3

Complete Kitchen

3.4

3.5

3.1

0.9

1.3

0.3

Fuel Used for Heating

7.4

6.3

10.1

2.1

1.6

2.8

Electric Costc

17.1

13.6

26.1

6.9

4.3

11.0

Monthly Rent

15.6

13.2

19.2

5.3

4.2

6.3

Property Taxes

32.0

27.0

49.6

20.8

13.7

35.4

Value of Property

13.3

12.3

16.6

9.7

6.0

17.4

NOTES: Rates (percents) exclude assignments. In 2000, self responses included mail, telephone, Internet, and Be Counted returns; enumerator responses included forms obtained in nonresponse follow-up, list/enumerate, and other field operations. In the C2SS, self responses included mail; enumerator responses included forms obtained in telephone and in-person follow-up.

a For 1999 in the 2000 census long-form sample; for last 12 months in the C2SS.

b Includes 100 percent of income imputed.

c Annual cost in the 2000 census long-form sample; last month’s cost in the C2SS.

SOURCE: Tabulations by U.S. Census Bureau staff from the 2000 Sample Census Edited File (SCEF) and the Census 2000 Supplementary Survey edited file, provided to the panel spring 2003.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

the 9 items.12 Of those cases that did not respond to the entire bank of income questions, 99 percent also failed to answer the questions on whether the person worked last week or last year; in contrast, 97 percent of complete responders to the income questions, and even 91 percent of those who omitted some of the income questions, did answer both the worked last week and worked last year questions. These few tabulations suggest that the distribution of respondent behavior could be bimodal, at least for the most intrusive and difficult questions on income and employment. A majority responded to all or most items, a smaller but still sizeable group responded to very few, and the smallest group was in between these two extremes.

7–C.4 Consistency of Responses to Long-Form Items

The 2000 Content Reinterview Survey provides an index of inconsistency for most of the long-form questionnaire items, comparing responses to the census with responses to the Content Reinterview Survey for 20,000 recipients of the 2000 long form. The index that was computed for each item is the ratio of simple response variance to total variance (times 100). Historically, index values below 20 have been considered to be low, values between 20 and 50 have been considered to be moderately problematic, and values above 50 have been considered to be very problematic (Singer and Ennis, 2003:7–9). The Content Reinterview Survey computations included only individuals for whom the items were reported in both the census and the survey, excluding imputed and edited values.

Table 7.6 provides values of the index of inconsistency from the 2000 Content Reinterview Survey and its 1990 counterpart for 18 person items and 11 housing items. Schneider (2003:Tables 1, 2) provides index values for additional 2000 long-form-sample items for which there were no corresponding 1990 values. The items exhibit a wide range of inconsistency, ranging from 3.2 percent in 2000 and 4.9 percent in 1990 for place of birth to 80.5 percent in 2000 for work disability and 73.6 percent in 1990 for self-care disability. Overall, of the long-form-sample items shown in Table 7.6, eight of the 2000 items and eight of the 1990 items have low values of

12  

The E-sample did not include the worst filled-out long-form cases, so the percentage of E-sample cases that did not answer any of the income questions was lower than the corresponding long-form-sample percentage.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

the index of inconsistency (below 20), indicating that the data are measured reliably in the census (although whether the reporting is unbiased is not known). Thirteen of the 2000 items and fourteen of the 1990 items have index values between 20 and 50, which makes their reliability of moderate concern. The remaining eight of the 2000 items and seven of the 1990 items have high values of the index of inconsistency (50 or above), indicating that the data are not measured reliably but, rather, are subject to considerable reporting variance for what could be a number of reasons, such as different respondents, different data collection modes, the difficulty of answering the question, or poor question design. Income amounts in 2000 also tend to have high index values (data not shown). Some of the items with high values are measuring rare occurrences (e.g., few households lack plumbing facilities and few households have mobile home costs); for such items, discrepancies in reporting between the census and the Content Reinterview Survey tend to inflate the valueoftheindexmorethanwouldbetrueforitemsthataremore evenly distributed.

A number of items shown in Table 7.6 were less consistently reported in 2000 than in 1990, while others were more consistently reported. Users should be aware of these changes, as they may affect the reliability of trend analysis. Specifically, nine items had index values that were 5 percentage points or higher than the corresponding values in 1990. These items were Hispanic origin (5 percent increase), race (7 percent increase, perhaps due to the new option to indicate more than one race), mobility disability (17 percent increase), work disability (35 percent increase), veteran status (10 percent increase), complete plumbing facilities (31 percent increase), vehicles available (5 percent increase), business on property (16 percent increase), and agricultural sales (10 percent increase). Conversely, eight items had index values in 2000 that were lower than the corresponding 1990 values by 5 or more percentage points. These items were self-care disability (22 percent decrease), length of military service (17 percent decrease), work last year (22 percent decrease), usual hours worked (6 percent decrease), year structure built (11 percent decrease), lot size (7 percent decrease), monthly rent (12 percent decrease), and meals in rent (33 percent decrease).

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Table 7.6 Index of Inconsistency for Selected Long-Form-Sample Items, 2000 and 1990 Content Reinterview Surveys (weighted)

 

Content Reinterview Survey

Item

2000 Census

1990 Census

Hispanic Origina

17.2

12.2

Racea

23.1

16.3

School Enrollment

13.5

17.3

Educational Attainment

36.5

32.3

Ancestry

30.7

26.5

Speaks Non-English Language

22.7

26.9

English-Speaking Ability

59.5

60.3

Place of Birth

3.2

4.9

Citizenship

9.8

10.9

Year of Entry

18.9

23.0

Self-Care Disability

51.7

73.6

Mobility Disability

64.5

47.1

Work Disability

80.5

45.7

Veteran Status

18.7

8.5

Length of Service

41.6

58.8

Work Last Year

24.3

45.9

Weeks Worked

57.5

56.8

Usual Hours Worked

34.3

40.1

Housing Tenurea

19.4

13.3

Units in Structureb

20.8

21.9

Year Structure Built

29.3

40.6

Complete Plumbing Facilities

85.2

53.8

Heating Fuel

17.7

14.0

Vehicles Available

37.1

32.1

Business on Propertyb

65.8

50.0

Lot Size

20.9

27.8

Agricultural Sales

52.0

41.7

Monthly Rentb

23.2

34.7

Meals in Rent

38.2

71.6

NOTE: The index of inconsistency is the ratio of the simple response variance to the total variance for an item times 100; for items with three or more categories, it is the aggregate index for the whole item; it ranges from 0 to 100 (see Singer and Ennis, 2003:7–9).

a Basic (complete-count) item in 2000 and 1990.

b Basic (complete-count) item in 1990.

SOURCE: Schneider (2004:Tables 1, 2).

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

7–C.5 Weighting and Variance Estimation

Because the long form is a sample survey, there is another option for treating missing data in addition to imputation—namely, adjustments can be made to the survey weights of respondents so that they represent nonrespondents as well. Such weighting is a replacement for whole-household imputation, which represented a considerably higher percentage of long-form-sample households than of the complete count in 2000 and 1990. The reason is that only households in which at least one member reported at least two basic items and two additional items were given a nonzero weight in the final edited long-form sample.

Table 7.7 provides a measure of household sample loss in the 2000 and 1990 long-form surveys. The 2000 long-form sample experienced somewhat less sample loss than the 1990 long-form sample: 99 percent of expected long forms were received, and 93 percent of households received were retained in the sample (i.e., given nonzero weights). The corresponding figures for the 1990 long-form sample were 98 percent of expected long forms received and 91 percent of received household long forms retained in the sample. In the 10 percent of census tracts with the highest rates of basic item imputations, only 84 and 79 percent of received household long forms were retained in the 2000 long-form sample and 1990 long-form sample, respectively.

Of the 1.2 million long-form-sample household returns in 2000 that were not sample defined and were dropped from the 2000 Sample Census Edited File, 54 percent were proxy returns for which a neighbor or landlord provided information for the household. Proxy returns in 2000 accounted for 6.2 percent of long forms received (they were 19.4 percent of enumerator long forms received). Because thee-fifths of long-form proxy returns were not sample-data-defined, the percentage of proxy returns in the 2000 Sample Census Edited File was only 2.7 percent.13

A reason for the somewhat higher rates of sample retention in 2000 than in 1990 could have been the layout of the questionnaire. As we described above, the 2000 long form led off with all of the

13  

From tabulations by U.S. Census Bureau staff from the 2000 Sample Census Edited File provided to the panel in spring 2003. Proxy returns were 4.8 percent of 2000 short forms (20.9 percent of enumerator short forms).

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Table 7.7 Whole-Household Nonresponse in the 2000 and 1990 Census Long-Form Samples

 

2000 Long-Form Sample

1990 Long-Form Sample

Measure

Total Households

Households in Worst 10% of Tracts

Total Households

Households in Worst 10% of Tracts

Percent, Long Forms Received of Number Expected

98.5

96.6

97.8

94.0

Percent, Households Retained in Edited Long-Form Sample of Number of Forms Received

93.2

84.3

91.2

78.8

No. Long Forms Expected from Households (millions)

17.9

1.3

15.9

1.1

NOTES: Households not retained in the edited long-form sample include wholly imputed households from the complete-count processing (types 2–5; see Box 4.2) and households in which no person had at least two basic and two long-form items reported (i.e., they were not sample data-defined). Worst 10 percent census tracts were defined as those with the highest rates of basic item imputations.

SOURCE: Tabulations by U.S. Census Bureau staff from the 2000 and 1990 Sample Census Edited Files (SCEF), provided to the panel spring 2003.

person items for the first person, whereas the 1990 long form asked all of the basic items for each household member, followed by the housing items, followed by the sample person items. It was easier in 2000 to meet the criterion for being “sample-data-defined,” so long as the first person answered the basic items and, say, marital status and education level, which came first among the additional person items.

Table 7.8 provides another measure of loss for the 2000 long-form sample; it shows the percentages of non-sample-data-defined persons by race and ethnicity of the household reference person and type of return. For the total long-form population, the rates of non-sample-data-defined persons range from 1.6 percent of white self returns to 25.6 percent of black enumerator returns. In the 10

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Table 7.8 Whole-Person Nonresponse in the 2000 Long-Form Sample, by Race of Reference Person

 

Non-Hispanic

Measure

Total Persons

Hispanic

Black

White

Percent Non-Sample-Data-Defined Persons of:

 

Total Persons on Long Forms

8.4

11.0

14.2

7.1

Self Returns

2.1

4.7

3.7

1.6

Enumerator Returns

20.9

18.5

25.6

20.7

Persons on Long Forms in Worst 10% Census Tracts

18.5

15.4

23.6

17.3

Self Returns

4.8

6.8

4.8

2.5

Enumerator Returns

31.5

24.1

39.3

33.9

No. Persons on Long Forms Received (millions)

 

Total

45.4

4.9

4.9

33.1

Worst 10% Census Tracts

3.7

1.3

1.0

1.1

NOTES: Non-sample-data-defined persons include persons in wholly imputed households and other non-sample-data-defined households (see text), plus wholly imputed persons (type 1) in enumerated households. Wholly imputed persons did receive a sample weight. Worst 10 percent census tracts were defined as those with the highest rates of basic item imputations.

SOURCE: Tabulations by U.S. Census Bureau staff from the 2000 and 1990 Sample Census Edited Files (SCEF), provided to the panel spring 2003.

percent of census tracts with the highest percentage of basic item imputations, the rates range from 2.5 percent of white self-returns to 39.3 percent of black enumerator returns.

For some groups and geographic areas, the levels of sample loss are large, adding to the variability, and, hence, uncertainty, of long-form-sample estimates. Moreover, because the Census Bureau provides variability estimates that do not account for item imputation and, separately, provides item imputation rates based on the people who were retained in the sample, it is easy to overlook the fact that weighting and item imputation are both methods of dealing with sample loss. Ideally, the two kinds of sample loss should be considered together. For example, if income imputation rates are combined with sample loss due to households with no or minimal response, then it could result that the effective sample size for a poverty estimate for the total household population in 2000 would

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

be only 60 percent of the original sample size (8 percent sample loss of persons from Table 7.8 plus 30 percent nonresponse to one or more income items from Table 7.5).

7–C.6 Long-Form-Sample Data Quality: Summary of Findings and Recommendations

The additional long-form-sample items exhibited higher item imputation rates in 2000 than the basic items. More serious, the long-form-sample data quality, as measured by nonresponse, deteriorated in 2000 compared with 1990. While sample loss was not quite as high as in 1990, imputation rates for many items were considerably higher, reaching levels as great as 32 percent for property taxes and 30 percent for some or all income items (compared with 12 and 13 percent, respectively, for these two items in 1990). A major reason for the disparities in rates was the effort devoted to telephone and field follow-up for missing data items in 1990; such effort was almost nonexistent in 2000.

When sample loss is considered together with item imputation, the variability in the 2000 long-form-sample estimates could be much more than expected from the original sample selection probabilities. However, the Census Bureau’s variance estimates for the long-form sample took account of sample selection rates and whole-household sample loss, but not item imputation. What we do not know is the extent of bias in the 2000 long-form-sample estimates that might be attributable to the high rates of imputation.

With regard to measures of response variance or consistency of reporting for the long-form items, the 2000 Content Reinterview Survey showed a wide range of values for an index of inconsistency, as did a similar survey in 1990. A total of 8 of the 2000 items and 7 of the 1990 items had index values greater than 50, indicating that the data were not reliably measured; another 13 items in 2000 and 14 items in 1990 had index values between 20 and 50, indicating that the data were only moderately reliable.

Little information is available with regard to possible response bias for the long-form-sample items (e.g., systematic overreporting or underreporting of income). Because the 2000 Content Reinterview Survey did not ask probing questions to try to obtain an accurate response, it did not provide measures of response bias. Compar-

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

isons of long-form-sample estimates with other sources have been performed for only a few variables thus far. Aggregate comparisons with the 2000 April Current Population Survey (CPS) and the C2SS found relatively consistent estimates of the percent poor in 1999: the estimated poverty rates were 12.4 percent for the long-form sample, 11.9 percent for the CPS, and 12.2 percent for the C2SS (Schneider, 2004:16). However, aggregate comparisons with the 2000 April CPS found sizeable discrepancies in estimates of employed and unemployed people (Clark et al., 2003). Thus, the census estimate of the number of employed people was 5 percent lower than the 2000 April CPS estimate, while the census estimate of the unemployment rate was 2.1 percentage points higher than the CPS rate, representing a 50 percent larger number of unemployed people (7.9 million in the census compared with 5.2 million in the CPS). The differences were much more pronounced for blacks and Hispanics than for other groups and much larger than those found in similar comparisons for 1990. They appear to be due in part to changes in question wording and imputation procedures from those used in 1990.

Finding 7.2: For the household population, missing data rates were at least moderately high (10 percent or more) for over one-half of the 2000 census long-form-sample items and very high (20 percent or more) for one-sixth of the long-form-sample items. Missing data rates also varied widely among population groups and geographic areas. By comparison with 1990, missing data rates were higher in 2000 for most long-form-sample items asked in both years and substantially higher—by 5 or more percentage points—for one-half of the items asked in both years. In addition, close to 10 percent of long-form-sample households in 2000 (similar to 1990) provided too little information for inclusion in the sample data file. When dropped households and individually missing data are considered together, the effective sample size that is available for analysis for some characteristics is 60 percent or less of the original long-form-sample size.

Many long-form-sample items had moderate to high rates of inconsistent reporting, as measured in a content reinterview survey. Few assessments have yet been

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

made of systematic reporting errors for the long-form-sample items, although aggregate comparisons of employment data between the 2000 census and the Current Population Survey (CPS) found sizeable discrepancies in estimates of employed and unemployed people—much larger than the discrepancies found in similar comparisons for 1990. No analysis of the effects of item imputation and weighting on the distributions of characteristics or the relationships among them has yet been undertaken, although analysis determined that changes in imputation procedures contributed to the 50 percent higher unemployment rate estimate in the 2000 census compared with the April 2000 CPS.

Recommendation 7.1: Given the high rates of imputation for many 2000 long-form-sample items, the Census Bureau should develop procedures to quantify and report the variability of the 2000 long-form estimates due to imputation, in addition to the variability due to sampling and weighting adjustments for whole-household weight adjustments. The Bureau should also study the effects of imputation on the distributions of characteristics and the relationships among them and conduct research on improved imputation methods for use in the American Community Survey (or the 2010 census if it includes a long-form sample).

Recommendation 7.2: The Census Bureau should make users aware of the high missing data rates and measures of inconsistent reporting for many long-form sample items, and inform users of the 2000 census long-form-sample data products (Summary Files 3 and 4 and the Public Use Microdata Samples) about the need for caution in analyzing and interpreting those data.

In particular, users should review Census Bureau documentation of imputation and weighting procedures, examine imputation rates and estimates of standard errors provided by the Bureau, be alert for User Notes from the Bureau about data errors and other reports

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

on data quality, and inform Census Bureau staff of data anomalies for investigation.

7–D QUALITY OF GROUP QUARTERS DATA

Residents of group quarters accounted for 7.8 million people in the 2000 census, up from 6.7 million in 1990. The census is the only source at present of detailed information for residents of all types of group quarters, including prisons, juvenile institutions, nursing homes, hospitals and schools for the handicapped, military quarters, shelters, group homes, and other group quarters.

The quality of the data for group quarters residents in the 2000 census long-form sample was poor in comparison with the data for household residents and also in comparison with the group quarters data in 1990. Table 7.9 shows 2000 imputation/assignment rates and comparable 1990 imputation rates for selected person data items for the total group quarters population, prison inmates, and students in college dormitories.14 In 2000 missing data rates for the items shown reached as high as 50 percent for all group quarters residents and as high as 75 percent for prison inmates. Generally, missing data rates were highest for inmates of institutions (prisons, juvenile institutions, nursing homes, hospitals and schools for the handicapped) and lowest for college students and the military. The particularly high rates for institutional residents were probably due in part to the high rates of use of administrative records to provide information instead of enumeration of residents. The Census Bureau was not prepared for the widespread resort to administrative records by institutions (see Section 4-F.2), and, very often, the available records did not contain long-form-type information, or institutions were unwilling to provide such information.

The great extent of missing data among group quarters residents in 2000 raises a question as to whether the Census Bureau should have published long-form-sample estimates for some or all types of group quarters. Given the decision to publish, it was unfortunate that many tabulations in census data products (e.g., age, employment, income) combined group quarters and household residents.

14  

See Appendix H (Table H.8) for rates for all of the basic and additional items for group quarters residents in nine categories of type of facility.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Table7.9 Imputation/Assignment Rates for Selected Person Items, 2000 and 1990 Census Long-Form Samples, by Type of Residence, Group Quarters Population (weighted)

 

2000

1990

 

Total Group Quarters

Inmates of Prisons

Students in College Dormitories

Total Group Quarters

Inmates of Prisons

Students in College Dormitories

Agea

3.8

5.5

3.4

1.5

2.1

1.3

Sex

3.0

2.7

1.9

0.6

1.1

0.2

Race

4.5

5.4

5.4

1.8

2.7

1.4

Ethnicity

8.0

11.8

7.1

7.6

16.8

3.4

Marital Status

18.0

30.9

8.1

4.2

11.1

1.2

Educational Attainment

39.3

53.8

19.2

17.9

24.6

2.8

English- Speaking Ability

33.9

56.8

16.3

22.1

29.8

11.0

Place of Birth

40.2

54.0

22.2

19.2

31.7

6.7

Citizenship

36.5

53.0

19.9

14.0

24.7

3.9

Residence 5 Years Ago

44.9

70.6

23.7

18.1

33.5

4.7

Mobility Disability

46.9

66.2

22.3

16.7

31.5

6.3

Work Disabilityb

47.7

66.7

22.7

18.1

34.2

6.0

Grandchildren at Home

30.0

36.5

0.5

Veteran Status

39.6

57.5

21.6

18.0

29.2

5.7

Occupation Last Year

46.9

75.4

30.7

21.3

44.2

11.1

Weeks Worked Last Year

42.8

72.5

29.1

21.4

40.2

13.2

Wages Last Year

50.1

74.3

34.7

27.4

49.7

15.0

Population (millions)

7.78

1.98

2.06

6.66

NOTES:—; not available.

a Excludes imputation of age from date of birth.

b In 1990, “work disability ” refers to a disability that prevents working; in 2000, the term refers to a disability that makes it difficult to work.

SOURCE: Tabulations by U.S.Census Bureau staff from the 2000 and 1990 Sample Census Edited Files (SCEF), provided to the panel spring 2003.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Combining the data makes it harder for users to compare census results for such statistics as the poverty rate and unemployment rate with other household surveys, which typically do not include the military or institutional residents. Combining the data also obscures the differences in data quality between group quarters residents and household members. The difference is further obscured because published item imputation (allocation) rates that accompany the long-form-sample data products combined group quarters and household member rates.

No systematic investigation has yet been undertaken of the effects on distributions of characteristics of the high rates of missing data and the imputation procedures used. However, the discovery by users of very high unemployment rates in some communities, such as college towns, led to a determination by the Census Bureau that a particular combination of missing responses to some questions and reports of availability for work by residents of group quarters resulted in an inappropriate imputation of unemployed status to many such residents. The problem affected residents of noninstitutional group quarters, such as students in college dormitories, people living in group homes, and others. Residents of institutions showed similar reporting patterns, but the imputation did not allow an unemployed status for them. The problem is described in User Note 4 for Summary Tape File 3 (U.S. Census Bureau, 2003d:Data Note 4),15 and the magnitude is such that the Census Bureau reissued employment status tabulations for states, counties, and places in late 2003 to exclude group quarters residents, limiting the tabulations to household residents only.16

Finding 7.3: For group quarters residents, missing data rates for most long-form-sample items were very high in 2000 (20 percent or more for four-fifths of the items and 40 percent or more for one-half of the items). The 2000 rates were much higher than missing data rates for household members and considerably higher than missing data rates for group quarters residents in 1990. The

15  

The note is also available at http://www.census.gov/prod/cen2000/doc/sf3.pdf [2/25/04].

16  

Available at http://www.census.gov/population/www/census2000/phc-28.html [2/25/04].

 

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

2000 missing data rates were particularly high for prisoners, residents of nursing homes, and residents of long-term-care hospitals perhaps because of heavy reliance on administrative records for enumerating them. Few assessments have yet been made of systematic reporting errors for group quarters residents for long-form-sample items, nor of the effects of imputations on the distributions of characteristics or the relationships among them. However, a systematic error was found in the imputation of employment status for people living in noninstitutional group quarters because of a particular pattern of missing data. The result was a substantial overestimate of unemployment rates for these people, so much so that the Census Bureau reissued employment status tabulations for household members only, excluding group quarters residents.

Earlier we called for a complete redesign of the enumeration procedures for group quarters residents in the 2010 census (see Recommendation 4.4 in Section 4-F.3). That redesign should include consideration of changes to questionnaire content as well. If the American Community Survey is fully funded as a replacement for the census long-form sample, then it is likely to and should provide detailed data for group quarters residents. (At present, the C2SS and other precursors to the ACS include only household residents.) The designers for the ACS should consider how best to obtain long-form-type data for different types of group quarters. For institutions, the use of administrative records may make most sense provided the cooperation of the facility staff can be obtained. It may also be that accurate responses to some of the long-form-sample questions (e.g., income and employment last year) are too difficult to obtain in institutional settings, either from records or the residents themselves, at least without special training of interviewers and other measures to elicit responses. For other types of group quarters, a household-type questionnaire may work best. This small, but important, population merits dedication of sufficient resources for research and testing on questionnaire design and content and enumeration procedures that can produce useful, high-quality information for policy making, program planning, and other purposes.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

Recommendation 7.3: The Census Bureau should publish distributions of characteristics and item imputation rates, for the 2010 census and the American Community Survey (when it includes group quarters residents), that distinguish household residents from the group quarters population (at least the institutionalized component). Such separation would make it easier for data users to compare census and ACS estimates with household surveys and would facilitate comparative assessments of data quality for these two populations by the Census Bureau and others.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×

This page intentionally left blank.

Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 269
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 270
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 271
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 272
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 273
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 274
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 275
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 276
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 277
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 278
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 279
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 280
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 281
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 282
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 283
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 284
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 285
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 286
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 287
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 288
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 289
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 290
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 291
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 292
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 293
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 294
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 295
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 296
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 297
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 298
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 299
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 300
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 301
Suggested Citation:"7 Assessment of Basic and Long-Form-Sample Data." National Research Council. 2004. The 2000 Census: Counting Under Adversity. Washington, DC: The National Academies Press. doi: 10.17226/10907.
×
Page 302
Next: 8 Race and Ethnicity Measurement »
The 2000 Census: Counting Under Adversity Get This Book
×
Buy Hardback | $80.00 Buy Ebook | $64.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The decennial census was the federal government’s largest and most complex peacetime operation. This report of a panel of the National Research Council’s Committee on National Statistics comprehensively reviews the conduct of the 2000 census and the quality of the resulting data. The panel’s findings cover the planning process for 2000, which was marked by an atmosphere of intense controversy about the proposed role of statistical techniques in the census enumeration and possible adjustment for errors in counting the population. The report addresses the success and problems of major innovations in census operations, the completeness of population coverage in 2000, and the quality of both the basic demographic data collected from all census respondents and the detailed socioeconomic data collected from the census long-form sample (about one-sixth of the population). The panel draws comparisons with the 1990 experience and recommends improvements in the planning process and design for 2010. The 2000 Census: Counting Under Adversity will be an invaluable resource for users of the 2000 data and for policymakers and census planners. It provides a trove of information about the issues that have fueled debate about the census process and about the operations and quality of the nation’s twenty-second decennial enumeration.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!