housing unit record of the same size in the nearby area to provide characteristics for people in the household (see Griffin, 2001).

  1. 2.269 million people (0.8% of the household population) were substituted because the number of persons was known for their household but no other information was available. For these households, the computer duplicated another housing unit record in the nearby area of the same household size.

  2. 2.333 million people (0.9% of the household population) were substituted because no information was provided for them, although other members of their households had data reported. This situation could occur, for example, when a large household listed more than six people and the telephone follow-up was not successful in reaching the household to obtain information for the additional members. For these people, the computer duplicated a person from a nearby housing unit with the same characteristics as the unit with person(s) requiring substitution.13

Content Editing and Imputation

For short-form content items, editing and imputation rates for missing values were low: 1.1 percent for sex, 4.3 percent for age, 3.2 percent for race, 3.8 percent for Hispanic origin, 1.6 percent for household relationship, and 4.2 percent for housing tenure.14 These rates are for people who were missing one or more but not every short-form item (i.e., they exclude substituted people). In many instances, it was possible to fill in an answer from other information for the person or household, so that rates of hot deck imputation for short-form items were lower: 0.2 percent for sex, 2.9 percent for age, 3.2 percent for race, 3.4 percent for Hispanic origin, 1.3 percent for household relationship, and 3.6 percent for housing tenure. Information about editing and imputation rates for long-form content items is not yet available.

Other Data Processing

A number of other data processing steps were carried out, or are still in process, to generate data files and publications from the 2000 census records. Such steps for the short-form records include tabulating the data on various dimensions and modifying the data appropriately on files that are to be released


Terminology has not been consistent across censuses for the process of imputation. “Substitution” most often refers to cases when an entire household is imputed. When individual people are imputed into a household with other respondents, they are often referred to as “totally allocated persons,” as distinct from allocations for one or a few missing items.


Item edit and imputation rates are from tabulations by panel staff from U.S. Census Bureau, E-Sample Person Dual-System Estimation Output File, February 16, 2001 (weighted using TESFINWT). The rate for age excludes cases in which it was possible to estimate age from date of birth and vice versa. See also Chapter 6.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement