a household does not provide any information on a questionnaire.1 Item nonresponse is often amenable to more sophisticated missing data methods because the responses that are available may be used to help predict the missing values. In contrast, for unit nonresponse, the only information available in the decennial census is the geographic location of the residence, and while geographic information is useful in somewhat the same way as responses to other items, it generally has limited value in comparison to information that is more specific to the household. The lack of information limits the techniques available to the Census Bureau, and as a result censuses have addressed unit nonresponse in the long-form-sample weighting process. This appendix is focused primarily on techniques that address item nonresponse, though some of the discussion will also apply to techniques for addressing unit nonresponse.2

Untreated, nonresponse can cause two problems. First, nonresponse can result in statistical bias. (Statistical bias is a measure of the difference in the expected value of a statistic and its true value.) It is generally the case that the data that one receives from respondents are different distributionally from the data that would have been provided by nonrespondents. This is why so-called complete-case analysis is problematic, since the restriction of the analysis to those cases that have a complete response fails to adequately represent the contribution from those that have missing data. These distributional differences may be present not just unconditionally, but (often) also conditionally given responses to certain items. For example, data for nonrespondents may be different from data for respondents because the respondents have a different demographic or socioeconomic profile. Even within demographic or socioeconomic groups or other conditioning variables that one might choose, data for nonrespondents may still have a different distribution than that for respondents. Methods that do not take these differences into account can introduce a statistical bias, which can be appreciable

1  

In many applications the unit is defined as a person, but in this application, the unit is defined as a household. Thus, a person may have no data provided for them, but as long as other members of the household provided information, that would still be referred to in this application as item nonresponse.

2  

The process of weighting the long-form sample to the complete-count totals relies on the assumption that cases of unit nonresponse are missing at random (defined below).



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement