NATIONAL AGRICULTURAL STATISTICS SERVICE
The National Agricultural Statistics Service (NASS) surveys farms, which are both establishments and, in surveys such as the Agricultural Resource Management Survey, households. Jaki McCarthy of NASS reported at the panel’s workshop that NASS has conducted studies of its respondents and nonrespondents in an effort to test whether knowledge of and attitudes toward NASS as a survey sponsor had an effect on response. The agency found that cooperators have more knowledge and better opinions of NASS statistics. Other studies of the relationship between burden and response found no consistent relationship between nonresponse and burden as measured by the number and complexity of questions. In fact, the highest burden sample units tend to be more cooperative than low-burden units.
Other NASS studies looking at the impact of incentives on survey response have found that $20 ATM cards increased mail response, although not in-person interview responses, and that they were cost-effective and did not increase bias. Calibration-weighting studies found that calibration weighting decreased bias in many key survey statistics.
NASS is currently exploring use of data mining to help predict survey nonrespondents and determine if current patterns can be used to help provide explanatory power or if, instead, they are most useful for non-theoretical predictive power. Preliminary findings suggest that in large datasets many variables are significantly different among cooperators, refusals, and non-contacts, but although the differences are significant, they are usually small in practical terms. Many variables are correlated, and using these variables alone is not useful in predicting individual nonresponse or managing data collection.
A breakthrough procedure is to use classification trees in which the dataset is split using simple rules and all variables and all possible breakpoints are examined. In this procedure the variable maximizing the difference between subgroups is selected, and a rule is generated that splits the dataset at the optimum breakpoint. This process is repeated for each resulting subgroup. The classification trees are used to manage data collection and, in the process, allow an indication of nonresponse bias. By this means it is possible to identify likely nonrespondent groups that will bias estimates.
Despite this research, there are still a number of important and foundational “unknowns,” which she summarized as follows: Is nonresponse affecting estimates? Is there bias after nonresponse adjustment? What are the important predictors of nonresponse? Can these be used to increase response? Who are the “important” nonrespondents?