3

Mitigating the Consequences of Nonresponse

Survey nonresponse has consequences, most notably the potential for nonresponse bias. The techniques and procedures for dealing with nonresponse bias depend on how one approaches the problem. Singer (2006) points out that statisticians have been concerned mainly with imputation and weighting as ways of adjusting for the bias introduced by nonresponse, while social scientists and survey methodologists have tended to focus on measuring, understanding, and reducing the nonresponse rates themselves.

Because of the negative effect that nonresponse can have on survey quality, in recent years survey researchers and managers have been responding very aggressively to the problem of growing nonresponse in surveys. However, as Couper (2011a) observed in the panel’s workshop, the approaches have become more selective, and survey researchers are rejecting the older approach that aimed simply to maximize the overall response rate. The newer approaches target interventions at subgroups, at domains of interest, and at maximizing the response from specific cases based on their perceived special contribution to the quality of key statistics. Survey researchers are focusing less attention on increasing overall rates and are increasingly focusing on understanding the causes and correlates of nonresponse and making adjustments based on that understanding.

In this chapter, we explore some of the ways in which survey methodologists and managers are responding to the growing problem of survey nonresponse. We outline the results of some very good work that has gone into the development of weighting adjustments and adjustment models, and we document the increased use of paradata in nonresponse adjustment.

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 51

3
Mitigating the Consequences
of Nonresponse
S
urvey nonresponse has consequences, most notably the potential for
nonresponse bias. The techniques and procedures for dealing with
nonresponse bias depend on how one approaches the problem. Singer
(2006) points out that statisticians have been concerned mainly with im-
putation and weighting as ways of adjusting for the bias introduced by
nonresponse, while social scientists and survey methodologists have tended
to focus on measuring, understanding, and reducing the nonresponse rates
themselves.
Because of the negative effect that nonresponse can have on survey
quality, in recent years survey researchers and managers have been respond-
ing very aggressively to the problem of growing nonresponse in surveys.
However, as Couper (2011a) observed in the panel’s workshop, the ap-
proaches have become more selective, and survey researchers are rejecting
the older approach that aimed simply to maximize the overall response
rate. The newer approaches target interventions at subgroups, at domains
of interest, and at maximizing the response from specific cases based on
their perceived special contribution to the quality of key statistics. Survey
researchers are focusing less attention on increasing overall rates and are
increasingly focusing on understanding the causes and correlates of nonre-
sponse and making adjustments based on that understanding.
In this chapter, we explore some of the ways in which survey method-
ologists and managers are responding to the growing problem of survey
nonresponse. We outline the results of some very good work that has gone
into the development of weighting adjustments and adjustment models,
and we document the increased use of paradata in nonresponse adjustment.
51

OCR for page 51

52 NONRESPONSE IN SOCIAL SCIENCE SURVEYS
Some of this work is in early stages, and other work is more advanced. We
make several recommendations for research to solidify and further advance
these lines of development.
NONRESPONSE WEIGHTING ADJUSTMENT METHODS1
The need for nonresponse adjustment arises because probability sam-
ples, in which all units have a known, positive probability of selection,
require complete responses. Without other non-sampling errors, estimators
for probability samples are approximately design-unbiased, consistent, and
measurable. Base weights, or inverse probability selection weights, can be
used to implement standard estimators.
One possible simple estimator is the ratio mean, which is approxi-
mately unbiased and consistent for the population mean:
ˆ Yˆ
Yπ = π =
∑d y , i i
N ˆ ∑d i
with di equal to the inverse of the probability of selection. If some of the
sample units do not respond (unit nonresponse), and the estimator is un-
changed, then the estimator may be biased:
ˆ
∑d y i i
Y0 = r
.
∑d i
r
The bias can be expressed in two ways:
(1) A deterministic framework assumes the population contains a stra-
tum of respondents and a stratum of nonrespondents. Let the population
means in the two strata be Yr and Ynr , respectively. The respondent stra-
tum is R percent of N, and the bias of the unadjusted estimator is
( )
ˆ
(
bias Y0 = (1 − R) Yr − Ynr . )
In the deterministic view, the bias arises when the means of the respondents
and of the nonrespondents differ.
(2) A stochastic framework assumes every unit in the population has
some non-zero probability of responding (its response propensity). The bias
of the unadjusted estimator is
1 he
T discussion of nonresponse weighting adjustment methods is abstracted from the pre-
sentation by Michael Brick at the panel’s workshop (Brick, 2011).

OCR for page 51

MITIGATING THE CONSEQUENCES OF NONRESPONSE 53
( )
ˆ
bias Y0 ≈
1 N
(
∑ Yi − Y (ϕi − ϕ ) ,
Nϕ i = 1
)
where ϕ i is an individual response propensity and ϕ is the mean of the
response propensities. Thus, in the stochastic view, the bias arises when the
characteristic and response propensity covary.
A natural adjusted estimator is then
ˆ
∑dϕ y
ˆ i
−1
i i
Y′ = r
,
∑dϕ ˆ i
−1
i
r
where ϕ i is an estimate of the response propensity and diϕ i represents
−1
ˆ ˆ
the adjusted weight wi′.
The selection of the weighting framework—deterministic or stochastic—
depends then on the theoretical model of the response mechanism. In other
words, the underlying model is the rationale for the selection of the adjust-
ment scheme.
Most adjustments now in use assume that the missing data are missing
completely at random (MCAR) or missing at random (MAR). The MCAR
assumption holds if all of the units in the population have the same prob-
ability of responding, that is, if the respondents are just a smaller random
sample. MCAR means that the distribution of the missingness (an indicator
for whether the unit responds or not) is independent of the y-variable and
all auxiliary (or x) variables.
MAR is a more realistic assumption than MCAR. MAR implies that
the probability of response does not depend on the y-variable once we con-
trol for a vector of known x-variables. Weighting class adjustment schemes
that define subgroups using the auxiliary data, assuming that the sample
units within the subgroups (h = 1,...,H) have the same response propensity,
are consistent with the MAR assumption. These methods adjust the weights
for respondents in the group with ϕ i = ϕ h ∀ i ∈ h.
ˆ ˆ
This type of estimator is either a weighting-class estimator or a post-
stratified estimator, depending on the type of data available for computing
the adjustment. If the data are at the sample level (known for sampled units
but not for the entire population), it is a weighting-class estimator; if the
data are at the population level, then it is a post-stratified estimator. The
adjustment requires that sample members can be divided into cells using
the vector of observable characteristics.

OCR for page 51

54 NONRESPONSE IN SOCIAL SCIENCE SURVEYS
In the weighting-class approach, the adjusted weight is calculated in
four stages:
1. A base weight is calculated that is the reciprocal of the probability
of selection of the case under the sample design.
2. When there is nonresponse and the eligibility of the nonrespon-
dents cannot be determined, the base weights for these nonrespondents are
distributed into the eligible nonresponse category based on the proportion
of the weights that are eligible in the respondent set.
3. These weights are adjusted to compensate for eligible nonrespondents.
4. A final weight for the eligible respondent cases is computed as the
product of the base weight, the eligibility adjustment factor, and the non-
response adjustment factor (Yang and Wang, 2008).
In choosing weighting classes for the adjustment in stage 3, bias is
eliminated when the variables and classes are such that either:
(1) ϕ i = ϕ h ∀ i ∈ h or
ˆ
(2) Yi = Yh ∀ i ∈ h.
That is, the bias from nonresponse is eliminated if, within the weighting
cells, all cases have the same response propensity or the same value for the
survey variable. As noted in the stochastic model of nonresponse, nonre-
sponse bias only exists when the response propensities and the outcomes
are correlated. However, since most surveys have a multitude of survey
outcomes, the idea of using classes that are related only to the response
propensities is commonly adopted.
Models constructed to meet stage 1 are called response propensity
stratification, and those designed to meet stage 2 are referred to as predicted
mean stratification. The classes themselves are sometimes formed by subject
matter experts, based on information on the key survey outcomes.
An empirical method that is used often with categorical data is to form
weighting classes by using classification software such as CART, CHAID,
or SEARCH. Often the dependent variable is the response (respondent or
not), and sometimes the survey outcomes are used as dependent variables,
depending on the criteria being used. In either case, this approach may
result in a very large number of weighting classes. Eltinge and Yansaneh
(1997) suggested methods to test whether appropriate classes are formed.
Many alternative methods of making these adjustments are sometimes
used. We describe several of these alternatives below.

OCR for page 51

MITIGATING THE CONSEQUENCES OF NONRESPONSE 55
Propensity Model Approach
The propensity model approach uses multiple logistic regression analy-
sis (or some similar approach) to examine the nonresponse mechanism and
calculate a nonresponse adjustment. In this method a response indicator is
regressed on a set of independent variables, such as those used to define
weighting class cells. A predicted value derived from the regression equation
is called the propensity score, which is simply an estimated response prob-
ability (Rosenbaum and Rubin, 1983). Survey population members with
the same observable characteristics are assigned the same propensity score.
The response propensity can be used to adjust directly by using the inverse
of the estimated response propensity to adjust the weights for the respon-
dents. This is the response propensity stratification, although in many cases
the propensity scores are used to divide the sample into propensity classes
based on quintiles of the distribution, and the average propensity within
the class is the adjustment.
Advantages of propensity-weighting methods over the traditional
weighting-class methods are that continuous variables can be used to de-
fine cells, the models can accommodate a large number of variables, and
the technique is simple to apply (Hazelwood et al., 2007). If large adjust-
ments are avoided by using classes rather than the inverse of the estimated
response propensities, then the methods can be as stable as other weighting-
class methods. Like other response probability adjustments, this approach
also implicitly assumes that one weighting adjustment is sufficient to ad-
dress nonresponse bias in all estimates.
Selection Models
Heckman (1979) first proposed the sample selection model for regres-
sions. The model is based on the observation that respondents self-select
to participate in a survey, either explicitly by refusing to participate or
implicitly through inability to answer or be contacted.
Selection models are the conventional method among empirical econo-
mists for modeling samples with nonresponse (or other types of selectiv-
ity). Sampling statisticians have viewed this approach with skepticism,
mainly because most selection models make strong assumptions about the
nonresponse mechanism that may not hold in practice. Selection models
are also typically variable-specific solutions, in the sense that the model
is constructed for one particular estimate and cannot be used for a wide
variety of statistics. While this feature can be of benefit because selection
models may improve the quality of individual estimates, survey users are
typically interested in producing many statistics. The goal is most often a

OCR for page 51

56 NONRESPONSE IN SOCIAL SCIENCE SURVEYS
consistency among estimates (e.g., the sum of the estimates for males and
females should equal the total) that the selection models do not confer.
The most popular form of selection model requires an explicit distri-
butional assumption. In principle, different selectivity corrections could be
made from a given set of data, depending on the model to be estimated.
Raking Ratio Adjustment Approach
Raking ratio adjustments are used to benchmark sampling weights to
known control totals and can be considered a form of multidimensional
post-stratification. This approach reduces sampling error through the use
of auxiliary variables correlated to survey response and has been used to re-
duce nonresponse bias (Brick et al., 2008). The advantage of raking is that
more variables that are correlated with response propensities and outcome
variables can be included in the weighting process without creating large
weight adjustments. Like post-stratification and weighting-class methods,
careful review of the weights is required to make sure large weight adjust-
ments are not introduced by the raking process.
Calibration
Post-stratification and raking are two specific methods of calibration,
as described by Särndal (2007). Calibration is a method of computing
weights in a manner that equates the sum of the calibrated totals to totals
defined by auxiliary information. Calibrated weights can then be used to
produce estimates of totals and other finite population parameters that are
consistent internally, as discussed above in raking.
Calibration is used to correct for survey nonresponse (as well as for
coverage error resulting from frame undercoverage or unit duplication).
Kott and Chang (2010) showed that calibration weighting treats response
as an additional phase of random sampling. This method is particularly
valuable when many important auxiliary variables are related either to
response propensity or to the key survey outcomes. As a result, it has been
heavily studied for use in countries with population registers or when the
sampling frame is rich in auxiliary data, such as in establishment surveys.
Mixture Models
Selection models can be thought of as expressing the joint distribution
of the outcome and “missingness” as the product of the distribution of the
missing data mechanism conditional on the outcome variable and on the
marginal distribution of the outcome variable. An alternative approach is to
write the joint distribution as the product of the distribution of the outcome

OCR for page 51

MITIGATING THE CONSEQUENCES OF NONRESPONSE 57
conditional on the missingness mechanism and on the marginal distribution
of the missingness mechanism. The two approaches do not result in the
same estimates in some situations. Little (1993) described the difference in
the two approaches and discussed when pattern mixture models might be
preferred.
Observations About Weighting Adjustment Approaches
All of these weighting adjustment schemes depend very heavily on the
availability of auxiliary data that are highly correlated with either the re-
sponse propensities or the key outcomes. Without these types of data, the
adjustments are ineffective in reducing nonresponse bias. As response rates
decline, these weighting adjustments may become even more important
tools for producing high-quality survey estimates.
Recommendation 3-1: More research is needed on the use of auxiliary
data for weighting adjustments, including whether weighting can make
estimates worse (i.e., increase bias) and whether traditional weighting
approaches inflate the variance of the estimates.
In his summary, Brick (2011) makes the case for the development and
refinement of survey theory, suggesting that empirical adjustment methods
may work in many cases, but pointing out that they are unsatisfying in
several ways. Some possible paths to a solution would be to develop a more
comprehensive theory relating response mechanisms to nonresponse bias,
and a more comprehensive statistical theory of adjustment to deal with
different types of statistics.
Recommendation 3-2: Research is needed to assist in understanding
the impact of adjustment procedures on estimates other than means,
proportions, and totals.
USE OF PARADATA IN REDUCING NONRESPONSE
AND NONRESPONSE BIAS
There is a growing interest in paradata—that is, data about the pro-
cess by which the survey data were collected that are obtained in the
process of conducting the survey. Paradata encompass information about
the interviews (times of day interviews were conducted and how long the
interviews took); about the contacts (how many times contact was made
with each sample person or how many attempts to contact the sample per-
son were made, the apparent reluctance of the sample person); as well as
survey modes (such as phone, Web, e-mail, or in person). These data have

OCR for page 51

58 NONRESPONSE IN SOCIAL SCIENCE SURVEYS
many uses. They help in managing the survey operation (scheduling and
evaluating interviewers) and assessing its costs. They are also important
for understanding the findings of a survey and making inferences about
nonrespondents. Indeed, there is a long history in the research literature
of collecting additional data (what has become known as paradata) for
nonresponse.
Hansen and Hurwitz (1946) suggested two-phase sampling, with the
second phase of sampling looking at nonrespondents by using an intensive
follow-up of units selected for the second phase. If data can be collected
from all the sampled second-phase nonrespondents, then standard two-
phase sampling weights can be developed to eliminate nonresponse bias.
Even with an incomplete response at the second phase, the potential for
bias can be reduced by using information from the additional second-phase
sample.
Significant advances have been made in the state of the science for us-
ing paradata for reducing nonresponse bias (Olson, 2013). Today there are
two main options for reducing such bias. One is to use paradata to intro-
duce new design features to recruit uncontacted or uncooperative sample
members—and, hopefully, respondents with different characteristics—into
the respondent pool. The new design features rely on the use of paradata
in a “responsive” design. The second option is to use paradata in adjusting
base weights of the respondents. A third use of paradata is to use the data
to better understand the survey participation phenomenon so that future
surveys may reduce nonresponse, but this use does not result in reducing
nonresponse bias for the survey at hand.
The initial focus of research on paradata was to explore nonresponse
rates. The types of paradata that were considered as predictors of response
were respondent-voiced concerns, the presence of a locked entrance or other
safety measures, a multiunit building, and an urban setting (Campanelli et
al., 1997; Groves and Couper, 1998).
More recently the work on paradata has taken a new direction and
has focused more on the reduction of nonresponse bias by using paradata
in responsive designs or in weighting adjustments. Adjustments that are
effective in reducing nonresponse bias must be based on data that are pre-
dictive of the likelihood of participating in a survey or on the key survey
outcomes or both (Little, 1986; Kalton and Flores-Cervantes, 2003; Little
and Vartivarian, 2005; Groves, 2006; Kreuter et al., 2010).
The current challenges for paradata research are to enhance the un-
derlying theory (e.g., what paradata are correlated with both response
propensities and outcome measures); to better understand measurement
error in the paradata and what effect these errors have on the utility of the
paradata for reducing nonresponse bias; to operationally assign new tasks
for interviewers that are feasible and do not detract from their ability to

OCR for page 51

MITIGATING THE CONSEQUENCES OF NONRESPONSE 59
conduct the interviews; and to better understand the environment for the
interview—such as doorstep interactions, reasons for non-participation,
and the quality of interviewer observations of the neighborhoods and the
housing unit—so that better paradata measures can be developed.
Generally, the quality of paradata is relatively good if the data are auto-
matically generated. When interviewers are asked to collect additional data
that are not a byproduct of the data collection process, there is often a drop
in quality. Additional data collection requirements often lead to substantial
missing data rates. Likewise, when interviewer judgment is required, the
data are of varying quality (see Casas-Cordero, 2010; Kreuter and Casas-
Cordero, 2010; McCulloch et al., 2010; and West and Olson, 2010).
Kreuter holds that paradata carry a compelling theoretical potential
for nonresponse adjustment. With paradata, the development of proxy
variables is possible, and it is also possible to identify large variations in
correlations across outcome variables. However, research has shown that
although interviewers are good at making observations that are relevant for
the primary act of data collection, they can have difficulty in collecting the
additional proxy variables (Ys).
According to Kreuter, a case can be made for further collaboration
with subject-matter experts, statisticians, psychologists, and fieldwork staff
in the refinement of paradata, so these are examples of areas in which fur-
ther investigation may be fruitful. Such collaboration would be useful, for
instance, with substantive researchers to develop interviewer observation
measures for labor force surveys (at-home pattern), health surveys (too ill
to participate), housing surveys (condition of the dwelling), crime surveys
(bars on windows), and educational surveys (literacy). Collaboration with
statisticians could help improve statistical models, providing answers to
such questions as how to balance multiple predictors of response and Ys,
how to handle large and messy data, how to model and cluster sequences of
unequal length, and how to address issues of discrete times and mixed pro-
cesses. Psychologists, in collaboration with survey methodologists, could
aid in understanding the factors that drive errors in interviewer observation,
how training could improve ratings, how much error can be tolerated, and
what the quality is relative to other sources. Finally, collaboration with
fieldwork staff could help identify the costs associated with paradata col-
lection as well as cheaper alternatives, the risks in interviewer multitasking,
the appropriate level of observation, and ethical and legal matters issues
that need to be resolved.
CONCLUDING OBSERVATION
A focus on reducing nonresponse by using paradata in responsive de-
signs, or through other means, should not lead to neglect of other sources of

OCR for page 51

60 NONRESPONSE IN SOCIAL SCIENCE SURVEYS
error in survey estimates. In particular, measurement error (overreporting,
underreporting, or misreporting) should be addressed, particularly in terms
of the possible interaction with nonresponse.
Recommendation 3-3: Research is needed on the impact that reduc-
tion of survey nonresponse would have on other error sources, such as
measurement error.