Principles and Methods of Sensitivity Analyses

This chapter concerns principles and methods for sensitivity analyses that quantify the robustness of inferences to departures from underlying assumptions. Unlike the well-developed literature on drawing inferences from incomplete data, the literature on the assessment of sensitivity to various assumptions is relatively new. Because it is an active area of research, it is more difficult to identify a clear consensus about how sensitivity analyses should be conducted. However, in this chapter we articulate a consensus set of principles and describe methods that respect those principles.

We begin by describing in some detail the difficulties posed by reliance on untestable assumptions. We then demonstrate how sensitivity to these assumptions can be represented and investigated in the context of two popular models, selection and pattern mixture models. We also provide case study illustrations to suggest a format for conducting sensitivity analyses, recognizing that these case studies cannot cover the broad range of types and designs of clinical trials. Because the literature on sensitivity analysis is evolving, the primary objective of this chapter is to assert the importance of conducting some form of sensitivity analysis and to illustrate principles in some simple cases. We close the chapter with recommendations for further research on specific aspects of sensitivity analysis methodology.

There are fundamental issues involved with selecting a model and assessing its fit to incomplete data that do not apply to inference from complete data. Such issues occur even in the missing at random (MAR)

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 83

5
Principles and Methods of
Sensitivity Analyses
This chapter concerns principles and methods for sensitivity analyses
that quantify the robustness of inferences to departures from underlying
assumptions. Unlike the well-developed literature on drawing inferences
from incomplete data, the literature on the assessment of sensitivity to vari-
ous assumptions is relatively new. Because it is an active area of research, it
is more difficult to identify a clear consensus about how sensitivity analyses
should be conducted. However, in this chapter we articulate a consensus set
of principles and describe methods that respect those principles.
We begin by describing in some detail the difficulties posed by reliance
on untestable assumptions. We then demonstrate how sensitivity to these
assumptions can be represented and investigated in the context of two
popular models, selection and pattern mixture models. We also provide case
study illustrations to suggest a format for conducting sensitivity analyses,
recognizing that these case studies cannot cover the broad range of types
and designs of clinical trials. Because the literature on sensitivity analysis is
evolving, the primary objective of this chapter is to assert the importance of
conducting some form of sensitivity analysis and to illustrate principles in
some simple cases. We close the chapter with recommendations for further
research on specific aspects of sensitivity analysis methodology.
BACKGROUND
There are fundamental issues involved with selecting a model and
assessing its fit to incomplete data that do not apply to inference from
complete data. Such issues occur even in the missing at random (MAR)

OCR for page 83

MISSING DATA IN CLINICAL TRIALS
case, but they are compounded under missing not at random (MNAR). We
believe that, especially when the primary analysis assumes MAR, the fit of
an MAR model can often be addressed by standard model-checking diag-
nostics, leaving the sensitivity analysis to MNAR models that deviate from
MAR. This approach is suggested in order not to overburden the primary
analysis. The discussion in Chapter 4 provides some references for model-
checking of MAR models. In addition, with MAR missingness mechanisms
that deviate markedly from missing completely at random (MCAR), as in
the hypertension example in Chapter 4, analyses with incomplete data are
potentially less robust to violations of parametric assumptions than analy-
ses with complete data, so checking them is even more critical.
The data can never rule out an MNAR mechanism, and when the data
are potentially MNAR, issues of sensitivity to modeling asumptions are
even more serious than under MAR. One approach could be to estimate
from the available data the parameters of a model representing an MNAR
mechanism. However, the data typically do not contain information on the
parameters of the particular model chosen (Jansen et al., 2006).
In fact, different MNAR models may fit the observed data equally well
but have quite different implications for the unobserved measurements
and hence for the conclusions to be drawn from the respective analyses.
Without additional information, one cannot usefully distinguish between
such MNAR models based solely on their fit to the observed data, and so
goodness-of-fit tools alone do not provide a relevant means of choosing
between such models.
These considerations point to the necessity of sensitivity analysis. In a
broad sense, one can define a sensitivity analysis as one in which several sta-
tistical models are considered simultaneously or in which a statistical model
is further scrutinized using specialized tools, such as diagnostic measures.
This rather loose and very general definition encompasses a wide variety
of useful approaches.
A simple procedure is to fit a selected number of (MNAR) models, all
of which are deemed plausible and have equivalent or nearly equivalent fit
to the observed data; alternatively, a preferred (primary) analysis can be
supplemented with a number of modifications. The degree to which conclu-
sions (inferences) are stable across such analyses provides an indication of
the confidence that can be placed in them.
Modifications to a basic model can be constructed in different ways.
One obvious strategy is to consider various dependencies of the missing data
process on the outcomes or the covariates. One can choose to supplement an
analysis within the selection modeling framework, say, with one or several
in the pattern mixture modeling framework, which explicitly models the
missing responses at any given time given the previously observed responses.
Alternatively, the distributional assumptions of the models can be altered.

OCR for page 83

PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES
The vast range of models and methods for handling missing data high-
lights the need for sensitivity analysis. Indeed, research on methodology
has shifted from formulation of ever more complex models to methods for
assessing sensitivity of specific models and their underlying assumptions.
The paradigm shift to sensitivity analysis is, therefore, welcome. Prior
to focused research on sensitivity, many methods used in practice were
potentially useful but ad hoc (e.g., comparing several incompatible MNAR
models to each other). Although informal sensitivity analyses are an indis-
pensable step in the analysis of incomplete longitudinal data, it is desirable
to have more formal frameworks within which to develop such analyses.
It is possible to assess model sensitivities of several different types,
including sensitivity to: (a) distributional assumptions for the full data,
(b) outlying or influential observations, and (c) assumptions about the
missing data mechanism. Assessment of (a) can be partially carried out to
the extent that one can compare observed and fitted values for the observ-
ables under the model specified for the full data. However, distributional
assumptions for the missing data cannot be checked. Assessment of (b) can
be used to identify observations that are outliers in the observed-data distri-
bution or that may be driving weakly identified parts of an MNAR model
(Molenberghs and kenward, 2007). This chapter focuses on (c), sensitivity
to assumptions about the missing data mechanism.
FRAMEWORK
To focus ideas, we restrict consideration to follow-up randomized study
designs with repeated measures. We consider the case in which interest is
focused on treatment comparisons of visit-specific means of the repeated
measures. With incomplete data, inference about the treatment arm means
requires two types of assumptions: (i) untestable assumptions about the
distribution of missing outcomes data, and (ii) testable assumptions about
the distribution of observed outcomes. Recall that the full-data distribution,
described in Chapter 4, can be factored as
[Yobs,Ymis,M | X] = [Yobs,M | X] ×[Ymis | Yobs,M,X]. (1)
Type (i) assumptions are needed to estimate the distribution [Ymis | Yobs,M,X],
while type (ii) assumptions are used, if necessary, to model the observables
[Yobs,M | X] in a parsimonious way.
Type (i) assumptions are necessary to identify the treatment-specific
means. Informally, a parameter is identified if one can write its estimator
as a function that depends only on the observed data. When a parameter
is not identified, it would not be possible to obtain a point estimate even if
the sample size were infinite. It is therefore essential to conduct a sensitivity

OCR for page 83

MISSING DATA IN CLINICAL TRIALS
analysis, whereby the data analysis is repeated under different type (i)
assumptions, in order to clarify the extent to which the conclusions of the
trial are dependent on unverifiable assumptions. The usefulness of a sensi-
tivity analysis ultimately depends on the transparency and plausibility of the
unverifiable assumptions. It is key that any sensitivity analysis methodology
allow the formulation of these assumptions in a transparent and easy-to-
communicate manner.
Ultimately, type (i) assumptions describe how missing outcomes are
being “imputed” under a given model. A reasonable way to formulate
these assumptions is in terms of the connection (or link) between the dis-
tributions of those having missing and those having observed outcomes
but similar covariate profiles. Making this difference explicit is a feature
of pattern mixture models. Examples discussed in this chapter illustrate
both pattern mixture and selection modeling approaches.
In general, it is also necessary to impose type (ii) assumptions. An
important consideration is that modeling assumptions of type (ii), which
apply to the distribution of observed data, can be supported and scrutinized
with standard model-checking techniques.
Broadly speaking, there are two approaches for combining type (i) and
(ii) assumptions to draw inferences about the treatment-specific means:
pattern mixture and selection modeling. To illustrate these approaches, the
next four sections present four example designs of increasing complexity.
The first two examples involve a single outcome, without and then with
auxiliary data. These examples are meant to illustrate when and why the
assumptions of type (i) and (ii) are needed. The third and fourth exam-
ples extend the designs to those with repeated measures, with monotone
and non-monotone missing data, respectively, with and without auxiliary
data.
Our examples are not meant to be prescriptive as to how every sensi-
tivity analysis should be conducted, but rather to illustrate principles that
can guide practice. Type (i) assumptions can only be justified on substan-
tive grounds. As the clinical contexts vary between studies, so too will the
specific form of the sensitivity analysis.
EXAMPLE: SINGLE OUTCOME, NO AUXILIARY DATA
We start with the simple case in which the trial records no baseline
covariate data, and the only measurement to be obtained in the study
is that of the outcome Y, taken at a specified time after randomization.
We assume that the treatment-arm-specific means of Y form the basis for
treatment comparisons and that in each arm there are some study par-
ticipants on whom Y is missing. We let R = 1 if Y is observed and R = 0
otherwise.

OCR for page 83

PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES
Because estimation of each treatment arm mean relies solely on data
from subjects assigned to that arm, the problem reduces to estimation of
a mean E(Y) based on a random sample with Y missing in some units.
Thus, formally, the problem is to estimate m = E(Y) from the observed
data, which comprises the list of indicators R, and the value of Y for those
having R = 1.
The MAR assumption described in Chapter 4 is a type (i) assumption.
In this setting, MAR means that, within each treatment arm, the distribu-
tion of Y among respondents (i.e., those with R = 1) is the same as that for
nonrespondents (i.e., with R = 0).
This example illustrates several key ideas. First, it vividly illustrates the
meaning of an untestable assumption. Let m1 = E(Y | R = 1) denote the mean
among respondents, m0 = E(Y | R = 0) the mean among nonrespondents,
and p = P(R=1) the proportion of those responding. The full-data mean m
is a weighted average
m = pm 1 + ( 1 – p) m0, (2)
but there is no information in the data about the value of m0. Hence, any
assumption one makes about the distribution for the nonrespondents will
be untestable from the data available. In particular, the MAR assumption—
that m1 = m0—is untestable.
Second, this example also illustrates the identifiability (or lack thereof)
of a parameter. Without making assumptions about m0, the full-data mean
m cannot be identified (estimated) from the observed data. However, if one
is prepared to adopt an untestable assumption, m will be identified. For
example, one can assume MAR is equivalent to setting m1 = m0. From (2),
MAR implies that m = m1, or that the full-data mean is equal to the mean
among those with observed Y. Hence, under MAR, a valid estimate of m1
is also valid for m. A natural choice is the sample mean among those with
observed data, namely, µ1 = ∑ RiYi ∑ Ri .
ˆ
i i
Third, this example is the simplest version of a pattern mixture model:
the full-data distribution is written as a mixture—or weighted average—of
the observed and missing data distributions. Under MAR, their means are
equal. However, it is more typical to use pattern mixture models when the
means are not assumed to be equal (MNAR).
By contrast, in the selection model approach, type (ii) assumptions are
made in terms of how the probability of nonresponse relates to the possibly
unobserved outcome. The full-data mean can be estimated using a weighted
average of the observed outcomes, where the weights are individual-specific
and correspond to the conditional probability of being observed given
the observed outcome value. The reweighting serves to create a “pseudo-

OCR for page 83

MISSING DATA IN CLINICAL TRIALS
population” of individuals who are representative of the intended full-data
sample of outcomes.
Importantly, there is a one-to-one relationship between the specification
of a selection model and specification of a pattern-mixture model. The key
distinction ultimately arises in how type (ii) assumptions are imposed. As it
turns out, the two approaches generate equivalent estimators in this simple
example, but for more complex models that rely on type (i) assumptions to
model the observed data, that is not the case.
Pattern Mixture Model Approach
Because we are only interested in the mean of Y, it suffices to make
assumptions about how the mean of Y among nonresponders links to the
mean of Y among respondents. A simple way to accomplish this is by intro-
ducing a sensitivity parameter D that satisfies m0 = m1 +D, or
E(Y | R = 0) = E(Y | R = 1) + D. (3)
It is easy to see that D = mo – m1, the difference in means between
respondents and nonrespondents. To accommodate general measurement
scales, the model should be parameterized so that the sensitivity parameter
satisfies an identity such as
m0 = g–1{g(m1) + D}, (4)
where g( ) is a function, specified by the data analyst, that is strictly increas-
ing and maps values from the range of Y to the real line. The function g
determines the investigator’s choice of scale for comparisons between the
respondents’ and nonrespondents’ means and is often guided by the nature
of the outcome.
For a continuous outcome, one might choose g(u) = u, which reduces to
the simple contrast in means given by (3), where D represents the difference
in mean between nonrespondents and respondents.
For binary outcomes, a convenient choice is g(u) = log(u/(1–u)), which
ensures that the m0 lies between 0 and 1. Here, D is the log odds ratio com-
paring the odds of Y = 1 between respondents and nonrespondents.
Each value of D corresponds to a different unverifiable assumption
about the mean of Y in the nonrespondents. Any specific value of D cor-
responds to an estimate of m because m can be written as the weighted
average
m = pm1 + (1– p)g–1{g(m1) + D}. (5)

OCR for page 83

PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES
After fixing D, one can estimate m by replacing m1 and p with their
sample estimators µ1 and π . Formulas for standard error estimators can
ˆ ˆ
be derived from standard Taylor expansions (delta method), or one can use
the bootstrap.
To examine how inferences concerning m depend on unverifiable
assumptions about the missing data distribution, notice that m is actually a
function of D in (5). Hence, one can proceed by generating an estimate of
m for each value of D that is thought to be plausible. In this model, D = 0
corresponds to MAR; hence, examining inferences about m over a set or
range for D that includes D = 0 will summarize the effects of departures
from MAR on inferences about m.
For fixed D, assumption (4) is of type (i). In this simple setting, type (ii)
assumptions are not needed because m1 and p can be estimated with sample
means, and no modeling is needed.
Finally, to test for treatment effects between two arms, one adopts a
value D0 for the first arm and a value D1 for the second arm. One then esti-
mates each mean separately under the adopted values of D and conducts a
Wald test that their difference is zero. To investigate how the conclusions
depend on the adopted values of D, one repeats the testing over a range of
plausible values for the pair (D0, D1).
Selection Model Approach
A second option for conducting sensitivity analysis is to assume that
one knows how the odds of nonresponse change with the values of the
outcome Y. For example, one can assume that the log odds of nonresponse
differs by a for those who differ by one unit on Y. This is equivalent to
assuming that one knows the value of a (but not h) in the logistic regres-
sion model
logit {P[R = 0 | Y = y]} = h + a y. (6)
(6)
Models like (6) are called selection models because they model the
probability of nonresponse (or selection) as a function of the outcome. Each
unique value of a corresponds to a different unverifiable assumption about
how the probability of nonresponse changes with the outcome.
The model in (6) is also equivalent to assuming that
p (y | R = 0) = p(y | R = 1) × exp(a y) × const. (7)
Adopting a value of a is equivalent to adopting a known link between the
distribution of the respondents and that of the nonrespondents, because one

OCR for page 83

0 MISSING DATA IN CLINICAL TRIALS
cannot use the data to learn anything about the nonrespondent distribution
or to check the value of a. Moreover, one cannot check two other impor-
tant assumptions: that the log odds of nonresponse is linear in y and that
the support of the distribution of Y among nonrespondents is the same as
that among respondents (as implied by (7)).
Although not immediately apparent, once a value of a is adopted, one
can estimate m = E[Y] consistently. A sensitivity analysis consists of repeat-
ing the estimation of m at different plausible values of a so as to assess the
sensitivity of inferences about m to assumptions about the missing data
mechanism as encoded by a and model (6).
Estimation of m relies on the identity
R×Y
E (Y ) = E , (8)
)
(
P R = 1 Y
which suggests estimation of m through inverse probability weighting (see
below); in this case, the weights can depend on missing values of Y. The
inverse probability weighting estimator is
RiYi
µIPW = ∑
ˆ ,
)
( (9)
ˆ
1 − expit h + αYi
i
ˆ
where expit(u) = logit–1(u) = exp(u) / {1 + exp(u)}. To compute h , one solves
the unbiased estimating equation
R
∑ 1 − expit ( ih + αY ) = 0 (10)
i i
for h.1 Analytic formulas for consistent standard error estimators are avail-
able (e.g., Rotnitzky et al., 1998), but bootstrap resampling can be used.
Sensitivity analysis for tests of treatment effects proceeds by repeating the
test over a set of plausible values for a, where different values of a can be
chosen for each arm.
With the selection model approach described here we can conduct sen-
sitivity analysis, not just about the mean but about any other component of
the distribution of Y, for example, the median of Y. Just as in the preceding
pattern mixture approach, the data structure in this setting is so simple that
we need not worry about postulating type (ii) assumptions.
1 Estimation
of h by standard logistic regression of R on Y is not feasible because Y is missing
R
ˆ =1.
when R = 0; the estimator h exploits the identity E
{ }
p R =1 Y

OCR for page 83

PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES
EXAMPLE: SINGLE OUTCOME WITH AUXILIARY DATA
We next consider a setting in which individuals are scheduled to have
a measurement Y0 at baseline, which we assume is never missing (this con-
stitutes the auxiliary data), and a second measurement Y1 at some specified
follow-up time, which is missing in some subjects. We let R1 = 1 if Y1 is
observed and R1 = 0 otherwise. As in the preceding example, we limit our
discussion to estimation of the arm-specific mean of Y1, denoted now by
m = E(Y1).
In this example, the type (i) MAR assumption states that, within each
treatment group and within levels of Y0, the distribution of Y1 among
nonrespondents is the same as the distribution of Y1 among respondents.
That is,
[Y1 | Y0,X,R = 1] = [Y1 | Y0,X,R = 0]. (11)
Pattern Mixture Model Approach
In this and the next section, we demonstrate sensitivity analysis under
MNAR. Under the pattern mixture approach one specifies a link between
the distribution of Y1 in the nonrespondents and respondents who share
the same value of Y0. One can specify, for example, that
E(Y1 | Y0,R1 = 0) = g–1[g{h(Y0)} + D], (12)
where h(Y0) = E(Y1 | R1 = 1,Y0) and g is defined as in the example above.
Example: Continuous Values of Y Suppose Y1 is continuous. One needs a
specification of both the sensitivity analysis function g and the relationship
between Y1 and Y0, represented by η(Y0). A simple version of η is a regres-
sion of Y1 on Y0,
h( Y 0) = E ( Y 1 | Y 0, R = 1 ) (13)
=b0 + b1Y0. (14)
Now let g(u) = u as in the first example above. In this case, using (12),
the mean of the missing Y1 are imputed as regression predictions of Y1 plus
a shift D,
E(Y1 | Y0,R1 = 0) =b0 + b1Y0 +D. (15)

OCR for page 83

MISSING DATA IN CLINICAL TRIALS
Hence, at a fixed value of D, an estimator of ED(Y1 | R1 = 0) can be derived
ˆ ˆ
ˆ
as the sample mean of the regression predictions Y1 = β0 + β1Y0 + D among
ˆ come from a regression of Y
ˆ
those with R = 0. The estimators β and β 1
0
1 1
on Y0 among those with R1 = 1.
In this case, D represents the baseline adjusted difference in the mean
of Y1 between nonrespondents and respondents. If D> 0 (< 0), then for any
fixed value of Y0, the mean of Y1 among nonrespondents is D units higher
(lower) than the mean of Y1 among respondents.
A few comments are in order for this example:
• Model (12) assumes that mean differences do not depend on Y0. If
one believes that they do, then one may choose a more complex version of
the g function, such as
E(Y1 | Y0,R1 = 0) = g–1[g{h(Y0)} + D0 + D1Y0]. (16)
If this version is coupled with a linear regression for η(Y0), then both
the slope and the intercept of that regression will differ for respondents and
nonrespondents.
• In general, any user-specified sensitivity function d(Y0,D) can be
posited, including the simple versions d(Y0,D) =D and d(Y0,D) = D0 + D1Y0.
Importantly, no version of d(Y0,D) can be checked using the observed data.
The choice of d function is a type (i) assumption.
• Likewise, more general choices can be made for the form of η(Y0),
including versions that are nonlinear in Y0. The choice of η is a type (ii)
assumption; it can be critiqued by standard goodness-of-fit procedures
using the observed data.
Example: Binary Outcome Y If Y is binary, the functional form of g
and η will need to be different than in the continuous case. Choosing
g(u) = log(u/(1 + u) implies that Dis the log odds ratio comparing the odds
of Y1 = 1 between respondents and nonrespondents, conditional on Y0. As
with the continuous case, D > 0 (D < 0) implies that, for every level of Y0,
nonrespondents are more (less) likely to have Y1 = 1 than respondents.
The function η(Y0), which describes E(Y1 | Y0,R = 1), should be speci-
fied in terms of a model that is appropriate for binary outcomes. For
example, a simple logistic specification is
logit{h(Y0)} = l0 + l1Y0, (17)

OCR for page 83

PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES
which is equivalent to writing
exp ( λ0 + λ1Y0 )
η (Y0 ) = (18)
.
1 + exp ( λ0 + λ1Y0 )
When Y0 is binary, this model is saturated. But when Y0 is continuous,
or includes other auxiliary covariates, model choice for η will take on added
importance.
Inference A sensitivity analysis to examine how inferences are impacted by
the choice of D consists of repeating the inference over a set or range of values
of Ddeemed to be plausible. It can proceed in the following manner:
Step 1. Specify models for η(Y0) and d(Y0,D).
Step 2. Fit the model η(Y0) to those with R1 = 0, and obtain the esti-
mated function η (Y0 ) .
ˆ
Step 3. The full-data mean m = E(Y1) is
)
(
{ } { }
µ = π E η (Y0 ) R = 1 + (1 − π ) E g −1 g η (Y0 ) + d (Y0 , D ) R = 0 , (19)
where expectations are taken over the distribution of Y0 | R. Although the
general formula looks complex, it is easily computed for a fixed value of D
once the model for η has been fit to data. Specifically,
{ }
Step 3a. The estimate of E η (Y0 ) R = 1 is the sample mean
∑ Riη (Y0i ) ∑ Ri .
ˆ
i i
)
( { }
Step 3b. The estimate of E g −1 g η (Y0 ) + d (Y0 , D ) R1 = 0 also is
computed as a sample mean,
∑ (1 − Ri ) g −1 g {η (Y0i )} + d (Y0i , D )
ˆ
(20)
i
.
∑ (1 − Ri )
i
Step 3c. The estimate of p is π = (1 / n ) ∑ Ri .
ˆ
i
Step 3d. The estimate µ of m is computed by replacing parameters in
ˆ
(19) by their estimators described in the previous steps.
Step 4. Standard errors are computed using bootstrap resampling.
Step 5. Inferences about m are carried out for a plausible set or range
of values of D. Because each unique value of D yields an estimator µ D , it
ˆ
is possible to construct a contour plot of Z-scores, p-values, or confidence

OCR for page 83

MISSING DATA IN CLINICAL TRIALS
RiYi
µIPW = (1 / n ) ∑
ˆ ,
{ } (25)
ˆ (Y ) + αY
1 − expit h 0i
i 1i
where h (Y0 ) is an estimator of h(Y0).
ˆ
Unless Y0 is discrete with a few levels, estimation of h(Y0) requires the
assumption that h(Y0) takes a known form, such as h(Y0;g) = g0 + g1Y0.
(Note that if one adopts this model, one is assuming that the probability
of response follows a logistic regression model on Y0 and Y1 with a given
specified value for the coefficient a of Y1.) Specifying h(Y0) is a type (ii)
assumption that is technically not needed to identify m but is needed in
practical situations involving finite samples.
One can compute an estimator γˆ of g by solving a set of estimating
equations4 for g ,
∂h (Y0i , γ )
R1i
∑ − 1 = 0. (26)
{ }
1 − expit h (Y0i ; γ ) + αY1i
∂γ
i
Formulas for sandwich-type standard error estimators are available,
but the bootstrap can also be used to compute standard error estimates.
Hypothesis-testing sensitivity analysis is conducted in a manner similar to
the one described in the example above with no auxiliary data.
As with the pattern mixture models, by repeating the estimation of m at
a set or interval of known a values, one can examine how different degrees
of residual association between nonresponse and the outcome Y1 affect
inferences concerning E(Y1). A plot similar to the one constructed for the
pattern mixture model is given in Figure 5-2.
EXAMPLE: GENERAL REPEATED MEASURES SETTING
As the number of planned measurement occasions increases, the com-
plexity of the sensitivity analysis grows because the number of missing
data patterns grows. As a result, there can be limitless ways of specifying
models.
Consider a study with K scheduled postbaseline visits. In the special
case of monotone missing data, there are (K + 1) patterns representing
each of the visits at which a subject might last be seen, that is, 0,...,K. The
4 As with the selection approach of the example with no auxiliary data, to estimate g one
cannot fit a logistic regression model because Y1 is missing when R1 = 0. The estimator γˆ
R1
E Y0 = 1.
exploits the identity
)
(
P R1 = 1 Y0 , Y1

OCR for page 83

Reject HD (Placebo)
Placebo
200 mg
Mean
Alpha (200 mg)
Mean Among Nonresponders
Reject HD (200 mg)
Alpha Alpha Alpha (Placebo)
FIGURE 5-2 Selection model sensitivity analysis. Left panel: plot of mean outcome among nonrespondents as a function of
sensitivity parameter a, where a = 0 corresponds to MAR. Center panel: plot of full-data mean as function of a. Right panel: contour
of Z statistic for comparing placebo to active treatment where a is varied separately by treatment.
Figure 5-2
R01821
bitmapped uneditable image

OCR for page 83

MISSING DATA IN CLINICAL TRIALS
(K + 1)st pattern represent subjects with complete data, while the other K
patterns represent those with varying degrees of missing data. In the general
setting, there are many ways to specify pattern models—the models that
link the distribution of missing outcomes to the distribution of observed
outcomes within specified strata—and it is generally necessary to look for
simplifications of the model structure.
For example, one could link the conditional (on a shared history of
observed outcomes through visit k – 1) distribution of missing outcomes at
visit k among those who were last seen at visit k – 1 to (a) the distribution
of outcomes at visit k among those who complete the study, (b) the distri-
bution of outcomes at visit k among those who are in the study through
visit k, or (c) the distribution of outcomes at visit k among those who are
last seen at visit k.
Let Yk denote the outcome scheduled to be measured at visit k, with
visit 0 denoting the baseline measure. We use the notation Yk− = (Y0 ,…, Yk )
to denote the history of the outcomes through visit k and Yk = (Yk+1 ,…, YK ) to
+
denote the future outcomes after visit k. We let Rk denote the indicator that
Yk is observed, so that Rk = 1 if Yk is observed and Rk = 0 otherwise. We
assume that Y0 is observed on all individuals so that R0 = 1. As above, we
focus on inference about the mean m = E(YK) of the intended outcome at
the last visit K.
Monotone Missing Data
Under monotone missingness, if the outcome at visit k is missing, then
the outcome at visit k + 1 is missing. If we let L be the last visit that a
subject has a measurement observed, then the observed data for a subject
is YL = (Y0 ,…, YL ) , where L ≤ K.
−
A Pattern Mixture Model Approach
As noted above, there are many pattern models that can be specified.
Here, we discuss inference in one such model. Recall that both type (i) and
type (ii) assumptions are needed. We first address type (i) specification, illus-
trating a way to link distributions with those having missing observations
to those with observed data.
The general strategy is illustrated for the case K = 3, which relies on an
assumption known as “nonfuture dependence” (kenward et al., 2003). In
simple terms, the nonfuture dependence assumption states that the prob-
ability of drop out at time L can only depend on observed data up to L and
the possibly missing value of YL, but not future values of L.

OCR for page 83

PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES
In the model used here, we assume there is a link between
Y Y − , L = k − 1 and Y Y − , L > k − 1 , which are, respectively, the
k k
k k
distributions of Yk among those who do and do not drop out at time k – 1.
The idea is to use the distribution of those still in the study at time k – 1 to
identify the distribution of those who drop out at k – 1.
It can be shown that m = E(YK) can be estimated by a recursion algo-
rithm, provided the following observed-data distributions are estimated:
Y3 Y0 , Y1 , Y2 , L = 4 , Y2 Y0 , Y1 , L ≥ 3 , Y1 Y0 , L ≥ 2 , Y0 L ≥ 1 , (27)
and the following dropout probabilities
)( )
(
P L = 4 L ≥ 3, Y0 , Y1 , Y2 , Y3 , P L = 3 L ≥ 2, Y0 , Y1 , Y2 ,
(28)
L ≥ 1, Y0 , Y1 ) , P ( L = 1 )
(
P L=2 Y0 ,
can also be estimated. Each is identified from observed data when missing-
ness is monotone.
What is needed to implement the estimation of m = E(YK) is a model
that links the distributions with observed data (27) to the distributions
having missing observations. One simple way to do this is to assume the
distribution of Yk among recent dropouts, Yk Yk−−1 , L = k − 1 , follows
the same parametric model as the distribution of Yk among respondents,
Y Y − , L > k − 1 , but with a different—or shifted—parameter value.
k
k −1
This assumption cannot be verified and may not be realistic in all studies;
we use it here simply as an illustration.
To be more concrete, suppose that the outcomes Y0,…,Y3 are continu-
ous. One can assume regression models for each of (27) as follows,
)
(
E Y0 L ≥ 1 = µ0 , (29)
)
(
E Y1 Y0 , L ≥ 2 = µ1 + β1Y0 , (30)
)
(
E Y2 Y0 , Y1 , L ≥ 3 = µ2 + β2 (Y0 , Y1 ) ,
T
(31)
Y0 , Y1 , Y2 , L = 4 ) = µ3 + β3 (Y0 , Y1 , Y2 ) .
( T
E Y3 (32)
This modeling of the observed data distribution comprises our type (i)
assumptions. These can (and must) be checked using the observables.
Using type (ii) assumptions, the distributions of missing Y can be
linked in a way similar to those for the first example above. For example,

OCR for page 83

00 MISSING DATA IN CLINICAL TRIALS
those with L = 1 are missing Y1. One can link the observed-data regression
) )
( (
E Y1 Y0 , L ≥ 2 to the missing-data regression E Y1 Y0 , L = 1 through
)
( * *
E Y1 Y0 , L = 1 = µ1 + β1Y0 , (33)
* *
where, say, µ1 = µ1 + D µ1 and β1 = β1 + D β1 . Models for missing Y2 and Y3
can be specified similarly.
As with the previous cases, (33) is a type (ii) assumption and cannot be
checked with data. Moreover, even using a simple structure like (33), the
number of sensitivity parameters grows large very quickly with the number
of repeated measures. Hence, it is important to consider simplifications,
such as setting Db = 0, assuming Dm is equivalent across patterns, or some
combination of the two.
Note that under our assumptions, D µk is the difference between the
mean of Yk among those who drop out at k – 1 and those who remain
beyond k – 1, conditional on observed data history up to k – 1. In this
example, the assumption of linearity in the regression models, combined
with an assumption that D βk = 0 for all k, means that one does not need a
)
(
model for P L = k L ≥ k, Yk− to implement the estimation via recursion
algorithm.
A sensitivity analysis consists of estimating m and its standard error
repeatedly over a range of plausible values of specified D parameters. For
this illustration, setting D = 0 implies MAR.5
Selection Model
Another way to posit type (i) assumptions in this setting is to postu-
late a model for how the odds of dropping out between visits k and k + 1,
depends on the (possibly missing) future outcomes, Yk+ , given the recorded
history Yk− . That is,
5 An attractive feature of the pattern mixture approach we consider here (the one that links
the distribution of outcomes between dropouts at a given time and those who remain in the
study at that time) is that the special choice of link that specifies that these two distributions
are the same is tantamount to the MAR assumption (i.e., the assumption that at any given
occasion the past recorded data are the only predictors of the future outcomes that are used
to decide whether or not to drop out of the study at that time). This feature does not hold
with other choices of pattern mixture models. Thus, in our example, exploring how infer-
ences about m change as Dk moves away from Dk = 0 is tantamount to exploring the impact
of distinct degrees of residual dependence between the missing outcomes and dropping out
on our inferences about m. In more general pattern mixture models, D = 0 is only sufficient,
but not necessary, for MAR to hold. It is possible to find other unique combinations of D that
correspond to MAR.

OCR for page 83

0
PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES
).
) = P ((L > k
P L = k L ≥ k, Yk− , Yk+
( k, Yk− , Yk+
odds L = k L ≥
)
k, Yk− , Yk+
L≥
The MAR assumption states that the odds do not depend on the future
outcomes Yk+ . The nonfuture dependence assumption above states that it
depends only on the future through Yk+1. That is,
) )
( (
odds L = k L ≥ k, Yk− , Yk+ = odds L = k L ≥ k, Yk− , Yk+1 (34)
and is equivalent to assuming that after adjusting for the recorded history,
the outcome to be measured at visit k + 1 is the only predictor of all future
missing outcomes that is associated with the odds of dropping out between
visits k and k + 1.
This last assumption, coupled with an assumption that quantifies the
dependence of the odds in the right hand side on Yk+1, suffices to identify
m = E(YK): in fact, it suffices to identify E(Yk) for any k = 1,...,K. For
example, one might assume
) = exp (α ),
(
odds L = k L ≥ k, Yk− , Yk+1 + 1
)
(
(35)
odds L = k L ≥ k, Yk− , Yk+1
that is, that each unit increase in Yk+1 is associated with a constant increase
in the log odds of nonresponse of a, the same for all values of Yk− and all
visits k.
Under (34), a = 0 implies MAR. One would make this choice if it is
believed that the recorded history Yk− encodes all the predictors of Yk+1 that
are associated with missingness. Values of a ≠ 0 reflect residual association
of dropping out between visits k and k + 1 and the possibly unobserved
outcome Yk+1, after adjusting for previous outcomes, and hence the belief
that dropping out cannot be entirely explained by the observed recorded
history Yk− . By repeating estimation of the vector m for each fixed a, one
can examine how different degrees of residual association between drop-
ping out and outcome at each occasion after adjusting for the influence of
recorded history affects inferences concerning m.
Assumptions (34) and (35) together are equivalent to specifying that
{( )} = h (Y ) + αY
log it P L = k L ≥ k, Yk− , Yk+ −
k +1 , (36)
k k
()−
where hk Yk is an unknown function of Y. This, in turn, is equivalent to
the pattern mixture model

OCR for page 83

0 MISSING DATA IN CLINICAL TRIALS
)( )
(
p yk+1 L = k, Yk− = p yk+1 L ≥ k + 1, Yk− × exp (α yk+1 ) × const . (37)
In this latter form, one can see that there is no evidence in the data
regarding a since it serves as the link between the conditional (on Yk− ) dis-
tribution of Yk+1 among those who drop out between visits k and k + 1 and
those who remain through visit k + 1. If one believes that the association
between dropping out and future outcomes depends solely on the current
outcome but varies according to the recorded history, one can replace a
with a known function of Yk− .
For instance, replacing in equation (36) the constant a with the func-
tion a0 + a1Yk with a0 and a1 specified, encodes the belief that the residual
association between dropping out between k and k + 1 and the outcome
Yk+1 may be stronger for individuals with, say, higher (if a1 > 0) values of
the outcome at visit k. As an example, if Yk is a strong predictor of Yk+1 and
that lower values of Yk+1 are preferable (e.g., HIV-RNA viral load), then
it is reasonable to postulate that subjects with low values of Yk drop out
for reasons unrelated to the drug efficacy (and, in particular, then to their
outcome Yk+1) while subjects with higher values of Yk drop out for reasons
related to drug efficacy and hence to their outcome Yk+1.
Regardless of how the residual dependence is specified, m can be
expressed in terms of the distribution of the observed data, that is, it is
identified. Estimation of m = E[YK] relies on the identity
RKYK
(38)
E YK = E ,
)
(
−
π K YK ; h1 ,…, hK ; α
where
{() }
K −1
)
(
π K YK ; h1 ,…, hK ; α = ∏ 1 − expit hk YK + αYk+1 .
− −
k=0
This formula suggests that one can estimate m with the IPW estimator
RiKYiK
µIPW = (1 / n ) ∑
ˆ .
( ) (39)
−ˆ ˆ
i π k YiK ; hi1 ,…, hiK ; α
()
ˆ −
This estimator relies on estimators hik = hk Yik . In order to estimate
these functions, one needs to impose type (ii) modeling assumptions on
() () ( )
hk Yk− , that is, hk Yk− = hk Yk− ; γ k . For example, one can assume that
h (Y ) = γ
−
0,k + γ 1,kYk (adopting such a model would be tantamount to
k k
assuming that the probability of dropping out at each time follows a logistic
regression model on just the immediately preceding recorded data and on
the current outcome).

OCR for page 83

0
PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES
As with the selection approach of the two preceding examples, to esti-
mate gk, one cannot fit a logistic regression model because YK+1 is missing
when L = k. However, one can estimate it instead by solving the estimating
equations
)
(∂hk Yik ; γ k
−
n K −1 Ri,k+1
∑∑ − 1 .
Rik (40)
{ }
1 − expit Y − ; γ + αY
∂γ k
i =1 k = 0
i ,k +1
ik k
for g, justified on similar grounds as the estimators of h functions in the
previous examples.
Formula for sandwich-type standard error estimators are available
(see Rotnitzky et al., 1997), but the bootstrap can also be used to compute
standard error estimates. Sensitivity analysis with regard to hypothesis
testing is conducted in a manner similar to the one described in the first
example above.
Nonmonotone Missing Data
A typical way to analyze nonmonotone missing data is to treat the
time of dropout as the key missing data variable and then to assume MAR
within dropout pattern (or conditional on dropout time). The advantage
of this approach is purely practical: It interpolates missing data under a
specified model. That said, however, the current literature suggests that
MAR within pattern does not easily correspond to realistic mechanisms
for generating the data. This raises concern among members of the panel
that nonmonotone dropouts may require more specialized methods for
modeling the missing data mechanism, and accounting for departures from
MAR.
This topic has not been deeply studied in the extant statistical literature,
and in particular numerical studies are lacking. We recommend this as a key
area of investigation that will: (a) examine the appropriateness of existing
models and in particular the potential pitfalls of assuming MAR within
missing data pattern; and (b) develop and apply novel, appropriate methods
of model specification and sensitivity analysis to handle nonmonotone miss-
ing data patterns.
COMPARING PATTERN MIXTURE AND
SELECTION APPROACHES
The main appeal of the selection model approach is that, since it models
the probability of nonresponse rather than the distribution of outcomes, it
can easily accommodate vectors of auxiliary factors with components that
can be of all types, discrete, categorical, and continuous.

OCR for page 83

0 MISSING DATA IN CLINICAL TRIALS
Two disadvantages of the selection approach as they relate to draw-
ing inferences are (a) the inverse weighting estimation procedure, which
can yield relatively inefficient inferences (i.e., large standard errors), and
(b) that model checking of the type (ii) assumptions must be conducted
for each unique value of the sensitivity analysis parameters. Formal model
checking procedures have yet to be formally developed for this setting. The
inefficiencies associated with the inverse weighting procedure are mitigated
in settings with a sizable fraction of missing data, as the sampling variability
is often of less concern than the range of type (i) assumptions that are
entertained. To address (b), one should fit a highly flexible model for the h
function in the selection model.
Another potential disadvantage of selection models relates to interpreta-
tion of the sensitivity parameter. Particularly for continuous measures, it may
be difficult to interpret nonresponse rates on the odds scale and to specify
reasonable ranges for the sensitivity parameter. Plots such as those shown in
Figure 5-2 (above) can be helpful in understanding how values of the sensitiv-
ity parameter correspond to imputed means for the missing outcomes.
Advantages of the pattern mixture model include transparent interpre-
tation of sensitivity parameters and straightforward model checking for the
observed-data distribution. The sensitivity parameters are typically specified
in terms of differences in mean between respondents and nonrespondents,
which appeal directly to intuition and contributes to formulating plausible
ranges for the parameter. Pattern mixture models also can be specified so
that the fit to the observed data is identical across all values of the sensitiv-
ity parameters; hence, model checking will be straightforward and does not
depend on the assumed missing data assumption.
Disadvantages of pattern mixture modeling include difficulties in includ-
ing auxiliary information, which will generally require additional modeling.
Computation of the weighted averages across patterns for models of large
numbers of repeated measures also can become complex without significant
simplifying assumptions.
TIME-TO-EVENT DATA
A major challenge in the analysis of time-to-event outcomes in ran-
domized trials is to properly account for censoring that may be informa-
tive. Different approaches have been proposed in the research literature to
address this issue. When no auxiliary prognostic factors are available, the
general strategy has been to impose nonidentifiable assumptions concerning
the dependence between failure and censoring times and then vary these
assumptions in order to assess the sensitivity of inferences on the estimated
survivor function. When prognostic factors are recorded, Robins and col-
leagues in a series of papers (Robins and Rotnitzky, 1992; Robins, 1993;

OCR for page 83

0
PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES
Robins and Finkelstein, 2000) proposed a general estimation strategy under
the assumption that all measured prognostic factors that predict censor-
ing are recorded in the database. Scharfstein and Robins (2002) proposed
a method for conducting sensitivity analysis under the assumption that
some but not all joint prognostic factors for censoring and survival are
available. Their approach is to repeat inference under different values of a
nonidentifiable censoring bias parameter that encodes the magnitude of
the residual association between survival and censoring after adjusting for
measured prognostic factors.
In randomized studies, censoring typically occurs for several reasons,
some noninformative, others informative. For instance, in studies with stag-
gered entry, the administrative end of the follow-up period typically induces
noninformative censoring. However, loss to follow-up due to dropouts
induces a competing censoring mechanism that is likely to be informative.
Treatment discontinuation might induce yet another informative censoring
process.
Under the Scharfstein and Robins methodology, the analyst specifies
a range for the parameter encoding the residual dependence of the hazard
of the minimum of competing censoring times on the censored outcome.
However, this range might be rather difficult to specify if the reasons that
each censoring might occur are quite different, more so if some censoring
processes are informative and some are not. To ameliorate this problem in
studies with staggered entry, one can eliminate the censoring by the admin-
istrative end of the follow-up period (typically a source of noninformative
censoring) by restricting the follow-up period to a shorter interval in which
(with probability) no subject is administratively censored. However, in doing
so, one would lose valuable information on the survival experience of the
study patients who remain at risk at the end of the reduced analysis interval.
Rotnitzky et al. (2007) provide estimators of the survival function under
separate models for the competing censoring mechanisms, including both
informative and noninformative censoring. The methods can be used to
exploit the data recorded throughout the entire follow-up period and, in par-
ticular, beyond the end of the reduced analysis interval discussed above.
DECISION MAKING
Even after model fitting and sensitivity analysis, investigators have to
decide about how important the treatment effect is. Unfortunately, there is
no scientific consensus on how to synthesize information from a sensitivity
analysis into a single decision about treatment effect. At least three pos-
sibilities can be considered.
One possibility is to specify a plausible region for the sensitivity parameters
and report estimates of the lower and upper bounds from this range. These

OCR for page 83

0 MISSING DATA IN CLINICAL TRIALS
endpoints form bounds on the estimated treatment effect and would be used
in place of point estimates. Accompanying these bounds would be a 95 per-
cent confidence region. This procedure can be viewed as accounting for both
sampling variability and variability due to model uncertainty (i.e., uncertainty
about the sensitivity parameter value): see Molenberghs and kenward (2007)
for more detailed discussion and recommendations for computing a 95 per-
cent confidence region.
A second possibility is to carry out inference under MAR and determine
the set of sensitivity parameter values that would lead to overturning the
conclusion from MAR. Results can be viewed as equivocal if the inference
about treatment effects could be overturned for values of the sensitivity
parameter that are plausible.
The third possibility is to derive a summary inference that averages
over values of the sensitivity parameters in some principled fashion. This
approach could be viewed as appropriate in settings in which reliable prior
information about the sensitivity parameter value is known in advance.
Regardless of the specific approach taken to decision making, the key
issue is weighting the results, either formally or informally, from both the
primary analysis and each alternative analysis by assessing the reason-
ableness of the assumptions made in conjunction with each analysis. The
analyses should be given little weight when the associated assumptions
are viewed as being extreme and should be given substantial weight when
the associated assumptions are viewed as being comparably plausible to
those for the primary analysis. Therefore, in situations in which there are
alternative analyses as part of the sensitivity analysis that support contrary
inferences to that of the primary analysis, if the associated assumptions are
viewed as being fairly extreme, it would be reasonable to continue to sup-
port the inference from the primary analysis.
RECOMMENDATION
Recommendation 15: Sensitivity analyses should be part of the primary
reporting of findings from clinical trials. Examining sensitivity to the
assumptions about the missing data mechanism should be a mandatory
component of reporting.
We note that there are some often-used models for the analysis of miss-
ing data in clinical trials for which the form of a sensitivity analysis has
not been fully developed in the literature. Although we have provided prin-
ciples for the broad development of sensitivity analyses, we have not been
prescriptive for many individual models. It is important that additional
research be carried out so that methods to carry out sensitivity analyses for
all of the standard models are available.