5
Principles and Methods of Sensitivity Analyses

This chapter concerns principles and methods for sensitivity analyses that quantify the robustness of inferences to departures from underlying assumptions. Unlike the well-developed literature on drawing inferences from incomplete data, the literature on the assessment of sensitivity to various assumptions is relatively new. Because it is an active area of research, it is more difficult to identify a clear consensus about how sensitivity analyses should be conducted. However, in this chapter we articulate a consensus set of principles and describe methods that respect those principles.

We begin by describing in some detail the difficulties posed by reliance on untestable assumptions. We then demonstrate how sensitivity to these assumptions can be represented and investigated in the context of two popular models, selection and pattern mixture models. We also provide case study illustrations to suggest a format for conducting sensitivity analyses, recognizing that these case studies cannot cover the broad range of types and designs of clinical trials. Because the literature on sensitivity analysis is evolving, the primary objective of this chapter is to assert the importance of conducting some form of sensitivity analysis and to illustrate principles in some simple cases. We close the chapter with recommendations for further research on specific aspects of sensitivity analysis methodology.

BACKGROUND

There are fundamental issues involved with selecting a model and assessing its fit to incomplete data that do not apply to inference from complete data. Such issues occur even in the missing at random (MAR)



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 83
5 Principles and Methods of Sensitivity Analyses This chapter concerns principles and methods for sensitivity analyses that quantify the robustness of inferences to departures from underlying assumptions. Unlike the well-developed literature on drawing inferences from incomplete data, the literature on the assessment of sensitivity to vari- ous assumptions is relatively new. Because it is an active area of research, it is more difficult to identify a clear consensus about how sensitivity analyses should be conducted. However, in this chapter we articulate a consensus set of principles and describe methods that respect those principles. We begin by describing in some detail the difficulties posed by reliance on untestable assumptions. We then demonstrate how sensitivity to these assumptions can be represented and investigated in the context of two popular models, selection and pattern mixture models. We also provide case study illustrations to suggest a format for conducting sensitivity analyses, recognizing that these case studies cannot cover the broad range of types and designs of clinical trials. Because the literature on sensitivity analysis is evolving, the primary objective of this chapter is to assert the importance of conducting some form of sensitivity analysis and to illustrate principles in some simple cases. We close the chapter with recommendations for further research on specific aspects of sensitivity analysis methodology. BACKGROUND There are fundamental issues involved with selecting a model and assessing its fit to incomplete data that do not apply to inference from complete data. Such issues occur even in the missing at random (MAR) 

OCR for page 83
 MISSING DATA IN CLINICAL TRIALS case, but they are compounded under missing not at random (MNAR). We believe that, especially when the primary analysis assumes MAR, the fit of an MAR model can often be addressed by standard model-checking diag- nostics, leaving the sensitivity analysis to MNAR models that deviate from MAR. This approach is suggested in order not to overburden the primary analysis. The discussion in Chapter 4 provides some references for model- checking of MAR models. In addition, with MAR missingness mechanisms that deviate markedly from missing completely at random (MCAR), as in the hypertension example in Chapter 4, analyses with incomplete data are potentially less robust to violations of parametric assumptions than analy- ses with complete data, so checking them is even more critical. The data can never rule out an MNAR mechanism, and when the data are potentially MNAR, issues of sensitivity to modeling asumptions are even more serious than under MAR. One approach could be to estimate from the available data the parameters of a model representing an MNAR mechanism. However, the data typically do not contain information on the parameters of the particular model chosen (Jansen et al., 2006). In fact, different MNAR models may fit the observed data equally well but have quite different implications for the unobserved measurements and hence for the conclusions to be drawn from the respective analyses. Without additional information, one cannot usefully distinguish between such MNAR models based solely on their fit to the observed data, and so goodness-of-fit tools alone do not provide a relevant means of choosing between such models. These considerations point to the necessity of sensitivity analysis. In a broad sense, one can define a sensitivity analysis as one in which several sta- tistical models are considered simultaneously or in which a statistical model is further scrutinized using specialized tools, such as diagnostic measures. This rather loose and very general definition encompasses a wide variety of useful approaches. A simple procedure is to fit a selected number of (MNAR) models, all of which are deemed plausible and have equivalent or nearly equivalent fit to the observed data; alternatively, a preferred (primary) analysis can be supplemented with a number of modifications. The degree to which conclu- sions (inferences) are stable across such analyses provides an indication of the confidence that can be placed in them. Modifications to a basic model can be constructed in different ways. One obvious strategy is to consider various dependencies of the missing data process on the outcomes or the covariates. One can choose to supplement an analysis within the selection modeling framework, say, with one or several in the pattern mixture modeling framework, which explicitly models the missing responses at any given time given the previously observed responses. Alternatively, the distributional assumptions of the models can be altered.

OCR for page 83
 PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES The vast range of models and methods for handling missing data high- lights the need for sensitivity analysis. Indeed, research on methodology has shifted from formulation of ever more complex models to methods for assessing sensitivity of specific models and their underlying assumptions. The paradigm shift to sensitivity analysis is, therefore, welcome. Prior to focused research on sensitivity, many methods used in practice were potentially useful but ad hoc (e.g., comparing several incompatible MNAR models to each other). Although informal sensitivity analyses are an indis- pensable step in the analysis of incomplete longitudinal data, it is desirable to have more formal frameworks within which to develop such analyses. It is possible to assess model sensitivities of several different types, including sensitivity to: (a) distributional assumptions for the full data, (b) outlying or influential observations, and (c) assumptions about the missing data mechanism. Assessment of (a) can be partially carried out to the extent that one can compare observed and fitted values for the observ- ables under the model specified for the full data. However, distributional assumptions for the missing data cannot be checked. Assessment of (b) can be used to identify observations that are outliers in the observed-data distri- bution or that may be driving weakly identified parts of an MNAR model (Molenberghs and kenward, 2007). This chapter focuses on (c), sensitivity to assumptions about the missing data mechanism. FRAMEWORK To focus ideas, we restrict consideration to follow-up randomized study designs with repeated measures. We consider the case in which interest is focused on treatment comparisons of visit-specific means of the repeated measures. With incomplete data, inference about the treatment arm means requires two types of assumptions: (i) untestable assumptions about the distribution of missing outcomes data, and (ii) testable assumptions about the distribution of observed outcomes. Recall that the full-data distribution, described in Chapter 4, can be factored as [Yobs,Ymis,M | X] = [Yobs,M | X] ×[Ymis | Yobs,M,X]. (1) Type (i) assumptions are needed to estimate the distribution [Ymis | Yobs,M,X], while type (ii) assumptions are used, if necessary, to model the observables [Yobs,M | X] in a parsimonious way. Type (i) assumptions are necessary to identify the treatment-specific means. Informally, a parameter is identified if one can write its estimator as a function that depends only on the observed data. When a parameter is not identified, it would not be possible to obtain a point estimate even if the sample size were infinite. It is therefore essential to conduct a sensitivity

OCR for page 83
 MISSING DATA IN CLINICAL TRIALS analysis, whereby the data analysis is repeated under different type (i) assumptions, in order to clarify the extent to which the conclusions of the trial are dependent on unverifiable assumptions. The usefulness of a sensi- tivity analysis ultimately depends on the transparency and plausibility of the unverifiable assumptions. It is key that any sensitivity analysis methodology allow the formulation of these assumptions in a transparent and easy-to- communicate manner. Ultimately, type (i) assumptions describe how missing outcomes are being “imputed” under a given model. A reasonable way to formulate these assumptions is in terms of the connection (or link) between the dis- tributions of those having missing and those having observed outcomes but similar covariate profiles. Making this difference explicit is a feature of pattern mixture models. Examples discussed in this chapter illustrate both pattern mixture and selection modeling approaches. In general, it is also necessary to impose type (ii) assumptions. An important consideration is that modeling assumptions of type (ii), which apply to the distribution of observed data, can be supported and scrutinized with standard model-checking techniques. Broadly speaking, there are two approaches for combining type (i) and (ii) assumptions to draw inferences about the treatment-specific means: pattern mixture and selection modeling. To illustrate these approaches, the next four sections present four example designs of increasing complexity. The first two examples involve a single outcome, without and then with auxiliary data. These examples are meant to illustrate when and why the assumptions of type (i) and (ii) are needed. The third and fourth exam- ples extend the designs to those with repeated measures, with monotone and non-monotone missing data, respectively, with and without auxiliary data. Our examples are not meant to be prescriptive as to how every sensi- tivity analysis should be conducted, but rather to illustrate principles that can guide practice. Type (i) assumptions can only be justified on substan- tive grounds. As the clinical contexts vary between studies, so too will the specific form of the sensitivity analysis. EXAMPLE: SINGLE OUTCOME, NO AUXILIARY DATA We start with the simple case in which the trial records no baseline covariate data, and the only measurement to be obtained in the study is that of the outcome Y, taken at a specified time after randomization. We assume that the treatment-arm-specific means of Y form the basis for treatment comparisons and that in each arm there are some study par- ticipants on whom Y is missing. We let R = 1 if Y is observed and R = 0 otherwise.

OCR for page 83
 PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES Because estimation of each treatment arm mean relies solely on data from subjects assigned to that arm, the problem reduces to estimation of a mean E(Y) based on a random sample with Y missing in some units. Thus, formally, the problem is to estimate m = E(Y) from the observed data, which comprises the list of indicators R, and the value of Y for those having R = 1. The MAR assumption described in Chapter 4 is a type (i) assumption. In this setting, MAR means that, within each treatment arm, the distribu- tion of Y among respondents (i.e., those with R = 1) is the same as that for nonrespondents (i.e., with R = 0). This example illustrates several key ideas. First, it vividly illustrates the meaning of an untestable assumption. Let m1 = E(Y | R = 1) denote the mean among respondents, m0 = E(Y | R = 0) the mean among nonrespondents, and p = P(R=1) the proportion of those responding. The full-data mean m is a weighted average m = pm 1 + ( 1 – p) m0, (2) but there is no information in the data about the value of m0. Hence, any assumption one makes about the distribution for the nonrespondents will be untestable from the data available. In particular, the MAR assumption— that m1 = m0—is untestable. Second, this example also illustrates the identifiability (or lack thereof) of a parameter. Without making assumptions about m0, the full-data mean m cannot be identified (estimated) from the observed data. However, if one is prepared to adopt an untestable assumption, m will be identified. For example, one can assume MAR is equivalent to setting m1 = m0. From (2), MAR implies that m = m1, or that the full-data mean is equal to the mean among those with observed Y. Hence, under MAR, a valid estimate of m1 is also valid for m. A natural choice is the sample mean among those with observed data, namely, µ1 = ∑ RiYi ∑ Ri . ˆ i i Third, this example is the simplest version of a pattern mixture model: the full-data distribution is written as a mixture—or weighted average—of the observed and missing data distributions. Under MAR, their means are equal. However, it is more typical to use pattern mixture models when the means are not assumed to be equal (MNAR). By contrast, in the selection model approach, type (ii) assumptions are made in terms of how the probability of nonresponse relates to the possibly unobserved outcome. The full-data mean can be estimated using a weighted average of the observed outcomes, where the weights are individual-specific and correspond to the conditional probability of being observed given the observed outcome value. The reweighting serves to create a “pseudo-

OCR for page 83
 MISSING DATA IN CLINICAL TRIALS population” of individuals who are representative of the intended full-data sample of outcomes. Importantly, there is a one-to-one relationship between the specification of a selection model and specification of a pattern-mixture model. The key distinction ultimately arises in how type (ii) assumptions are imposed. As it turns out, the two approaches generate equivalent estimators in this simple example, but for more complex models that rely on type (i) assumptions to model the observed data, that is not the case. Pattern Mixture Model Approach Because we are only interested in the mean of Y, it suffices to make assumptions about how the mean of Y among nonresponders links to the mean of Y among respondents. A simple way to accomplish this is by intro- ducing a sensitivity parameter D that satisfies m0 = m1 +D, or E(Y | R = 0) = E(Y | R = 1) + D. (3) It is easy to see that D = mo – m1, the difference in means between respondents and nonrespondents. To accommodate general measurement scales, the model should be parameterized so that the sensitivity parameter satisfies an identity such as m0 = g–1{g(m1) + D}, (4) where g( ) is a function, specified by the data analyst, that is strictly increas- ing and maps values from the range of Y to the real line. The function g determines the investigator’s choice of scale for comparisons between the respondents’ and nonrespondents’ means and is often guided by the nature of the outcome. For a continuous outcome, one might choose g(u) = u, which reduces to the simple contrast in means given by (3), where D represents the difference in mean between nonrespondents and respondents. For binary outcomes, a convenient choice is g(u) = log(u/(1–u)), which ensures that the m0 lies between 0 and 1. Here, D is the log odds ratio com- paring the odds of Y = 1 between respondents and nonrespondents. Each value of D corresponds to a different unverifiable assumption about the mean of Y in the nonrespondents. Any specific value of D cor- responds to an estimate of m because m can be written as the weighted average m = pm1 + (1– p)g–1{g(m1) + D}. (5)

OCR for page 83
 PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES After fixing D, one can estimate m by replacing m1 and p with their sample estimators µ1 and π . Formulas for standard error estimators can ˆ ˆ be derived from standard Taylor expansions (delta method), or one can use the bootstrap. To examine how inferences concerning m depend on unverifiable assumptions about the missing data distribution, notice that m is actually a function of D in (5). Hence, one can proceed by generating an estimate of m for each value of D that is thought to be plausible. In this model, D = 0 corresponds to MAR; hence, examining inferences about m over a set or range for D that includes D = 0 will summarize the effects of departures from MAR on inferences about m. For fixed D, assumption (4) is of type (i). In this simple setting, type (ii) assumptions are not needed because m1 and p can be estimated with sample means, and no modeling is needed. Finally, to test for treatment effects between two arms, one adopts a value D0 for the first arm and a value D1 for the second arm. One then esti- mates each mean separately under the adopted values of D and conducts a Wald test that their difference is zero. To investigate how the conclusions depend on the adopted values of D, one repeats the testing over a range of plausible values for the pair (D0, D1). Selection Model Approach A second option for conducting sensitivity analysis is to assume that one knows how the odds of nonresponse change with the values of the outcome Y. For example, one can assume that the log odds of nonresponse differs by a for those who differ by one unit on Y. This is equivalent to assuming that one knows the value of a (but not h) in the logistic regres- sion model logit {P[R = 0 | Y = y]} = h + a y. (6) (6) Models like (6) are called selection models because they model the probability of nonresponse (or selection) as a function of the outcome. Each unique value of a corresponds to a different unverifiable assumption about how the probability of nonresponse changes with the outcome. The model in (6) is also equivalent to assuming that p (y | R = 0) = p(y | R = 1) × exp(a y) × const. (7) Adopting a value of a is equivalent to adopting a known link between the distribution of the respondents and that of the nonrespondents, because one

OCR for page 83
0 MISSING DATA IN CLINICAL TRIALS cannot use the data to learn anything about the nonrespondent distribution or to check the value of a. Moreover, one cannot check two other impor- tant assumptions: that the log odds of nonresponse is linear in y and that the support of the distribution of Y among nonrespondents is the same as that among respondents (as implied by (7)). Although not immediately apparent, once a value of a is adopted, one can estimate m = E[Y] consistently. A sensitivity analysis consists of repeat- ing the estimation of m at different plausible values of a so as to assess the sensitivity of inferences about m to assumptions about the missing data mechanism as encoded by a and model (6). Estimation of m relies on the identity   R×Y   E (Y ) = E  , (8) ) ( P R = 1 Y    which suggests estimation of m through inverse probability weighting (see below); in this case, the weights can depend on missing values of Y. The inverse probability weighting estimator is RiYi µIPW = ∑ ˆ , ) ( (9) ˆ 1 − expit h + αYi i ˆ where expit(u) = logit–1(u) = exp(u) / {1 + exp(u)}. To compute h , one solves the unbiased estimating equation   R   ∑ 1 − expit ( ih + αY )  = 0 (10)     i i for h.1 Analytic formulas for consistent standard error estimators are avail- able (e.g., Rotnitzky et al., 1998), but bootstrap resampling can be used. Sensitivity analysis for tests of treatment effects proceeds by repeating the test over a set of plausible values for a, where different values of a can be chosen for each arm. With the selection model approach described here we can conduct sen- sitivity analysis, not just about the mean but about any other component of the distribution of Y, for example, the median of Y. Just as in the preceding pattern mixture approach, the data structure in this setting is so simple that we need not worry about postulating type (ii) assumptions. 1 Estimation of h by standard logistic regression of R on Y is not feasible because Y is missing   R ˆ  =1. when R = 0; the estimator h exploits the identity E  { } p R =1 Y   

OCR for page 83
 PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES EXAMPLE: SINGLE OUTCOME WITH AUXILIARY DATA We next consider a setting in which individuals are scheduled to have a measurement Y0 at baseline, which we assume is never missing (this con- stitutes the auxiliary data), and a second measurement Y1 at some specified follow-up time, which is missing in some subjects. We let R1 = 1 if Y1 is observed and R1 = 0 otherwise. As in the preceding example, we limit our discussion to estimation of the arm-specific mean of Y1, denoted now by m = E(Y1). In this example, the type (i) MAR assumption states that, within each treatment group and within levels of Y0, the distribution of Y1 among nonrespondents is the same as the distribution of Y1 among respondents. That is, [Y1 | Y0,X,R = 1] = [Y1 | Y0,X,R = 0]. (11) Pattern Mixture Model Approach In this and the next section, we demonstrate sensitivity analysis under MNAR. Under the pattern mixture approach one specifies a link between the distribution of Y1 in the nonrespondents and respondents who share the same value of Y0. One can specify, for example, that E(Y1 | Y0,R1 = 0) = g–1[g{h(Y0)} + D], (12) where h(Y0) = E(Y1 | R1 = 1,Y0) and g is defined as in the example above. Example: Continuous Values of Y Suppose Y1 is continuous. One needs a specification of both the sensitivity analysis function g and the relationship between Y1 and Y0, represented by η(Y0). A simple version of η is a regres- sion of Y1 on Y0, h( Y 0) = E ( Y 1 | Y 0, R = 1 ) (13)  =b0 + b1Y0. (14) Now let g(u) = u as in the first example above. In this case, using (12), the mean of the missing Y1 are imputed as regression predictions of Y1 plus a shift D, E(Y1 | Y0,R1 = 0) =b0 + b1Y0 +D. (15)

OCR for page 83
 MISSING DATA IN CLINICAL TRIALS Hence, at a fixed value of D, an estimator of ED(Y1 | R1 = 0) can be derived ˆ ˆ ˆ as the sample mean of the regression predictions Y1 = β0 + β1Y0 + D among ˆ come from a regression of Y ˆ those with R = 0. The estimators β and β 1 0 1 1 on Y0 among those with R1 = 1. In this case, D represents the baseline adjusted difference in the mean of Y1 between nonrespondents and respondents. If D> 0 (< 0), then for any fixed value of Y0, the mean of Y1 among nonrespondents is D units higher (lower) than the mean of Y1 among respondents. A few comments are in order for this example: • Model (12) assumes that mean differences do not depend on Y0. If one believes that they do, then one may choose a more complex version of the g function, such as E(Y1 | Y0,R1 = 0) = g–1[g{h(Y0)} + D0 + D1Y0]. (16) If this version is coupled with a linear regression for η(Y0), then both the slope and the intercept of that regression will differ for respondents and nonrespondents. • In general, any user-specified sensitivity function d(Y0,D) can be posited, including the simple versions d(Y0,D) =D and d(Y0,D) = D0 + D1Y0. Importantly, no version of d(Y0,D) can be checked using the observed data. The choice of d function is a type (i) assumption. • Likewise, more general choices can be made for the form of η(Y0), including versions that are nonlinear in Y0. The choice of η is a type (ii) assumption; it can be critiqued by standard goodness-of-fit procedures using the observed data. Example: Binary Outcome Y If Y is binary, the functional form of g and η will need to be different than in the continuous case. Choosing g(u) = log(u/(1 + u) implies that Dis the log odds ratio comparing the odds of Y1 = 1 between respondents and nonrespondents, conditional on Y0. As with the continuous case, D > 0 (D < 0) implies that, for every level of Y0, nonrespondents are more (less) likely to have Y1 = 1 than respondents. The function η(Y0), which describes E(Y1 | Y0,R = 1), should be speci- fied in terms of a model that is appropriate for binary outcomes. For example, a simple logistic specification is logit{h(Y0)} = l0 + l1Y0, (17)

OCR for page 83
 PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES which is equivalent to writing exp ( λ0 + λ1Y0 ) η (Y0 ) = (18) . 1 + exp ( λ0 + λ1Y0 ) When Y0 is binary, this model is saturated. But when Y0 is continuous, or includes other auxiliary covariates, model choice for η will take on added importance. Inference A sensitivity analysis to examine how inferences are impacted by the choice of D consists of repeating the inference over a set or range of values of Ddeemed to be plausible. It can proceed in the following manner: Step 1. Specify models for η(Y0) and d(Y0,D). Step 2. Fit the model η(Y0) to those with R1 = 0, and obtain the esti- mated function η (Y0 ) . ˆ Step 3. The full-data mean m = E(Y1) is ) ( { } { } µ = π E η (Y0 ) R = 1 + (1 − π ) E g −1  g η (Y0 ) + d (Y0 , D )  R = 0 , (19)   where expectations are taken over the distribution of Y0 | R. Although the general formula looks complex, it is easily computed for a fixed value of D once the model for η has been fit to data. Specifically, { } Step 3a. The estimate of E η (Y0 ) R = 1 is the sample mean ∑ Riη (Y0i ) ∑ Ri . ˆ i i ) ( { } Step 3b. The estimate of E g −1  g η (Y0 ) + d (Y0 , D )  R1 = 0 also is   computed as a sample mean, ∑ (1 − Ri ) g −1  g {η (Y0i )} + d (Y0i , D ) ˆ   (20) i . ∑ (1 − Ri ) i Step 3c. The estimate of p is π = (1 / n ) ∑ Ri . ˆ i Step 3d. The estimate µ of m is computed by replacing parameters in ˆ (19) by their estimators described in the previous steps. Step 4. Standard errors are computed using bootstrap resampling. Step 5. Inferences about m are carried out for a plausible set or range of values of D. Because each unique value of D yields an estimator µ D , it ˆ is possible to construct a contour plot of Z-scores, p-values, or confidence

OCR for page 83
 MISSING DATA IN CLINICAL TRIALS RiYi µIPW = (1 / n ) ∑ ˆ , { } (25) ˆ (Y ) + αY 1 − expit h 0i i 1i where h (Y0 ) is an estimator of h(Y0). ˆ Unless Y0 is discrete with a few levels, estimation of h(Y0) requires the assumption that h(Y0) takes a known form, such as h(Y0;g) = g0 + g1Y0. (Note that if one adopts this model, one is assuming that the probability of response follows a logistic regression model on Y0 and Y1 with a given specified value for the coefficient a of Y1.) Specifying h(Y0) is a type (ii) assumption that is technically not needed to identify m but is needed in practical situations involving finite samples. One can compute an estimator γˆ of g by solving a set of estimating equations4 for g , ∂h (Y0i , γ )   R1i ∑  − 1 = 0. (26) { }  1 − expit h (Y0i ; γ ) + αY1i ∂γ    i Formulas for sandwich-type standard error estimators are available, but the bootstrap can also be used to compute standard error estimates. Hypothesis-testing sensitivity analysis is conducted in a manner similar to the one described in the example above with no auxiliary data. As with the pattern mixture models, by repeating the estimation of m at a set or interval of known a values, one can examine how different degrees of residual association between nonresponse and the outcome Y1 affect inferences concerning E(Y1). A plot similar to the one constructed for the pattern mixture model is given in Figure 5-2. EXAMPLE: GENERAL REPEATED MEASURES SETTING As the number of planned measurement occasions increases, the com- plexity of the sensitivity analysis grows because the number of missing data patterns grows. As a result, there can be limitless ways of specifying models. Consider a study with K scheduled postbaseline visits. In the special case of monotone missing data, there are (K + 1) patterns representing each of the visits at which a subject might last be seen, that is, 0,...,K. The 4 As with the selection approach of the example with no auxiliary data, to estimate g one cannot fit a logistic regression model because Y1 is missing when R1 = 0. The estimator γˆ   R1 E Y0  = 1. exploits the identity ) (  P R1 = 1 Y0 , Y1   

OCR for page 83
Reject HD (Placebo) Placebo 200 mg Mean Alpha (200 mg) Mean Among Nonresponders Reject HD (200 mg) Alpha Alpha Alpha (Placebo) FIGURE 5-2 Selection model sensitivity analysis. Left panel: plot of mean outcome among nonrespondents as a function of sensitivity parameter a, where a = 0 corresponds to MAR. Center panel: plot of full-data mean as function of a. Right panel: contour of Z statistic for comparing placebo to active treatment where a is varied separately by treatment.  Figure 5-2 R01821 bitmapped uneditable image

OCR for page 83
 MISSING DATA IN CLINICAL TRIALS (K + 1)st pattern represent subjects with complete data, while the other K patterns represent those with varying degrees of missing data. In the general setting, there are many ways to specify pattern models—the models that link the distribution of missing outcomes to the distribution of observed outcomes within specified strata—and it is generally necessary to look for simplifications of the model structure. For example, one could link the conditional (on a shared history of observed outcomes through visit k – 1) distribution of missing outcomes at visit k among those who were last seen at visit k – 1 to (a) the distribution of outcomes at visit k among those who complete the study, (b) the distri- bution of outcomes at visit k among those who are in the study through visit k, or (c) the distribution of outcomes at visit k among those who are last seen at visit k. Let Yk denote the outcome scheduled to be measured at visit k, with visit 0 denoting the baseline measure. We use the notation Yk− = (Y0 ,…, Yk ) to denote the history of the outcomes through visit k and Yk = (Yk+1 ,…, YK ) to + denote the future outcomes after visit k. We let Rk denote the indicator that Yk is observed, so that Rk = 1 if Yk is observed and Rk = 0 otherwise. We assume that Y0 is observed on all individuals so that R0 = 1. As above, we focus on inference about the mean m = E(YK) of the intended outcome at the last visit K. Monotone Missing Data Under monotone missingness, if the outcome at visit k is missing, then the outcome at visit k + 1 is missing. If we let L be the last visit that a subject has a measurement observed, then the observed data for a subject is YL = (Y0 ,…, YL ) , where L ≤ K. − A Pattern Mixture Model Approach As noted above, there are many pattern models that can be specified. Here, we discuss inference in one such model. Recall that both type (i) and type (ii) assumptions are needed. We first address type (i) specification, illus- trating a way to link distributions with those having missing observations to those with observed data. The general strategy is illustrated for the case K = 3, which relies on an assumption known as “nonfuture dependence” (kenward et al., 2003). In simple terms, the nonfuture dependence assumption states that the prob- ability of drop out at time L can only depend on observed data up to L and the possibly missing value of YL, but not future values of L.

OCR for page 83
 PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES In the model used here, we assume there is a link between Y Y − , L = k − 1 and Y Y − , L > k − 1 , which are, respectively, the k  k  k k distributions of Yk among those who do and do not drop out at time k – 1. The idea is to use the distribution of those still in the study at time k – 1 to identify the distribution of those who drop out at k – 1. It can be shown that m = E(YK) can be estimated by a recursion algo- rithm, provided the following observed-data distributions are estimated: Y3 Y0 , Y1 , Y2 , L = 4  , Y2 Y0 , Y1 , L ≥ 3 , Y1 Y0 , L ≥ 2  , Y0 L ≥ 1 , (27)      and the following dropout probabilities )( ) ( P L = 4 L ≥ 3, Y0 , Y1 , Y2 , Y3 , P L = 3 L ≥ 2, Y0 , Y1 , Y2 , (28) L ≥ 1, Y0 , Y1 ) , P ( L = 1 ) ( P L=2 Y0 , can also be estimated. Each is identified from observed data when missing- ness is monotone. What is needed to implement the estimation of m = E(YK) is a model that links the distributions with observed data (27) to the distributions having missing observations. One simple way to do this is to assume the distribution of Yk among recent dropouts, Yk Yk−−1 , L = k − 1 , follows   the same parametric model as the distribution of Yk among respondents, Y Y − , L > k − 1 , but with a different—or shifted—parameter value. k  k −1 This assumption cannot be verified and may not be realistic in all studies; we use it here simply as an illustration. To be more concrete, suppose that the outcomes Y0,…,Y3 are continu- ous. One can assume regression models for each of (27) as follows, ) ( E Y0 L ≥ 1 = µ0 , (29) ) ( E Y1 Y0 , L ≥ 2 = µ1 + β1Y0 , (30) ) ( E Y2 Y0 , Y1 , L ≥ 3 = µ2 + β2 (Y0 , Y1 ) , T (31) Y0 , Y1 , Y2 , L = 4 ) = µ3 + β3 (Y0 , Y1 , Y2 ) . ( T E Y3 (32) This modeling of the observed data distribution comprises our type (i) assumptions. These can (and must) be checked using the observables. Using type (ii) assumptions, the distributions of missing Y can be linked in a way similar to those for the first example above. For example,

OCR for page 83
00 MISSING DATA IN CLINICAL TRIALS those with L = 1 are missing Y1. One can link the observed-data regression ) ) ( ( E Y1 Y0 , L ≥ 2 to the missing-data regression E Y1 Y0 , L = 1 through ) ( * * E Y1 Y0 , L = 1 = µ1 + β1Y0 , (33) * * where, say, µ1 = µ1 + D µ1 and β1 = β1 + D β1 . Models for missing Y2 and Y3 can be specified similarly. As with the previous cases, (33) is a type (ii) assumption and cannot be checked with data. Moreover, even using a simple structure like (33), the number of sensitivity parameters grows large very quickly with the number of repeated measures. Hence, it is important to consider simplifications, such as setting Db = 0, assuming Dm is equivalent across patterns, or some combination of the two. Note that under our assumptions, D µk is the difference between the mean of Yk among those who drop out at k – 1 and those who remain beyond k – 1, conditional on observed data history up to k – 1. In this example, the assumption of linearity in the regression models, combined with an assumption that D βk = 0 for all k, means that one does not need a ) ( model for P L = k L ≥ k, Yk− to implement the estimation via recursion algorithm. A sensitivity analysis consists of estimating m and its standard error repeatedly over a range of plausible values of specified D parameters. For this illustration, setting D = 0 implies MAR.5 Selection Model Another way to posit type (i) assumptions in this setting is to postu- late a model for how the odds of dropping out between visits k and k + 1, depends on the (possibly missing) future outcomes, Yk+ , given the recorded history Yk− . That is, 5 An attractive feature of the pattern mixture approach we consider here (the one that links the distribution of outcomes between dropouts at a given time and those who remain in the study at that time) is that the special choice of link that specifies that these two distributions are the same is tantamount to the MAR assumption (i.e., the assumption that at any given occasion the past recorded data are the only predictors of the future outcomes that are used to decide whether or not to drop out of the study at that time). This feature does not hold with other choices of pattern mixture models. Thus, in our example, exploring how infer- ences about m change as Dk moves away from Dk = 0 is tantamount to exploring the impact of distinct degrees of residual dependence between the missing outcomes and dropping out on our inferences about m. In more general pattern mixture models, D = 0 is only sufficient, but not necessary, for MAR to hold. It is possible to find other unique combinations of D that correspond to MAR.

OCR for page 83
0 PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES ). ) = P ((L > k P L = k L ≥ k, Yk− , Yk+ ( k, Yk− , Yk+ odds L = k L ≥ ) k, Yk− , Yk+ L≥ The MAR assumption states that the odds do not depend on the future outcomes Yk+ . The nonfuture dependence assumption above states that it depends only on the future through Yk+1. That is, ) ) ( ( odds L = k L ≥ k, Yk− , Yk+ = odds L = k L ≥ k, Yk− , Yk+1 (34) and is equivalent to assuming that after adjusting for the recorded history, the outcome to be measured at visit k + 1 is the only predictor of all future missing outcomes that is associated with the odds of dropping out between visits k and k + 1. This last assumption, coupled with an assumption that quantifies the dependence of the odds in the right hand side on Yk+1, suffices to identify m = E(YK): in fact, it suffices to identify E(Yk) for any k = 1,...,K. For example, one might assume ) = exp (α ), ( odds L = k L ≥ k, Yk− , Yk+1 + 1 ) ( (35) odds L = k L ≥ k, Yk− , Yk+1 that is, that each unit increase in Yk+1 is associated with a constant increase in the log odds of nonresponse of a, the same for all values of Yk− and all visits k. Under (34), a = 0 implies MAR. One would make this choice if it is believed that the recorded history Yk− encodes all the predictors of Yk+1 that are associated with missingness. Values of a ≠ 0 reflect residual association of dropping out between visits k and k + 1 and the possibly unobserved outcome Yk+1, after adjusting for previous outcomes, and hence the belief that dropping out cannot be entirely explained by the observed recorded history Yk− . By repeating estimation of the vector m for each fixed a, one can examine how different degrees of residual association between drop- ping out and outcome at each occasion after adjusting for the influence of recorded history affects inferences concerning m. Assumptions (34) and (35) together are equivalent to specifying that {( )} = h (Y ) + αY log it P L = k L ≥ k, Yk− , Yk+ − k +1 , (36) k k ()− where hk Yk is an unknown function of Y. This, in turn, is equivalent to the pattern mixture model

OCR for page 83
0 MISSING DATA IN CLINICAL TRIALS )( ) ( p yk+1 L = k, Yk− = p yk+1 L ≥ k + 1, Yk− × exp (α yk+1 ) × const . (37) In this latter form, one can see that there is no evidence in the data regarding a since it serves as the link between the conditional (on Yk− ) dis- tribution of Yk+1 among those who drop out between visits k and k + 1 and those who remain through visit k + 1. If one believes that the association between dropping out and future outcomes depends solely on the current outcome but varies according to the recorded history, one can replace a with a known function of Yk− . For instance, replacing in equation (36) the constant a with the func- tion a0 + a1Yk with a0 and a1 specified, encodes the belief that the residual association between dropping out between k and k + 1 and the outcome Yk+1 may be stronger for individuals with, say, higher (if a1 > 0) values of the outcome at visit k. As an example, if Yk is a strong predictor of Yk+1 and that lower values of Yk+1 are preferable (e.g., HIV-RNA viral load), then it is reasonable to postulate that subjects with low values of Yk drop out for reasons unrelated to the drug efficacy (and, in particular, then to their outcome Yk+1) while subjects with higher values of Yk drop out for reasons related to drug efficacy and hence to their outcome Yk+1. Regardless of how the residual dependence is specified, m can be expressed in terms of the distribution of the observed data, that is, it is identified. Estimation of m = E[YK] relies on the identity   RKYK   (38) E YK  = E  , ) (  − π K YK ; h1 ,…, hK ; α     where {() } K −1 ) ( π K YK ; h1 ,…, hK ; α = ∏ 1 − expit hk YK + αYk+1 . − −     k=0 This formula suggests that one can estimate m with the IPW estimator RiKYiK µIPW = (1 / n ) ∑ ˆ . ( ) (39) −ˆ ˆ i π k YiK ; hi1 ,…, hiK ; α () ˆ − This estimator relies on estimators hik = hk Yik . In order to estimate these functions, one needs to impose type (ii) modeling assumptions on () () ( ) hk Yk− , that is, hk Yk− = hk Yk− ; γ k . For example, one can assume that h (Y ) = γ − 0,k + γ 1,kYk (adopting such a model would be tantamount to k k assuming that the probability of dropping out at each time follows a logistic regression model on just the immediately preceding recorded data and on the current outcome).

OCR for page 83
0 PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES As with the selection approach of the two preceding examples, to esti- mate gk, one cannot fit a logistic regression model because YK+1 is missing when L = k. However, one can estimate it instead by solving the estimating equations ) (∂hk Yik ; γ k   − n K −1 Ri,k+1 ∑∑  − 1 . Rik (40) { }  1 − expit Y − ; γ + αY  ∂γ k i =1 k = 0   i ,k +1 ik k for g, justified on similar grounds as the estimators of h functions in the previous examples. Formula for sandwich-type standard error estimators are available (see Rotnitzky et al., 1997), but the bootstrap can also be used to compute standard error estimates. Sensitivity analysis with regard to hypothesis testing is conducted in a manner similar to the one described in the first example above. Nonmonotone Missing Data A typical way to analyze nonmonotone missing data is to treat the time of dropout as the key missing data variable and then to assume MAR within dropout pattern (or conditional on dropout time). The advantage of this approach is purely practical: It interpolates missing data under a specified model. That said, however, the current literature suggests that MAR within pattern does not easily correspond to realistic mechanisms for generating the data. This raises concern among members of the panel that nonmonotone dropouts may require more specialized methods for modeling the missing data mechanism, and accounting for departures from MAR. This topic has not been deeply studied in the extant statistical literature, and in particular numerical studies are lacking. We recommend this as a key area of investigation that will: (a) examine the appropriateness of existing models and in particular the potential pitfalls of assuming MAR within missing data pattern; and (b) develop and apply novel, appropriate methods of model specification and sensitivity analysis to handle nonmonotone miss- ing data patterns. COMPARING PATTERN MIXTURE AND SELECTION APPROACHES The main appeal of the selection model approach is that, since it models the probability of nonresponse rather than the distribution of outcomes, it can easily accommodate vectors of auxiliary factors with components that can be of all types, discrete, categorical, and continuous.

OCR for page 83
0 MISSING DATA IN CLINICAL TRIALS Two disadvantages of the selection approach as they relate to draw- ing inferences are (a) the inverse weighting estimation procedure, which can yield relatively inefficient inferences (i.e., large standard errors), and (b) that model checking of the type (ii) assumptions must be conducted for each unique value of the sensitivity analysis parameters. Formal model checking procedures have yet to be formally developed for this setting. The inefficiencies associated with the inverse weighting procedure are mitigated in settings with a sizable fraction of missing data, as the sampling variability is often of less concern than the range of type (i) assumptions that are entertained. To address (b), one should fit a highly flexible model for the h function in the selection model. Another potential disadvantage of selection models relates to interpreta- tion of the sensitivity parameter. Particularly for continuous measures, it may be difficult to interpret nonresponse rates on the odds scale and to specify reasonable ranges for the sensitivity parameter. Plots such as those shown in Figure 5-2 (above) can be helpful in understanding how values of the sensitiv- ity parameter correspond to imputed means for the missing outcomes. Advantages of the pattern mixture model include transparent interpre- tation of sensitivity parameters and straightforward model checking for the observed-data distribution. The sensitivity parameters are typically specified in terms of differences in mean between respondents and nonrespondents, which appeal directly to intuition and contributes to formulating plausible ranges for the parameter. Pattern mixture models also can be specified so that the fit to the observed data is identical across all values of the sensitiv- ity parameters; hence, model checking will be straightforward and does not depend on the assumed missing data assumption. Disadvantages of pattern mixture modeling include difficulties in includ- ing auxiliary information, which will generally require additional modeling. Computation of the weighted averages across patterns for models of large numbers of repeated measures also can become complex without significant simplifying assumptions. TIME-TO-EVENT DATA A major challenge in the analysis of time-to-event outcomes in ran- domized trials is to properly account for censoring that may be informa- tive. Different approaches have been proposed in the research literature to address this issue. When no auxiliary prognostic factors are available, the general strategy has been to impose nonidentifiable assumptions concerning the dependence between failure and censoring times and then vary these assumptions in order to assess the sensitivity of inferences on the estimated survivor function. When prognostic factors are recorded, Robins and col- leagues in a series of papers (Robins and Rotnitzky, 1992; Robins, 1993;

OCR for page 83
0 PRINCIPLES AND METHODS OF SENSITIVITY ANALYSES Robins and Finkelstein, 2000) proposed a general estimation strategy under the assumption that all measured prognostic factors that predict censor- ing are recorded in the database. Scharfstein and Robins (2002) proposed a method for conducting sensitivity analysis under the assumption that some but not all joint prognostic factors for censoring and survival are available. Their approach is to repeat inference under different values of a nonidentifiable censoring bias parameter that encodes the magnitude of the residual association between survival and censoring after adjusting for measured prognostic factors. In randomized studies, censoring typically occurs for several reasons, some noninformative, others informative. For instance, in studies with stag- gered entry, the administrative end of the follow-up period typically induces noninformative censoring. However, loss to follow-up due to dropouts induces a competing censoring mechanism that is likely to be informative. Treatment discontinuation might induce yet another informative censoring process. Under the Scharfstein and Robins methodology, the analyst specifies a range for the parameter encoding the residual dependence of the hazard of the minimum of competing censoring times on the censored outcome. However, this range might be rather difficult to specify if the reasons that each censoring might occur are quite different, more so if some censoring processes are informative and some are not. To ameliorate this problem in studies with staggered entry, one can eliminate the censoring by the admin- istrative end of the follow-up period (typically a source of noninformative censoring) by restricting the follow-up period to a shorter interval in which (with probability) no subject is administratively censored. However, in doing so, one would lose valuable information on the survival experience of the study patients who remain at risk at the end of the reduced analysis interval. Rotnitzky et al. (2007) provide estimators of the survival function under separate models for the competing censoring mechanisms, including both informative and noninformative censoring. The methods can be used to exploit the data recorded throughout the entire follow-up period and, in par- ticular, beyond the end of the reduced analysis interval discussed above. DECISION MAKING Even after model fitting and sensitivity analysis, investigators have to decide about how important the treatment effect is. Unfortunately, there is no scientific consensus on how to synthesize information from a sensitivity analysis into a single decision about treatment effect. At least three pos- sibilities can be considered. One possibility is to specify a plausible region for the sensitivity parameters and report estimates of the lower and upper bounds from this range. These

OCR for page 83
0 MISSING DATA IN CLINICAL TRIALS endpoints form bounds on the estimated treatment effect and would be used in place of point estimates. Accompanying these bounds would be a 95 per- cent confidence region. This procedure can be viewed as accounting for both sampling variability and variability due to model uncertainty (i.e., uncertainty about the sensitivity parameter value): see Molenberghs and kenward (2007) for more detailed discussion and recommendations for computing a 95 per- cent confidence region. A second possibility is to carry out inference under MAR and determine the set of sensitivity parameter values that would lead to overturning the conclusion from MAR. Results can be viewed as equivocal if the inference about treatment effects could be overturned for values of the sensitivity parameter that are plausible. The third possibility is to derive a summary inference that averages over values of the sensitivity parameters in some principled fashion. This approach could be viewed as appropriate in settings in which reliable prior information about the sensitivity parameter value is known in advance. Regardless of the specific approach taken to decision making, the key issue is weighting the results, either formally or informally, from both the primary analysis and each alternative analysis by assessing the reason- ableness of the assumptions made in conjunction with each analysis. The analyses should be given little weight when the associated assumptions are viewed as being extreme and should be given substantial weight when the associated assumptions are viewed as being comparably plausible to those for the primary analysis. Therefore, in situations in which there are alternative analyses as part of the sensitivity analysis that support contrary inferences to that of the primary analysis, if the associated assumptions are viewed as being fairly extreme, it would be reasonable to continue to sup- port the inference from the primary analysis. RECOMMENDATION Recommendation 15: Sensitivity analyses should be part of the primary reporting of findings from clinical trials. Examining sensitivity to the assumptions about the missing data mechanism should be a mandatory component of reporting. We note that there are some often-used models for the analysis of miss- ing data in clinical trials for which the form of a sensitivity analysis has not been fully developed in the literature. Although we have provided prin- ciples for the broad development of sensitivity analyses, we have not been prescriptive for many individual models. It is important that additional research be carried out so that methods to carry out sensitivity analyses for all of the standard models are available.