Read "Methodological Challenges in Biomedical HIV Prevention Trials" at NAP.edu

« Previous: 8 Estimating HIV Incidence

Page 186 Cite

Suggested Citation:"9 Interim Monitoring and Analysis of Results." Institute of Medicine. 2008. Methodological Challenges in Biomedical HIV Prevention Trials. Washington, DC: The National Academies Press. doi: 10.17226/12056.

Page 187 Cite

Page 188 Cite

Page 189 Cite

Page 190 Cite

Page 191 Cite

Page 192 Cite

Page 193 Cite

Page 194 Cite

Page 195 Cite

Page 196 Cite

Page 197 Cite

Page 198 Cite

Page 199 Cite

Page 200 Cite

Page 201 Cite

Page 202 Cite

Page 203 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

9 Interim Monitoring and Analysis of Results R andomized clinical trials often take several years to complete subject enrollment, or accrual, and follow-up. That means that information about the risks and benefits of the intervention becomes available during the trial, sometimes from the trial itself and sometimes from external sources such as other trials. This information provides a scientific basis for monitoring the interim results of the trialâand indeed the ethical necessity to do soâto assess whether the trial should be modified in some way, or possibly terminated, given those results. During interim reviews of the trial, as well as after it has been completed, investigators must analyze the results in valid ways that reflect the trialâs design and protocol. This chapter explores the challenges entailed in performing interim monitoring and analyzing the results of HIV prevention trials. Ensuring Effective INTERIM MONITORING The evolving interim results of phase 3 and some phase 2 randomized trials are typically monitored by a data monitoring committee (DMC) (also known as a data and safety monitoring board, or a data monitoring board). Such a committee is composed of independent experts appointed by the study investigators or sponsor to ensure that the best interests of study participants are met during the trial (Ellenberg et al., 2002). â or F more on reporting trial results, see the CONSORT guidelines at http://www.consort- statement.org/index.aspx?o=1030. 186

INTERIM MONITORING AND ANALYSIS OF RESULTS 187 For example, the DMC monitors whether randomly assigning partici- pants to intervention and control groups is still ethical given interim results, and whether subjects who are already enrolled should continue to receive their assigned interventions. The DMC also tracks a studyâs evolving results to determine whether the trial still has the potential to achieve its scientific goals. The DMC may also recommend modifications to the trial design based on interim results, including changing the target period for enrolling or following subjects, or modifying the criteria for enrolling subjects. Though not the focus of this chapter, another function of the DMC is to evaluate the quality of the study conduct. In particular, the DMC usu- ally reviews investigatorsâ compliance with data management and operating procedures. For example, the DMC may monitor the accuracy and com- pleteness of the data collected, the trialâs compliance with restrictions on the eligibility of some potential participants, the adequacy of their accrual rates, and the trialâs adherence to drug distribution policies. If it detects problems, the DMC may suggest changes to procedures (Ellenberg et al., 2002). After reviewing a trialâs interim results, a DMC could recommend ter- minating the trial for a number of reasons, including the following: â¢ The intervention and control arms are convincingly different (that is, the intervention is efficacious), or, in the case of a noninferiority trial, the study arms are convincingly similar. â¢ One or more of the study arms produces unacceptable side effects or toxicity. â¢ Accrual of participants is so slow that completion of the trial in a reasonable time period is no longer feasible. â¢ Information from other studies with related goals and similar inter- vention arms makes continuation of the trial unnecessary or unethical. This section reviews key aspects of interim monitoring of randomized HIV prevention trials, including the composition of DMCs and the typical format of their meetings, the importance of access to complete information, challenges in monitoring trial assumptions, safety, efficacy, and futility, and the use of information from sources external to the trial. DMC Composition and Meetings The DMC for an HIV prevention trial typically includes statisticians and clinicians and often other scientists such as a virologist or someone with expertise in a key diagnostic test, an ethicist, and a lay participantâall appointed by the studyâs investigators or sponsors (Ellenberg et al., 2002; Fleming et al., 2002). Because of the central role of behavior in biomedical

188 METHODOLOGICAL CHALLENGES IN HIV PREVENTION TRIALS HIV prevention studies, their DMCs should usually also include an indi- vidual with expertise in behavioral or social sciences. HIV prevention trials are often designed and sponsored by investiga- tors and organizations based outside the countries where the trials occur, such as pharmaceutical companies, governments, or nonprofit founda- tions. If that is the case, including representatives of local communities on the DMC is critical. For example, in the late-1990s, two mother-to-child HIV prevention trials were undertaken in Thailand, supported by the U.S. government and designed primarily by nonâThai scientists (Shaffer et al., 1999; Lallemant et al., 2000). Although both trials demonstrated sig- nificant declines in mother-to-child transmission, the trial that compared a shortened AZT (antiretroviral) regimen to no treatment had a DMC with minimal representation from the host country. That fact helped spark con- siderable ethical debate about the use of a placebo group when the efficacy of AZT had been established elsewhere (Angell, 1997). Recommendation 9-1: The data monitoring committees of trials with sponsors and scientific leaders from outside the host country should include multiple representatives from the host country. These membersâwho should compose at least one-third of the committeeâ should include scientists, ethicists, and lay people familiar with the community and local norms. DMC Meetings A key consideration for DMCs is how often they should meet. A trialâs protocol for interim monitoring should include guidelines for determining the frequency of meetingsâtypically expressed as a measure of informa- tion, such as the number of observed HIV infections. For example, a trial design might call for an interim efficacy analysis when 25, 50, and 75 per- cent of the anticipated number of HIV infections in the control group have occurred. The protocol should also specify how the trial should âspendâ the overall type I error (say, 5 percent) among its interim and final analyses. (See more on this below, and, for example, Ellenberg et al., 2002.) As noted, DMCs also meet to monitor participant accrual, HIV inci- dence rates, attrition, and adherence, and to assess safetyâsometimes while also assessing efficacy. It is common, and advisable, to require that a DMC meet at least once a year to perform such monitoring. The DMC typically holds open sessions at which it discusses the progress of a trial with key investigators, including the sponsors. This informationâusually presented in an âopenâ reportâmay include the rates at which the trial is enrolling subjects, their baseline characteristics, the completeness of the data that the trial is collecting, and its ability to retain

INTERIM MONITORING AND ANALYSIS OF RESULTS 189 subjects, all aggregated across study arms. The committee also reviews a âclosedâ reportâtypically prepared by the studyâs statisticianâin a closed session, usually attended only by DMC members and the study statistician. This report usually includes summaries of safety, efficacy, adherence, and attrition by blinded study arm. Most DMCs have the authority to unblind the study armsâthat is, to find out which arm the data are fromâif they feel that doing so is impor- tant to determining whether to modify or continue a trial. For example, a nonsignificant trend in trial results favoring the control arm could con- vince the DMC to recommend ending a trial, but a similar trend favoring the experimental arm would typically convince the committee to continue the trial. If the former situation potentially exists, DMC members should unblind themselves, to determine whether the trial should end on the grounds that the experimental arm is not helping subjects as much as the control arm. In some instances, DMCs have operated under criteria that members will remain unblinded unless interim analyses comparing efficacy among a trialâs arms demonstrate a significant result. For example, Van Damme and colleagues (2002) report that the DMC for the N-9 microbicide trial had planned to remain unblinded unless results from the study arms became significantly different at the P = 0.001 level. The committee fails to see the rationale behind such criteria in trials comparing a new intervention to a control group because the threshold for stopping a trial due to a higher risk of HIV infection in the intervention group should be lower than the threshold for stopping the trial because of a lower risk of infection in the intervention group. For example, a nonsignifi- cant trend suggesting increased HIV risk in the intervention group would usually mean that there is a real concern that the participants are being harmed and also that the trial would be unlikely to demonstrate a signifi- cantly lower HIV infection risk in the intervention group if the trial were completed as planned. This was the motivation for the recent termination of the Merck STEP trial (http://www.avac.org/pdf/STEP_data_release.7Nov. pdf). The committee believes that DMCs should always have the option of unblinding study arms, if they believe that doing so is in the best interests of the participants. Recommendation 9-2: The data monitoring committees for HIV pre- vention trials should always have the option of unblinding interim results if they believe that doing so might lead them to recommend that the trial be modified or terminated, or lead to other actions that are in the best interests of the trial participants. In particular, when the efficacy data show nonsignificant trends favoring one of the blinded

190 METHODOLOGICAL CHALLENGES IN HIV PREVENTION TRIALS arms, a DMC should unblind itself as this might reflect an intervention that may be harming patients. Deciding Whether a Trial Remains Feasible In most randomized clinical trials, the DMC monitors the assumptions used to determine a trialâs sample size and planned duration to ensure that it remains feasible. Ideally, a charter prepared prior to the start of a trial details the tasks the DMC will perform and the criteria it will use. In HIV prevention trials, these assumptions include the following: â¢ Assumed versus actual rates of subject accrual, and the demograph- ics of enrolled subjects â¢ Assumed versus actual HIV infection rates â¢ Assumed versus actual adherence of subjects to study interventions â¢ Assumed versus actual retention of subjects, including rates of loss to follow-up, and missing data â¢ Assumed versus actual rates of pregnancy and other reasons for discontinuing the product Enough information is usually available during a trial to estimate participant accrual, adherence, behavior, and retention rates precisely. (An important consideration is whether adherence and behavior can be mea- sured in an unbiased fashionâsee Chapter 5.) However, HIV incidence rates are typically so low that the DMC may have trouble obtaining suf- ficiently precise estimates to determine whether the incidence rate used to determine the sample size and study duration is accurate. And as Chapter 2 noted, an overly optimistic estimate of HIV incidence in the control group could mean that a trial is underpowered, and thus that it is unable to achieve its goals. For example, if a study has assumed that the annual HIV incidence rate in the control arm will be 4 percent, the width of the 95 percent confidence interval (CI) for the rate estimated from trial results is about 0.8/sqrt(n Ã f), where n denotes the number of subjects on which the estimate is based, and f denotes their average follow-up time. Thus, if investigators conduct an interim analysis after enrolling 500 subjects (250 per arm), with an average follow-up time of 1 year per sub- ject, the width of the 95 percent CI for the true incidence rate is about 5 percent. That is, an observed HIV incidence rate of 3 percent would still be consistent with the assumed rate of 4 percent used to power the study, yet it would also be consistent with a rate that would indicate insufficient power. This underscores the need to adequately justify a studyâs assumed

INTERIM MONITORING AND ANALYSIS OF RESULTS 191 HIV incidence rate, and to be conservative in using it to determine the sample size and duration of follow-up of enrolled subjects. Many randomized trials either do not provide guidelines and criteria that the DMC will use to recommend modifying the sample size or duration of follow-up, or do so only in vague terms. Such modifications would not change the statistical validity of a trial if they were not based on compara- tive analyses of the interim data. For example, a DMC could recommend increasing a trialâs sample size based on the HIV incidence rate in a placebo group or across study arms. Such recommendations should be based on spe- cific criteria set forth in the protocol, such as the pooled HIV incidence rate versus the incidence rate in the control group. However, a recommendation to continue accruing subjects because of âinteresting trendsâ in HIV infec- tions across study arms could be problematic, as this will tend to inflate the false positive rate (type I error) in standard analyses of the results. Recommendation 9-3: Investigators should clearly describe in the study protocol the basis and criteria for any recommendation by the data monitoring committee to modify a trialâs size or duration. If such changes are implemented, the protocol should also specify how inves- tigators should evaluate the trial results. Monitoring for Safety, Efficacy, and Futility To determine whether to stop or modify a trial based on its interim results, the DMC monitors emerging data on the safety and efficacy of a studyâs interventions. In HIV trials, this information includes â¢ safety data, â¢ differences in HIV infection rates between study arms, and â¢ differences in other measures of efficacy between study arms. Trials usually include more structured rules for modifying or stopping them in two instances: when they demonstrate the efficacy or noninferi- ority of a new intervention, and when they demonstrate its futility. The criteria for terminating or modifying a trial may also include unexpected side effects. Safety The side effects of interventions could be minor (such as rash or sore- ness) or more serious (greater susceptibility to other infections). Side effects of products used in HIV prevention studies could also

192 METHODOLOGICAL CHALLENGES IN HIV PREVENTION TRIALS include behavioral changes. For example, participants could be more likely to engage in risky sexual behaviorâthis is known as disinhibition, or risk compensation (Cassell et al., 2006)âif they believe the product being tested provides partial or complete protection against HIV infection. Demonstrating Benefit Conducting multiple statistical tests comparing a new and control inter- vention increases the rate of a false positive result (type I error) (Pocock, 1974). Because a DMC usually reviews a studyâs interim results on sev- eral occasions, statistical analyses need to account for this inflated risk (Turnbull, 2006). That is, the criteria for achieving statistical significance at each interim analysis must be chosen to cap the overall chance of a false positive at some predefined levelâtypically 5 percent (or 0.05). Multiple ways of âspendingâ this type I error among interim analyses are available. Pocock suggested using the same criteria for each analysis, selected to give the desired overall type I error (see, for example, OâBrien and Fleming, 1979). However, most trials employ a more conservative rule, such as the OâBrien-Fleming spending function (OâBrien and Fleming, 1979), which requires early analyses to reach higher thresholds (that is, smaller P values) for statistical significance, and allows the final analysis to reach a lower threshold. For example, for a trial with three interim analyses and one final analy- sis, investigators could achieve an overall type I error rate of 5 percent by using a Pocock spending function requiring a P value of 0.016 or less at each analysis. Or investigators could achieve that error rate by using an OâBrien-Fleming approach requiring P values of 0.000005, 0.013, and 0.0228 at the first, second, and third interim analyses, and 0.0417 at the final analysis. However, although both approaches would yield an overall type I error rate of 5 percent, the OâBrien-Fleming boundaries are less likely to end a trial early than the Pocock boundaries. If an interim analysis does not prompt termination, the OâBrien-Fleming boundaries are also more likely to have a lower threshold for demonstrating that the treatment has made a significant difference at the final analysis. (For an example of early stopping of an HIV treatment trial for efficacy, see Hammer et al., 1997.) In HIV prevention studies, where subjectsâ adherence and behavior are important determinants of an interventionâs effect, investigators must also consider whether the intervention sustains that effect. For example, a microbicide that reduces the risk of HIV infection for 6 monthsâbut not thereafter, because users do not adhere to the regimenâis unlikely to have an important impact in controlling the HIV epidemic. Thus terminating an effectiveness trial of such a microbicide based on a short-term effect at an interim analysis (say, after 6 months) may be unwise. However, an efficacy

INTERIM MONITORING AND ANALYSIS OF RESULTS 193 trial designed specifically to assess whether the intervention has some pro- tective ability might well use a 6-month effect on HIV infection as a primary endpoint (see Chapter 2). One way to attempt to incorporate this consideration into the design of a monitoring plan is to use a conservative spending function, as noted. However, a better approach is to define the endpoint used in an interim analysis to reflect a sustained effect, such as the difference between inter- vention and control arms in cumulative HIV incidence at 2 years. One recently completed trial of the efficacy of male circumcision in preventing HIV infection used such a criterion (Bailey et al., 2007). Recommendation 9-4: For effectiveness trials, guidelines for stopping HIV prevention trials based on positive interim results should require evidence of a sustained impact on cumulative HIV incidence. Demonstrating Futility Interim analyses may suggest stopping a trial because of futilityâthat is, because the trial is highly unlikely to show that a new intervention is superior, given current evidence and the added information that would become available if the trial continued. In an HIV prevention trial, âfutilityâ need not refer only to evidence of a complete lack of benefit in preventing HIV infection, but also to evidence that the protective efficacy is less than some minimal amount (such as 40 percent), or that the intervention does not produce a sustained drop in HIV infection rates. Or, an effectiveness trial might include a stopping rule for futility if the interim data rule out a short-term effect on HIV infection (say, 6 months after randomization). A trial of an intervention that reaches such a futility criterion would typically prompt the studyâs investigators to pursue no further testing. For example, Hall et al. evaluated the value of intravenous and intrathecal cyta- rabine for prolonging the survival of HIV-infected people with progressive multifocal leukoencephalopathy (1998). At the time of the second interim analysis, when 57 of the scheduled 90 subjects had enrolled, 14 deaths had occurred in each of the cytarabine arms, as well as in the placebo arm, and cytarabine was associated with significant side effects. The chances that the study would show a significant survival benefit with cytarabine if the trial were completed were exceedingly small, given those results and the fact that only 33 more subjects would be enrolled. Thus the DMC recommended ending the trial for futility. More recently, in October 2007, the DMC for an HIV vaccine trial recommended terminating the trial based on an interim analysis, concluding that the vaccine could neither prevent HIV infection nor reduce the amount of virus in people who became infected (National Institute of Allergy and

194 METHODOLOGICAL CHALLENGES IN HIV PREVENTION TRIALS Infectious Diseases, 2007). Soon afterward, the DMC for a companion vaccine trial recommended terminating that trial also. The sponsors of the STEP trial have since announced that participants would be notified whether they received placebo or vaccine (see http://www.hvtn.org/media/ pr/STEPStudyOC.pdf). Terminating a trial for futility can prevent the inefficient use of resources. However, there is also an ethical basis for stopping a trial when the likeli- hood is low that it will achieve a definitive result. In the cytarabine exam- ple, early termination avoided exposing study participants to a therapy that appeared unlikely to help them but that does have serious side effects. In the HIV vaccine examples, the possibility that a vaccine might increase the risk of HIV infection could not be excluded based on the interim data, providing an ethical basis for terminating the trial. However, even if a new interven- tion does not have side effects and does not seem to increase risk, an ethical case could be made that participants would incur an opportunity cost by remaining in a trial, if by doing so they could not seek other options. This underscores the need for a detailed informed-consent process that alerts people to both the risks and benefits of participating in a trial. Rules that encourage investigators to terminate a study based on a low likelihood that the intervention will show adequate efficacy can play an important role in HIV prevention trials. Designing a phase 3 effectiveness trial with carefully constructed futility criteria could mimic a strategy of following a phase 2B trial with a phase 3 trial only if the 2B results were encouraging. Such a strategy would avoid the ethical problems of pursuing a phase 3 trial after finding promising results in a phase 2B trial. In some instances, there may be advantages to continuing a trial even when the interim data suggest that an intervention is unlikely to be supe- rior to the control regime. For example, if a trial compares a group that receives a common but unproven intervention with an untreated control group, interim evidence that the intervention is unlikely to produce a bet- ter response may not be sufficient grounds for terminating the trial owing to futility, because of the value of showing that the intervention is not very effective. Using Information from Similar Trials or Other Sources Information that affects the equipoise between risks and benefits of a trialâs study arms sometimes becomes available from sources outside the trial. For example, after public disclosure of interim results from the Thai PHPT trial on preventing mother-to-child transmission of HIV, the DMC for a Botswana trial recommended terminating one of four study arms that was similar to the terminated arm of the Thai trial (Talawat et al., 2002; Shapiro et al., 2006).

INTERIM MONITORING AND ANALYSIS OF RESULTS 195 Other examples have occurred with other diseases. For example, a recent meta-analysis (Nissen and Wolski, 2007) that raised questions about the cardiovascular side effects of rosiglitazone in diabetics led the investiga- tors of a randomized trial of the drugâs safety to convene an unscheduled interim analysis (Home et al., 2007). Because of inconsistencies between their results and those of the meta-analysis, the investigators chose to con- tinue their trial. These examples illustrate that DMCs need to be aware of emerging results from similar trials and other sources. They also suggest that inves- tigators consider including guidelines on whether and how they might use information that becomes available from related trials in interim monitoring. Dixon and Lagakos (2000) have cautioned against having the DMCs for similar contemporaneous trials share efficacy results, as that would raise serious questions about the appropriate publication of the findings, and detract from the long-standing desire for trials to yield reproducible results. Terminating a trial based on the unplanned pooling of efficacy data from another trial undermines the prespecified study criteria. As such, this approach represents a post hoc analysis, as the methods used to undertake such pooling and interpret the results are not part of the study design. This is a very different matter from terminating an arm of a trial, or ending a trial altogether, based on external data, as occurred when investigators ter- minated the African Phambili trial (HVTN 503) of a Merck HIV vaccine based on the findings of the international STEP trial (HVTN 502). Different studies may also use different criteria for assessing or defining endpoints, and for including or excluding subjects, and set different sched- ules for subject visits, further complicating the interpretation of interim information based on post hoc pooling. Finally, if two trials were stopped based on post hoc pooling of efficacy data, many researchers would insist that the results be published as a single paper, because any conclusion that the intervention had a positive effect would stem from the combined data. This would introduce other complications. However, because DMCs use less formal criteria to assess the safety of a new product than they use to document efficacy, sharing safety informa- tion from concurrent trials is acceptable and can be informative, especially for less frequent adverse events. One recent proposalâperhaps motivated by recent safety problems with microbicides (N-9 and cellulose sulfate), but applicable to any HIV biomedical preventionâis to create a âsuper DMCâ that would monitor several microbicide trials with one or more intervention arms in common (Nunn, 2007). The basic idea is that DMCs would agree to share a core set of safety data, and that participating investigators would be notified of any emerging safety problems. Such an idea is intriguing. However, implementing it would require careful planning to avoid arbi-

196 METHODOLOGICAL CHALLENGES IN HIV PREVENTION TRIALS trary decisions on when to notify individual DMCs of overall results, and on which results each participating DMC should see. For example, if two three-arm trials had one common experimental arm and a common control arm, would the DMCs share safety information about all three arms? Also important are details about the procedures for capturing safety data, as trials may differ in the method, frequency, and completeness with which they collect safety information. There may also be ethical or regula- tory considerations, including whether the informed-consent process must be changed in such circumstances, as well as scientific issues, such as whether and when participating trials should release a single publication on the main trial results, as opposed to separate publications for each trial. To the committeeâs knowledge, very little has been written about how best to share safety information among DMCs, yet the committee sees value in doing so in an appropriate manner. Nor has anyone discussed whether DMCs should share safety information routinely, or only if a possible con- cern is raised. Recommendation 9-5: Investigators, donors, and regulatory agencies should encourage research on how to combine safety information from concurrent trials of similar products, including the scientific advantages and disadvantages of sharing information, the timing and logistics of doing so, ethical concerns (such as how such information might affect the informed-consent process), and how to report the results from such trials. ANALYZING TRIAL RESULTS Analyzing the results of HIV prevention trials is particularly challeng- ing, for several reasons: â¢ HIV infection is a âsilentâ eventâthat is, it is not directly observableâand the tests used to diagnose infection are imperfect. â¢ When pregnancies occur during a trial, women are often taken off the study product. â¢ Participants in such trials may not adhere to the study interventions. â¢ Investigators need to account for the impact of HIV exposure on trial outcomesâwhich is determined by both HIV prevalence and the behavior of participantsâwhile also addressing the challenges of obtaining accurate information on behavior. â¢ Investigators need to assess the relationships among interventions, adherence, exposure, and the risk of HIV infection.

INTERIM MONITORING AND ANALYSIS OF RESULTS 197 The Silence of HIV Infection and the Imperfections of Diagnostic Tests In contrast to âtime-to-eventâ endpoints such as mortality and progres- sion of HIV, as measured by a biomarker such as viral load, determining HIV infection requires a diagnostic test such as EIA, RNA-PCR, or a more recently developed rapid test. Repeated testing of subjects in an HIV pre- vention trial leads to âinterval-censoredâ observations of the time to HIV infection, rather than an exact date. That is, periodic testing brackets an individualâs time of infection between the last negative and first positive diagnostic test. The situation is further complicated by the fact that the diagnostic tests used to detect HIV infection are not perfect. For example, RNA-PCR can have a low sensitivity when used within 2 weeks of HIV infection (Balasu- bramanian and Lagakos, 2003), leading to false negatives. Similarly, EIA does not usually detect HIV infection in persons who have not yet devel- oped HIV antibodies (that is, those who have not yet seroconverted). These features imply that some participants who enroll in HIV preven- tion trials may already be infected, and that some participants who become infected during a trial may not be diagnosed. Investigators need to take those possibilities into account when analyzing trial results. Excluding Subjects Who Were HIV Infected When Enrolled from Analysis HIV prevention trials have used different approaches in analyzing results from participants who are later suspected of having been HIV infected at the time of enrollment. One approach has been to use posten- rollment diagnostic tests to avoid counting subjects believed to have been infected at the time of randomization. If investigators could identify and exclude all subjects already infected when they are randomized, and no others, estimates of the relative efficacy of an intervention, and tests of the null hypothesis, would improve in several ways (Balasubramanian and Lagakos, 2004): â¢ Estimates of product efficacy or effectiveness would be less biased. â¢ The type I error of tests comparing study arms with respect to HIV infection rates would remain valid. â¢ The power of the trial to detect a real difference in efficacy between arms would increase. However, despite these potential advantages of excluding subjects who are already infected at the time of randomization, the impact of doing so is often minimal because the number of such exclusions is small. On the

198 METHODOLOGICAL CHALLENGES IN HIV PREVENTION TRIALS other hand, postrandomization exclusions can introduce biases and distort comparisons of the intervention and control arms, if they do not exclude all subjects infected at the time of randomization, or if they incorrectly exclude some subjects who were uninfected at enrollment, differentially among the intervention arms. This could occur, for example, if the post hoc testing of baseline samples is triggered by positive HIV test occurring shortly after randomization (say, at 3 months), since this could be influenced by a dif- ferential intervention effect. Biases can also occur if the criteria for determining which participants to exclude are not identical in the intervention and control arms, or if the patterns of participantsâ clinic visits are not identical in each arm. Thus, investigators must carefully weigh the potential gains from excluding indi- viduals who may have been infected at enrollment against the possibility that doing so will introduce bias into the comparison of study arms. Even if investigators could theoretically justify the post hoc exclusion of subjects, critics might question the face validity of results from trials that exclude more subjects from the intervention arm than from the control arm. Recommendation 9-6: Investigators should base their primary analysis of the efficacy of an intervention on all randomized subjects. Second- ary sensitivity analyses that exclude subjects believed to have been HIV infected when they were randomized can be useful. However, investiga- tors should not substitute such analyses for the primary analysis, unless such exclusions (and nonexclusions) can confidently be made without error. Recommendation 9-7: Investigators of trials evaluating an intervention that is believed to have a delayed impact may find it efficient to exclude people found to be HIV infected after randomization but before a given follow-up time. If so, the trial protocol should specify and justify such an approach, and investigators should use it only if follow-up of subjects and assessment and confirmation of HIV infection during this period is identical in all study arms. Investigators should undertake secondary analyses based on all randomized subjects. Confirming HIV Infections Subjects who test HIV positive typically undergo confirmatory tests. Some of the initial results turn out to be true positives, while some are false positives. In theory, some of the true positives might not be confirmed because of the imperfect sensitivity of the confirmatory testâthat is, these subjects would be considered negative when they are actually positiveâthus increasing the number of false positives.

INTERIM MONITORING AND ANALYSIS OF RESULTS 199 Given that tests to detect HIV infection are imperfect, a trial protocol should set clear criteria for confirming that subjects are indeed infected. Although such confirmation could increase the number of false negatives, it would decrease the number of false positives and lead to more confi- dence that the observed endpoints are âreal.â It is critical that investiga- tors develop criteria for assessing endpoints that are applied equally in the intervention and control arms. Analyzing Time to Infection Standard methods for analyzing the amount of time that elapses between randomization and HIV infection assume that investigators know the exact time of infection. These methods include the log-rank test, Kaplan-Meier estimator, and Coxâs model. To account for the interval-censored nature of information on HIV infection in prevention trials, and the imperfection of the tests, investigators could use modified versions of these methods (Richardson and Hughes, 2000; Balasubramanian and Lagakos, 2004; Gupte et al., 2007; Zhang and Lagakos, in press). These standard methods provide valid tests of the efficacy of an inter- vention if subjects are evaluated based on the same schedule of clinic visits in each study arm, and if the sensitivity and specificity of the diagnostic test does not depend on the study arm. In that case, similar periodic results would be expected to occur in the study arms under the null hypothesis of no intervention effect. Under these circumstances the following occurs: â¢ Standard Kaplan-Meier estimates of cumulative HIV infection rates are valid at the scheduled visit times. However, these cumulative rates cannot be estimated for times between visits, so the curves should not be displayed in the usual way as step functions. â¢ In practice, participants are often not evaluated according to the exact visit schedule. In such cases, the Kaplan-Meier estimator, log-rank test, and Cox model should use the scheduled visit time rather than the actual time. That is because the tests depend on the magnitudes of the observed times only through their relative ranks (that is, they are ârank invariantâ), and thus small differences in the time of visits can have a big impact on the results. â¢ Investigators should include information from unscheduled visits in the analyses only if they can safely assume that such visits do not depend on subjectsâ infection status. Otherwise, investigators should base their analyses only on results from scheduled visits. In some studies, such as in newborns and infants, a nonnegligible pro- portion of subjects may die before being detected as HIV infected. In such

200 METHODOLOGICAL CHALLENGES IN HIV PREVENTION TRIALS settings, investigators should use methods for analyzing competing risks, or use HIV-free survival rather than HIV infection as the endpoint (see, for example, Richardson and Hughes, 2000). Effects of Product Discontinuation and Loss to Follow-Up Participants may stop using an intervention during a trial for sev- eral reasons, most commonly adverse treatment effects, an inability to continue the treatment or lack of interest in doing so, or, in some HIV prevention studies, pregnancy. In some trials, investigators stop tracking participants who discontinue their randomized intervention prematurely. It is well known that this can lead to distorted statistical inferences about intervention effects (Lagakos et al., 1990) because subjects who discontinue their intervention, including placebo, can have different risks of becoming infected than those that do not. For example, Hughes et al. (1994) noted that HIV patients with more rapidly declining CD4+ cell counts were more likely to discontinue treatment than other patients. Another example is the Coronary Drug Project (Canner et al., 1986). In that trial, death rates in patients randomized to receive Clofibrate were 18 percent for compliers compared with 25 percent for noncompliers, suggesting that the drug might be beneficial. However, the corresponding death rates for the placebo group were 15 percent and 28 percent, indicating that something about being non- compliant was associated with a poorer outcome (Snapinn et al., 2004). For both examples, if the rates of discontinuation differed among the interven- tion groups and analyses were based only on outcome events that occurred prior to discontinuation (sometimes referred to as âas treatedâ analyses), the comparisons of outcome events among the interventions would be biased. Thus, the accepted practice is to continue to follow participants for the study endpoint regardless of whether or not they prematurely discon- tinue their randomized intervention, and to use all outcome events in the analysis of the data, and not just those that precede discontinuation of the intervention; such analyses are called intention-to-treat analyses. As noted in Chapter 2, the power of intention-to-treat analyses will, in general, be reduced by product discontinuation. Chapter 5 discusses several ways in which the effect of persistence and more generally adherence and behavior on HIV incidence can be meaningfully analyzed. Handling Pregnancies During Follow-Up If a pregnancy that occurs during a trial does not trigger a modifica- tion of the intervention, then analyses of time to HIV infection will not change. However, if a product is temporarily or permanently discontinued when a woman is found to be pregnant, the more general discussion of

INTERIM MONITORING AND ANALYSIS OF RESULTS 201 product discontinuation described in the previous paragraph applies. Thus, it is important that investigators continue to follow pregnant women for HIV infection after they discontinue a product owing to pregnancy and, when analyzing results, use intention-to-treat analyses that utilize outcome events for the duration of follow-up, and not just those occurring prior to the pregnancy. An alternative method of analysis is to âcensorâ a womanâs time of infection when she is found to be pregnant and discontinues the product. That is, investigators would regard this womanâs time of infection as being âat least x,â where x is the time from randomization until she is found to be pregnant and taken off the product. This convention is sometimes referred to as an âas-treatedâ analysis. As with other forms of discontinuation, such analyses could lead to biased estimates of the cumulative risk of HIV infec- tion if pregnancy represented a type of âinformative censoringââthat is, if the risk of (subsequent) HIV infection in a pregnant woman is different from that of a nonpregnant women with equal follow-up. Although the evi- dence for a differential risk of HIV infection during infection is limited and thus somewhat controversial, there have been reports of increased HIV risk in pregnant women (Taha et al., 1998; Gray et al., 2005; Morrison et al., 2007). The impact of pregnancy on âas-treatedâ statistical tests comparing intervention groups is somewhat different. Here, if the rates of pregnancy do not differ among the intervention arms, and if the risk of HIV infec- tion for a pregnant woman does not depend on the product she had been taking, at-treated tests that censor a woman at the time of pregnancy will lead to valid comparisons. However, when planning a trial, investigators usually cannot be assured of either of these assumptions; thus, it is prudent to continue to follow women who become pregnant for the studyâs outcome events and to analyze the resulting data using intention-to-treat methods. Recommendation 9-8: In all trials, investigators should continue to follow women who become pregnant for HIV infection, regardless of whether they discontinue their study intervention. In addition, intention- to-treat analyses should be the primary basis for comparing interven- tion groups with respect to HIV infection and other efficacy endpoints.Â Investigators can include as-treated analyses as secondary analyses, but should interpret them cautiously, because of the possibility that such discontinuations represent a type of informative censoring. REFERENCES Angell, M. 1997. The ethics of clinical research in the third world. New England Journal of Medicine 337(12):847-849.

202 METHODOLOGICAL CHALLENGES IN HIV PREVENTION TRIALS Bailey, R. C., S. Moses, C. B. Parker, K. Agot, I. Maclean, J. N. Krieger, C. F. Williams, R. T. Campbell, and J. O. Ndinya-Achola. 2007. Male circumcision for HIV pre- vention in young men in Kisumu, Kenya: A randomised controlled trial. Lancet 369(9562):643-656. Balasubramanian, R., and S. W. Lagakos. 2003. Estimation of a failure time distribution based on imperfect diagnostic tests. Biometrika 90(1):171-182. Balasubramanian, R., and S. W. Lagakos. 2004. Analyzing time-to-event data in a clinical trial when an unknown proportion of subjects has experienced the event at entry. Biometrics 60(2):335-343. Canner, P. L., K. G. Berge, N. K. Wenger, J. Stamler, L. Friedman, R. J. Prineas, and W. Friedewald. 1986. Fifteen-year mortality in coronary drug project patients: Long-term benefit with niacin. Journal of the American College of Cardiology 8(6):1245-1255. Cassell, M. M., D. T. Halperin, J. D. Shelton, and D. Stanton. 2006. Risk compensation: The Achilles heel of innovations in HIV prevention? BMJ 332(7541):605-607. Dixon, D. O., and S. W. Lagakos. 2000. Should data and safety monitoring boards share confidential interim data? Controlled Clinical Trials 21(1):1-6; discussion 54-55. Ellenberg, S., P. L. Fleming, and D. L. DeMets. 2002. Data monitoring committees in clinical trials: A practical perspective. Edited by S. Senn and V. Barnett. West Sussex, UK: John Wiley & Sons Ltd. Fleming, T. R., S. Ellenberg, and D. L. DeMets. 2002. Monitoring clinical trials: Issues and controversies regarding confidentiality. Statistics in Medicine 21(19):2843-2851. Gray, R. H., X. Li, G. Kigozi, D. Serwadda, H. Brahmbhatt, F. Wabwire-Mangen, F. Nalugoda, M. Kiddugavu, N. Sewankambo, T. C. Quinn, S. J. Reynolds, and M. J. Wawer. 2005. Increased risk of incident HIV during pregnancy in Rakai, Uganda: A prospective study. Lancet 366(9492):1182-1188. Gupte, N., R. Brookmeyer, R. Bollinger, and G. Gray. 2007. Modeling maternal-infant HIV transmission in the presence of breast-feeding with an imperfect test. Biometrics 63(4):1189-1197. Hall, C. D., U. Dafni, D. Simpson, D. Clifford, P. E. Wetherill, B. Cohen, J. McArthur, H. Hollander, C. Yainnoutsos, E. Major, L. Millar, and J. Timpone. 1998. Failure of cytarabine in progressive multifocal leukoencephalopathy associated with human im- munodeficiency virus infection. AIDS Clinical Trials Group 243 Team. New England Journal of Medicine 338(19):1345-1351. Hammer, S. M., K. E. Squires, M. D. Hughes, J. M. Grimes, L. M. Demeter, J. S. Currier, J. J. Eron, Jr., J. E. Feinberg, H. H. Balfour, Jr., L. R. Deyton, J. A. Chodakewitz, and M. A. Fischl. 1997. A controlled trial of two nucleoside analogues plus indinavir in per- sons with human immunodeficiency virus infection and cd4 cell counts of 200 per cubic millimeter or less. AIDS Clinical Trials Group 320 Study Team. New England Journal of Medicine 337(11):725-733. Home, P. D., S. J. Pocock, H. Beck-Nielsen, R. Gomis, M. Hanefeld, N. P. Jones, M. Komajda, and J. J. McMurray. 2007. Rosiglitazone evaluated for cardiovascular outcomesâan interim analysis. New England Journal of Medicine 357(1):28-38. Hughes, M. D., D. S. Stein, H. M. Gundacker, F. T. Valentine, J. P. Phair, and P. A. Volberding. 1994. Within-subject variation in cd4 lymphocyte count in asymptomatic human immu- nodeficiency virus infection: Implications for patient monitoring. Journal of Infectious Diseases 169(1):28-36. Lagakos, S. W., L. L. Lim, and J. M. Robins. 1990. Adjusting for early treatment termination in comparative clinical trials. Statistics in Medicine 9(12):1417-1424. Lallemant, M., G. Jourdain, S. Le Coeur, S. Kim, S. Koetsawang, A. M. Comeau, W. Phoolcharoen, M. Essex, K. McIntosh, and V. Vithayasai. 2000. A trial of shortened zidovudine regimens to prevent mother-to-child transmission of human immunodefi-

INTERIM MONITORING AND ANALYSIS OF RESULTS 203 ciency virus type 1. Perinatal HIV prevention trial (Thailand) investigators. New England Journal of Medicine 343(14):982-991. Morrison, C. S., J. Wang, B. Van Der Pol, N. Padian, R. A. Salata, and B. A. Richardson. 2007. Pregnancy and the risk of HIV-1 acquisition among women in Uganda and Zimbabwe. AIDS 21(8):1027-1034. NIAID (National Institute of Allergy and Infectious Diseases). 2007. Statement: Immuniza- tions are discontinued in two HIV vaccine trials. Bethesda, MD: NIAID. http://www3. niaid.nih.gov/news/newsreleases/2007/step_statement.htm (accessed November 2007). Nissen, S. E., and K. Wolski. 2007. Effect of rosiglitazone on the risk of myocardial in- farction and death from cardiovascular causes. New England Journal of Medicine 356(24):2457-2471. Nunn, A. 2007. Issues in microbicide trial design, monitoring, and analysis. Paper read at the second public meeting for the Committee on Methodological Challenges in HIV Preven- tion Trials, April 19, London, UK. OâBrien, P. C., and T. R. Fleming. 1979. A multiple testing procedure for clinical trials. Bio- metrics 35(3):549-556. Pocock, S. J. 1983. Clinical trials: A practical approach. Chichester, UK: John Wiley & Sons, Inc. Richardson, B. A., and J. P. Hughes. 2000. Product limit estimation for infectious disease data when the diagnostic test for the outcome is measured with uncertainty. Biostatistics 1(3):341-354. Shaffer, N., R. Chuachoowong, P. A. Mock, C. Bhadrakom, W. Siriwasin, N. L. Young, T. Chotpitayasunondh, S. Chearskul, A. Roongpisuthipong, P. Chinayon, J. Karon, T. D. Mastro, and R. J. Simonds. 1999. Short-course zidovudine for perinatal HIV-1 transmission in Bangkok, Thailand: A randomised controlled trial. Bangkok Collabora- tive Perinatal HIV Transmission Study Group. Lancet 353(9155):773-780. Shapiro, R. L., I. Thior, S. Gilbert, C. Lockman, C. Wester, L. Smeaton, S. J. Stevens, K. Heymann, K. McIntosh, S. Ndungâu, V. Gaseitsiwe, T. Novitsky, S. Peter, E. Kim, C. Widenfelt, P. Moffat, P. Ndase, P. Arimi, P. Kebaabetswe, P. Mazonde, R. Lee, J. Marlink, J. Makhema, S. Lagakos, and M. Essex. 2006. A randomized comparison of strategies for adding single-dose nevirapine to zidovudine to prevent mother-to-child HIV transmission in Botswana. AIDS 20:1281-1288. Snapinn, S. M., Q. Jiang, and B. Iglewicz. 2004. Informative noncompliance in endpoint trials. Current Controlled Trials in Cardiovascular Medicine 5(1):5. Taha, T. E., G. A. Dallabetta, D. R. Hoover, J. D. Chiphangwi, L. A. Mtimavalye, G. N. Liomba, N. I. Kumwenda, and P. G. Miotti. 1998. Trends of HIV-1 and sexually transmitted dis- eases among pregnant and postpartum women in urban Malawi. AIDS 12(2):197-203. Talawat, S., G. J. Dore, S. Le Coeur, and M. Lallemant. ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ 2002. Infant feeding practices and attitudes among women with HIV infection in northern Thailand. AIDS Care 14(5):625-631. Turnbull, B. 2006. Group sequential tests. In Encyclopedia of Statistical Science. New York: John Wiley & Sons, Inc. Van Damme, L., G. Ramjee, M. Alary, B. Vuylsteke, V. Chandeying, H. Rees, P. Sirivongrangson, L. Mukenge-Tshibaka, V. Ettiegne-Traore, C. Uaheowitchai, S. S. Karim, B. Masse, J. Perriens, and M. Laga. 2002. Effectiveness of COL-1492, a nonoxynol-9 vaginal gel, on HIV-1 transmission in female sex workers: A randomised controlled trial. Lancet 360(9338):971-977. Zhang, P., and S. W. Lagakos. In press. Analysis of time to a silent event whose occurrence is monitored with error, with application to mother-to-child HIV transmission. Statistics in Medicine.

Next: 10 Alternative Designs »

Methodological Challenges in Biomedical HIV Prevention Trials (2008)

Chapter: 9 Interim Monitoring and Analysis of Results

Welcome to OpenBook!

Get Email Updates