In order for a new drug therapy to obtain FDA approval, clinical trials must provide evidence of safety and efficacy. In addition, the FDA and the NCI frequently audit sites and cooperative groups to ensure the quality of the data collected and the safety of the experimental treatment being given. A number of regulations and guidance documents provide guidelines for the frequency and extensiveness of audits and for the amount and type of data required to support claims of safety and efficacy. These guidelines are continually revised by the FDA and the NCI to reflect lessons learned and, when appropriate, to respond to the concerns of a variety of stakeholders, including government, industry, patients, and clinicians. However, some clinical trials investigators have suggested that the NCI and the FDA reduce the amount of audits and data required, thereby streamlining the clinical trials process and saving both time and money. At the same time, additional data and resources are needed for today’s trials—such as those for targeted therapies—to identify patient candidates for new therapies, for biomarker analysis, and more. The first session of the second day of the conference explored which data are essential for demonstrating safety and efficacy and what changes could improve current auditing and data requirements.
Dr. James Doroshow, director of the Division of Cancer Treatment and Diagnosis (DCTD) of the NCI, began this session by giving the NCI
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 65
Session 3:
Data Collection Standards
and Monitoring
In order for a new drug therapy to obtain FDA approval, clinical
trials must provide evidence of safety and efficacy. In addition, the FDA
and the NCI frequently audit sites and cooperative groups to ensure the
quality of the data collected and the safety of the experimental treatment
being given. A number of regulations and guidance documents provide
guidelines for the frequency and extensiveness of audits and for the
amount and type of data required to support claims of safety and effi-
cacy. These guidelines are continually revised by the FDA and the NCI to
reflect lessons learned and, when appropriate, to respond to the concerns
of a variety of stakeholders, including government, industry, patients,
and clinicians. However, some clinical trials investigators have suggested
that the NCI and the FDA reduce the amount of audits and data required,
thereby streamlining the clinical trials process and saving both time and
money. At the same time, additional data and resources are needed for
today’s trials—such as those for targeted therapies—to identify patient
candidates for new therapies, for biomarker analysis, and more. The
first session of the second day of the conference explored which data are
essential for demonstrating safety and efficacy and what changes could
improve current auditing and data requirements.
THE NCI PERSPECTIVE
Dr. James Doroshow, director of the Division of Cancer Treatment
and Diagnosis (DCTD) of the NCI, began this session by giving the NCI
OCR for page 65
MULTI-CENTER PHASE III CLINICAL TRIALS
perspective on data collection and monitoring. He suggested that, given
the expense of complex trials and the limited budgets of government and
industry, one cannot collect all the data that might potentially be needed
for a future review. Instead, he suggested that the best course is collect-
ing only the amount of data that is absolutely necessary, but he noted
that it is debatable just how much data are required to ensure safety and
efficacy in any given situation. For example, new investigational drugs
might have novel toxicities that require the monitoring of numerous signs
and symptoms. But once a novel toxicity, such as hypertension, has been
documented among the first 1,000 patients tested with the experimental
drug, he asked, “do we need to collect blood pressures on the next 10,000
patients?”
The amount of data needed for audits is also a debatable issue, he
said. He noted that the NCI regularly audits 10 percent of the patient cases
at cooperative group sites and determines major deficiencies related to a
number of factors listed in Box 5. But he noted that there are no standards
for these auditing data and no comparative data against which to measure
them. “I personally happen to think that if there were a major deficiency
in adverse-event reporting of only 2.3 percent of the cases, that’s pretty
good,” Dr. Doroshow said. “But should we give the cooperative groups
kudos or tell them that this is unacceptable? We really do not know.”
Dr. Doroshow also said he has data to suggest that reviewing 10
percent of patient cases is sufficient for on-site audits and that reviewing
additional cases does not improve the information gained. “Monitoring
every patient at every clinical visit is not necessarily going to provide you
with additional information that tells you whether or not a trial or a site
is doing well or is doing badly,” he said.
BOX 5
NCI Cooperative Group Program: Patient
Case Review Categories
• Informed consent
• Eligibility
• Treatment
• Adverse events
• Disease outcome/response
• General data timeliness
SOURCE: Doroshow presentation (July 1, 2008).
OCR for page 65
DATA COLLECTION STANDARDS AND MONITORING
Another debate in data collection and monitoring concerns which
kinds of endpoints are valid to use and how much verification is required
for those endpoints. Many clinical trials are moving away from using
survival data as endpoints and instead are using time-to-progression or
tumor response as endpoints. These endpoints are determined by radio-
logic criteria that can be subjective, because they are not easily quantified,
but one study of three different trials found a close correlation between
investigator assessments of radiologic endpoints and those of a radiologic
review (Dodd et al., 2008). “We may think it intuitively obvious that these
radiologic reviews should be done,” Dr. Doroshow commented, “but
where are the data that make it clear that adding a procedure is unequivo-
cally going to be beneficial in improving the quality of the data that leads
to an indication?” He added that, for technical and biostatistical reasons,
an additional blinded central review may introduce as many errors as it
corrects.
In a later talk, Dr. Fyfe agreed with this assessment, noting that
because of the existence of different methods, different lesions selected,
and measurement inconsistencies, a blinded radiologic review should not
look for a concordance of findings that is then used to assess the quality
of the reviewed site but should verify benefit and look for bias between
the treatment arms in a study. She pointed out that there often is not a
great degree of concordance in radiological assessments of small-volume
disease, because such assessments can be highly subjective. “As we start
to use independent review facilities, there is a real danger, because more
data are not better data,” Dr. Fyfe said. “They are simply confusing,
whether it’s an industry trial or a cooperative group trial.”
Dr. Doroshow ended his presentation by saying, “It’s neither appro-
priate nor desirable for the NCI to perform clinical trials in a manner that
is identical to the model that industry has used and is continuing to use.
One hundred percent source verification is neither reasonable nor neces-
sary, nor does it necessarily get us a higher quality of data that will lead to
improving treatment for our patients.” In defense of industry, Dr. Canetta
commented, “Sometimes our approach has been that if we believe we
might need data eventually, we had better collect them. We do not relate
only to the FDA, but also to a plethora of other regulatory agencies that
are equally powerful in regulating the use of experimental agents in their
countries.”
INDUSTRY PERSPECTIVE
In the following presentation, Dr. Gwendolyn Fyfe, senior staff scien-
tist in clinical hematology and oncology at Genentech, gave the industry
perspective on data collection and monitoring. She said that much of
OCR for page 65
MULTI-CENTER PHASE III CLINICAL TRIALS
the data her company collects is never used. This includes data on vital
signs, concomitant medications, laboratory values, medical histories, and
low-grade toxicities; secondary information about adverse events; and
independent reviews of efficacy endpoints. Furthermore, some data are
collected excessively or in an inefficient manner.
Dr. Fyfe noted that by the time a drug enters Phase III trials, much is
known about its toxicity and the time course of its toxicity. Investigators
should use that information to narrow the collection of data, she said.
By the time Avastin (bevacizumab) reached Phase III trials for colorectal
cancer, for example, its effects on blood pressure and likely effects on
bleeding were known and should have focused the collection of data on
adverse effects. Dr. Fyfe suggested collecting grade 3 or 4 toxicities on a
cycle-specific basis. “I know, for the most part, when I go into Phase III,
whether something is going to happen at day 7 or 14. I do not learn any-
thing by collecting that exact date, and I put a burden on the sites if I ask
for a precise date,” she said.
Most of the data on adverse events are “rolled up into a worst grade,
so for all the data you collect, you end up with just one number,” Dr.
Fyfe said. “I’m not sure that all the specific stop and start dates helped us
understand the safety profile of bevacizumab.” Dr. Fyfe noted that much
of the data collected on adverse events is superfluous, because it merely
improves confidence intervals without changing clinical decisions. As
Table 2 shows, increasing the number of patients analyzed for adverse
events causes statistical differences that are not meaningful in the clinic.
“A physician will not manage patients differently if there is a 40 percent or
a 60 percent adverse event rate,” she said. “When we think about collect-
ing more data, there is this inference that it’s better that we know more.
TABLE 2 Does More Safety Data Provide Greater Certainty About
the Safety Profile?
Expected Rate of
Adverse Event 100 Patients 200 Patients 400 Patients 800 Patients
(percentage) Analyzed Analyzed Analyzed Analyzed
5 4.3 3.0 2.1 1.5
10 5.9 4.2 2.9 2.1
20 7.8 5.5 3.9 2.8
30 9.0 6.5 4.5 3.2
40 9.6 6.8 4.8 3.4
50 9.8 6.9 4.9 3.5
NOTE: Confidence intervals as a function of patient number.
SOURCE: Fyfe presentation (July 2, 2008).
OCR for page 65
DATA COLLECTION STANDARDS AND MONITORING
It’s simply not true, in terms of being helpful. It probably just slows things
down at an extra cost without providing much value.” Later, during dis-
cussion, Dr. Ralph deVere White suggested that many of the data points
collected to show drug efficacy may also be superfluous since they, too,
might merely improve confidence intervals without having any relevance
to clinical decisions. Dr. Fyfe agreed that this may be so and that it should
be considered when developing minimum data standards.
Not only are Genentech and other companies collecting informa-
tion that they are not using, Dr. Fyfe said, but they are missing impor-
tant data. In particular, she expressed dissatisfaction about the lack of
placebo-controlled trials, which are needed to ensure unbiased reporting
of adverse events. For example, Genentech’s randomized Phase II trial of
Avastin did not include an arm that received a placebo, and the rate of
thrombosis—blood clots—was much higher in the Avastin arm than in the
control arm receiving standard treatment (26 or 13 percent, depending on
which Avastin arm, versus 9 percent in the control arm). However, differ-
ences in the rate of thrombosis in the control and treatment arms of the
placebo-controlled Phase III trial of Avastin were more similar (16 percent
in the placebo arm versus 19 percent in the Avastin arm) (Hurwitz et al.,
2004; Kabbinavar et al., 2003). “With the best possible intentions, people
under-report adverse events on the control arm,” Dr. Fyfe said.
Dr. Fyfe also noted the importance of understanding why physicians
or patients stop treatment, as that can provide information about the
tolerability of a drug or its efficacy. Often, however, this data is not col-
lected. “Sometimes patients stop a drug because it’s toxic, but sometimes
they stop the drug because they are progressing but have not reached
that magical ‘progressive disease’ endpoint that we collect data on,” she
said. Collecting data on subsequent treatment is also helpful, as multiple
treatment lines are often pursued, and such data can help assess optimal
treatment paradigms, she added.
Dr. Fyfe suggested collecting data on the deaths, discontinuations,
and SAEs in all patients at all sites, and she commented that collecting
data on targeted adverse events is also appropriate in some cases. She
added, however, that detailed adverse-event profiles in subpopulations
are usually inadequately answered in Phase III trials and are probably
better addressed in Phase II or Phase IV trials or through post-marketing
registries.
She suggested that a set of data standards could ensure that the data
collected are adequate to reliably assess whether an unapproved agent
has a good risk-to-benefit ratio, including the issue of whether the drug
significantly improves outcome when added to or in contrast to a known
standard, and also what the effect on safety is when that drug is added to
or substituted for a known treatment standard. Ideally, data should also
OCR for page 65
0 MULTI-CENTER PHASE III CLINICAL TRIALS
be collected that might identify subsets of patients in whom the risk/
benefit ratio is different from other patients. Dr. Fyfe said that data stan-
dards should be similar for all licensure trials and that minor tweaking of
current cooperative group standards may be all that is needed.
Dr. Fyfe also stressed the importance of verifying study data in licen-
sure trials to ensure accuracy and completeness, but she noted that this
need be done only in a subset of patients. She suggested that stakeholders
work together to quickly determine the most appropriate and consistent
standards for data collection and monitoring. “We simply need to stan-
dardize so that we can assess the risk and benefit of drugs in the medical
milieu of the United States, rather than offshore, as is increasingly hap-
pening,” she said. She added that there should be a funding mechanism
so that cooperative groups can meet the data standards created. She sug-
gested creating a foundation to which industry contributes when doing
Phase III trials with cooperative groups, and she added that there should
also be a surcharge so that the cooperative groups can do trials that help
define the standard of care for patients in the United States.
COOPERATIVE GROUPS PERSPECTIVE
The next speaker, Dr. Robert Comis, president and chairman of the
Coalition of Cancer Cooperative Groups and group chair of ECOG, dis-
cussed the increasing tension within cooperative groups between the need
to reduce data collection in order to save money and time and the need
to provide the data required for licensure of drugs. The pressure to save
money is real: Dr. Comis highlighted the flat budget for the Coopera-
tive Group Program over the past three or four years, which represents
a substantial decrease when adjusted for inflation. He began his talk by
describing the 1997 recommendation of the Armitage Committee, which
reviewed the NCI Cooperative Group Program (NCI, 1997). The commit-
tee recommended that in designing clinical trials, data collection should
be reduced and that investigators should collect only data pertinent to
studying endpoints and safety. At the same time, the FDA released a guid-
ance for industry stating that cooperative group data could be used for
FDA filings (FDA, 1998). This has, in part, fueled an increase in licensure
Phase III trials run by cooperative groups, Dr. Comis said.
But the NCI and the FDA differ in important ways in the data that
they require and how it is reported, he said. They differ in how adverse
events are reported, in eligibility and dosing checks, in how data are col-
lected in the laboratory and audited and monitored, in what locks are
placed on databases, and in what endpoints are verified. The additional
data or procedures that the FDA requires for licensure add substantial
OCR for page 65
DATA COLLECTION STANDARDS AND MONITORING
costs on to a clinical trial. Those costs are not necessarily reimbursed by
industry sponsors. For example, the FDA may require:
• ardiac safety monitoring tests and procedures that are not the
c
standard of care;
• a central review of imaging findings;
• revisions of case report forms; and
• upplemental data management efforts, such as reconciling NCI
s
and FDA databases of adverse event reports.
“When we do a study that has registration implications,” Dr. Comis
said, “there are tremendous additional workloads that are imposed on
the central offices and sites that are well beyond NCI funding levels.” To
meet those additional requirements, many cooperative groups scramble
in an ad hoc manner to acquire industry or other funding to support their
efforts, but this can be unreliable and difficult given that “the system is so
underfunded that there is no elasticity,” Dr. Comis said. “What if industry
funding goes away?” he asked.
Dr. Comis suggested that cooperative groups, government agen-
cies, and industry develop evidence-based standards for Phase III trials,
including standards for data collection, data and site monitoring, and
the content of case report forms. He also suggested that there be an inde-
pendent and thorough analysis of the value of independent reviews of
imaging findings and data, since many experts question their value and
added expense. During the discussion that followed the presentation, Dr.
Canetta agreed that such an analysis would be beneficial. Dr. Comis’s final
suggestion was that there be a cooperative group–wide support structure
to provide services beyond the capacity of the central offices.
FDA PERSPECTIVE
Dr. Richard Pazdur, director of the Office of Oncology Products in the
Center for Drug Evaluation and Research at the FDA, joined the speaker
panel in the discussion that followed Dr. Comis’s presentation, and in an
impromptu presentation he agreed with many of the points and sugges-
tions made by the previous speakers. He stressed the importance of inde-
pendent reviews of the data and imaging findings for trials that are not
placebo-controlled, but he added that independent review does not have
to be extensive. It is not necessary, for instance, to review every patient
case or solicit the opinions of three different radiologists. Dr. Pazdur sug-
gested exploring alternative mechanisms, including requiring blinded
trials, in order to ensure that there is no systematic bias in studies. The
FDA does not require independent review for blinded trials, he pointed
OCR for page 65
MULTI-CENTER PHASE III CLINICAL TRIALS
out. He also agreed with Dr. Fyfe that it is not feasible to require that there
be concurrence between reviewer and investigator in the assessment of
radiologic findings.
On the subject of FDA requirements related to assessing the safety of
a tested drug, Dr. Pazdur pointed out that the data needed to support a
safety claim for an oncology drug are much smaller than required in other
therapeutic areas. Because of this, for subsequent development of the
drug in sNDAs, the FDA may require more safety information involving
larger numbers of patients. That has been especially true in recent years,
since the FDA has become more safety conscious in the post-Vioxx era,
Dr. Pazdur said. He pointed out, however, that when oncology drugs fail
to get approved it is not because investigators failed to demonstrate their
safety to the FDA but rather because they failed to demonstrate their
efficacy. He concluded that it would be helpful if the FDA defined more
clearly what an optimal safety database is, and he suggested that a public
hearing and workshop be held on this topic.
Dr. Pazdur also noted that the FDA accepts the NCI’s auditing pro-
cedure but that industry often supersedes that auditing—not because of
requirements of the FDA but rather in order to meet its own needs. He
suggested developing uniform auditing standards that the NCI, the FDA,
and industry would all follow.
In the general discussion that ensued, Dr. Bruce Hillman, professor of
radiology at the University of Virginia School of Medicine, brought up the
subject of innovative techniques in imaging, such as analysis software that
can provide more precision and less variability in the analysis of images.
“I find it really hard to understand why we are still talking about linear
anatomic measurement criteria in this day and age of this extraordinary
software, especially as we start talking about targeted treatments,” he
said. He suggested considering these innovative analyses of radiologic
findings when developing data and review standards.
A few attendees raised the issue of industry funding of cooperative
group trials and the effect that this might have on the data collected and
reported. “The evidence necessary to sell something under a monopoly
structure at a high price is very different than the evidence needed to
influence the practice of medicine,” Mr. Robert Erwin said. “What is the
impact on the data that are collected—and even the questions that are
asked—by industry stepping into the breach to fill the funding gap left
by NCI?” Dr. Doroshow countered that any effects that industry might
have on clinical trial data could be kept in check by having an indepen-
dent auditing system. Dr. Abrams agreed that such an independent audit
is critical, as conflicts of interest can arise whenever cooperative groups
receive industry support. “If we do not have a robust independent review
of these trials,” he said, “the criticism will be raised quite quickly that
OCR for page 65
DATA COLLECTION STANDARDS AND MONITORING
these trials are being done by industry and that public dollars should
not pay for them. What will protect these trials is that they have a very
robust independent review, not just a cooperative group–only review.” Dr.
Padzur added that most of the cooperative group trials receiving industry
support are for supplemental indications of drugs whose safety and effec-
tiveness are already well established and that clinical trials for primary
indications generally are scrutinized more by the agency.
Dr. Schilsky suggested that ASCO would be more than willing to
oversee the development of minimum data standards for oncology clini-
cal trials that are acceptable to all stakeholders. He also suggested that
ASCO take the lead in determining the appropriate minimal eligibility
requirements among stakeholders; this is important because a lack of eli-
gibility is a major deterrent to patient participation in clinical trials, as had
been pointed out by Dr. Grubbs on the first day of the workshop. Such
eligibility criteria should ensure that the patient population of the study
is well defined and that the proposed treatment is likely to be safe in the
population to be studied, Dr. Schilsky said. Dr. Pazdur noted that lower-
ing the threshold of eligibility might increase the need for more safety
data. If, for example, people with compromised kidney or liver function
are allowed to participate in a clinical trial and it is not known how an
experimental drug is excreted, safety is more of an issue, he said.
Dr. John Wagner, executive director of clinical pharmacology at
Merck, suggested considering “fitness for purpose” when developing
minimal data standards so that the design of an experimental protocol can
adequately validate and qualify a particular use of the drug or biomarker
being tested. “There can be a minimal set of data-collection standards,
but for particular uses that may need to be augmented in one way or
another,” he said.
Dr. Canetta expanded on Dr. Fyfe’s comment that new toxicities are
rarely found during Phase III trials, because they are already documented
in Phase II trials. Industry and investigators “have been cutting off a lot
of the Phase II activities,” he said, “and therefore we have lost a lot of
learning opportunities.” He suggested a few remedies, such as doing
more randomized Phase II studies before proceeding to Phase III studies
or having an independent data and safety monitoring committee that is
program-wide for Phase II studies. The latter has helped his company
acquire better safety information before proceeding to Phase III studies,
he said.
Dr. Mendelsohn asked Dr. Comis if, in his analysis of cooperative
group trials, he found any key factors that foster adequate patient accru-
als. “The most important characteristics of a highly successful trial are
that it answers an interesting question and involves a new approach,” Dr.
OCR for page 65
MULTI-CENTER PHASE III CLINICAL TRIALS
Comis responded. “So I think we all need to focus on those things that
are the most cutting-edge.”
Dr. DeVere White noted that money is often a driver of change, and he
suggested changing the funding structure of cooperative groups. In par-
ticular, he suggested that instead of one-third of the money that the NCI
gives to a cooperative group going towards patient reimbursements with
the rest going to the infrastructure of the cooperative group, NCI fund-
ing be split more equally between patient reimbursements and coopera-
tive group infrastructure support. If this were done, more money would
be allocated to patients and to the physicians who put them in clinical
trials.
Dr. Buckner noted that the NCI typically does not require data collec-
tion on attribution of adverse events, whereas the FDA does. His clinical
trial findings suggest that attribution appears to be an unreliable endpoint
and perhaps could be immediately removed. Dr. Canetta concurred, not-
ing that he recently received a letter from the Japanese regulatory agency
asking that such attributions not be done.