Omics research is a continually evolving field, with rapidly advancing scientific techniques that are highly dependent on bioinformatics and rigorous statistical methods to effectively interpret high-dimensional data. Despite the challenges inherent in a new field and some setbacks (Dupuy and Simon, 2007; Ransohoff, 2005; Ransohoff and Gourlay, 2010; Simon et al., 2003), a number of omics-based tests derived from this research have reached the market and are being used to manage patient care across a broad range of medical specialties and subspecialties.
The purpose of the case studies was to examine the test discovery and development processes used in omics research and omics-based test development to help determine the criteria needed to effectively guide development of omics-based tests, and to consider the roles of the various responsible parties in test development (discussed in Chapter 5). These examples have influenced the emerging field of omics-based tests and will continue to guide the field as more information about these tests accrues. The case studies focus on factors such as:
- The discovery and confirmation of omics-based tests;
- Analytical validation;
- Statistical and bioinformatics validation;
- Clinical/biological validation; and
- Clinical utility and clinical use.
The most extensive case study centers on several omics-based tests developed by a Duke University laboratory to predict sensitivity to chemotherapeutic
agents (see Appendix B). These tests were used to select patient therapy in the three clinical trials identified in the IOM committee’s statement of task. In addition, the committee examined six tests that are currently commercially available (Table 6-1): Oncotype DX (Genomic Health), MammaPrint (Agendia), Tissue of Origin (Pathwork Diagnostics), OVA1 (Vermillion), AlloMap (XDx), and Corus CAD (CardioDx), as well as one test that did not advance to clinical use (OvaCheck). These cases reflect a number of diseases and types of omics research. The committee also reviewed the development and use of human epidermal growth factor receptor 2 (HER2) testing, one of the earliest single-biomarker tests to guide the choice of cancer therapy. This last case study illustrates the challenges involved in the development of a single analyte test and suggests how the complexity could be magnified for multianalyte, omics-based tests. Despite many years of research and development, challenges remain in defining the optimal test method and interpretation for HER2 testing.
Examination of what transpired in the development of the omics-based tests at Duke University was based on presentations to the committee, a panel discussion with Duke University researchers and administrators, documents provided by the National Cancer Institute and Duke University, and the peer reviewed literature. The Duke case study is illustrative of the more systemic challenges involved in the development of omics-based tests and the need for rigorous criteria for test development, and is discussed in more detail in Appendix B. For the other case studies, the committee reviewed publicly available materials, including peer reviewed publications and Food and Drug Administration (FDA) 510(k) clearance decision summaries. The committee also invited presentations on the development of Oncotype DX and MammaPrint (Shak, 2011; van ‘t Veer, 2011). Each of the six summaries for the commercially available tests was sent to the respective company for review of factual accuracy and completeness.
Detailed case studies were prepared for Oncotype DX, MammaPrint, Tissue of Origin, and AlloMap. The organization of the detailed case studies reflects the test development process that the committee recommends in Chapters 2-4, including candidate omics-based test discovery and confirmation, development into a defined and validated omics-based test, and evaluation for clinical use. Shorter case studies were assembled for HER2, OVA1, OvaCheck, and Corus CAD, and include details on discovery, development, and clinical use. Case studies appear in Appendix A.
TABLE 6-1 Overview of Commercially Available Omics-Based Tests
and Test Website
(diagnostic, prognostic, effect modifier)a
|Oncotype DX||Genomic Health
"a 21-gene assay that provides an individualized prediction of chemotherapy benefit and 10-year distant recurrence to inform adjuvant treatment decisions in certain women with early-stage breast cancer" (Genomic Health, 2011).
ASCO guidelines state:
"Oncotype DX may be used to identify patients who are predicted to obtain the most therapeutic benefit from adjuvant tamoxifen and may not require adjuvant chemotherapy. In addition, patients with high recurrence scores appear to achieve relatively more benefit from adjuvant chemotherapy than from tamoxifen" (Harris et al., 2007).
“a qualitative in vitro diagnostic test service, performed in a central laboratory, using the gene expression profile of fresh breast cancer tissue samples to assess a patient’s risk for distant metastasis (up to 10 years for patients less than 61 years old, up to 5 years for patients 61 years) … for breast cancer patients, with Stage I or Stage II disease, with tumor size 5.0 cm and lymph node negative. [Test] result is indicated for use as a prognostic marker only” (FDA, 2011b).
and Test Website
(diagnostic, prognostic, effect modifier)a
|Tissue of Origin||Pathwork Diagnostics
“The Pathwork® Tissue of Origin Test is an in vitro diagnostic intended to measure the degree of similarity between the RNA expression patterns in a patient’s formalin-fixed, paraffin-embedded (FFPE) tumor and the RNA expression patterns in a database of fifteen tumor types (poorly differentiated, undifferentiated, and metastatic cases) that were diagnosed according to the current clinical and pathological practice” (FDA, 2010).
|Guides referral to a gynecologic oncologist “a qualitative serum test that combines the results of five immunoassays into a single numerical score for women with an ovarian adnexal mass present for which surgery is planned as an aid to further assess the likelihood that malignancy is present when the physician’s independent clinical and radiological evaluation does not indicate malignancy. The test is not intended as a screening or stand-alone diagnostic assay” (FDA, 2011c).||Yes|
and Test Website
(diagnostic, prognostic, effect modifier)a
“an In Vitro Diagnostic Multivariate Index assay (IVDMIA) test service, performed in a single laboratory, assessing the gene expression profile of RNA isolated from peripheral blood mononuclear cells (PBMC[s]). [It] is intended to aid in the identification of heart transplant recipients with stable allograft function [at least 2 months ( 55 days) posttransplant] with a low probability of moderate/severe acute cellular rejection (ACR) at the time of testing” (FDA, 2008).
|Corus CAD||Cardio Dx
“blood test that can quickly and safely assess whether or not [a] patient’s symptoms are due to obstructive coronary artery disease (CAD) … a decision-making tool that can help identify patients unlikely to have obstructive CAD” (Cardio Dx, 2011).
|a The designation here reflects the intended use described in the FDA 510(k) decision summary, if there is one, or if not, the company-proposed use and/or guideline recommendation.
NOTE: ASCO = American Society of Clinical Oncology, FDA = Food and Drug Administration.
The review of the case studies highlighted several key concepts for omics-based test discovery, development, and validation. These include
- importance of a well-designed development plan;
- data and code availability;
- avoidance of overlap between discovery and validation specimens;
- locking down all aspects of the test prior to evaluation for clinical utility and use;
- interaction with FDA;
- clinical/biological validation characteristics;
- assessment of clinical utility; and
- role of the investigators and institutions in scientific oversight.
An important point to keep in mind is that these case studies are representative of an early period in the new field of omics-based test development, prior to broad agreement on standard processes and criteria for the various stages of test development. As the scientific community gains experience with omics research, more information about ideal test development processes will continue to emerge. Likewise, evidence on clinical utility for these tests will also become richer as more information accrues over time. Recommendations from this IOM committee on the processes for omics-based test development aim to provide needed clarity to the field.
Importance of a Well-Designed Development Plan
As described in Chapter 2, an important aspect of test development is a well-designed development plan. Components of a test development plan include starting with a clinically meaningful question, developing a candidate test on a training set of specimens, locking down the candidate test, and employing rigorous test validation procedures. The availability of appropriate archival tissue for clinical/biological validation can greatly facilitate rigorous development procedures. If appropriate archival samples are not available for assessment of clinical utility, then the development plan should include a prospective clinical trial design.
Tests developed in universities (or where the developmental process is started in universities) are likely to grow out of basic research on the biological meaning of the omics elements that comprise the test and gradually evolve into more formal test development. This is illustrated by the studies of HER2 and MammaPrint, although the development of MammaPrint crossed into a more formalized approach after Agendia was formed. In contrast, companies are likely to start with a focus on a specific test that will eventually
have commercial value and thus generally have a clear development plan early on, as seems to be the case with Genomic Health. According to a presentation to the IOM committee by Steven Shak, chief medical officer of Genomic Health, Oncotype DX had a clearly articulated development plan from the start. Investigators defined the purpose of the test and subsequently created and implemented a multistep, multistudy approach to develop their omics-based test “to provide the evidence regarding analytic performance, clinical[/ biological] validity, and clinical utility to meet the needs of patients, physicians, payors, and regulators” (Shak, 2011).
Data and Code Availability
As described in Chapter 2, the committee recommends that data and metadata used for identification of an omics-based test should be made available in an independently managed database in standard format, and that code and fully specified computational procedures for omics-based tests should be made sustainably available. Chapters 2 and 5 discuss the importance of sharing data within the scientific community, especially in the setting of omics research, where data and analyses are particularly complex. Sharing data can enable external verification of the results and allow other investigators to generate additional insights from the data. In Chapter 5, the committee recommends that journals and funders require the public availability of data, metadata, prespecified analysis plans, code, and fully specified computational models.
The importance of data availability was highlighted in the OvaCheck case study, in which publicly available datasets enabled external researchers to uncover serious problems in experimental design (Baggerly et al., 2004) that ultimately demonstrated that the published computational model (Petricoin et al., 2002) was based on artifacts and not on relevant biological signals.
In the Duke case, external investigators did not have full access to the data and code, and this limited the ability to independently evaluate the tests to determine whether they were valid (Baggerly and Coombes, 2009; Baron et al., 2010). Conflicting and unclear information in the papers and cited references regarding the data and statistical methods contributed to the inability of colleagues in the scientific community to understand and replicate the generation of the computational models (Baggerly, 2011; McShane, 2010; Review of Genomic Predictors for Clinical Trials from Nevins, Potti, and Barry, 2009). When Baggerly and Coombes attempted to assess the validity of the tests, they determined that there was insufficient information to reproduce the published results using the available data and the methods published in the Nature Medicine paper (Baggerly, 2011; Potti et al., 2006).
A review of the six commercially available tests (Table 6-2) illustrates
TABLE 6-2 Data Availability
|Test||Computational Model Published?||Raw Data from Discover)' Publicly Available?|
|Oncoryptf DX||Yes (Paik ct al., 2004)||Discovery RT-PCR data not available|
|MammaPrint||No||Discovery microarray data available|
|Tissue of Origin||No||Raw data from discover)' includes both publicly available (GEO accession number GSE2109) and private sources|
|OVA1||No||Initial raw data used to find a preliminary panel of biomarkcrs is available; raw data used to finalize the panel of biomarkcrs in the OVA1 test is unavailable; data on samples used to train the computational model arc available in the 510(k) decision summary, but arc not available in pccr-rcvicwcd literature|
|AlloMap||Yes (Deng ct al., 2006)||The microarray data used to initially identify biomarkcrs during the discovery phase of product development arc available in GEO under accession number GSE244S. Raw RT-PCR training data were provided to FDA but not reported in Deng ct al. (2006)|
|CorusCAD||Yes (Rosenberg ct al., 2010)||Discovery microarray data available (GEO accession number GSE206S6)|
NOTE: FDA = Food and Drug Ad ministration, GEO = Gene Expression Omnibus, RT-PCR = rcvcrsc-transcriptasc polymerase chain reaction.
a Personal communication, Laura van ‘t Veer, Agcndia, November 1, 2011.
b Personal communication, Steve Rosenberg, CardioDX, December 12, 2011.
|Went Through FDA Clearance Processes?||Is the Methodology to Derive the Computational Model Available, and in Sufficient Detail to Fully Reproduce?||Independent Clinical Research Entity Involved?|
|No||Methodology used to derive and train the computational model is not available in sufficient detail to fully reproduce||Yes National Surgical Adjuvant Breast and Bowel Project|
|Yes||Some of the details needed for independent replication are unclear from the supplementary materials, but inquiries about the computational model have always been answereda||Yes TRANSBIG, a Consortium of the Breast International Group|
|Yes||Methodology used to derive and train the computational model is not available||No|
|Yes||Methodology used to derive and train the computational model is not available||Yes (PrecisionMed International housed specimens, Quest Diagnostics performed biomarker measurements, Applied Clinical Intelligence performed data analysis)|
|Yes||Unknown. A description of the methods for biomarker discovery, test development, and computational model are detailed in the Deng et al. (2006) supplement||Yes Cardiac Allograft Rejection Gene Expression Observational investigators|
|No||Some of the details needed for independent replication, such as penalization tuning parameters for Ridge regression, are not provided. RT-PCR data used to derive the computational model were not published but are available upon request to qualified investigatorsb||Yes “The analysis was independently performed under the supervision of Dr. Schork at Scripps Translational Science Institute”|
that public availability of all omics-based test data and code has not been the standard of practice. The field of omics is early in its development and the standards for data sharing have been unclear and only slowly evolving toward more transparency. Commercial interests and protection of proprietary information may also limit the public availability of some data and information.
The cases highlight several examples in which test developers explicitly note the availability of data. For example, Paik et al. (2004), Deng et al. (2006), and Rosenberg et al. (2010) report the computational model for Oncotype DX, AlloMap, and Corus CAD, respectively. Both tests developed as laboratory-developed tests (LDTs) had published computational models (Oncotype DX and Corus CAD); only one FDA-cleared test has a published computational model (AlloMap). Discovery microarray data are available for MammaPrint, AlloMap, and Corus CAD (Deng et al., 2006; van ‘t Veer et al., 2002).1 Buyse et al. (2006) report that raw microarray data and clinical data for the MammaPrint clinical validation study were deposited with the European Bioinformatics Institute ArrayExpress database. Although there are examples of developers reporting the availability of a test’s computational model or data used in discovery or validation, as Table 6-2 shows, there often is not enough information publicly available for external investigators to fully reproduce a test.
The IOM committee recognized that it might not always be possible to make this information publicly available due to the protection of intellectual property. For publicly funded research, the committee recommends that code and fully specified computational procedures should be made available at the time of publication or at the end of funding. For commercially developed tests, code and fully specified computational procedures would be submitted for FDA review if seeking approval or clearance, or would be described in a publication in the case of an LDT (see Chapter 2). Companies that seek FDA clearance or approval for their tests would have had to submit data to FDA as part of the 510(k) clearance processes or premarket approval (PMA) processes, respectively, but only the information reported in the FDA decision summary is made publicly available.
Avoidance of Overlap of Discovery and Validation Specimens
One of the challenges of omics research is the lack of available specimens on which to conduct correlative science and exploration of candidate omics-based tests (reviewed in IOM, 2010). Partly due to this challenge, a number of tests have been developed with overlapping training
172Microarray data from Corus CAD are available, but PCR data used in test development are unavailable. Personal communication, Steve Rosenberg, October 21, 2011.
and validation datasets. In the cases that the committee reviewed, two commercial omics-based tests used overlapping training and development datasets at some point in their development processes: MammaPrint and AlloMap (Table 6-3). Of the samples used in the development of the MammaPrint computational model, 78 percent were reused in the van de Vivjer et al. (2002) study, leading to criticism about the overlap between training and validation datasets (Kim and Paik, 2010; Ransohoff, 2003, 2004). A subsequent validation study (Buyse et al., 2006) was performed, and suggested that overfitting due to the use of training samples was likely a problem in the 2002 study. In the AlloMap case, the first validation study used 63 specimens that had not been used in previous development of the test. This second validation included the 63 primary validation samples plus samples from patients who had contributed samples to the gene discovery and/or diagnostic development phases of the study. Overlap of training and validation datasets can lead to a number of problems in test discovery and development, including overstatement of the accuracy of an omics-based test and incorrect error estimation (Leek et al., 2010). The OvaCheck case study illustrates the importance of independent validation datasets (Diamandis, 2004). If test samples are obtained independently, preferably drawn from another institution at a different point in time, and prepared and analyzed separately, then the risk of batch effects is greatly reduced. As described in Chapter 2, test developers should clearly separate training and validation datasets in order to avert these problems, which are among the most common causes of a test ultimately failing clinical validation.
TABLE 6-3 Statistical and Bioinformatics Validation Considerations
|Test||Lock-Down Reported?||Overlap of Discovery and Validation Datasets at Some Point in Test Development?|
|Oncotype DX||Yes (Paik et al., 2004)||No|
|MammaPrint||Yes (Personal communicationa )||Yes|
|Tissue of Origin||Yes (Monzon et al., 2009; Pillai et al., 2011; personal communicationb )||No|
|OVA1||Yes (Personal communicationc )||No|
|AlloMap||Yes (Personal communicationd )||Yes|
|Corus CAD||Yes (Rosenberg et al., 2010)||No|
|a Personal communication, Laura van ‘t Veer, Agendia, November 28, 2011.
b Personal communication, Ed Stevens, Pathwork Diagnostics, October 18, 2011.
c Personal communication, Scott Henderson, Vermillion, December 12, 2011.
d Personal communication, Mitch Nelles, XDx, October 12, 2011.
Locking Down All Aspects of the Test Prior to Evaluation for Clinical Use
In Chapter 4, the committee recommends that a validated omics-based test should not be changed during a clinical trial. The omics-based test should be locked down prior to this stage of development (more information on the importance on locking down an omics-based test can be found in Chapters 2, 3, and 4). As noted by Baggerly and colleagues and by McShane, the computational models developed by a Duke University lab were not locked down prior to use in the clinical trials (Baggerly, 2011; McShane, 2010). This was a serious shortcoming in the development of the Duke omics-based tests. For three other cases, the developers explicitly stated in the study publication that their test was locked down before clinical validation: Oncotype DX (Paik et al., 2004), Tissue of Origin (Monzon et al., 2009; Pillai et al., 2011), and Corus CAD (Rosenberg et al., 2010). Communication with test developers provided additional confirmation that AlloMap, OVA1, and MammaPrint were locked down (Table 6-3).
Interaction with FDA
In Chapter 4, the committee recommends that FDA clarify the regulation of omics-based tests by developing and finalizing a guidance or regulation defining which omics-based tests require FDA review, the type of review required, and when this review should occur. A similar guidance should be developed and finalized for oversight of LDTs that are currently not reviewed by FDA.
Review of the committee’s case studies demonstrates that companies have pursued both LDT and FDA pathways for developing an omics-based test. The use of multiple pathways indicates a lack of clarity and consistency on the regulatory requirements for omics-based tests. Five of the commercially available tests that the committee examined are performed exclusively by each company’s proprietary Clinical Laboratory Improvement Amendments of 1988 (CLIA)-certified laboratory.2 Two companies did not seek FDA clearance and market their tests as LDTs: Genomic Health (Oncotype DX) and CardioDx (Corus CAD). Four companies received FDA 510(k) clearance of their tests: Agendia (MammaPrint), Pathwork Diagnostics (Tissue of Origin), Vermillion (OVA1), and XDx (AlloMap). The publicly available 510(k) clearance decision documents summarize the analytical and clinical validation results that the company/sponsor submitted to FDA. However, FDA clearance does not mean that a test has clinical utility, and
2 OVA1 is performed exclusively by Quest Diagnostics, which is subject to CLIA certification (Quest Diagnostics, 2011). Currently Pathwork Diagnostics offers Tissue of Origin exclusively through its CLIA-certified laboratory, but is developing an in vitro diagnostic test kit for other laboratories (Pathwork Diagnostics, 2010).
lack of FDA clearance does not mean that a test does not have clinical utility. This is an important distinction between the FDA approval process for drugs versus devices (which includes omics-based tests).
In Chapter 5, the committee also recommends that FDA communicate the investigational device exemption (IDE) requirements for omics-based tests used in clinical trials. Commercial developers, such as those examined in the case studies, may be more familiar with IDE requirements than academic institutions. In several of the case studies, companies and FDA held a pre-IDE meeting to determine whether an IDE would be required for the company’s test development process. FDA determined that an IDE was not needed for both the AlloMap and Tissue of Origin tests because the test was not directing patient therapy in the studies proposed to assess the test.3 Physicians can now use the test for that purpose, however. Agendia reported that it received an IDE for MammaPrint that helped clarify the process and requirements for the de novo 510(k),4 and Vermillion reported that it received an IDE for OVA1.5 Two ongoing prospective studies (the TAI-LORx and RxPONDER trials) direct patient management on the basis of the Oncotype DX Recurrence Score. For both trials, information required for approval of investigational use of Oncotype DX in the trial was submitted as part of an investigational new drug application to FDA.6 In the Duke case study, the investigators did consult FDA regarding the need for an IDE (FDA, 2011a). In 2009, FDA sent a letter to Duke stating that the omics-based tests being studied in the three clinical trials named in the IOM statement of task needed to go through the IDE process (Chan, 2009). In response, the investigators made some changes to the protocol of the studies, and Duke contacted FDA for further clarification about whether an IDE was still required (FDA, 2011a; Potti, 2009). The Duke Institutional Review Board (IRB) determined that an IDE was not needed when it did not receive a reply from FDA7 (FDA, 2011a). However, in retrospect, the Duke IRB recognized that an IDE should have been obtained for the omics-based tests because the tests were used to direct patient management in the clinical trials (FDA, 2011a).
Regardless of which pathway is taken to market, consultation with FDA
3 Personal communication, Mitch Nelles, XDx, October 12, 2011; Personal communication, Ed Stevens, Pathwork Diagnostics, October 18, 2011.
4 Personal communication, Laura van ‘t Veer, Agendia, November 28, 2011.
5 Personal communication, Scott Henderson, Vermillion, November 1, 2011.
6 Personal communication, Lisa McShane, National Cancer Institute, February 9, 2012.
7 According to the FDA website, the Center for Drug Evaluation and Research has no record of receiving the December 2009 letter from Dr. Anil Potti discussing an exemption for the trial that received pre-IDE review. The letter was brought to FDA’s attention during its 2011 inspection of the Duke IRB and clinical investigators (see http://www.fda.gov/MedicalDevices/ProductsandMedicalProcedures/InVitroDiagnostics/ucm289100.htm).
can be beneficial and is recommended. For example, the developers of the OVA1 test sought FDA input, and this early dialogue with FDA prompted Vermillion to include two different cut-off values for the test, depending on a patient’s menopausal status (Fung, 2010). Although Genomic Health did not meet with FDA, the company indicated that it benefited from past experience working with FDA and from the extensive background material FDA provides on its website about assay validation.8
Clinical/Biological Validation Characteristics
The choice of study design for the clinical/biological validation studies of the tests varied widely among the case studies (Table 6-4). Designs included retrospective studies, a prospective–retrospective study (using archived specimens from previously conducted, formal clinical trials that evaluated treatment options that might be affected by the test’s use), and prospective trials in which the test was not directing therapy. In the Duke case, the investigators attempted to validate some of the tests using cell lines, or in one case, clinical samples from patients with breast cancer (Bonnefoi et al., 2007). However, prospective clinical trials in which the omics-based tests were used to direct patient therapy were initiated prematurely, before clinical/ biological validity had been established.
As described in Chapter 3, the committee recommended that the identity of the specimens used for clinical/biological validation of the omics-based test should be blinded to the individuals performing and interpreting the test results during clinical/biological validation of the test when feasible. In some instances, this may entail working with another independent group to conduct validation studies. In the case of Oncotype DX, Genomic Health completed reverse-transcriptase polymerase chain reaction (RT-PCR) analysis of the specimens used for clinical validation and supplied the results to the National Surgical Adjuvant Breast and Bowel Project, who conducted the analyses evaluating the association between recurrence score and clinical outcome. In the clinical validation for MammaPrint by Buyse et al. (2006), the data were housed at TRANSBIG,9 and the statistical analyses were conducted by the International Drug Development Institute. For Corus CAD, the publication describing the clinical/biological validation noted that an investigator at the Scripps Translational Science Institute was responsible for the data analyses, while the laboratory work was performed at CardioDX.
8 Personal communication, Steven Shak, Genomic Health, December 13, 2011.
9 A consortium launched by the Breast International Group (BIG) to promote international collaboration in translational research. (See www.breastinternationalgroup.org/Research/TRANSBIG.aspx.)
TABLE 6-4 Choice of Trial Designs for Clinical/Biological Validation
|Test||Clinical Validation Designs|
|Oncotype DX||Prospective–retrospective (Paik et al., 2004)
Retrospective (Habel et al., 2006)
|Tissue of Origin||Retrospective|
|OVA1||Prospective (not directing therapy—trial was for development and validation)|
|AlloMap||Prospective (not directing therapy—trial was for development and validation)|
|Corus CAD||Prospective (not directing therapy—trial was for development and validation)|
The development and validation pathways for Oncotype DX and MammaPrint offer some interesting comparisons. Kim and Paik (2010) note that both MammaPrint and Oncotype DX went through their respective development steps either “by design or by demand from the community.” Investigators largely chose different strategies for test discovery and development, including the type of tissue used in discovery, the type of risk score (dichotomous variable versus low-, intermediate-, and high-risk categories); different regulatory approaches; and a different intended use population. Oncotype DX was designed for patients with estrogen-receptor-positive, lymph-node-negative, early-stage breast cancer. MammaPrint has a potentially larger indication, including patients with estrogen-receptor-negative tumors. Despite these differences, MammaPrint and Oncotype DX have shown around 80 percent agreement in outcome classification (Fan et al., 2006), although they only have one gene in common.
Assessment of Clinical Utility
Clinical utility is defined as “evidence of improved measurable clinical outcomes, and [a test’s] usefulness and added value to patient management decision-making compared with current management without [the] test” (Teutsch et al., 2009, p. 11). Assessment of clinical utility is not part of the FDA’s evaluation of a test, and generally, clinical utility is determined after a test or device is on the market, sometimes decades later. Clinical utility is different from the intrinsic attributes of a test’s performance characteristics, such as sensitivity and specificity. Clinical utility is not concerned with how a test performs, but rather how its use influences health outcomes. As described in Chapter 4, the ideal way to assess clinical utility is through
prospective randomized controlled trials addressing, for example, whether the use of the new test results in an increase in the length or quality of life, a significant increase in progression-free survival, or avoidance of unnecessary treatment (especially toxic treatment) by patients who are unlikely to benefit. Clinical trials assessing clinical utility are important because they can inform how a test is used in practice.
As described in Chapter 4, prospective trial designs in which biomarkers direct patient management (as in Figures 4-4 and 4-5) are useful approaches to assess the clinical utility of the biomarker. However, they are also the study designs that pose the highest risk to trial participants, because treatment decisions are determined by the test result. The justification for using these designs depends substantially on the amount of information known about the test. For example there were no reported validation attempts using clinical tumor samples from patients with lung cancer for the cisplatin test developed by the Duke University lab, even though the first trial in which the cisplatin test was used to guide therapy was the NCT00509366 trial for advanced lung cancer. This type of trial design may have been premature, given the lack of tumor-specific validation of the cisplatin test.
Evidence on clinical utility evolves as new information about a test emerges, and the absence of data on clinical utility should not be interpreted as a lack of utility. A defining feature of both the MammaPrint and Oncotype DX development stories is the need for a large, prospective trial to provide more information on each test’s utility in clinical practice. In both cases, the prospective trial was not initiated until after the test was on the market and widely available for clinical use as a prognostic factor. The results of the prospective studies TAILORx and MINDACT will provide prospective information on the clinical utility of Oncotype DX and MammaPrint, respectively, for the first time.
The Role of Investigators and Institutions in Scientific Oversight
Chapter 5 comprehensively discusses the roles of responsible parties in the conduct of omics research and omics-based test development. Here, the roles of the investigators and institutions are highlighted for their central role in ensuring rigorous omics research and test development. As illustrated in the Duke case study (Appendix B), transparency and open communication are integral to the conduct of science, whether it be reporting of data and code, disclosure of conflicts of interest, or reporting of potential breaches in scientific procedures.
Investigators are responsible for the accuracy of their data, the fairness of their conclusions, and responding appropriately to criticism. Investigators ensure that clinical research is conducted with the engagement of
appropriate scientific expertise, including the involvement of individuals with proper biostatistics and bioinformatics expertise, and that the research has the approval of relevant review bodies. It also is important for all members of a research team to understand the aims and intricacies of collaborative studies and for coauthors of a publication to keep each other informed about constructive criticism of the work and ways to improve ongoing research.
Institutions play an important role in establishing a culture of scientific integrity and transparency, including setting expectations of behavior, achievement, and integrity, and providing safe environments for reporting irregularities to prevent lapses in scientific integrity. Institutions are directly charged with being the “oversight” bodies when specific scientific questions or challenges arise, including investigating questions of misconduct, or simply in investigating “soundness of science.” Oversight processes that will maintain integrity even in the presence of institutional conflicts of interest, both financial and non-financial (such as factors that impact an institution’s reputation) may be especially important in addressing this charge. Closer attention to such conflicts may have been helpful in avoiding the events that occurred in the Duke case (see Appendix B).
The case studies illustrate the multitude of considerations that must be taken into account to move omics research into test development for clinical use. The involvement of investigators, institutions, funders, and journals is essential for ensuring good research practices and oversight of omics-based test discovery and development. A well-designed test development plan addresses a clinically meaningful question and employs rigorous test discovery, development, and validation procedures. This includes locking down all aspects of an omics-based test prior to evaluation for clinical utility and use, and avoiding overlap between discovery and validation specimens. Choosing an appropriate clinical/biological validation strategy and interacting with FDA prior to initiation of validation studies also reflect a well-designed test development plan. Making data and code available are critical aspects of test development because it enables external verification of the results and generation of additional insights that can advance science and patient care.
Baggerly, K. A. 2011. Forensics Bioinformatics. Presentation at the Workshop of the IOM Committee on the Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials, Washington, DC, March 30-31.
Baggerly, K. A., and K. R. Coombes. 2009. Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology. Annals of Applied Statistics 3(4):1309-1334.
Baggerly, K. A., J. S. Morris, and K. R. Coombes. 2004. Reproducibility of SELDI-TOF protein patterns in serum: Comparing datasets from different experiments. Bioinformatics 20(5):777-785.
Baron, A. E., K. Bandeen-Roche, D. A. Berry, J. Bryan, V. J. Carey, K. Chaloner, M. Delorenzi, B. Efron, R. C. Elston, D. Ghosh, J. D. Goldberg, S. Goodman, F. E. Harrell, S. Galloway Hilsenbeck, W. Huber, R. A. Irizarry, C. Kendziorski, M. R. Kosorok, T. A. Louis, J. S. Marron, M. Newton, M. Ochs, J. Quackenbush, G. L. Rosner, I. Ruczinski, S. Skates, T. P. Speed, J. D. Storey, Z. Szallasi, R. Tibshirani, and S. Zeger. 2010. Letter to Harold Varmus: Concerns about Prediction Models Used in Duke Clinical Trials. Bethesda, MD, July 19, 2010. http://www.cancerletter.com/categories/documents (accessed January 18, 2012).
Buyse, M., S. Loi, L. J. van ‘t Veer, G. Viale, M. Delorenzi, A. M. Glas, M. S. d’Assignies, J. Bergh, R. Lidereau, P. Ellis, A. Harris, J. Bogaerts, P. Therasse, A. Floore, M. Amakrane, F. Piette, E. T. Rutgers, C. Sortiriou, F. Cardoso, and M. J. Piccart. 2006. Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. Journal of the National Cancer Institute 98(17):1183-1192.
Cardio Dx. 2011. What Is Corus CAD? http://www.cardiodx.com/corus-cad/product-overview/ (accessed November 21, 2011).
Chan, M. M. 2009. Letter to Division of Medical Oncology, Duke University Medical Center. http://www.fda.gov/downloads/MedicalDevices/ProductsandMedicalProcedures/InVitroDiagnostics/UCM289102.pdf (accessed February 9, 2012).
Deng, M. C., H. J. Eisen, M. R. Mehra, M. Billingham, C. C. Marboe, G. Berry, J. Kobashigawa, F. L. Johnson, R. C. Starling, S. Murali, D. F. Pauly, H. Baron, J. G. Wohlgemuth, R. N. Woodward, T. M. Klingler, D. Walther, P. G. Lal, S. Rosenberg, S. Hunt, and for the CARGO Investigators. 2006. Noninvasive discrimination of rejection in cardiac allograft recipients using gene expression profiling. American Journal of Transplantation 6(1):150-160.
Diamandis, E. 2004. Mass spectrometry as a diagnostic and a cancer biomarker discovery tool: Opportunities and potential limitations. Molecular and Cellular Proteomics 3(4):367-378.
Dupuy, A., and R. M. Simon. 2007. Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. Journal of the National Cancer Institute 99(2):147-157.
Fan, C., D. S. Oh, L. Wessels, B. Weigelt, D. S. A. Nuyten, A. B. Nobel, L. J. van ‘t Veer, and C. M. Perou. 2006. Concordance among gene-expression-based predictors for breast cancer. New England Journal of Medicine 355(6):560-569.
FDA (Food and Drug Administration). 2008. 510(k) Substantial Equivalence Determination Decision Summary Assay and Instrument Combination Template. http://www.accessdata.fda.gov/cdrh_docs/reviews/K073482.pdf (accessed November 21, 2011).
FDA. 2010. 510(k) Substantial Equivalence Determination Decision Summary (K092967). http://www.accessdata.fda.gov/cdrh_docs/reviews/K092967.pdf (accessed November 16, 2011).
FDA. 2011a. FDA Establishment Inspection Report, Duke University Medical Center. http://www.fda.gov/downloads/MedicalDevices/ProductsandMedicalProcedures/InVitroDiagnostics/UCM289106.pdf (accessed February 9, 2012).
FDA. 2011b. 510(k) Substantial Equivalence Determination Decision Summary (K101454). http://www.accessdata.fda.gov/cdrh_docs/reviews/K101454.pdf (accessed September 19, 2011).
FDA. 2011c. Substantial Equivalence Determination Decision Summary (K081754). http://www.accessdata.fda.gov/cdrh_docs/reviews/K081754.pdf (accessed October 11, 2011).
Fung, E. T. 2010. A recipe for proteomics diagnostic test development: The OVA1 test, from biomarker discovery to FDA clearance. Clinical Chemistry 56(2):327-329.
Genomic Health. 2011. Overview: What Is the Oncotype DX Assay? http://www.oncotypedx.com/en-US/Breast/HealthcareProfessional/Overview.aspx (accessed September 14, 2011).
Habel, L. A., S. Shak, M. Jacobs, A. Capra, C. Alexander, M. Pho, J. Baker, M. Walker, D. Watson, J. Hackett, N. T. Blick, D. Greenberg, L. Fehrenbacher, B. Langholz, and C. P. Quesenberry. 2006. A population-based study of tumor gene expression and risk of breast cancer death among lymph node-negative patients. Breast Cancer Research 8(3):R25.
Harris, L., H. Fritsche, R. Mennel, L. Norton, P. Ravdin, S. E. Taube, M. R. Somerfield, D. F. Hayes, and R. C. Bast Jr. 2007. American Society of Clinical Oncology 2007 update of recommendations for the use of tumor markers in breast cancer. Journal of Clinical Oncology 25(33):5287-5312.
IOM (Institute of Medicine). 2010. A National Clinical Trials System for the 21st Century: Reinvigorating the Cooperative Group Program. Washington, DC: The National Academies Press.
Kim, C., and S. Paik. 2010. Gene-expression-based prognostic assays for breast cancer. Nature Reviews 7(6):340-347.
Leek, J. T., R. B. Scharpf, H. C. Bravo, D. Simcha, B. Langmead, W. E. Johnson, D. Geman, K. Baggerly, and R. A. Irizarry. 2010. Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics 11(10):733-739.
McShane, L. M. 2010a. NCI Address to the Institute of Medicine Committee on the Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials. Presented at Meeting 1. Washington, DC, December 20.
Monzon, F. A., M. Lyons-Weiler, L. J. Buturovic, C. T. Rigl, W. D. Henner, C. Sciulli, C. I. Dumur, F. Medeiros, and G. G. Anderson. 2009. Multicenter validation of a 1,550-gene expression profile for identification of tumor tissue of origin. Journal of Clinical Oncology 27(15):2503-2508.
Paik, S., S. Shak, G. Tang, C. Kim, J. Baker, M. Cronin, F. L. Baehner, M. G. Walker, D. Watson, T. Park, W. Hiller, E. R. Fisher, D. L. Wickerham, J. Bryant, and N. Wolmark. 2004. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. New England Journal of Medicine 351(27):2817-2826.
Pathwork Diagnostics. 2010. Pathwork Tissue of Origin test for FFPE cleared by U.S. Food and Drug Administration. http://www.pathworkdx.com/News/M129_FDA_Clearance_Final.pdf (accessed November 17, 2011).
Petricoin, E. F., A. M. Ardekani, B. A. Hitt, P. J. Levine, V. A. Fusaro, S. M. Steinberg, G. B. Mills, C. Simone, D. A. Fishman, E. C. Kohn, and L. A. Liotta. 2002. Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306):572-577.
Pillai, R., R. Deeter, C. T. Rigl, J. S. Nystrom, M. H. Miller, L. Buturovic, and W. D. Henner. 2011. Validation and reproducibility of a microarray-based gene expression test for tumor identification in formalin-fixed, paraffin-embedded specimens. Journal of Molecular Diagnostics 13(1):48-56.
Potti, A. 2009. Letter to FDA’s CDER from Division of Medical Oncology, Duke University Medical Center. http://www.fda.gov/downloads/MedicalDevices/ProductsandMedicalProcedures/InVitroDiagnostics/UCM289103.pdf (accessed February 9, 2012).
Potti, A., H. K. Dressman, A. Bild, R. F. Riedel, G. Chan, R. Sayer, J. Cragun, H. Cottrill, M. J. Kelley, R. Petersen, D. Harpole, J. Marks, A. Berchuck, G. S. Ginsburg, P. Febbo, J. Lancaster, and J. R. Nevins. 2006. Genomic signatures to guide the use of chemotherapeutics. Nature Medicine 12(11):1294-1300.
Quest Diagnostics. 2011. Licenses and Accreditation. http://www.questdiagnostics.com/brand/company/b_comp_licenses.html (accessed November 21, 2011).
Ransohoff, D. F. 2003. Gene-expression signatures in breast cancer. New England Journal of Medicine 348(17):1716.
Ransohoff, D. F. 2004. Rules of evidence for cancer molecular-marker discovery and validation. Nature Reviews Cancer 4(4):309-314.
Ransohoff, D. F. 2005. Bias as a threat to the validity of cancer molecular-marker research. Nature Reviews Cancer 5:142-148.
Ransohoff, D. F., and M. L. Gourlay. 2010. Sources of bias in specimens for research about molecular markers for cancer. Journal of Clinical Oncology 28(4):698-704.
Review of Genomic Predictors for Clinical Trials from Nevins, Potti, and Barry. 2009. Durham, NC: Duke University.
Rosenberg, S., M. R. Elashoff, P. Beineke, S. E. Daniels, J. A. Wingrove, W. G. Tingley, P. T. Sager, A. J. Sehnert, M. Yau, W. E. Kraus, K. Newby, R. S. Schwartz, S. Voros, S. G. Ellis, N. Tahirkhelli, R. Waksman, J. McPherson, A. Lansky, M. E. Winn, N. J. Schork, E. J. Topol, and for the PREDICT (Personalized Risk Evaluation and Diagnosis In the Conorary Tree) Investigators. 2010. Multicenter validation of the diagnostic accuracy of a blood-based gene expression test for assessing obstructive coronary artery disease in nondiabetic patients. Annals of Internal Medicine 153(7):425-434.
Shak, S. 2011. Case Study: Oncotype DX Breast Cancer Assay. Presentation at the Workshop of the IOM Committee on the Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials, Washington, DC, March 30.
Simon, R., M. D. Radmacher, K. Dobbin, and L. M. McShane. 2003. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. Journal of the National Cancer Institute 95(1):14-18.
Teutsch, S. M., L. A. Bradley, G. E. Palomaki, J. E. Haddow, M. Piper, N. Calonge, D. Dotson, M. P. Douglas, and A. O. Berg. 2009. The Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Initiative: Methods of the EGAPP Working Group. Genetics in Medicine 11(1):3-14.
van de Vijver, M. J., Y. D. He, L. J. van ‘t Veer, H. Dai, A. A. M. Hart, D. W. Voskuil, G. J. Schreiber, J. L. Peterse, C. Roberts, M. J. Marton, M. Parrish, D. Atsma, A. Witteven, A. M. Glas, L. Delahaye, T. van der Velde, H. Bartelink, S. Rodenhuis, E. T. Rutgers, S. F. Friend, and R. Bernards. 2002. A gene-expression signature as a predictor of survival in breast cancer. New England Journal of Medicine 347(25):1999-2009.
van ‘t Veer, L. J. 2011. Case study—MammaPrint. Presented at Meeting 2 of the Committee on Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials, Washington, DC, March 30.
van ‘t Veer, L. J., H. Dai, M. J. van de Vijver, Y. D. He, A. A. M. Hart, M. Mao, H. L. Peterse, K. van der Kooy, M. J. Marton, A. T. Wittereveen, G. J. Schreiber, R. M. Kerkoven, C. Roberts, P. S. Linsley, R. Bernards, and S. F. Friend. 2002. Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(31):530-536.