4


Perspectives of Diagnostic Test and Pharmaceutical Developers

Important Points Highlighted by Individual Speakers

•   There are many sources of IVD performance error—from patient samples to data interpretation and reporting—and establishing clinical validity is just as important as establishing analytical validity for assuring that patients receive the correct therapy.

•   Using robust scientific evidence for determining if benefits outweigh the risks for using a companion diagnostic should be the basis for creating a level playing field for regulating IVDs and LDTs.

•   The implementation of a strong external quality assurance program for IVDs and LDTs is needed as a standard for validating biomarker measurements across laboratories.

•   Studying patients who tested negative for a biomarker but who could have benefited from the associated therapy is important for optimizing patient populations for drugs and for understanding disease biology.

•   Development of test registries to compare test results across multiple laboratory settings could establish stronger links between test performance and clinical outcomes.

•   Several pharmaceutical companies have established internal diagnostic groups for co-development, but the companies have not overlooked the value of collaborating externally with experts.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 29
4 Perspectives of Diagnostic Test and Pharmaceutical Developers Important Points Highlighted by Individual Speakers • There are many sources of IVD performance error—from patient samples to data interpretation and reporting—and establishing clinical validity is just as important as establishing analytical validity for assuring that patients receive the correct therapy. • Using robust scientific evidence for determining if benefits out- weigh the risks for using a companion diagnostic should be the basis for creating a level playing field for regulating IVDs and LDTs. • The implementation of a strong external quality assurance pro- gram for IVDs and LDTs is needed as a standard for validating biomarker measurements across laboratories. • Studying patients who tested negative for a biomarker but who could have benefited from the associated therapy is important for optimizing patient populations for drugs and for under- standing disease biology. • Development of test registries to compare test results across multiple laboratory settings could establish stronger links between test performance and clinical outcomes. • Several pharmaceutical companies have established internal diag- nostic groups for co-development, but the companies have not overlooked the value of collaborating externally with experts. 29

OCR for page 29
30 THERAPEUTIC AND DIAGNOSTIC CO-DEVELOPMENT To assess the present and future prospects for the use of co-developed companion diagnostics in drug use and development, speakers represent- ing groups involved with test development as well as speakers representing pharmaceutical manufacturing were invited to share their perspectives. A theme that emerged was the importance of obtaining analytical validity, clinical validity, and clinical utility evidence for tests, regardless of how the tests are specifically developed, distributed, or conducted. Developing the evidence for validity and utility A number of important shortcomings currently affect clinicians’ use of IVDs. One important issue is that IVDs vary in performance and have many sources of potential error, said Walter Koch, vice president of global research at Roche Molecular Systems. No fewer than a dozen different methods are currently used for mutation detection, he said. Furthermore, tumors are heterogeneous, which raises the possibility that variability in tissue sampling may lead to results based on just a few cells that may not accurately reflect the cellular makeup of the tumor. In addition, reagents used in tests can be variable because of considerable lot-to-lot variation, and in manual analyses data interpretations may vary. Koch noted that two studies performed in Europe support the notion that procedural steps are not well controlled in some laboratories. In fact, for KRAS testing, only 70 percent of the laboratories accurately reported all of the mutations (Beau-Faller et al., 2011; Bellon et al., 2011; Dequeker et al., 2011). In addition to the KRAS example above, Koch cited an example from Roche’s clinical trials for Zelboraf, where the cobas® 4800 BRAF V600 Mutation Test was used to identify patients with melanoma tumors har- boring the V600E BRAF mutation (Cheng et al., 2012). While Koch noted that FDA recognizes Sanger sequencing as the “gold standard” for variant detection in the absence of an FDA-approved test, he pointed out that it may be poorly suited for cancer tissue mutation analysis because of known poor sensitivity for samples containing less than 25 percent mutant alleles, which is frequently the case in cancer (Anderson et al., 2012; Halait et al., 2012). Other potential consequences of relying on Sanger sequencing include invalid results (no results), false negatives (incorrectly identified as wild-type), and false positives (incorrectly identified as BRAF V600E) that may occur more often, as reported by Anderson et al. (2012). The down- stream clinical implications of these errors could include inappropriate denial or delayed access to Zelboraf or patients inappropriately receiving the drug, which may lead to preventable toxicity in addition to poor effi- cacy. “Today, a lot of laboratories are using [Sanger] technology to do these kinds of mutation analyses,” Koch said. “They are perhaps inappropriate for this use.”

OCR for page 29
PERSPECTIVES OF DIAGNOSTIC TEST AND PHARMACEUTICAL DEVELOPERS 31 Even tests using the same technology can produce discrepant results (Gonzalez de Castro et al., 2012). A comparison of the cobas® 4800 BRAF V600 Mutation Test with Therascreen (Qiagen) BRAF RGQ found that the two methods produced different results in tumors with necrosis and low tumor cell content (Longshore et al., 2012). As a result, some patients would be assigned to the wrong category for receiving or not receiving the drug, Koch said. Response to Potential Solutions Several suggested stakeholder solutions were proposed to address the current co-development pathway (see Box 1-2), including an American Clinical Laboratory Association proposal that clinical validity should be assured for laboratory tests. Koch examined the standards for clinical valid- ity in the proposed federal legislation, Modernizing Laboratory Test Stan- dards for Patients Act of 2011 (H.R. 3207), which states, “One or more studies published in a peer-reviewed journal that is generally recognized to be of national scope and reputation, or data from unpublished studies conducted by the submitter or for which the submitter has obtained a right of reference, shall be sufficient to constitute reasonable assurance of the clinical validity of the claimed uses.” Koch then suggested that this may not be sufficient to decide on routine use of biomarker testing; rather, replicated studies and more substantial clinical validation should be required. The proposals made by AdvaMed as well as by FDA reflect a risk-based approach, Koch said. He thought that this solution could be improved upon by answering questions about regulating tests when there is already an existing test and about whether an alternate Class III equivalence mecha- nism could be used because repeating a clinical trial is not practical. “All in vitro diagnostics, regardless of where they are made—by a manufacturer or a lab—should be subjected to similar regulatory approaches,” Koch said. “At the end of the day, the same risk–benefit profiles apply to patients when a therapeutic decision is based on that result. So why should they be treated differently?” Koch disagreed with the recommendations from CAP that “com­ anion p analytes” should be defined, because IVDs clearly vary in both analytical and clinical performance. He also disagreed that a single diagnostic prevents further research, as suggested by the American Society of Clinical Oncology (ASCO). “Our own drug company—and the academics that I work with— continue to dig into the complex biology of cancer,” he said. “They are not ­ ­ limited simply by that companion diagnostic.” Koch said that payment reform is needed “to recognize the value of advanced medical diagnostic tests, their impact on health care, and the resources needed to develop and clinically validate them.” Inadequate pay-

OCR for page 29
32 THERAPEUTIC AND DIAGNOSTIC CO-DEVELOPMENT ment systems seem to hinder innovation as well as patient access to new tests; however, Koch said, a potential alternative strategy has been sug- gested by Medicare’s Molecular Diagnostics Services Program at Palmetto GBA,1 which issues reimbursement based on an assessment of levels of analytical and clinical evidence. Leveling the Playing Field The co-development process has several benefits, said Pamela S ­ watkowski, director of regulatory affairs for Abbott Molecular. It pro- vides an opportunity to evaluate the drug and the device in one trial and to select an optimum patient population for a smaller clinical trial. For the pharmaceutical manufacturer, an effective marker can improve the effects of the drug. However, a pharmaceutical company needs to know whether the performance of a diagnostic is robust and sometimes LDTs may not give that same level of assurance that the correct population has been selected. For diagnostic manufacturers, an effective marker can facilitate use by phar- maceutical manufacturers as well as by clinicians. For example, Swatkowski said, co-development enables new types of diagnostic claims, and patients can be well characterized and receive extensive follow-up and monitoring of outcomes. A level playing field is needed for all tests to determine device safety and efficacy by answering the basic question of whether “there is enough valid scientific evidence that the benefits outweigh any probable risks,” S ­ watkowski said. She agreed with ASCO’s point about the challenge of regulatory uncertainty regarding FDA oversight of companion diagnostic LDTs. Having a better understanding of the enforcement discretion of LDTs would be useful for providing evidence for clinical utility and not just ana- lytical performance. After the test was approved by FDA in 2011, Koch observed that other labs were advertising BRAF tests for Zelboraf use within a short period of time. However, it was not evident what the performance characteristics of their tests were, what technologies were being used, or how the test might relate to the FDA-approved test. To Koch, the situation was similar to a drug being approved and having a generic drug available soon after for the same application. In this circumstance “it just doesn’t seem like a level playing field,” he said. Swatkowski encouraged FDA to work with industry “to define those requirements for development of subsequent assays after the first com­ 1  PalmettoGBA. Homepage. See http://www.palmettogba.com/palmetto/palmetto.nsf/Site Home?ReadForm (accessed October 10, 2013).

OCR for page 29
PERSPECTIVES OF DIAGNOSTIC TEST AND PHARMACEUTICAL DEVELOPERS 33 panion diagnostic is approved, since we all know that samples from the original trial won’t be available.” Response to Proposed Solutions With respect to various stakeholder proposals (see Chapter 1 and Box 1-2), Swatkowski highlighted the proposal made by the Coalition for 21st Century Medicine that developers of tests, whether IVDs or LDTs, “need to offer proof of clinical validity in order to obtain coverage and reimbursement,” with “reimbursement based on the performance of the test and the evidence that supports that performance.” Pfeifer’s interpretation of the proposal by the Coalition for 21st Century Medicine is that cover- age and reimbursement should be based on performance of each technique. The assumption that a companion diagnostic at some level will provide a level of performance that cannot be matched by LDTs will probably not be borne out, Pfeifer said. Swatkowski also agreed with the position of AdvaMed that tests should be regulated according to risk, despite the challenges of doing so. For instance, though the CAP proposal emphasizes the analyte be used for drug efficacy, defining an analyte this way does not address the test technology and assay variability among different methodologies. An important point to make, she said, is that an IVD is a system that extends from sample prepa- ration through test generation and bioinformatics to the “algorithms that determine whether a patient is positive or negative, and the cutoff that’s used is really the heart of the IVD device.” AREAS FOR CONSIDERATION Additional considerations, including financial reimbursement and cod- ing requirements, may need to be addressed in order to improve the cur- rent system of IVD use. Swatkowski highlighted several issues related to reimbursement for IVDs, including the need for transparent coding, especially in preparation for next-generation sequencing. With the current coding system, payers do not necessarily know what they are paying for, Swatkowski said. For example, the test for a particular analyte may not be transparent as to whether the test has been FDA-approved or is an LDT. In addition, differential payments should be considered for clinically validated FDA-approved assays, she said, as is currently done with innovative drugs. FDA should also consider outlining the requirements for adding additional (i.e., second and third) therapies to the IVD device labeling, ­watkowski said. While the complete dataset from the original clinical S trial may not be required, additional statistical testing would be useful to calculate the negative and positive predictive values. Using medical infor-

OCR for page 29
34 THERAPEUTIC AND DIAGNOSTIC CO-DEVELOPMENT mation from consenting patients may be a way to accomplish this so that the demographics of the patient population would be equivalent to those used in the original clinical trial, Swatkowski said. She highlighted several development issues for FDA to consider, including continued joint meetings that include the relevant parts of FDA, pharmaceutical sponsors, and IVD device sponsors. NGS will forcibly change the current landscape of diagnostic testing, Swatkowski said. It will be necessary “to understand how we can analyti- cally validate data that’s generated by these platforms that are equivalent to already cleared or approved” tests based on similar but different technolo- gies that have been cleared or approved. The goal will be to use the ana- lytical data to connect the information to already generated outcome data for clinical utility, Swatkowski said. An important issue for whole genome sequencing will be the selection of suitable reference human genomes for validation purposes. For example, extensive information technology and data storage capabilities to fully analyze complex datasets will be needed, Swatkowski said. Because there are a variety of platforms and sequencing technologies, “any regulatory requirement should have the flexibility to adapt to rapidly changing technology.” considerations for ivds and ldts The single aim of Amgen’s companion diagnostics effort is to “accu- rately identify those patients who can benefit most from therapy,” said Scott Patterson, executive director of medical sciences for Amgen. “Patients who cannot benefit from a particular therapy should not be getting the drug.” Patterson outlined several implications for this objective. First, he said, false positives or false negatives should be limited, depending on whether the biomarker makes a positive or negative determination. In other words, the “robustness” of a test is critical. Second, the test must be available in all markets where the therapy will be commercialized and not just within the United States. Third, diagnostic tests should not be used as a means of restricting access to therapeutics. Finally, as others have mentioned during the workshop, efficient testing of multiple biomarkers should be done early in the course of treatment, Patterson said. Because FDA approval of an IVD provides the desired level of confi- dence for robustness in a test, Patterson said, if another assay is going to be used, it should meet the same level of evidence. Given the possibility that not all tests for a certain biomarker are equal, then determining a patient’s eligibility for a drug by an analyte is only supported if rigorous concordance is established with an IVD that has associated clinical utility. If such an IVD does not exist, then a rigorous analytical concordance equivalent to the appropriate elements of premarket approval validation should be required,

OCR for page 29
PERSPECTIVES OF DIAGNOSTIC TEST AND PHARMACEUTICAL DEVELOPERS 35 Patterson said. Also, if no IVD exists, commutable standards to validate biomarker measurements across individual laboratories are needed. Lastly, ongoing and challenging proficiency testing and external quality assurance (EQA) programs are critical to ensure standards are maintained, whether for approved IVDs or LDTs. LDTs will not go away, he said, but standards need to be high across all laboratories. Physicians and patients should be educated in order to increase under- standing about the quality and status of the test being offered. At the same time, more robust EQA programs are needed in order to produce consistent patient selection along with transparency of results from EQA laboratories (van Krieken et al., 2013). Additionally, Patterson said, there should be an investigation into the utility of testing earlier in the course of treatment for multiple biomarkers, both for conserving samples and for addressing payers’ concerns. A “test needs the ability to discriminate at a clinical decision point,” Patterson said, with the decision point serving as a cutoff point for iden- tifying and classifying patients. Furthermore, there should be a biological understanding of the biomarker and the cutoff if multiple therapies are to be addressed using the same biomarker. “We really want to try our best to understand the biology behind that biomarker, such that if a cutoff is determined, it will, therefore, be applicable to other therapies in that class,” Patterson said. In the case of binary test results (i.e., somatic mutation tests), data sug- gest that greater sensitivity is better because an assay does not provide a yes/ no answer, and this is where the cutoff is important, Patterson said. How- ever, variation in the ability of a laboratory to identify mutations, caused by different levels of test sensitivity, may pose a risk to patients. “Again, it gets back to really having rigorous performance characteristics established for the tests, wherever those tests are being conducted,” Patterson said. Other assays, such as transcript or FISH assays with continuous variable results, face different variability challenges. The percent of cells expressing the biomarker and the level of biomarker expression can vary within a sample. As with binary tests, biological plausibility is needed to support the cutoff that was established in the clinical trial outcome data. Even binary tests are unlikely to always identify the same patients, Patterson said, and “continuously variable tests pose greater issues.” Regarding the individual proposal for financial reimbursement (see Box 1-2), Patterson said that, in principle, a test could be reimbursed along with a drug, but the challenge will come in implementing such an approach. “Will it also stop tests for which the performance characteristics are not as well determined as the IVD being used? Or will all such tests be reimbursed even if their performance characteristics are unknown?” When testing is conducted by a single or limited number of laboratories,

OCR for page 29
36 THERAPEUTIC AND DIAGNOSTIC CO-DEVELOPMENT the barriers to reimbursement for the test along with a drug appear to be fewer—a fact that may be related to the consistency of results and trans- parency regarding methodology, said Patterson. Ultimately, the reimburse- ment logistics associated with distributed testing would need to be taken into account. If the goal of the combined cost model (see Box 1-2, individual par- ticipant submission) for the drug and test is to enforce the use of IVDs, a laboratory is unlikely to use an LDT because it would be paid only a service fee, said Bruce Quinn, senior health policy specialist with Foley Hoag LLP (see Figure 4-1). But payers could not institute such a system unilaterally because that would require that the test be provided for free to the labora- tory by the pharmaceutical company or the test manufacturer. The payer and laboratory could work together to provide the LDT, but they would be at a substantial financial disadvantage in doing so, and this would also risk having the laboratory or the pharmaceutical company give free tests to a hospital system in return for using its drug, which would raise potential conflict of interest issues. Combining payments for tests with payments for therapeutics is an interesting idea, Pfeifer said. In a world of bundled payments, a health care organization may be given a certain amount of money to take care of a patient with cancer. If so, decisions about how to use that money may occur at the local rather than national level. “Each individual institution may have to decide” how to use the allotment, he said. The combined cost model may not be viable after generic versions of a drug become available and the overhead no longer exists to provide free test kits, especially if the test has to be provided to a large number of patients to find just a few who will benefit from a treatment, said Quinn. There are other ways to enforce the use of an IVD. For example, in theory, LDTs could be made illegal, or the coding of the test could be reformed to make it clear that an IVD was used. “You can’t [change the coding] today, but that could be constructed in a few months in the coding system,” he said. Enforcing an IVD monopoly would enable a manufacturer to raise the price of a test, leaving pharmaceutical companies and payers without an alternative. This approach does not resolve the challenges of limited tissue specimens and provides no incentives for competition or improved ­ roducts. p However, Quinn said, a more robust CLIA does not resolve return on invest- ment problems for the IVD manufacturers who go through FDA and then find that their approval is followed by the production of similar LDTs. Test performance in use Once a co-developed drug and a companion diagnostic are approved by FDA, “what are the ramifications as that drug is used in the marketplace?”

OCR for page 29
$400 Kit IVD $ CASH MFGR R&D Regulatory PHARMA $ Drug Combination Approval and Drug - Labeling Diagnostic Kit and # Kits $600 Service Platform $ CASH “Free” LAB PAYER # Kits Test Result $ Drug DOCTOR FIGURE 4-1  Proposed payment model for connecting the costs of co-developed companion diagnostics to those of the related drug. NOTE: IVD, in vitro diagnostic; MFGR, manufacturer; PHARMA, pharmaceutical company; R&D, research and development. 37 SOURCE: Quinn, IOM workshop presentation on February 27, 2013. Figure 4-1 R02567

OCR for page 29
38 THERAPEUTIC AND DIAGNOSTIC CO-DEVELOPMENT asked Richard Buller, vice president and head of oncology clinical develop- ment at Pfizer. Will clinicians use only the FDA-approved test, or will other LDTs be used preferentially? Demonstrating clinical benefits should be the gold standard for a test, and in that way clinical validity for tests used should be ensured, Buller emphasized. A test can have either a positive result or a negative result, and a patient can either benefit or not benefit from a drug, Buller said. When a positive test result leads to positive results from the use of a drug, then the patient was correctly selected for treatment. When a negative result points toward a lack of benefit, it is generally the case that a decision will be made to not treat a patient. What about the false positives, Buller asked, where there is a positive test result but the use of the drug does not lead to clinical benefit? These patients may turn out to be as non-responders, or technical issues may have affected the assay or biological sample. These cases of false positives provide an opportunity to understand the biology of the disease. Resistance mutations may have developed during the course of the treatment, or the resistance mutations may have been present originally, in which case those patients may show no improvement and need a different drug, Buller said. He explained that the “reference standard” needs to be a clinical outcome or clinical utility for companion diagnostic development. With false negatives, it is possible that a patient could have benefitted from the use of a drug despite the difficulties with identification, Buller said, but marker-negative patients need to be tested at some point in the development process to determine if the therapy is of benefit to them. The magnitude of the problem of false negatives increases with the decreasing prevalence of a disease; when only a small percentage of patients have a marker, a test with a large percentage of false negatives—i.e., that misses many patients with the marker—would have a major impact. Ultimately the identification of false negatives may create opportunities to more fully understand disease biology. The performance of tests can vary greatly, even after approval. Buller exhibited the positive rate of the Abbott Vysis LSI Break Apart FISH Probe Kit which was used to test for ALK in four different central laboratories following FDA approval of the test and drug. The positive test rates ranged from 2.1 percent to 5.5 percent, Buller said. “There are probably some laboratory testing issues there” that were related to the screening approach and not the assay performance. Pfizer has a commitment to do post-market evaluation of test-negative patients, Buller said. It also has been supporting method comparison studies across sequential cases, multiple platforms, and multiple countries to see how different tests perform. Pfizer is currently working with Ventana Medi- cal Systems, Inc., to submit a second ALK test to the premarket approval process, and it is working with other central laboratories to understand

OCR for page 29
PERSPECTIVES OF DIAGNOSTIC TEST AND PHARMACEUTICAL DEVELOPERS 39 testing outcomes in the marketplace. In particular, it is looking at patients who have discordant test results in order to improve understanding of the disease biology along with differences in testing. TEST REGISTRIES The use of registries as a test bed for comparative effectiveness research across the spectrum of different diagnostic testing platforms was debated during the ensuing panel discussion. The moderator of the session, Geoffrey Ginsburg, director of the Center for Genomic Medicine at Duke University, specifically queried the four speakers about the potential value of test reg- istries for comparing the efficiency of multiple LDTs. Buller observed that not just payers but also diagnostic manufacturers and pharmaceutical companies would be interested in a registry because test results and drug use could be linked to test performance and clinical outcomes. Swatkowski noted that a registry could collect information on multiple tests, but in that case it would still be necessary to know the details of the tests in order to make comparisons, such as the technology employed and the cutoffs used. This may present a bioinformatics challenge, she said, noting that “we would have to plan the variables that would be collected in order to make that registry useful.” Patterson said that a registry is an interesting idea that could reveal how well laboratories are performing tests. A “robust and challenging EQA program” would be another way to achieve that end, he said. Buller suggested that starting with the larger high-volume laboratories would be a good way to see if the approach was useful. in-house Diagnostic units Ginsburg also asked whether pharmaceutical companies are setting up diagnostic units to develop companion tests internally rather than rely- ing on outside companies. Patterson said that Amgen has decided not to take that approach, because the company works with expert diagnostic companies that can develop the whole range of biomarkers that it needs. “We work on a very broad range of analytes,” he said, “and to cover all those even in one company is very difficult.” However, Amgen does have a department of molecular science that works on biomarker research in early phase trials. A workshop participant said that Novartis also has an integrated com- panion diagnostic group.2 In this way, the company could have access to 2  Novartis: Our Global Capabilities. See http://www.novartisoncology.com/about-us/our- global-capabilities.jsp (accessed October 10, 2013).

OCR for page 29
40 THERAPEUTIC AND DIAGNOSTIC CO-DEVELOPMENT internal expertise on all aspects of IVD development and ensure an inte- grated approach to co-development. But Novartis also continues to work with external partners, depending on the needs of the individual therapeutic being developed. Buller said that Pfizer has an integrated group specializing in the diag- nostic aspect of co-development, but it chose not to bring a specific technol- ogy into the company or to buy a diagnostic company because of the rate at which technology is changing. Koch said that Roche has both a standalone diagnostics business and a therapeutics business. While its pharmaceutical partners, such as Genentech, do have integrated diagnostics groups, they primarily focus on understanding disease biology both in preclinical and early clinical trials. Buller and Koch both noted that by not having internal diagnostic units, the enterprise has more flexibility to collaborate with the best external groups.