During discussions throughout the workshop, participants commented on the major proposals made by speakers (see Box 7-1). Those discussions are consolidated in this chapter as a way of summarizing the major themes of the workshop.
Throughout the workshop, participants returned to Hayes’ proposal that the LDT pathway be eliminated and all genomic diagnostic tests be reviewed and approved by FDA.
Several participants pointed to the value of the LDT pathway. For example, Conti observed that in some areas of medicine, especially where good publications are available, it may cost far less to develop an LDT than the numbers Siegel cited in her presentation. CLIA provides much value to areas of medical practice, such as infectious diseases, endocrinology, or neurology, that are willing to accept published research as good evidence of treatment improvement. “You can do that very efficiently with respect to capital. Small laboratories can bring up LDTs without having to raise millions of dollars.”
Shak said that once a test developer has been successful with an LDT, the developers of that test are eager to produce new tests because of the beneficial effects they have on the lives of patients. This is one of the reasons why companies continue to invest in research and development, even when a final test is years away and reimbursement is uncertain. Only a fraction
• Eliminate laboratory developed tests and have all genomic diagnostic tests undergo FDA review and approval. (Hayes)
• Base FDA approval on analytical validity and clinical utility, not clinical validity and intended use. (Hayes)
• Consolidate the review of all oncologic products within a single FDA office. (Hayes)
• Base reimbursement on the value of a genomic diagnostic test to patients, payers, and society. (Hayes)
• Clarify the regulatory and reimbursement pathways for genomic test development. (Siegel)
• Preserve physician discretion in ordering, interpreting, and delivering diagnostics, therapies, and other forms of care. (Conti)
• Ensure that agency decision making is transparent, with rulemaking by notice and comment rather than through guidelines. (Conti)
• Standardize the validation of protocols and enhance quality control to improve the efficiency of test development. (van ‘t Veer)
• Provide guidance for IRBs on how to review genomic tests. (van ‘t Veer)
• Provide opportunities and incentives for guidelines committees and regulatory bodies to harmonize their definitions of clinical utility. (van ‘t Veer)
• Reform reimbursement to recognize the value of diagnostic tests, their impact on health care, and the resources needed to develop and validate tests. (Enns)
• Establish reliable and accurate performance standards for new genomic tests.(Enns)
of the possible and promising tests can be developed, so the system should enable more to be developed, not fewer.
With regard to the distinction between CLIA-regulated tests and FDA-approved tests, Shak said that “the devil is in the details.” Either route could yield regulation that is fit for purpose. In that respect, looking at the purpose of a test and then deciding on the proper kind of regulation may be more appropriate than the opposite.
Leonard said that it may be inaccurate to contrast an FDA path with an LDT path because there are many different LDT paths. For example, the path is much different in academia than in industry. “Maybe we need to start talking about different LDT pathways and think about the benefits of each.” van ‘t Veer agreed and also observed that the complexity of tests varies greatly. Some require many levels of analysis, data integration, and bioinformatics, while others are relatively simple molecular tests. In addi-
• Compare new genomic tests with traditional practices to establish comparative effectiveness. (Shak)
• Provide ways for patients to get their questions answered by health care providers at the point of clinical decision making. (Gorman)
• Begin test development by discussing the needs of patients rather than how to secure reimbursement for a procedure. (Gorman)
• Use a progressive or adaptive regulatory and reimbursement framework to reflect the accumulation of knowledge and reduction of uncertainty over time. (Tunis)
• Define evidentiary standards through a collaborative process involving regulators, payers, clinicians, patients, and other stakeholders. (Tunis)
• Clarify expectations about the clinical value of technologies to provide criteria for coverage. (Hochheiser)
• Develop standards to accelerate the deployment of genomic tests. (Hochheiser)
• Develop critical reasoning medicine to support coverage decisions even when data are incomplete. (Quinn)
• Develop a public health approach to genetics that evaluates the utility of genomic tests and their impacts at the population level. (Khoury)
• Create incentives for public or private funding of research beyond the discovery phase, for knowledge synthesis and stakeholder convening, and for public and provider education. (Khoury)
• Increase collaboration among and within agencies to enhance the efficiency of regulation. (Gutierrez)
• Create a standard nomenclature and taxonomy to enhance the efficiency of regulation. (Jacques)
tion, tests are used for different purposes, such as diagnosis versus treatment decisions, and regulation could be reflective of these differences.
An intermediate position proposed by Leonard is that FDA would formally decide what can go through an LDT pathway without FDA review based on risk-based stratification. This would be “a better strategy than eliminating the LDT pathway altogether,” said Leonard, since there would be a negative impact on medical care if the LDT pathway did not exist. The LDT pathway can spur innovation, especially with tests used in low volumes, even if they pose difficulties with evidence generation.
Hayes stated that over the course of the day he had come to modify the proposal he set out at the beginning. Perhaps the LDT pathway should still exist, he said, but within FDA, so that a single review process with multiple pathways would exist. This would put more burden on FDA, but it would eliminate the need for many different assessment panels among third-party payers, because they could rely on FDA for review and approval, though
they still would have to set reimbursement levels. This system would be much more similar to how drugs are approved. “Genentech hasn’t to my knowledge run around to every insurance company in the country and gotten approval for Herceptin,” said Hayes. “It all happened because the FDA gave it approval.” Instead, the money now spent by companies on technology assessment could be shifted to FDA to support the extra work needed for the agency to become the single arbiter of whether a diagnostic does or does not have clinical utility.
One problem, Jacques pointed out, is that the budget at least within CMS to do technology assessments is currently very limited. Another problem, Hochheiser observed, is that it is very difficult to arrive at the value of a test. Furthermore, tests have to have positive margins, not just value, to be commercially appealing.
Workshop participants also discussed Hayes’ idea of combining FDA offices into a single oncologic office that looks at both diagnostics and therapeutics. Leonard asked how that arrangement would help for diagnostic tests that do not have an accompanying drug. Also, she asked, would every major disease need its own office or standing committee?
Another point raised by Hayes is that reimbursements need to be commensurate with the amount of work needed to develop a diagnostic. Leonard asked how CMS and third-party payers can be convinced to pay more for a test than the cost of doing that test. Wylie Burke of the University of Washington, and chair of the Roundtable, observed that such evaluations would encompass not only clinical utility but cost-effectiveness, which is an interesting but radical proposal. Hayes responded that FDA could determine clinical utility while payers do analyses of cost-effectiveness.
Burke asked whether a process needs to be developed involving a broader set of stakeholders about evidence. (This issue is also addressed later in this chapter.) Hochheiser agreed that an effective structure needs to be established but that no such structure exists today, even though processes may exist.
Hayes pointed to ODAC as a structure that works. FDA does not have to take ODAC’s advice but usually does. ODAC consists of clinicians, statisticians, patients, and other stakeholders and makes hard decisions, such as whether 3 months of extra survival on average is worthwhile. “There is process and structure to address it in a relatively rational and stakeholder [engaged] way,” said Hayes.
Enns, however, said that he would not want to take a diagnostic test through ODAC and CDER. He is much more comfortable taking products through CDRH. ODAC does not know how diagnostic tests are developed
and how they work, he said. Hayes responded that ODAC could combine CDRH and CDER for oncologic products.
Another major theme of the discussions was the potential for FDA to work more closely with CMS so that decisions about regulatory approval and reimbursement are coordinated. The evidence requirements may not be the same, Leonard pointed out, but test developers would better understand the bars they have to surmount to get approval and then payment.
One problem, said Leonard, is that CMS covers procedures for older populations, but many procedures are aimed at other populations. Also, CMS has different concerns than private payers. Could a private payer group work with both FDA and CMS so that everyone is involved in the regulatory and reimbursement process? A possible incentive to do so is to make payments dependent on participation in such a process.
A conversation between FDA and CMS on reviewing the same evidence could, in some cases, lead to simultaneous regulatory approval and reimbursement, Tunis added. In other cases, it could lead to clarification of the divergence between the two agencies and what CMS is looking for in contrast to regulatory expectations. Parallel review could enable the agencies to clarify for themselves and for the outside world the difference between safe and effective and reasonable and necessary. However, Tunis did not expect greater cooperation to lead to harmonized or identical expectations about evidence because the regulatory expectations usually will be different. Instead, alignment will lead to greater predictability and clarity about how studies need to be designed to address the information needs of the regulators and what additional information is needed for reimbursement decisions and clinical decisions. However, Tunis also observed that it may not be scientifically or economically viable to demonstrate clinical utility for regulatory approval, much less reimbursement. Leonard suggested that NIH may increasingly be willing to consider funding for test validation research and that public-private partnerships also could consider funding evidence development for genomic tests to fill this gap.
Jacques said that once ongoing pilot studies are completed, parallel review will be more formalized and that only a few years should be needed to generate enough experience to develop a framework or guideline for collaboration. At the moment, offers by CMS to collaborate in a review generate “a polite yes, but a somewhat guarded yes. There is nothing that prevents [this] from happening now aside from the reluctance of sponsors to tell . . . FDA we would like you to invite CMS to our meetings,” he said.
The bar for approval at CMS is higher than at FDA, which is one reason why the number of reviews FDA handles is much larger than at CMS,
Gutierrez stated. “Some people don’t have the data or would not want to collect the data, at least at that time, for what it would require to have CMS’s approval. They’re not ready.”
In response to a question about third-party reviews of LDTs, Gutierrez pointed out that FDA does have a third-party review process, though “it has never worked particularly well for diagnostics partly because the expertise hasn’t existed.” But pilot programs are in process, particularly involving third-party inspections done in other foreign countries. Also, the agency recognizes that more expertise is now available, especially for devices that are lower-risk, and some groups have expressed interest in doing third-party inspections. Jacques added that, while unable to discuss in detail, CMS is open to exploring some of these options.
CMS is exploring the potential to align coverage with evidence development with FDA’s postmarket requirements. If FDA and CMS could agree with the sponsors of a particular protocol on a way to satisfy both CED and FDA requirements, that would be better than the current system. “In the current system, you have a postmarket requirement and no guarantee that there will be any Medicare funding going to support that,” said Jacques. “It may take forever to accrue that study. If Medicare from day one is essentially saying we’re going to go ahead and pay for the item or service in this particular context, it seems that you would be able to more efficiently address FDA’s issues as well as our issues.”
Collaboration among FDA, CMS, and private payers could facilitate coverage with evidence development or other progressive approval processes for regulation and reimbursement, Leonard said. However, this approach may only work for the LDT pathway given the regulatory and reimbursement systems that exist in the United States. Questions that would have to be answered are how to change the reimbursement level as data are generated, and how to get a test off the market if the evidence does not support its continued use.
van ‘t Veer said that a critical point is to get FDA and CMS to determine the common levels of evidence needed and common strategies of how to get something approved, while also circling back to the people who are developing the test, whether in industry or academia. Different types of tests need different levels of evidence, and these differences need to be integrated into work plans.
Payment decisions need standards that allow for further validation over time, said Leonard. Payers have their own groups that do assessments of evidence, so one question is how all payers could support a single decision.
Also, once a decision is made, how would compliance with that decision be ensured?
In various systems of progressive approval, observed Khoury, there would be continuous collection of information on clinical validity and utility where the stakeholders all agreed to the rules of engagement. As an example, he pointed to whole-genome sequencing. Existing evidence does not necessarily call for whole-genome sequencing, but if the sequence were available, the question could be asked, “What information is actionable in that whole-genome sequence under different clinical scenarios?” Such an approach would direct the conversation rather than forcing it to be reactive. “You can feed different processes that allow you to collect data, get the stakeholders together, fund the research, reimburse some of it, and have tighter controls at the outset.”
Enns briefly mentioned models from other countries. For example, Japan does a simultaneous review of safety, effectiveness, and reimbursement coverage. The process in Japan takes far too long, said Enns, but perhaps it points toward a way for FDA and CMS to work together.
Burke also asked about partnerships that involve not just FDA and CMS but industry, providers, and patients. What are the barriers and incentives to partnering, she asked?
Innovative approaches other than CED also could yield valuable evidence, said Leonard. For example, the prospective-retrospective trial designs that Genomic Health used for Oncotype DX were an innovative design that worked. For prospective-retrospective designs, specimens from clinical trials need to be archived and clinical data need to be accessible, which adds to the cost of the designs. Also, clinical trial samples can be proprietary when they are sponsored by industry. NIH could make it a requirement that samples be archived and available when it funds a clinical trial, as is being done at the National Institute of Diabetes and Digestive and Kidney Diseases.
On the same topic, a participant said that one way to break the vicious cycle of undervalued genomic diagnostic tests is through coverage for field evaluation. That raises the question of when the evidence is strong enough to move to this type of evaluation process. Khoury agreed that such an arrangement is the only effective long-term way to develop genomic tests. “If you get stuck with either the highest level of evidence or nothing at all, genomics will never really come to light.” Whole-genome analysis is an excellent example, because it is not currently useful except in looking for rare and undiagnosed genetic conditions, yet it contains plenty of actionable information.
Hayes pointed to some of the problems with progressive approval. Once a test is being widely used, it is much harder to evaluate, because people either believe that it should be used or should not be used, and a
true RCT is much more difficult. Instead, said Hayes, the level of evidence needed for clinical utility should be defined and money should be put into trials to achieve this level. “Let’s get the trials done quickly by not allowing the assay to be available outside the trials, just like we do with drugs. Then we’ll generate much higher levels of evidence much faster. In fact, the entrepreneurs will be rewarded for doing this because the reimbursement will be sufficient for them to do this, and the patients will be better off because we’ll actually know how to use these things faster.” In contrast, allowing an intermediate level of approval risks shutting down innovation “because it’s already there and then it’s harder to test.”
Khoury said that RCTs may or may not be the answer and that information can also come from a variety of sources such as observational studies and modeling. The important thing is to design the rules ahead of time.
Another issue, said Hayes, is whether third-party payers should help fund the clinical trials. In some cases, they may want to be partners in evidence generation, but there has to be value for the third-party payer in the partnership, and partnerships should not be mandated.
Tunis also observed that making regulatory or coverage decisions with less evidence than has been the case in the past implies backloading the evidence requirements, which could increase innovation and economic development. “The only downside is putting the genie back in the bottle,” Tunis said. “If things are going to get into the market earlier and more broadly with less evidence, then on the back end it’s got to be easier to take things off the market. I don’t know how to make that happen from a public acceptability point of view.”
Hayes observed that new drugs cannot be on the market during a randomized trial of that drug. “The assumption is that the new drug must be worthwhile.” Rather, new drugs undergo staged, conditional approvals based on settings. “Perhaps there are ways to do that with biomarkers.”
However, Shak pointed out that the use of some drugs off label by physicians is allowed, which led him to the question of tracking what physicians actually do in practice. “What are the patterns we want to encourage, and what are the ones we want to discourage?” Quinn said that incentives should be in the right direction but that currently the systems to track what happens in practice are weak. Tunis observed that the sophisticated analysis of routinely collected data generated in the course of care could be informative about clinical utility, but the question needs to be asked whether such information will have sufficient reliability to inform decisions. “It goes back to my point . . . about defining evidentiary thresholds and strengths of evidence linked to certain kinds of decision making rather than just let’s collect some information and hope that it happens to be informative. We’ve got to be more thoughtful about what the questions are, what the methods need to be, and then figure out how to do those studies, as opposed to we
happen to have access to this data from claims databases, electronic health records and let’s not bother to do anything else.”
Shak pointed to the example of the Cystic Fibrosis Foundation, which invested in a patient registry and is now feeding back quality metrics to individual centers. “They put that on the web so every family and patient with cystic fibrosis can see and compare their center to others. It really is a very innovative and creative way of empowering patients.” At the same time, survival among cystic fibrosis patients has gone from 28 to 38 years in the past 10 to 15 years.
More broadly, Burke asked how to arrive at what Tunis called a “collective social judgment” regarding the value of a genomic test. Different stakeholders can have different assessments of value. How can these differences be bridged, she asked?
Tunis observed that the process by which FDA derives regulatory guidance is one example of how to arrive at a collective social judgment, since it is an iterative public process in which there is a push and pull among stakeholders that occurs through a transparent process. Khoury also pointed to the experience with EGAPP, which was based on the model used by USPSTF for clinical preventive services. EGAPP developed methods and published evidence-based guidelines as well as recommendations and systematic reviews. It received pushback from some stakeholders, but Khoury said that the pushback amounted to shooting the messenger rather than the message. EGAPP is now modernizing its approaches to incorporate rapid evaluations and decision modeling so that it becomes more “nimble.” One question is the extent to which stakeholders should be involved or the extent to which EGAPP should be independent.
Tunis also said that public–private partnerships could offer a forum for stakeholders to talk about a wide range of issues, including integrating payer and regulatory requirements and evidentiary thresholds. One example is the Foundation for the National Institutes of Health, which has been working on the validation of individual biomarkers. But even this consortium determined that setting evidentiary thresholds was beyond its scope. Partnerships may have value, but there may not be a marketplace demand or a business model to support such work. In that case, said Burke, perhaps its value to the full range of stakeholders needs to be articulated.
Leonard asked about the objectives of partnerships. Would they do evidence-based reviews for tests on the market, which are being done by AHRQ and other groups? Or would they decide whether tests are medically useful and whether they should be paid for or whether there should be
coverage with evidence development? Even if a group did that, who would pay attention to its recommendations?
Shak suggested that arriving at a collective social judgment may be a two-step process. In the first stage, there would be a dispassionate collection of evidence with transparency about what is known and what is not known. Phase two would then determine whether the benefits of a test outweigh the risks. It will not be possible to get 100 percent agreement on this second phase, he acknowledged. Rather, it will require having many perspectives at the table that can hash out differences and arrive at an assessment. He suggested that professional societies could serve in this role of convening advocates.
Hayes, however, observed that professional societies have perceived conflicts of interest and also that societies would be overwhelmed by the amount of work that needs to be done. Shak countered that the convening function could be structured to be open and transparent and avoid these conflicts. The societies could provide lead areas of expertise as medicine becomes more complicated.
Tunis agreed with the advantages of a two-step process but wondered who could bring together the many different stakeholders involved, from insurance companies to patient advocacy organizations to medical specialty societies. He also worried that such a process might sound like the creation of entities to determine effectiveness, value, availability, and price, which “sounds a whole lot like a rationing body.”
Jacques pointed out that the inherent problem is trying to fund innovation using an insurance paradigm, which is inevitably reactive. An alternative model might be the one used by the Department of Defense, which specifies the performance characteristics of what is needed and determines how much it will spend to support the development of a product.
Khoury said that if the system were being reinvented, the most important component would be the convening of the stakeholder space. “You need the rules of engagement. You need a continuous process of knowledge synthesis so that you can inform the research enterprise. We need investors in that research enterprise. Then we need that space by which validated technology moves into practice in a way that makes sense.” Billions of dollars are now being spent to make new discoveries. The additional expense of doing knowledge synthesis would not be that great, and without such a mechanism, the money spent on basic science discovery will not result in better health outcomes.
In his concluding remarks at the workshop, Robert McCormack of Veridex observed that the workshop uncovered an unprecedented amount
of information, some of which has never been uncovered before. In particular, he called attention to the importance of clarity. “The sheer confusion over the number of guideline groups that exist today, and the fact that they all don’t have the same bar or standard, makes it very confusing for manufacturers. What makes it worse is that the playing field is always changing, and you don’t know it’s changing until it’s already changed.”
The legacy of the workshop is not what was said but will happen once it is over, said McCormack. “It’s incumbent upon us now to identify those next steps . . . and to start putting into place some of those mechanisms to chip away and resolve some of these issues.”