Important Points Highlighted by the Individual Speakers
• A progressive or adaptive regulatory and reimbursement framework, in which initial approval is conditional upon further study, has many advantages over a binary approval model.
• The full range of stakeholders should collaborate to develop standards defining evidentiary thresholds.
• Critical reasoning medicine can support coverage decisions when the data to make these decisions are incomplete.
In a session featuring individuals with experience in health care payments, speakers discussed the issues that arise in making coverage and reimbursement decisions. A repeated theme of their presentations was that binary decisions to approve or disapprove a genomic diagnostic test are not in keeping with the nature of the evidence. A better approach is to make decisions in stages as more evidence becomes available, suggested participants. Rigorous standards can help in implementing these kinds of progressive approval and reimbursement systems.
There is an inherent tension between level of certainty about risks and benefits and early access to new technologies or innovation, said Sean
Tunis of the Center for Medical Technology Policy. The higher the level of assurance needed that a patient will benefit from a genomic technology, the greater the burden on gaining the evidence to provide that certainty, which puts downward pressure on innovation. Similarly, the more emphasis that is placed on reducing health care costs, the greater the downward pressure on economic growth and jobs. “It’s not intuitively obvious what the optimal balance of innovation and certainty is that maximizes public health over time,” said Tunis.
Tunis focused on two related barriers to the development of clinically useful genomic diagnostic tests:
• Regulatory and reimbursement decisions rely on a binary model of approval.
• Evidentiary thresholds for regulatory and reimbursement decisions are poorly defined.
Today, regulatory and reimbursement decisions are made as if there were a “magic point” at which suddenly something is true where previously it was false, said Tunis. “We pretend that evidence is kind of an ordinal property as opposed to a continuous function, but that’s obviously not true.”
A much better model, said Tunis, is a progressive or adaptive regulatory and reimbursement framework. In this case, approval or disapproval decisions are not made at a particular time; rather, they are made progressively over time. Coverage with evidence development (CED) and managed entry schemes are examples of such models for reimbursement with initial approval conditional upon further study. Accelerated approval would be an example of a progressive regulatory model. “Having single yes/no decisions over time is just too crude an approach,” said Tunis. “If we’re going to solve this problem with technologies generally, and certainly with diagnostics, we need to think about our regulatory decision making in a way that’s more compatible with the accumulation of knowledge and the reduction of uncertainty over time.”
Decision making today is not predictable due to a lack of clarity regarding the regulatory and reimbursement pathways, Tunis said, reiterating Siegel’s remarks. What is needed is a collaboration involving regulators, payers, clinicians, patients, and other stakeholders to define what the evidentiary thresholds should be. This cannot be done at a generic level but rather must be fit for purpose. The evidentiary threshold will need to be defined in a way that is specific to indications and therapies. “Our current regulatory or reimbursement policy framework is not aligned with the nature of evidence and the accumulation of knowledge over time. Until it is, we’re going to have a very inefficient system,” said Tunis.
Importantly, no new regulatory or statutory authority is needed to take these steps, according to Tunis. FDA and CMS have implemented progressive systems and could institute similar systems in the future.
Effectiveness and Guidance
The Secretary’s Advisory Committee on Genetics, Health, and Society (SACGHS) stated that “information on clinical utility is critical for managing patients, developing professional guidelines, and making coverage decisions” (SACGHS, 2008). This group recommended that the Department of Health and Human Services create “a public-private entity of stakeholders to … establish evidentiary standards and levels of certainty required for different situations.”
Tunis briefly described some work being done at his center that follows up on these recommendations. He and his colleagues are creating documents called effectiveness guidance documents that are analogous to FDA’s regulatory guidance documents. However, instead of describing how to design studies in specific therapeutic areas to meet regulatory requirements, the effectiveness guidance documents reflect the information needs of patients, clinicians, and payers. Such guidance is designed to be complementary to regulatory guidance.
The development of effectiveness guidance documents starts with systematic reviews that identify deficiencies in the existing evidence base, Tunis said. Content experts generate initial draft recommendations, which are refined by a technical working group. The revised recommendations are discussed at a multidisciplinary methods symposium, which brings together various stakeholders for public comment, after which the recommendations are finalized and posted. As a specific example, Tunis cited the following draft recommendation: “Valid outcomes or surrogates for breast cancer prognosis include distant recurrence at 5 or 10 years, disease-free survival, disease-specific mortality, and overall survival.” Whether this strikes the proper balance between innovation and certainty would have to be determined, but the appropriate response is to adjust the threshold and not give up on the process of coming to a consensus.
“We can’t move forward without some kind of mechanism to get everybody on the same page in terms of the minimally acceptable level of certainty for making these regulatory and reimbursement decisions,” said Tunis. “It’s not a property of evidence. It’s a property of collective social judgment, so you need a collective social process to define what these thresholds are.”
Louis Hochheiser of Humana said that standards for determining what should and should not be covered by insurance companies would greatly benefit their decision-making process. “It’s an enormous task. We have such great difficulty.”
Humana provides coverage for 11.5 million people and tries to create policies to cover them in a rational way. It puts patients and improving health outcomes as its primary focus, said Hochheiser. This requires both education for providers and the public and the pursuit of cost-effectiveness. Humana preauthorizes its genomic tests and, Hochheiser said, finds that 20 to 25 percent of the ordered tests are inappropriately ordered. This is a huge issue that needs to be addressed.
Decisions need to reflect acceptance from all stakeholders, including physicians, patients, diagnostic companies, payers, regulators, pharmaceutical companies, and policy makers. Humana wants to be part of that decision making, said Hochheiser. “We don’t want to drive it. In fact, we find ourselves driving it now when we don’t want to be driving. But we do want to be in the room to talk about it and give a perspective of what it’s like to be responsible for a large portion of our population.”
Clarifying Expectations Through Standards
CMS and other payers need to clarify standards around developing technologies, Hochheiser said. This would clarify criteria for coverage, both for payers and for the developers of tests. Today, each payer has teams of people who are evaluating new technologies in order to enable decisions regarding coverage. “We are spending millions of dollars a year [on evaluation] that could be going into developing appropriate testing [because] we don’t have a system, we don’t have a set of standards to go by.”
Standards would allow for more rapid deployment of genomic tests rather than waiting for peer-reviewed publications. They also would allow payers to make consistent decisions and would permit uniformity between CMS policy and that of commercial payers. “4.5 million of our 11.5 million [covered individuals] are CMS recipients … and yet we can have different rules,” said Hochheiser.
A system of standards should allow for continuing validation over time. For example, Humana was the first adopter of Oncotype DX in the United States, but it has not stopped at just the coverage decision. The company has continued to study the population that receives the test. It has found that 15 percent of the women with a low recurrence score, indicating negligible benefit from chemotherapy, choose to have it anyway. Half of the people with recurrence scores in the middle choose to have chemotherapy
and the other half choose not to. Of the women with a high recurrence score, 15 percent choose not to have chemotherapy; that population needs to be studied as well, said Hochheiser.
Humana shares the responsibility of clarifying the value of clinical technologies. It has a population of people that it can follow over time, and it is working with researchers to study the effects of interventions in this population. The company is willing to consider new coverage models, such as CED, when long-term studies are not practical. They are also willing to meet with test developers and participate in the development process. At the same time, it needs to have a reasonable price point. “If Humana … covers every test, we soon will have a product that we can’t sell competitively with the other insurers, which is why we need a level playing field where everybody knows that everybody is playing on that same field.”
Humana also is interested in working collaboratively to minimize errors. When a provider seeks a preauthorization, education materials are available for that person. The company also works with an outside business to provide genetic counselors.
“It’s absolute need for standards, collaboration at multiple levels, [and a] coordinated approach through all the different stakeholders. [Genomic diagnostic tests] have too much potential impact for us not to do something about it and do it now,” said Hochheiser.
Finally, Hochheiser asked whether venture capital is the best model for developing genomic tests that can make a huge difference in the health of patients. Health care reform, he pointed out, is taxing all insurers on their premiums. “Shouldn’t some of that money be directed toward the innovations that we need in health care to make progress?”
Bruce Quinn of Foley Hoag provided a more theoretical perspective on the utility and adoption of genomic tests. Evidence-based medicine, as it is traditionally approached, can generate anomalies, he said. For example, a 2005 report by the U.S. Preventive Services Task Force (USPSTF) labels the association between mutations in the BRCA gene and breast cancer only “fair,” said Quinn, despite significant research showing that patients who harbor BRCA mutations are at an increased risk of developing cancer (USPSTF, 2005). But because there are no randomized controlled trials (RCTs) of BRCA mutations, evidence of a causal relationship is not strong.
RCTs are designed to distinguish between correlation and causation, Quinn said. For example, high troponin levels are highly correlated with
having a heart attack (Thygesen et al., 2010), but giving troponin to people does not cause a heart attack, according to Quinn. Diagnostic tests, for their part, are useful because they indicate a reliable correlation.
Analytical validity, clinical validity, and clinical utility have limited usefulness, Quinn said. He used a book as an analogy. At one level, a book consists of ink, paper, and glue. At the next level, it consists of words, grammar, and a language. At the next level, it has content, meaning, and some measure of usefulness. But the usefulness of a book cannot be determined by studying its ink, paper, and glue. In the same way, a gene test with lower analytic validity may have a better correlation to a clinical outcome than another test for the same gene with higher analytic validity. The same is true for clinical validity, said Quinn, citing the differences in usefulness of hypothetically similar results between PSA testing and use of the Oncotype DX assay in predicting cancer recurrence. There is only a distant relationship between analytic validity, clinical validity, and clinical utility.
Tests often transform a question that cannot be answered into a question that can be answered, said Quinn. For example, the question “do we need to switch your HIV drug” is transformed to “is your HIV RNA count rising?” The key is the correlation between the answer the test provides and the question that needs to be answered.
There are two kinds of true statements, Quinn observed. The first are statements about things, like this is a rock or you have leukemia. The second are statements about relationships, like there are 10 dimes in a dollar or high troponin levels are associated with heart attacks. Clinical decision making deals with both kinds of statements. There are general medical rules consisting of principles, facts, and conclusions drawn from evidence and there are specific statements about a patient. Evidence-based medicine provides the backing for certain conclusions. The problem, said Quinn, is that medical science is very hard and requires considerable thought and expertise. Some evidence-based medicine may not add value when it is done in an unthinking or brute-force way.
An alternative model that Quinn mentioned is critical reasoning medicine, which combines the ideas of “we can believe this” with “we should do this.” Specific patient facts are combined with clinical rules and knowledge. In turn, this reasoning can be used to support coverage decisions, which separately take into account funds, priorities, and available alternatives, even if complete data are never available when a coverage decision is made.