3
Practical Incentives and Barriers to Translation

TRANSLATING MEDICAL INNOVATIONS WITH APPROPRIATE EVIDENCE

Sean Tunis, M.D., M.Sc.

Center for Medical Technology Policy


There is an important tension between innovation and the opportunity for post-market learning or evidence, and the risks and benefits of this tension need to be better understood. Post-market learning can occur only in an environment where payers have started to pay for something with presumably less evidence than they might have wanted. The idea that we can encourage post-market learning in an environment where the evidence requirements are becoming more rigorous is likely to change dramatically.

To be covered by Medicare, an item or service must be determined to be reasonable and necessary. The working definition for reasonable and necessary is that there is adequate evidence to conclude that the item or service improves net health outcomes, is generalizable to the Medicare population, and is as good as or better than currently covered alternatives. The key question here is, what constitutes adequate evidence? Unfortunately, the evidentiary bar is not well defined, which is part of the reason for the tension between innovation and evidence. From the point of view of a company developing products that it wishes to bring to market and for which it hopes to be reimbursed, the evidentiary target is fuzzy.

The approach that Medicare takes to determine whether adequate



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 25
3 Practical Incentives and Barriers to Translation TRANSLATING MEDICAL INNOVATIONS WITH APPROPRIATE EVIDENCE Sean Tunis, M.D., M.Sc. Center for Medical Technology Policy There is an important tension between innovation and the opportunity for post-market learning or evidence, and the risks and benefits of this tension need to be better understood. Post-market learning can occur only in an environment where payers have started to pay for something with presumably less evidence than they might have wanted. The idea that we can encourage post-market learning in an environment where the evidence requirements are becoming more rigorous is likely to change dramatically. To be covered by Medicare, an item or service must be determined to be reasonable and necessary. The working definition for reasonable and necessary is that there is adequate evidence to conclude that the item or service improves net health outcomes, is generalizable to the Medicare population, and is as good as or better than currently covered alternatives. The key question here is, what constitutes adequate evidence? Unfortu- nately, the evidentiary bar is not well defined, which is part of the reason for the tension between innovation and evidence. From the point of view of a company developing products that it wishes to bring to market and for which it hopes to be reimbursed, the evidentiary target is fuzzy. The approach that Medicare takes to determine whether adequate 

OCR for page 25
 DIFFUSION AND USE OF GENOMIC INNOVATIONS evidence exists for the reasonableness and necessity of diagnostic tests has two main components. First, the evidence must be adequate to determine whether the test provides more accurate diagnostic information than exist- ing tests. Second, if the test provides more accuracy, the evidence must be adequate to determine how the changed accuracy affects health outcomes. For example, does it change patient management, and do those changes in patient management actually improve outcomes? At the request of the Centers for Medicare and Medicaid Services (CMS), experts at Duke University conducted an evidence review on the use of positron emission tomography (PET) in scanning for Alzheimer’s disease. The review concluded that there was adequate evidence to con- clude that PET scanning has better sensitivity and specificity than clinical evaluation by an expert neurologist. The experts also constructed a decision model which determined that because available treatments had very limited efficacy and were relatively safe (i.e., treatments for dementia are basically nontoxic and not very effective), the new diagnostic information available from the PET scans had essentially no effect on patient management—that is, it would not change patient outcomes. Furthermore, the small false- negative rate of PET scans might lead to withholding treatment and might lead to worse outcomes than empirically treating anyone with a clinical diagnosis of dementia. In light of this, Medicare policy is to not cover PET scans for Alzheimer’s disease except in the context of a prospective clinical trial that would evalu- ate whether the information from PET scans changed patient management in other important ways. Medicare agreed to cover tests in the context of such a study, and a proposal for such a study was developed by scientists at the University of California at Los Angeles and submitted to the National Institutes of Health, but it was never funded. So the practical effect of this policy is that Medicare does not pay for any PET scanning for Alzheimer’s disease. What do other payers need in order to determine whether to pay for a diagnostic test? One major private payer’s policy on the clinical utility of ambulatory electocardiograms (ECG) is that ambulatory ECG is considered experimental and investigational because of the lack of peer-reviewed pub- lished reports of prospective clinical trials on the effectiveness of the dis- tinct features of ECG in improving clinical outcomes over standard cardiac event-monitoring services. What this payer is saying, in other words, is that in order to qualify the service for reimbursement one would need to conduct a prospective study of ambulatory ECG versus Holter monitoring, and the results would need to demonstrate that some important clinical outcome was changed as a result of the use of ambulatory ECG. Such a study has never been done, Tunis said, and it is unlikely that any company manufacturing ambulatory ECGs will ever conduct such a study.

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION Thus, while such evidence requirements may be desirable from an evidence- based medicine perspective, it may not be a feasible evidence threshold to use as a condition for reimbursement. Another example of evidentiary review is a retrospective gene- expression profiling for breast cancer that was conducted in October 2006 by the California Technology Assessment Foundation. That assessment found that the predictive accuracy of Oncotype Dx is high for recurrence (although the test was never compared to standard risk-assessment tools) and that the National Surgical Adjuvant Breast and Bowel Project (NSABP) Protocol B-14 showed that low-risk patients randomized to chemotherapy and followed for 10 years did no better than did those who did not undergo chemotherapy. In this case, there were 10 years of frozen specimens that could be used in the study. The TAILORx1 and MINDACT trials2 (10,000 and 6,000 patients, respectively) are now under way. For Oncotype Dx, the California contractor for Medicare initially issued a draft decision for non-coverage which was reversed because of strong feedback from clinical oncologists disagreeing with the draft deci- sion. Furthermore, administrative law judges were reversing denials of payment when the issue was brought before them. The important point is that there is no national policy from Medicare about how much evidence CMS will consider adequate to conclude that there is clinical utility on this test or any molecular diagnostic. Different payers have different evidence requirements for what is suf- ficient to determine the clinical utility of a diagnostic test, which makes it very difficult for a company developing a product to know how to design its clinical research portfolio. Payers, physicians, and patients are demand- ing more evidence on comparative effectiveness and value. Yet the evidence requirements for coverage are poorly defined, inconsistent, and, in some cases, not feasible. Furthermore, there is a major problem created by the fact that reim- bursement and regulatory evidence requirements are not aligned with one another. There is frequently a mismatch between what the payers would like to know and what the regulators would like to know. This means that even 1 “The Trial Assigning IndividuaLized Options for Treatment (Rx), or TAILORx, will examine whether genes that are frequently associated with risk of recurrence for women with early-stage breast cancer can be used to assign patients to the most appropriate and effec- tive treatment” (National Cancer Institute. http://www.cancer.gov/clinicaltrials/digestpage/ TAILORx, accessed January 22, 2008). 2 “Microarray for Node-Negative Disease may Avoid Chemotherapy (MINDACT) was originally designed to compare the ability of a 70-gene prognostic profile versus clinical and pathological criteria to identify women with node-negative breast cancer who are unlikely to benefit from adjuvant chemotherapy” (Tuma, R. S. 2005. Trial and error: Prognostic gene signature study design altered. Journal of the National Cancer Institute 97(5):331-333).

OCR for page 25
 DIFFUSION AND USE OF GENOMIC INNOVATIONS with regulatory approval there is no certainty that reimbursement approval will be forthcoming. The next major area of contention may well be the evi- dentiary framework of the payers concerning molecular diagnostic tests. The Center for Medical Technology Policy (CMTP), a private, non- profit corporation, has begun work on issues relevant to the discussion of creating high-quality evidence of clinical effectiveness3 and clinical utility from the perspective of decision makers, that is, from the perspective of payers, clinicians, and patients. For the past two years funding for CMTP has come primarily from foundations, but it is now changing to a membership-funded model with health plan and life science company memberships. The primary mission of CMTP is to support collaborative activities among stakeholders that will improve the quality and efficiency of prospective studies of new medical technologies. One of CMPT’s projects is to create coverage-guidance documents that will provide a clear, well-defined, and consistent target for what evidence is necessary to demonstrate clinical effectiveness and clinical utility across a broad range of technologies. The primary audience for these documents is product developers. The documents are analogous to FDA guidance docu- ments, but FDA guidance documents articulate evidence requirements for regulatory approval. The idea behind the CMPT documents is that they should serve as companion documents to these FDA documents and articu- late the specific evidence requirements for reimbursement or for coverage. The purpose of the documents is to reduce uncertainty, increase consistency, and incorporate a notion of feasibility. To develop these coverage-guidance documents CMPT will work with multiple stakeholder workgroups—for example: payers, product developers, clinical organizations, and patient groups—to determine what the evidence requirements should be. Then there will be a web-based, iterative, public- comment process on the draft documents. The first effort being undertaken is to develop an evidence-guidance document on gene-expression profiling for breast cancer. The next topic will be wound-healing interventions. Tunis concluded by saying that the definition of evidence requirements for clinical utility, clinical effectiveness, and comparative effectiveness4 should be well-defined—and not just defined by payers or by evidence- based medicine experts. They need to be defined in a collaborative way that 3 Clinical effectiveness is defined as “the extent to which specific clinical interventions when deployed in the field for a particular patient or population do what they are intended to do, that is, maintain and improve health and secure the greatest possible health gain from the available resources” (NHSE, 1996). 4 Comparative effectiveness is the “comparison of one diagnostic or treatment option to one or more others. . . . Primary comparative effectiveness research involves the direct generation of clinical information on the relative merits or outcomes of one intervention in comparison to one or more others” (Buckley, 2007).

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION involves the perspectives of people who understand what it takes to develop these products as well as the perspectives of the patient and the clinician. ASSESSING TECHNOLOGy FOR USE IN HEALTH AND MEDICINE Naomi Aronson, Ph.D. Blue Cross and Blue Shield Association In examining evidence it is important to distinguish between three kinds of policy, Aronson said: medical policy, coverage policy, and payment policy. Medical policy is based on scientific evidence and does not consider costs or coverage issues. Technology assessment is used in the support of medical policy. Medical policy essentially operationalizes two health plan contract provisions, those describing what an investigational service is and those describing what a medically necessary service is. Coverage policy, by contrast, is determined through a contract with purchasers of health plan policies; these purchasers are largely employers. In designing the benefits, their cost-effectiveness may be considered. The clearest example of this is with drug benefits, where decisions include such factors as cost-equivalent substitutability. Finally, there is payment policy, which is the contract between health plans and medical professionals and providers. The 39 independent Blue Cross and Blue Shield (BC/BS) plans around the country make their own coverage decisions, but they do look to the Technology Evaluation Center (TEC) at the Blue Cross and Blue Shield Association (BC/BSA) for evidence-based analysis because that is the basis for their coverage decisions. With 100 million total members in the 39 independent plans, BC/BS covers one in three Americans. The purpose of the Technology Evaluation Center (www.bcbs.com/tech) is to provide rigorous assessment of clinical evidence. The TEC is staffed by physicians, epidemiologists, research scientists, medical librarians, and pharmacists, who are employees of the BC/BSA. Nothing is released as a TEC assessment until it has been approved by an independent expert medi- cal advisory panel under whose authority the TEC operates. The advisory panel is composed of academic clinical researchers and specialty society appointees, including an appointee from the American College of Medical Genetics, an association that was recently added because of the complexity and importance of the area of medical genetics. Only 4 of the 17 votes on the panel are allotted to plan clinicians, which is an important point and emphasizes the independence and scientific

OCR for page 25
0 DIFFUSION AND USE OF GENOMIC INNOVATIONS rigor of the process. The staff presents its analysis for the panel to decide whether the technology under consideration improves health outcomes. Does it improve length of life, quality of life, or the ability to function? If the panel judges the evidence as supportive of improvement, the report is approved. During the past three years, the TEC has conducted more than 300 technology assessments, all of which can be viewed at www.bcbs.com/tech. The TEC takes the position that the process of technology assessment should be transparent so that stakeholders can understand the level of evidence used. While the TEC cannot consult with companies, its staff will hold teleconferences with any company that wishes to understand better how the TEC might approach the evidence concerning the company’s tech- nology. The TEC is also an evidence-based practice center for the Agency for Healthcare Research and Quality. Interest in genomics at the TEC is not new, Aronson said. Ten years ago the TEC assessed BRCA-1 and BRCA-2, found that they met the TEC criteria, and recommended that they should be offered and accompanied by genetic counseling. During the past year the TEC has placed a strong focus on genomics because it understands that genomics is an area that is both rapidly evolving and that can be somewhat confusing and intimidating to the average clinician. There are two roles that the TEC can play in genomics. The first is to assess specific currently emerging technologies. The second is “hori- zon scanning”—looking ahead to see what important technologies are approaching. Assessments of specific emerging technologies have included gene-expression profiling of breast cancer and genetic testing for long- QT Syndrome (LQTS). Horizon scanning has examined cardiovascular pharmacogenomics, cancer pharmacogenomics, and genomics of neurologic disorders. The focus of the TEC is on patient-outcome efficacy, that is, improved health. This is compatible with the ACCE5 evaluation model and the frame- work developed by the Centers for Disease Control and Prevention. If one understands clinical validity,6 how can the case for clinical util- ity be made? When can it be made from inference? When does it need to be directly demonstrated? In an ideal world one would always have direct evidence for clinical utility. Randomized controlled trials (RCTs) are expensive, however, and are typically not the norm in the area of diagnostic 5 “ACCE, which takes its name from the four components of evaluation—analytic validity, clinical validity, clinical utility and associated ethical, legal and social implications—is a model process for evaluating data on emerging genetic tests” (http://www.cdc.gov/genomics/gtesting/ ACCE.htm, accessed January 24, 2008). 6 “The clinical validity of a genetic test defines its ability to detect or predict the associated disorder (phenotype)” (http://www.cdc.gov/genomics/gtesting/ACCE.htm, accessed January 24, 2008).

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION testing. The reality is that the case for clinical utility generally relies heavily on indirect evidence, using a causal chain of logic, inference, and linkage of various bodies of literature, from the diagnostic performance of the test to the effect on patient management and, ultimately, to the effect on health outcomes. Bona fide health outcomes for diagnostic tests include avoidance of other tests and avoidance of an invasive procedure. The following example illustrates how the TEC puts together an assess- ment. Computed tomographic angiography (CTA) has been proposed as a noninvasive alternative to invasive coronary angiography for the evaluation of coronary artery disease. When the findings on CTA are negative, inva- sive angiography is not necessary, but those results with significant stenosis (positive CTA findings) need to be confirmed by invasive angiography. Comparing the health outcomes of the two technologies involves consider- ing such factors as the number of catheterizations avoided, the risks and the effects of a false negative CTA, the effects of added radiation (since CTA involves a substantially higher dose of radiation), and the effects of extra cardiac findings. The TEC assessment found that the evidence is insufficient to draw conclusions about the effect of CTA on health outcomes. There- fore, CTA does not meet the TEC criterion that requires being able to draw conclusions concerning the effect of the technology on health outcomes. Another example involves the assessment of genetic testing versus the use of clinical criteria for identifying people with LQTS, a condition that marks individuals as being at risk for lethal arrhythmias. Such individuals are typically under age 40 and usually have a family history of the condi- tion. The TEC assessment found that the genetic test is accurate in iden- tifying the presence of a mutation but that the diagnostic accuracy for identifying LQTS is not clear because there is no true gold standard for clinical diagnosis of LQTS. The assessment did find, however, that genetic testing identifies more individuals that may have LQTS than are identified through clinical diagnosis alone. The opinion of the medical advisory panel was that there is value in uncovering additional information because LQTS is an underdiagnosed condition. It is treatable with beta blockers, but if LQTS is not identified it can have catastrophic results. In this situation, it is not possible to conduct the kind of quantitative modeling of health outcomes that was done with CTA. However, a qualitative analysis showed that the genetic test had the potential to identify more patients with LQTS, who would then receive low-risk treatment with beta blockers, thereby forestalling the potential catastrophe of untreated disease. There are many associations in genomics, and more information is rapidly becoming available. However, the relationship between evidence and clinical validity is not well defined. In the previous example one could infer that the genetic test for LQTS would improve health outcomes, but when is direct evidence needed?

OCR for page 25
 DIFFUSION AND USE OF GENOMIC INNOVATIONS One example of the need for direct evidence can be found in the area of lung cancer screening. Lung cancer screening is controversial because it is unclear whether there is any value to early detection. Improved accuracy in detection must be viewed cautiously because of the potential for lead-time bias,7 length bias,8 and overdiagnosis bias.9 In light of these complexities, the National Cancer Institute is carrying out the National Lung Cancer Screening Trial in order to address the question of whether there is value in using spiral computed tomography (spiral CT) for the early detection of disease. Approximately 50,000 patients, both smokers and former smokers, are participating in the trial. They were randomized to receive either spiral CT or X-rays. This situation can be compared to some of the controversies surround- ing the use of genotyping in the decision whether to initiate warfarin dos- ing. It is not easy to compare genotyping to a reference standard. There are many intervening variables that contribute to warfarin dosing, and there is a narrow window between an effective therapeutic dose that prevents clotting and a too-high dose that leads to bleeding. For these reasons, one needs direct evidence, and that can only be accumulated with prospective trials of dosing algorithms to compare personalized warfarin starting doses with standard dosing in terms of bleeding outcomes. A small trial has been conducted, but the results were not encouraging. With a correct algorithm, however, there may be definitive results. Several trials are currently under way to find out. While evidence of clinical effectiveness is the cornerstone of the BC/BS plans’ medical and coverage policies, cost-effectiveness and affordability are also pressing issues. Every health plan in business is operating under state regulators. State regulators may have slight differences in their inves- tigational and medical-necessity language, but those differences are not substantial. The key point is that there is no contract language that allows payers to use cost-effectiveness as a standard or a criterion for coverage. The medical-necessity language does, however, specify that more will not be paid in order to achieve the same results. The difficulty is that there is no contract language that addresses those situations where there are only incremental benefits compared to costs, and there are many new technologies that have such incremental benefits compared to their costs. The TEC has produced some cost-effectiveness 7 Lead-time bias means that there is a longer time between diagnosis and death, even though death is not delayed. 8 Length time bias means that slower-growing tumors are more likely to be detected, which biases the resulting data to imply a better prognosis than the actual prognosis. 9 Overdiagnosis bias is when screening detects cancer that would not, within the lifetime of the individual, have developed into disease.

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION analyses10 of technologies that met the criteria of improved outcomes, but because there are no clear cost-effectiveness thresholds that are scientifi- cally prescribed, cost-effectiveness analysis is not a solution to the afford- ability problem. Just because something is of value, does that mean it is affordable? The affordability problem is real and there is a substantial gap between health care insurance premiums and workers’ wages and infla- tion. Employers, who are the main source of insurance in this country, are dealing with this problem by shifting costs onto employees. Additionally, there are currently about 47 million uninsured individuals in this country, which is slightly more than the number of people insured by Medicare and somewhat more than half the number insured by BC/BS (Center on Budget and Policy Priorities, 2006). A sustainable health care system is one that is affordable. As products are designed and brought into the market, thinking about long-term sustainability is a key to long-term success. Health plans want to make evidence-based decisions, but there are considerable challenges to obtaining good evidence on outcomes for both therapeutic interventions and diagnostic tests. For diagnostic tests, indirect evidence can be used if that evidence is based on performance, in which inferences can be made about clinical utility. But when there are complex associations and intervening variables, direct evidence is necessary. Ulti- mately, while the TEC process is not aimed at costs, cost-effectiveness and affordability are pressing concerns and will shape the success or failure of technologies, Aronson concluded. INTEGRATING GENETIC TECHNOLOGy INTO A HEALTH CARE SySTEM Wylie Burke, M.D., Ph.D. University of Washington The movement of genetics into the health care system is marked by three major trends, Burke said. First, information that was previously handled by medical geneticists and a few specialists is now moving into more of a specialty–primary care mix. This requires addressing the barriers that exist as that transition is made. Second, although genetics historically has used information as an endpoint that did not improve health care outcome, such 10 “Cost-effectiveness analysis is a measure or evaluation of the cost of an intervention rela- tive to its impact, usually expressed in dollars per unit of effect” (Modeste, 1996).

OCR for page 25
 DIFFUSION AND USE OF GENOMIC INNOVATIONS improvement in outcome is now possible. This shift requires thinking about the use of genetics in a different way—a way that is more like how other health care information is used. Finally, there is movement from a limited amount of information to a great deal of—maybe too much—information. In the past the worry might have been about what was not known, whereas now the worry is about managing the information that is available. There are three ways that genetic research can provide health benefits. The first is through the use of tests to identify genetic diagnosis or to iden- tify genetic risk. The second is the application of gene-expression panels and other kinds of genetic technology that will enable improved disease classification. Third, some innovative therapies have been developed, and there is hope for more. The latter two are likely to have the biggest benefits over time. In thinking about tests for genetic diagnosis and risk assessment, it is important to acknowledge the different kinds of tests that are currently available. Genetic tests differ in penetrance of the genotype from low to high. Historically, most applications of medical genetics have involved high-penetrance genotypes—that is, genotypes where, in those cases where genetic information is available, there is a great deal of certainty about what the clinical experience of a patient is going to be. The current wave of tests, however, is of a very different sort: they are much more probabilistic. Another difference from other areas of health care is that in genetics it may be the case that the available information has no connection with mea- sures to improve the disease course, whereas at other times the measures are available. Historically, medical genetics was defined as a practice that told people about very high risks for which there was not much that could be done and for which the main intervention would be making decisions about reproduction or selective abortion. This led to the development in 1975 of a medical-genetics standard that still applies in those cases where a geneticist is dealing with information regarding a highly penetrant genotype for which there is no treatment. This standard calls for non-directed counseling and is described as “An attempt to help the individual or family to . . . choose the course of action which seems to them appropriate in view of their risk, their family goals, and their ethical and religious standards, and to act in accordance with that decision” (Ad Hoc Committee on Genetic Counsel- ing, 1975). The first challenge, then, is that medical genetics has a standard that is very counseling-intensive and personnel-intensive, requiring time with indi- viduals to provide them with information that is very charged. But genetics now has a growing presence in such routine medical procedures as the obstetrical screening for trisomy 21. There are also some carrier-screening tests (e.g., Tay-Sachs, hemoglobinopathies, and cystic fibrosis) that are now part of routine health care.

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION In the future, genomic technologies will likely make it possible to provide many more tests of the carrier-screening variety or of a prenatal- diagnostic variety. One concern is that these tests will provide people with more information about risks to the fetus, and without also giving people the opportunity for in-depth counseling, this information might move indi- viduals along a pathway of selective termination, which is what frequently happens with an abnormal result. There is, then, a barrier related to the quality of care and ethical prac- tice. How, in often time-pressured practice settings, can one deliver the kind of counseling that medical genetics standards would suggest should be delivered? There are also cases of high-penetrant11 conditions where there is an opportunity for treatment benefit, such as newborn screening. Newborn screening is a very successful program that looks for individuals with spe- cific genetic conditions for which there are definitive treatments to improve health outcomes and also where there is a time urgency. A question arises because there are many more conditions for which tests are available than was the case when newborn screening started in the 1960s. Thus it has become necessary to consider what sort of time urgency and what level of outcome improvement is sufficient to justify incorporating tests into this kind of mandated screening program. There are other examples of conditions with high penetrance that have the opportunity for treatment, such as colorectal cancer. There are two rela- tively rare hereditary conditions that involve a very high lifetime risk. One is hereditary non-polyposis colon cancer (HNPCC), which has a prevalence of one person in 500 and is correlated with an 80 percent lifetime risk of colorectal cancer. HNPCC also increases the risks of endometrial cancer and ovarian cancer. Screening for HNPCC should start in the early twenties. A primary-care provider might see 10 or 12 individuals with this condition during his or her career. The second hereditary condition that increases the risk of colorectal cancer is familial adenomatous polyposis (FAP). Its prevalence is 1 person in 8,000, or rarer. A practitioner might never see a single case, but pick- ing up individuals with this condition is important. In people with this condition there is a 100 percent lifetime risk of colorectal cancer, and prophylactic subtotal colectomy is recommended. If genetic testing for these conditions follows the medical genetic stan- dard, there would be pre-test counseling, family assessment, determination 11 Penetrance is the probability of developing a disease (or some other outcome of interest) given that an individual has a particular genotype. The penetrance of a genotype is often estimated by examining the proporation of people with a particular genotype who develop the disease or outcome of interest.

OCR for page 25
 DIFFUSION AND USE OF GENOMIC INNOVATIONS of who is at risk, explanation of the limitations of current testing technol- ogy, and a description of treatment options if the test is positive, all deliv- ered in a very counseling-intensive way. Typically, however, these families come to genetics only after a fairly dramatic occurrence of cancer in the family. There is a need to do better prospectively. Families like this should be identified in primary care and specialty practice. The challenge in doing that is that it will require physi- cians to become sophisticated about the continuum of family history, which exists in virtually all common diseases. A geneticist would look at a configuration such as appears on the right side of Figure 3-1 and be able to determine that the individual is almost certainly either HNPCC or FAP. Testing would be done to determine which it is, and family members would begin the pathway for prevention. The middle of Figure 3-1 illustrates a family that does not meet the criteria and is unlikely to be at high risk. There may be some risk if infor- mation about different members of the family is missing. Based on knowl- edge of colorectal cancer epidemiology, however, individuals in that middle family are likely to benefit from starting routine colon cancer screening at 40 instead of 50. On the left side of Figure 3-1 is a family history where a grandfather died with colon cancer at 80. Such a family history is not an indicator or a red flag. Primary-care physicians as well as all the specialists who may come into contact with this kind of family history must become more sophisticated about making these kinds of distinctions and then referring their patients to the appropriate specialists. There is a great deal of evidence showing that physicians do not do a good job of making these distinctions and are not comfortable doing so. One major barrier to physicians doing a better job CRC, 65 CRC, 48 CRC, 80 Polyp CRC, 54 52 CRC, 42 FIGURE 3-1 Continuum of family history of colorectal cancer. SOURCE: Burke, 2007. 3-1.eps

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION is that there is no current funding mechanism for adequately reimbursing physicians for spending the time that it takes to complete a careful family history assessment. Apolipoprotein E (APOE) testing is an example of a test with a rela- tively low predictive value. An APOE4 genotype predicts increased risk for Alzheimer’s disease. At present there are no treatments available that can reduce the risks of Alzheimer’s disease in individuals with this genotype, so there is no way to use this genetic test to improve health outcomes. The information from the test is potentially actionable in other ways, however. For example, in one very small study about 150 adults whose parents had Alzheimer’s disease were offered APOE genotyping; the statistically significant finding of the study was that individuals who tested positive for APOE4 were more likely to buy long-term care insurance than individuals who either received a negative result or chose not to learn the result (Roberts et al., 2003). There is also a suggestion that APOE4-positive individuals are more likely to buy health insurance and less likely to buy life insurance. A patient, then, might consider APOE4 testing to be of value and actionable even if health outcomes are not affected. One of the challenges for a health care system is to decide whether this and similar uses of infor- mation is something that should be considered part of health care. Another example involves warfarin. As mentioned briefly above, vari- ants in the genes encoding the enzymes VKORC1 and CYP2C9 can affect how the body responds to warfarin. Only about 35 percent or 40 percent of the variance in response is explained by these genetic variants, however, which means this is not the sole predictor of how individuals are going to respond to warfarin or what dose they require (Rieder et al., 2005). Physi- cians are fairly good at determining warfarin dosing and are good at moni- toring reactions, so it is uncertain whether genetic test information about variance will help doctors better manage their patients. This is the kind of situation where additional data are needed. Another issue in pharmacogenetic testing is ancillary risk information. Many pharmacogenetic variants provide valid information about risks for diseases other than the one for which the test was conducted, and some of those risks are entirely unrelated to the purpose for which the pharmaco- genetic testing is done. In some tests there are two or more risks. Undoubt- edly, more ancillary risk will be discovered. Is this a good or a bad thing? The answer may well depend upon the test. A particular transcription factor (TCF7L2 variant DG10S468) has been identified as associated with a relative risk for Type 2 diabetes. The effect is statistically significant, but it is small. The major issue then is, what is the clinical utility? The action that would be taken after getting the results of this test is the kind of action that everyone should be taking, test or no

OCR for page 25
 DIFFUSION AND USE OF GENOMIC INNOVATIONS test. Therefore the test does not lead to any particularly useful advice to patients. It will be increasingly feasible to examine variance in multiple genes that contribute to a common disease, Burke said. In such situations it will be possible to identify a very small proportion of the population that has high risk, where the positive predictive value is much higher than would normally be obtained with a single-gene variant. An example of this is age- related macular degeneration. Using variation in three genes, researchers were able to estimate that a small percentage of the population (about 1 percent) had a risk of greater than 50 percent for age-related macular degeneration (Maller et al., 2006). One would want to engage in care- ful monitoring with this group and, when definitive preventive therapies are developed, these individuals would be the first candidates to receive treatment. What one also finds in the course of these tests is that most people have risks that are a little bit above or a little bit below the population average. How can one manage that information in order to extract from it what is clinically useful without getting distracted by a lot of information that is not clinically useful for most people? There are early indicators that genetics has a powerful ability to char- acterize and classify disease. Both HER-2-neu amplification (which identi- fies candidates for herceptin therapy) and gene-expression profiling for breast cancer are good examples of this ability. These are harbingers of an important way in which genetics will provide tools to improve practice in the future. There are also a few novel therapies, such as Gleevec, where an RNA therapy is directed toward a messenger RNA of the virus. Looking at the idea of different pathways from genetic research to clini- cal benefit, much of the discussion is focused on the current management of genetic information and what benefits might flow from it. Indeed, very substantial benefits are beginning to emerge. It appears that disease classi- fication and innovative therapy will be two major contributions of genetic research and, perhaps, the major contributions from genetic research to health care. Because those contributions are predicated on the goal of improving health outcomes, the technology assessment issues are analogous to the technology assessment issues for other kinds of health care and pose the same kinds of challenges. An interesting question is the extent to which tests for gene variance that are associated with increased risk will become an important modulator of either disease classification or innovative therapies. That is, will these tests help achieve greater benefit? Part of the complexity, as with other innovations, is point-of-service information. That is, how do we integrate these innovations? Clinicians and health care systems are concerned about billable services that improve

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION health care. Clinicians clearly want evidence-based guidelines. Furthermore, there is a need to think about high-quality, cost-effective methods for per- forming the kind of education and counseling needed by the patient. How to accomplish that in a less resource-intensive way or in a more efficient way is a major challenge, Burke concluded. VIEW FROM THE TRENCHES: CHALLENGES AND OPPORTUNITIES IN PERSONALIzED MEDICINE Brad Gray Genzyme Genetics Diffusion of genomic innovations into the practice of health care through new product launches requires a balancing of economic risk and reward, Gray said. The old paradigm in medicine is a series of actions: observation of a disease condition and action to treat it; an observation of response; and then a correction if the desired response is not achieved immediately. When this leads to innovation and improvement, it is deemed a success. In the long run this trial-and-error medicine can lead to great innovation, but in the short term, for the individual patient, it can provide a long, arduous path to identifying the correct treatment approach. There is a new paradigm for personalized medicine, however, one in which complex testing (some of which is genomic, some of which is proteomic, and some of which is other technologies) plays a central role in linking observation to tests and therapy. In such a paradigm, observation is followed by a test that provides specific information for better deci- sion making. This, in turn, is followed by the action, which would be the therapeutic choice or regimen that leads to a predictable response, thereby breaking the cycle of trial and error. A series of technological innovations has made it possible to categorize diseases much more specifically, transforming what had been gross catego- rization into very narrow classifications based on genomics, and this, in turn, has improved care significantly. One hundred years ago, all blood cancers would have been classified as one disease, the disease of the blood. Over time, however, it was recognized that the cancers of the blood could be divided into leukemias and lymphomas, and later it was understood that there are actually several different types of leukemias and lymphomas. With our current ability to understand the specific protein expression, the

OCR for page 25
0 DIFFUSION AND USE OF GENOMIC INNOVATIONS specific morphology, and the genetics of these diseases, they are being fur- ther disaggregated, and it now appears that there may be tens of different diseases in what was once categorized simply as blood disease. As a result, treatments can be tailored to the specific disease, providing a significant reduction in the risk of dying from the disease in the near term. This pattern is a striking example of what will happen in many other diseases, Gray said, with the first improvements probably occurring among the cancers and then later moving into the rest of the disease burden. There are a growing number of drugs on the market that are tied to specific tests that help identify those patients who would benefit from them (see Figure 3-2). In general, the tests are used to answer one of the follow- ing four questions. Which drug should be used? How should the dose be tailored for a specific patient? How can it be confirmed that the drug is actually working for that patient? Is there a response observed? A variation of the third question is, is the response strong enough to say the disease has been cured? The timeline of personalized medicine can be divided into three phases, Gray said. The first phase is fear, the second relates to value, and the final phase is acceptance. A few tests have gone all the way through these three phases, but the vast majority are still stuck somewhere in the middle of this timeline continuum because of the barriers encountered. Which Drug Should I Use? Tamoxifen ® Breast Cancer ER/PR Herceptin ® Breast Cancer HER2 ® Leukemia, Chronic Myelogenous Gleevec BCR-ABL Erbitux® Colorectal Cancer EGFR Tarceva ® Lung Cancer EGFR ® Leukemia,MDS Revlimid Deletion (5q) How Much of the Drug Do I Need? Camptosar® Colorectal Cancer UGT1A1 Is the Drug Working? Gleevec ® Leukemia, Chronic Myelogenous Quant BCR-ABL Gleevec ® Leukemia, Chronic Myelogenous BCR-ABL mutations Is My Disease Gone? Campath ® Leukemia, Chronic Lymphocytic Minimal Residual Disease FIGURE 3-2 Personalized drugs available today. 3-2.eps SOURCE: Gray, 2007.

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION In the first phase (fear) there are a number of barriers. Pharmaceutical companies are concerned about their markets being constricted in size by the narrowing of the definition of the disease or its indication. Payers are understandably concerned about making sure that, as these additional tests are performed, there is actually a reduction in cost or an improvement in outcome that appropriately compensates for the additional expense. There are physicians who are concerned that testing will constrict the way they practice medicine. Patients may worry that if a test result comes back negative, they may actually be denied access to a treatment they see as important to their health or survival. Regulators are concerned about how to address the complexities of genomic innovations. Finally, the diagnostics industry, which sees genomic testing as an opportunity, also sees extreme risks and uncertainties concerning the clinical value of these tests as well as risks and uncertainties pertaining to regulation, market adoption, and reimbursement. In 2005 Genzyme Genetics made a significant push in the field of per- sonalized medicine, focusing explicitly on tests that could be directly tied to a therapeutic intervention. The company was aggressive in licensing technologies with early but promising clinical data that had been published in reputable journals. The company then worked quickly to get those tech- nologies into the marketplace, believing that physicians would be convinced of their value as the data grew stronger and that a test that helped deter- mine the dosing of a therapy would be a compelling value proposition. Two tests that Genzyme Genetics brought to market offer enlightening examples and shaped the way in which the company currently thinks about new product development. First, the company aggressively brought to market the UGT1A1 test associated with irinotecan. In June 1996 the FDA had approved irinotecan for second-line treatment for colorectal cancer, and over the next several years a series of studies indicated a connection between polymorphisms in the UGT1A1 gene and toxicities stemming from irinotecan dosing. In June 2005, in response to those studies, the FDA approved the addition of information to the irinotecan label warning that a different starting dose should be considered for people who were homo- zygous carriers of a certain allele in UGT1A1. Very quickly thereafter, in August 2005, the FDA approved a device for detecting this allele that was manufactured by Third Wave. Genzyme Genetics was very excited about this technology, believing that an FDA label that included a recommendation of the use of the test would be extremely compelling and that eliminating these toxicities would be universally desired, and so the company worked with Third Wave to bring the test to the U.S. market very quickly. In December 2005, Genzyme Genetics launched the UGT1A1 polymorphism testing service. There was strong clinical evidence for the usefulness of this testing—strong enough

OCR for page 25
 DIFFUSION AND USE OF GENOMIC INNOVATIONS for the FDA to change the label—and there was an FDA-approved device that met all the criteria for clinical validity. From the company’s perspec- tive this was a very promising situation, one that seemed as positive as a situation could be. The experience, however, turned out to be quite different. Physicians said such things as, “I don’t need a test because I can start patients on irinotecan, and when side effects occur, I lower the dose, stop a cycle, or stop treatment,” or “I monitor bilirubin level, so do not need to test.” Physicians who were willing to test asked what dose to use if the patient did have the polymorphism because the dosage and administration section of the drug label did not offer details about what to do if a polymorphism was found. Some physicians decided that the specific polymorphism was fairly rare so that it was not worth testing all patients. From these experiences, the company learned that clinical utility data are not sufficient to change clinical practice. Physicians will use work-around solutions when they are modestly effective. Additionally, the inclusion of a test or genomic information in a drug-package insert does not necessarily lead to testing. Finally, package inserts must be clear on the implication of the testing results for dosing, or else physicians will struggle to interpret them. After an initial pulse of interest, physicians in the United States have largely disregarded the use of UGT1A1 testing when prescribing irinotecan. Dosing with irinotecan without UGT1A1 testing results in about $1,000 in additional costs because of adverse events, Gray said. Theoretically, that additional cost could be eliminated by testing every patient and dosing accordingly. Therefore, $1,000 is the value Genzyme Genetics would assign for the value of the test. The company, however, is reimbursed based on the current procedural terminology (CPT) code, where the dollars associated with the activities that are used to perform this test are totaled, yielding about $310. That figure, then, is the reimbursement for the test. Therefore, Gray said, the test is delivering three times the value of its cost—a compel- ling argument from a health-economic perspective. One might argue that, for innovations such as this to flourish in the future, a larger portion of the health-economic value delivered to the sys- tem should be captured by the company making the test. In the case of the UGT1A1 test, the company struggled to drive adoption but captured only a fraction of the value being delivered. A second high-profile product that Genzyme Genetics became involved in is the use of epidermal growth factor receptor (EGFR) testing. Mutations in the tyrosine kinase domain of the EGFR govern response to tyrosine kinase inhibitors (TKIs) in non-small-cell lung cancer (NSCLC). The first TKI for non-small-cell lung cancer was gefitinib, which was approved by the FDA in May 2003 for third-line treatment of advanced or metastatic NSCLC. Very shortly afterward, some prominent publications appeared that

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION discussed the way that mutations in this protein would predict the response or non-response to gefitinib. Then, in November 2004, the FDA approved a second drug in the class for second-line treatment of advanced or metastatic NSCLC: Tarceva from Genentech. In response, Genzyme Genetics aggres- sively pursued worldwide exclusive licensing of EGFR mutation testing. The company paid more than it had ever paid for an intellectual-property license and quickly drove a test to market. Soon afterward publications emerged that seemed to question the util- ity of EGFR mutation testing for driving dosing. Since that time there has been disagreement about which is the correct biomarker to predict response to this class of drugs. In July 2006 the C-Path Institute announced an effort to try to resolve the question of biomarkers in NSCLC cancer, but results are not yet available. When this product was taken to market, only a small minority of NSCLC patients who received TKIs—probably less than 5 percent—actually received the test, Gray said. The penetration is highest in the leading aca- demic centers, where there is a willingness and an ability to navigate the nuances of the emerging evidence. Community physicians, on the other hand, have generally been reluctant to adopt this approach. They are con- fused about the multiple-testing options, and they use what they consider clinical information (e.g., patient’s race, smoking habits) as a proxy for the mutation status. Furthermore, because TKIs are most often used as the last line of treatment in these patients, there is a reluctance to do a test that would suggest that certain patients will not respond. The company learned several things from this experience. First, the connection between genetics and treatment is not always clear. Commu- nity physicians need education and assistance in understanding conflicting evidence. Robust clinical-utility data will be required to drive adoption by community physicians, who will continue to substitute work-around solu- tions when they are modestly effective. Furthermore, community physicians are not inclined, in general, to deselect patients from treatment. A test that selects patients in is much easier to sell than one that selects out, especially when there are few alternatives for those patients, Gray said. The adoption curve for EGFR testing is still heading upward. While the EGFR mutation test has not been adopted as rapidly as a new drug therapy typically would be, the indicators are moving in the right direc- tion. The National Comprehensive Cancer Network (NCCN) guidelines for non-small-cell lung cancer include the test, a point which Genzyme Genetics believes will help community physicians gain comfort with the utility of the test. Based on past experience, then, Genzyme Genetics has revised its cri- teria for bringing new personalized medicine tests to market. First, for the company to invest in a test, the test needs to represent the only reliable way

OCR for page 25
 DIFFUSION AND USE OF GENOMIC INNOVATIONS to obtain information. When there are low-cost work-around approaches (e.g., measuring a bilirubin count or assessing smoking status), there is too much commercial risk to proceed. Second, clinical evidence is absolutely critical to driving adoption. There must be proof-of-concept data from inventors, or it must be feasible to run a decisive experiment at reasonable cost and in a reasonable amount of time if the company is going to pursue the innovation. Third, because reimbursement in the testing sector of the health care system has traditionally not been based on value but on activity-based costing, the economics must support investment in clinical and market development. The reimbursement path must be attractive, either by virtue of its intrinsic coding or because there is the possibility of making a com- pelling case to be reimbursed on a different basis than activity-based costs. Furthermore, the company will look for places to invest where intellectual property and know-how is available on an exclusive basis. In situations where only a non-exclusive product is offered, the company will not be able to justify the investment required to perform clinical research or to navigate the regulatory system. Concerning licensors and inventors, whether in academic medical centers or in small companies, Gray said that they must be educated about the experience of Genzyme Genetics in this area. The company is now look- ing for a partnership structure that will provide the needed return on invest- ment, given the risk the company would be making. Genzyme Genetics is still committed to personalized medicine, but with a far more realistic and cautious approach. To overcome the barriers and to help innovators bring genomics to the market quickly and effectively, several things are needed. The first is educa- tion. More information about the new tests must be given to physicians and to health care providers. Furthermore, organizations such as the NCCN need to develop and provide clinical-practice guidelines. Such guidelines will help interpret information about the test for the benefit of community physicians, who will be playing a more important role in testing than they have in the past. There is also a need to start education about diagnostics and genetics early in medical school. A second requirement is better data. Industry-wide cooperation is needed to collect and analyze data in a timely manner on the best use and outcomes with diagnostics. Finally, Gray concluded, the reimbursement system must compensate the innovators for their expense and risk if innovation is to continue. That means that there must be movement toward reimbursement based on value delivered by the test rather than according to activity-based costs. Further- more, reimbursement must appropriately take into account the regulatory burden undertaken to deliver the test to market.

OCR for page 25
 PRACTICAL INCENTIVES AND BARRIERS TO TRANSLATION DISCUSSION Wylie Burke, M.D., Ph.D. Moderator One audience member asked whether the issue of legal liability might drive the adoption of some testing, even in advance of clear clinical utility. Gray responded that where the utility of the test is clear and where there is a very clear way to use the information, physicians will likely see perform- ing the test as reducing their liability. With most tests, however, utility is not always clear, and there might be disagreement about how to use the information. Therefore, while liability is a factor, it is difficult to general- ize about whether it will promote or inhibit the adoption of new genomic technologies. Another participant said that she was struck by the parallel between the current state of genetic medicine practice and HIV. When HIV was first recognized, there was fear and stigmatization associated with the diagnosis. As effective treatments were developed and understanding of the disease increased, being HIV positive changed from a death sentence to a chronic disease. Still, because of the earlier stigmatization, there are still many controls and consents that must be included in counseling and other efforts surrounding HIV. As one moves forward with integrating genetics into routine medical practice, the audience member continued, it will be important to evaluate what is occurring and to not maintain all the stigmatization and consent requirements that surround genetics today and that contribute to the lack of use of this information. Burke responded that it will be important to stratify and recognize differences in genomic medicine. When the possible result of a test is a terminated pregnancy, it is likely that there will still be a need for counseling, but the level of counseling needed will be different, for instance, in the case of a pharmacogenetic test. Another audience member noted that several speakers used the warfarin example in their presentations. Physicians have a 40- to 50-year history of giving warfarin or Coumadin. Physicians also know that the risk of using Coumadin is highest in the first month and trails off by the third month of use. By the time a patient has been on the drug for years, the dose is rock solid unless there are changes in other medications. The target of interest should be the detection or prevention of an adverse event during the first three months. That is a much smaller market than the millions of people already on Coumadin. The calculations are very different if one is trying to detect a rare adverse event in a defined small population versus whether one is using disease-based diagnostics,

OCR for page 25
 DIFFUSION AND USE OF GENOMIC INNOVATIONS such as APOE. If there were an effective drug for Alzheimer’s disease and the dosage depended on their APOE genotype, that becomes an entirely different matter. One member of the audience noted that Gray, in his presentation, had described two incentives that diagnostic companies have for generat- ing good-quality clinical data, had discussed the concept of value-based reimbursement, and had explored the ideas of gaining monopoly in a test through the use of intellectual property and a biomarker. What is unclear, the questioner said, is how, without a monopoly and a biomarker, one can squeeze value-based reimbursement out of the payers. If one does not have a monopoly, then the diagnostic companies are simply all going to compete with each other and drive down the price. Gray responded that the questioner was correct and that the situation is borne out by the examples presented. Those examples illustrate that the innovations that have achieved value-based reimbursement are all cases in which a company owns intellectual property or know-how that cannot be replicated by competitors. The final question for the panel involved the use of direct or indirect evidence in technology assessment. Direct evidence from clinical trials is preferable, the questioner said, but very few genomic innovations will proceed along that pathway. Indirect evidence, if one can construct the biological pathway, makes sense. The problem is, as Tunis described, that the evidence lines are not clear. What the FDA requires is different from what the third-party payers use. Industry wants the incentive to invest, that is, they want to recoup their investments. The problem is how to proceed. CMS tried the concept of coverage with evidence development. Could something like that work for innova- tions that may be close to showing some clinical utility but that still need a clinical trial to demonstrate the additional benefit? Tunis replied that, for certain clinical applications, obtaining defini- tive evidence of clinical utility is going to be extremely lengthy, burden- some, and costly. Part of the new paradigm may require that the payer become comfortable with reimbursement tied to indirect evidence or to some threshold of clear clinical validity plus promising evidence of clinical utility with the subsequent documentation or verification of clinical utility occurring in post-market. This may be generally true for diagnostics, Tunis said. The eviden- tiary burden of demonstrating an effect of diagnostics on clinical out- comes through RCTs is heavy, whether it is for genetic testing or for CT angiography. Therefore, some kind of conditional reimbursement that presumes that some of the additional questions about clinical utility will eventually be answered—not before reimbursement but after—is going to have to be part of the new approach.