3
Simultaneous Group Discussions with Invited Speakers

First Group Discussion: Delivering Better Breast Cancer Screening Services Etta Pisano, M.D., Professor of Radiology and Biomedical Engineering, Chief of Breast Imaging and Director, UNC Biomedical Research Imaging Center and Member, Committee on Saving Women’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis

MODERATOR AND RAPPORTEUR: The first speaker on this afternoon’s panel is Stephen Taplin, a senior scientist in the applied research program at the National Cancer Institute, who will tell us about the organization of breast cancer screening services.

STEPHEN TAPLIN, M.D., Senior Scientist, Applied Research Program, NCI: Today we want to talk about how we can improve breast cancer screening by organizing care. I underline that screening is a process that leads to outcomes, not a test. Figure 3.1 presents the steps in the screening process. There are at least four different steps in the screening process: risk assessment, looking at who we are trying to reach; detection, finding an abnormality, where today’s focus may be; diagnosis, evaluating the abnormality to find the cancer; and treatment. The transitions between these steps need to be organized as well. Focusing only on improving the steps, not on how women get from one step to another, will not result in improved breast cancer screening.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis 3 Simultaneous Group Discussions with Invited Speakers First Group Discussion: Delivering Better Breast Cancer Screening Services Etta Pisano, M.D., Professor of Radiology and Biomedical Engineering, Chief of Breast Imaging and Director, UNC Biomedical Research Imaging Center and Member, Committee on Saving Women’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis MODERATOR AND RAPPORTEUR: The first speaker on this afternoon’s panel is Stephen Taplin, a senior scientist in the applied research program at the National Cancer Institute, who will tell us about the organization of breast cancer screening services. STEPHEN TAPLIN, M.D., Senior Scientist, Applied Research Program, NCI: Today we want to talk about how we can improve breast cancer screening by organizing care. I underline that screening is a process that leads to outcomes, not a test. Figure 3.1 presents the steps in the screening process. There are at least four different steps in the screening process: risk assessment, looking at who we are trying to reach; detection, finding an abnormality, where today’s focus may be; diagnosis, evaluating the abnormality to find the cancer; and treatment. The transitions between these steps need to be organized as well. Focusing only on improving the steps, not on how women get from one step to another, will not result in improved breast cancer screening.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis FIGURE 3.1 The screening process. The whole process leads to at least two outcomes that can be examined—the long-term outcome of mortality and some short-term outcomes, like reductions in late stage disease. To dissect this process, we can start by looking at recruitment. What are the ways to get people from the population at risk into screening? The next step is the detection process, and for this step we need to evaluate sensitivity and specificity, the quality and validity of the test itself. The third part is follow-up. In a study we just completed, we tried to simplify this process and isolate the problems (Taplin et al., 2004a). We decided to identify the source of all the late stage cases within an organized system. Were they people who were not being recruited, were not being detected, or had a breakdown in the follow-up of their care after a positive screen? Box 3.1 illustrates the sources of advanced cancers in populations with health insurance coverage where 70 to 80 percent of the women reported they had been screened. We found that 52 percent of the advanced cancers were recruitment failures. This teaches us that organized screening must include organized recruitment.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis BOX 3.1 Sources of Late Stage Cancers Failures in the process are associated with poor outcomes: 1,347 late stage cancers from 10 integrated health plans. Absence of screening (1–36 months)—52 percent of late stage breast cancers. Absence of detection—40 percent of late-stage breast cancers. Breakdown during follow-up—8 percent inadequate/other follow-up. SOURCE: Taplin et al. 2004a. The limitations of mammography are not trivial, however. Forty percent of these women had a mammogram within the prior three years. So reduction of these detection failures by improving the quality of technology is important, although not the whole story. Failure of follow-up, where I thought the action would be, was, in fact, the cause of only about 8 percent of advanced breast cancers in this population. We need to think about how to change the system. The Institute Of Medicine (IOM) report recommends organized screening as a way to make screening happen in our populations. This builds on several previous IOM reports, beginning with Crossing the Quality Chasm, which stressed the need for systematic change to improve quality (Institute of Medicine, 2001). Organizing screening is a dramatic system change. Then, measurement is critically important to clinicians making changes. Showing people they are making progress, that they are reaching entire populations, is a critical feedback loop. It tells people they are getting results from their actions in a way they otherwise might not appreciate. European models of care have been mentioned. They demonstrate that organized care does have a definition. Box 3.2 lists that it is about an explicit BOX 3.2 Definitions of European Organized Screening European Models of Organized Care (IARC, 2002): An explicit policy, with specified age categories, method and interval for screening. A defined target population. A management team responsible for implementation. A health care team responsible for care and clinical decision. A quality assurance structure. A method of identifying cancer occurrence in the target population.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis TABLE 3.1 Results of Organized Screening Programs Start Location Evaluation Design/ Comparison Outcome Age Effect Confidence Interval 1974 Sweden (Gavleborg) Geographic comparison Geographic Excess breast cancer (be) mortality 40–64 .84 (.71–1.00) 1975 Netherlands Sweden comparison BC mortality 50–79 .84 (.61–1.17) 1978–1979 (seven counties) Geographic comparison Before/after BC mortality 40–69 .68 (.60–.77) 1987 Netherlands comparison Modeled BC mortality 45–69 .76 (.53–1.09) 1988 U.K. estimate Cohort 0/E 55–69 .79 n/a 1990 Italy analysis Mortality ratio 50–69 .75 (.54–1.04) NOTE: Effect=breast cancer with organized screening/control population; Confidence Interval= 95 percent confidence interval. SOURCE: IARC, 2002. policy, specific age categories, a method to do it, and an interval for screening. It is a policy with a very explicit approach and a defined target population, a population at risk as specified in Figure 3.1. It is also about somebody being responsible, a leadership team. These things do not just happen. A quality assurance structure and measures to create feedback are also essential. So organized screening is happening in Europe. Are these programs having an impact? Of course they are. They have shown a clear trend towards reduce tions in mortality in the populations as illustrated from 1974 through 1990 in Table 3.1. Can it be done in the United States? I just completed 20 years working at Group Health Cooperative, where we put this kind of program in place, and we are also beginning a pilot project in 20 Bureau of Primary Health Care clinics. Group Health is somewhat unique. However, there are a number of plans around the country that have an organized insurance structure, and, in any event, I think our plan’s success was due more to leadership and paying attention and commitment to a direction than to the structure of our medical system. We organized five mammography screening facilities within our system which serves 400,000 people in the Northwest with more than 70,000 women age 40 and above. About 35,000 women were screened each year in that population. We created a multidisciplinary team and a team for leadership. There were different providers involved in each of the steps. When we started to influence, and get feedback on, what was happening in our population, it was not a big

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis problem to get these people together to ask what we were doing, what we needed to do. We had a group that included surgeons, oncologists, nurses, radiologists, administraters, and primary care physicians, all working together to organize this care. We had groups in each region delivering the care. There was clinical leadership at a facility which involved the radiologist and primary care physicians as well as the nursing staff. And we had an information system which mailed reminders. The critical part is identifying the population and communicating with the women. We organized outreach to look across the whole process and explore how to improve it. One of the first things we learned was that follow-up was not coordinated, and there were people falling through the cracks. So we created a system in which there were nurses in the radiology center who were responsible for the follow-up of all positives. All positive mammograms were in a database, and the nurses took responsibility for communicating with the primary care physician and the surgeon. We carried out a number of studies funded by the National Cancer Institute (NCI) to look at recruitment. A reminder postcard was found to be effective in improving recruitment (Taplin et al., 1994). We did a risk survey. Collecting risk factor information from a survey and informing women about their personalized risk increased the likelihood that they would come in for mammography to 66.7 percent compared to 42.9 percent among controls, who received generalized risk information (Curry et al., 1993). We found that a simple call, in which the woman was asked to come in and was scheduled, was as effective as a call addressing all the care issues. The reminding phone call itself was sufficient (Taplin et al., 2000). Then we turned to detection. We measured sensitivity, specificity, recall rates, and positive predictive values, and we provided yearly reports back to the radiologists including all the false negatives, which is more than is required by the Mammography Quality Standards Act (MQSA). Radiologists then went through their own quality assessment program. We looked at a method of improving clinical image quality and reported that interval cancers were more likely to occur in mammograms of poor quality (Taplin et al., 2002), and we are currently studying computer assisted detection. Then we conducted a teaching session with our technologists to try to improve clinical image quality. We evaluated follow-up and treatment. As I said earlier, our nurses assumed responsibility for communicating both with the primary care physicians and the surgeons to ensure follow up. If it was not occurring, they contacted the women. Group Health has also been looking at how treatment is organized. We already know that within our group the odds of breast conserving therapy are about 300 percent higher than in the surrounding community.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis Over a 17-year period, we systematically changed the entire system for individuals, by surveying and by recruiting; for the physicians, by giving them feedback about patient participation and results of screening; for the organization as a whole, by a steering committee and a multidisciplinary team; and by creating an information system. Did it have an impact? Absolutely. Our screening rates (ever had a mammogram) among women age 50 and above increased from under 50 percent to over 80 percent between 1986 and 1990, and we had similar increases in rates for women between 40 and 50 years of age and for mammograms within the last two years (increased from about 26 to 51 percent in women age 50 and above). We also reduced the numbers of women with late stage disease as shown in Figure 3.2. Our report of these data provides evidence that enrollment in an organized screening program is associated with increased likelihood of mammography and reduced odds of late-stage breast cancer compared to community controls (Taplin et al., 2004). We can’t describe the total use of resources in the surrounding community, because we don’t have individual level data, but my suspicion is that our program consumed fewer resources, that is, was more efficient. So organized care FIGURE 3.2 Group health rates of women found with late stage disease are lower as a result of early detection.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis has advantages within this health plan in promoting efficient use of screening, affecting screening and screening interval rates, and in reducing late stage disease. Having made the case in one place, what is the next step? Is it possible to organize screening in other settings? We turned to the cancer collaborative, which is a consortium of agencies, the Bureau of Primary Health Care, the NCI, the CDC, and the Institute for Health Care Improvement which have joined in a 2-year effort to change screening in a comparable way within the Bureau of Primary Health Care. The Bureau’s clinic sites are intentionally spread around the country. Of the 800 clinics, there are 20 sites participating, so it is a small proportion, but it is the pilot proportion. The 800 clinics as a whole serve more than a million women over age 45, so there is a chance to have an impact on a large population of people. The collaborative’s program makes the policy explicit, sets targets for improvement, and then asks about measurement. We create a leadership and implementation team within the primary care group. That team is the physician, the nurse, the PA, the medical records person, and the receptionist. We organize the recruitment, the follow-up, and the referral for treatment. Then we encourage regular changes within the clinic in order to achieve these, and we create a data system to identify how many people receive care. We emphasize systematic reorganization and practice teams. We meet monthly with these teams. We have three sessions within the year in which we discuss progress and assess what they are doing. We then meet on a monthly basis as they put the new plan into place. The bottom line is they are being asked to look at what they are doing, change it, and measure what the results are. So in conclusion, we can systematically change the screening process. It does not require rewriting legislation; it requires the will and the leadership to do it; it takes time, and it takes data. We need also to address barriers to collaboration and whether we can create an environment in which quality improvement is encouraged and reinforced. Those are important questions for us and for our society. REBECCA SMITH-BINDMAN, M.D., Associate Professor, Radiology, Epidemiology, and Biostatistics, Obstetrics, Gynecology, and Reproductive Medicine, University of California, San Francisco: There has been an enormous amount published over the last few decades about who is getting screening mammography. There have been several predictors of screening noted—age, race, ethnicity, having a usual source of care, rural residence, as well as financial barriers. I think there is a general belief now, however, that the differences by these predictors have declined as a result of numerous mammography outreach efforts. Our knowledge of the current status of mammography is based largely on two very widely cited surveys, the CDC’s Behavioral Risk Factor Surveillance

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis System and National Health Interview Survey, both of which assess mammography use annually. The surveys have found that the historical discrepancies in mammography use that have been seen by race and ethnicity have declined significantly. According to these self-report surveys, most eligible women are now getting mammography; 70 to 80 percent of women report that they have had a mammogram in the last two years, and there is no reported difference by race and ethnicity. In fact, some minority groups report higher rates of screening than white women. So many people, including policy makers, have concluded that compliance with recommendations for screening mammography is no longer a problem. We have not discussed differences in cancer statistics a lot today, although Dr. Esserman touched on this in her presentation. However, I think it is understood that there are substantial differences in breast cancer outcomes and breast cancer detection rates by race and ethnicity. In general, non-white women have more advanced disease at diagnosis. There have been improvements in breast cancer mortality over the last decade, but SEER statistics show that the improvements have been largely limited to white women. Mortality curves have been essentially flat for other racial and ethnic groups. It seemed to me that one might question the value of mammography if mammography use is now the same (and high) among different racial and ethnic groups, but differences persist in breast cancer mortality and tumor stage at diagnosis. Clearly, if mammography were working, we would expect breast cancer mortality rates to decline coincident with improvements in screening mammography rates. One possible explanation for the failure of mortality rates to decline among racial minorities along with their higher use of mammography is that estimates of their use of mammography based on self-reports may be inaccurate. I think there is growing concern that this might be the case, and that women, particularly minority women, may overestimate their use of mammography. So, I have been interested in investigating whether there are persistent differences in the use of mammography. We have just completed a study that evaluates screening mammography use among a large number of racially and ethnically diverse women diagnosed with cancer. This study examines recorded mammography use from medical records, and, therefore, we believe these data are more accurate than self-report data. We used data from the NCI funded Breast Cancer Surveillance Consortium which comprises mammography registries in seven states and is probably the largest data set available to assess actual mammography in the United States. In this data set, we learned about mammography use based on medical records, radiologist reports, and a survey that each patient completed every time she had a mammogram (such as a patient self-reported breast mass at the time of

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis a mammogram). These data allowed us to assess mammography use in a more detailed way compared with the self-report surveys that are relatively crude in terms of their assessment of mammography—that don’t differentiate between a screening mammogram, or a diagnostic mammogram, or whether there was a mass at the time of mammography. Clearly if a woman has a mass at the time of a mammogram, it should not be considered a screening test. Cancer outcomes are complete in this data set, since over 95 percent of cancers are ascertained based on linkage to cancer registries. The data describe approximately 900,000 women who are racially and ethnically diverse. In these women ages 40 to 85, there were approximately 26,000 breast cancers diagnosed. Table 3.2 displays the characteristics of tumors in these women by race and ethnicity. These numbers should be, and are, very similar to recent reports using SEER data. They display the adjusted odds of advanced stage cancer by race and ethnicity using white women (set at one) as the reference. Essentially, African American, Hispanic, and Native American women are at increased risk of having an advanced cancer at diagnosis, and such tumors are less likely to be curable. These are the results you would expect, looking at current population tumor registry data. We next looked at the mammography use among these women in the five years prior to breast cancer diagnosis. We categorized women into five groups based on their screening frequency. The most screened women had a mammogram a year before cancer diagnosis. The least screened woman had not had a screening mammogram for at least five years prior to cancer diagnosis. We considered women to have inadequate mammography if they had either never had a mammogram, had not had a mammogram for at least three and a half years, or had their first mammogram after age 55, or only coincident with the diagnosis of cancer. TABLE 3.2 Breast Cancer Characteristics by Race and Ethnicity with White Women as the Reference (adjusted odds ratios)   Large 15mm Advanced Stage High Grade Lymph Node+ Symptomatic White 1 1 1 1 1 African American 1.45 1.60 1.80 1.25 1.18 Hispanic 1.40 1.44 1.20 1.21 1.26 Native American 1.47 1.22 1.60 1.02 1.90 Asian 1.04 1.03 1.31 .86 1.09

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis There were substantial differences in mammography use by race and ethnicity. Compared to white women, all minority women were less likely to be screened regularly and more likely to be screened infrequently. In comparison to whites, the odds ratios varied from 1.4 to 1.8, and these ratios suggest minority women were around 40 to 80 percent more likely to not be screened. We then looked at tumor characteristics after adjusting for mammography. We asked whether women who were similarly screened would have similar types of cancer, or are minority women who are similarly screened still at increased risk for advanced cancers. The latter would suggest a biological explanation, the former discrepancies in mammography use. Once we stratified by mammography, the differences in tumor size, stage, lymph node involvement, and symptoms by race and ethnicity were reduced or eliminated. Thus no matter what a woman’s race or ethnicity, similarly screened women had similar types of tumors. Interestingly differences in tumor grade persisted even after adjusting for mammography. Figure 3.3 shows the percent of women with large tumors in each racial and ethnic group by use of mammography. From left to right, the groups go from more to least use of mammography. The percentage of women with large tumors increases as mammography decreases, and large tumors were found in about 80 percent of weomen who were never screened. FIGURE 3.3 Increasing tumor size with increasing interval since mammography.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis FIGURE 3.4 Racial and ethnic differences in mammography in Medicare beneficiaries. However, once plotted by mammography, there are few differences by race or ethnicity at each screening level. Similar results were found for advanced stage, lymph node involvement, and symptoms. Tumor grade did not change much from the most recent to the most distant mammography groups, and African Americans persistently had higher grade tumors as noted earlier by Dr. Esserman, apparently for biological reasons. It is clear that mammography use is associated with size of tumors, but differences by race and ethnicity, represented by the four different lines of the figure, are no longer significant. In summary, there are persistent and dramatic differences in cancer characteristics which are reduced or eliminated when data are adjusted for mammography use. Therefore, mammography appears to be in large part causal for the differences in tumor characteristics by race and ethnicity. Since mammography clearly contributes to racial and ethnic differences in mortality and to reduction of mortality in general, it is vital to increase the regular use of screening. Having had a mammogram once does not protect against advanced stage cancer, regular screening is required. We consider a 3-year interval to be a minimum requirement. I now turn to mammography use among elderly women. Medicare billing records are a great source of data to assess population-based use of mammography in Medicare eligible elderly women. Clearly mammography rates in this population have increased over time. However, the overall rates are substantially lower than suggested by self-report surveys. In contrast to self-report data, data from Medicare billing records suggest substantial and persistent racial and ethnic disparities.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis published (Fryback and Thornbury, 1991). Many studies look at technical quality and diagnostic accuracy, the sensitivity and specificity. I have been encouraged to see more recently studies asking how a technology changes the diagnosis, changes management, and changes outcomes. An effect on outcomes is the ultimate test we are looking for. We do not factor in cost or cost-effectiveness. The pure technology assessment is really a clinical evaluation of the evidence. Ideally, we would like to see direct evidence, randomized controlled trials comparing outcomes with and without the test. You have heard something about the barriers to that, but it is the standard of evidence in most therapeutic technology assessments, such as drug trials. In the screening setting, we do see some randomized controlled trials and that is great, but in the diagnostic imaging literature in general, that is not the reality. Indirect evidence is the reality that links a chain of evidence: the performance of the diagnostic test; its effect on patient management; and what it does to health outcomes. What are the criteria for a positive test? Does it permit the avoidance of other tests, or invasive procedures? Does it detect a treatable condition earlier? It is vital when using this kind of indirect framework of evidence to consider separately different patient indications. MRI of the breast differs depending of the indication, the kind of patient, the situation, and what it is being compared to. Is it a replacement for mammography, specifically for screening high risk women, or as an adjunct decision aid for biopsy in women with positive mammograms? These are different questions and require very different diagnostic performance of the test. So, the clinical context is critical. Also, in terms of the effect on patient management, when a non-invasive test is replacing an invasive test or procedure, that represents an obvious advantage. That is an easier technology assessment question than thinking about the ultimate effect on mortality. I’ll just move on to a couple of examples. I mentioned earlier MRI of the breast in the screening setting. We have looked at this from the technology assessment point of view (http://www.bcbs.com/tec/Vol18/18_15.pdf). For women at high genetic risk, there have been studies comparing MRI and screen-film mammography (Kriege et al., 2004). In the specific population with higher than average breast density, conventional mammography is not as sensitive. Specificity is a little bit more of a tossup. But if the screening test is positive, there will be a biopsy or further workup; if the test is negative, screening will continue. The trade-off is the benefit to the true positives of earlier detection against the harms of false positives, unnecessary biopsies, and delayed diagnosis. About 6 percent of the women in a high-risk population have breast cancer. About four additional cancers will be detected for every seven additional unnecessary biopsies. That is the sensitivity tradeoff, the risk-benefit equation that is part of technology assessment.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis In the high risk population, where the prevalence is high, there will be a relatively high number of false positives. Dr. Sullivan mentioned earlier that sensitivity and specificity are not the important parameters, and I agree. When comparing one test against an alternative possible replacement test, you examine the ROC curves. But particularly if you are looking for the add-on value to current management of a test, you need to understand how frequently the test is going to be called positive. What is your operating point on the ROC curve if you are going to use decision analysis modeling to figure how a positive compared to a negative test affects management, affects outcomes? So there is a little bit of that bind when we are stuck using some summary estimates of sensitivity and specificity. The next example is positron emission tomography or PET (http://www.bcbs.com/tec/Vol18/18_14.pdf). In a woman with a positive mammogram or clinical exam who is told she needs a biopsy, can some unnecessary biopsies be avoided by using PET. A negative PET scan will spare the woman a biopsy, a positive scan will lead to biopsy. So the balance on health outcome is between the harm of delayed diagnosis versus the benefit of avoiding an unnecessary biopsy. The specificity and sensitivity are not bad, but in the populations that generated these results, there was actually a 50 percent prevalence of cancer. In such a population, there is actually a 12 percent risk of a false negative scan. So that is the way the technology assessment equation plays out in that case. I thought these illustrations would shed some light on technology assessment in our hands. For anyone who is interested in learning more about specific technology assessments and the kind of things we do, we have a website which you can visit at http://www.bcbs.com/tec. ERIC BAUGH, M.D., Senior Vice President, Medical Affairs, Care First Blue Cross and Blue Shield: I will discuss how we go from the technology assessment that Dr. Flamm described to coverage decisions. At Care First Blue Cross and Blue Shield we serve approximately two and a half million members in Maryland, Washington, D.C., and Delaware. We formed our own coverage policy using a variety of informational resources. A technology assessment from our Technology Evaluation Center is only one of them. The evidence for our medical policy is also reviewed by a committee of community physicians, academic experts, and plan staff. Our community is sophisticated. We have people at Johns Hopkins, the University of Maryland, George Washington, and Georgetown Hospital Center that will participate at some level in our coverage decisions.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis FIGURE 3.8 Care First Medical Policy. Care First, like all Blue Cross and Blue Shield plans is an independent company and determines its own policies. Figure 3.8 illustrates our medical policy which is the foundation for everything we do. Medical policy is a proven plan or course of action or guiding principles affecting community standards of diagnosis and treatment. As you can see technology assessment, FDA approval, pharmacy and therapeutics committees, community standards of care, the medical literature, all go into helping formulate medical policy. This then determines quality of medical care, which is defined as the right care at the right time in the right setting at the right cost. When we reach the stage of building a set of benefits, contractual services provided to implement medical policy that people can buy at a reasonable cost, cost enters into the decision on coverage. Then of course, we have utilization management. All of these filter through our claims adjudication policy as to whether or not we are going to pay for something and how much. Medical policy development must fit contractual definitions and employ an objective standard of review and process for considering and reaching decisions.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis FIGURE 3.9 Coverage criteria. We use the TEC criteria described in Figure 3.9 for determining if a new technology provides net health benefits at least as great as the best available alternative by objective evidence in peer-reviewed literature. We also use a Hayes report (http://www.hayesinc.com/) that tracks new and emerging health technologies and gives us impact utilization and cost data. I refer to new technologies for breast cancer detection evaluated to date that provide no clinical benefit when compared to mammography or biopsy, or small benefit for a limited subset of the population when added to mammography as adjunctive. They do not substitute for existing technologies, but may add to the benefit of existing technologies for certain patients. For a new technology of this type, Care First will develop a medical or coverage policy that clearly defines for which patients and indications the technology is available. An example is MRI to investigate a woman with a positive lymph node and negative mammogram. For coverage, we must be able to verify adherence to the policy definition. There are a number of mechanisms to implement a policy of this type. The first mechanism is prior authorization. We could require that MRI of the breast be prior authorized, and specify the documentation required before approval can be granted. This information will be reviewed by a reviewer. If the reviewer

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis feels that the criteria have not been met, the case will be referred to a physician reviewer. Only physicians have the authority to deny coverage. But prior authorization programs are burdensome and unpopular. We limit the number of services that require authorizing and prefer to use other mechanisms to implement medical policies for very specific indications. Service claims editing is a second mechanism that could be employed. Certain CPT or ICD-9 codes may be selected for review, and the claims will be separated out after service. Certain medical information will be requested from physicians to document that the indications in the policy are met. This gets back to the whole concept of evidence-based medicine and the use of protocols—how was this supposed to be used and was the doctor applying those protocols appropriately? This is burdensome to the clinician, member, and plan, and plans need to consider the time and cost of this as with other coverage restrictions. A retrospective review after payment is a third approach to implement medical policy of this type. This gets painful if we decide that the criteria have not been met, and we ask for our money back. We look at claims experience to gauge the appropriateness of this mechanism, and apply it typically when the volume of claims is small and with limited indications. The review information will typically be used to educate participating physicians about the policy and about guidelines and protocols. New technologies are frequently more expensive than existing technologies. PET and MRI imaging are clearly more complex than mammography. Cost is not considered in the technology assessment, but it may be a factor in formulating the coverage and payment policies, which set the coding and payment rules, the frequency limits, and payment level. Health plans need to establish payment policies when there are no existing rates, for example, for new technologies or new applications of existing technologies. The payment level may be set based on the cost of the device and the operating costs. Payors may attempt to establish a rate based on price or cost of a comparable technology, or payors may attempt to reimburse certain new technologies or drugs at the same rate as existing technologies that provide comparable clinical benefit for the condition in question. These approaches try to link price with value. They are not very popular with the manufacturers or providers of these services and could have the effect of retarding the dissemination of the technology in question. I have tried to identify some of the selection criteria for those things that we use to establish coverage policy. These are listed in Figure 3.9. As the report for this meeting documents, FDA approval does not assure that a technology provides clinical benefit or utility. In fact, only ten percent of the new devices and tests that make it to the market have undergone trials to establish safety and effectiveness because they are cleared by the FDA through the 510-K process. It is also known that many payors will not cover a new technology that cannot dem-

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis onstrate an improvement in health outcomes at least as great as or better than the available alternatives. One way to promote development of data on clinical utility after FDA approval would be for payors to provide coverage during clinical trials. Contingent on completion, such trials could establish utility for coverage. Payment only within these trials will assure their completion. In Maryland, health plans are mandated to cover patient care costs of clinical trials involving serious or life-threatening conditions. In effect, this mandate requires contingency coverage. However, these trials are not limited to those establishing clinical value. It is important, therefore, for the health plans, other payors, and employers to interact with the research community to communicate the importance of trials that measure the impact on net health outcomes. If the trials demonstrate that the technology in question does not improve net health outcomes or is no better than conventional treatment with higher side effects, coverage will not be continued once the trial is concluded. This was our experience with autologous bone marrow transplantation for metastatic or advanced breast cancer. Coverage contingent on conduct of a trial (that is, not limited entirely to the trials) dissipates the impetus to field trials, and once provided, runs the risk of alienating members and clinicians when it must be withdrawn downstream. DIANA PETITTI, M.D., M.P.H., Director, Research and Evaluation, Kaiser Permanente, Southern California: I am also going to speak from the decision-making pay or point of view, but with the added perspective of a population and an organized system. In the United States we don’t have an overall health care system, we have a non-system. I am fortunate to work within a system which defines its responsibility in terms of providing health care coverage and maximizing health insurance investment for the improvement of the health of its population. The population served by our system comprises the 3.1 million members of the Southern California health plan—a population that is larger in size than that of New Zealand, many other countries, and 35 states. For breast cancer, the population health perspective means that our investment of premium dollars must improve early detection for the whole population of members. Within this population health context, I am going to specifically talk about our technology assessment of CAD and our decision not to adopt and deploy this technology. The goal for our population is to improve detection of breast cancer. We must weigh what we might spend on CAD against alternative investments of our resources. Increasing the number of women in our population who are eligible to be screened and have the test is the first competing alternative investment. Even in our system with the ability to deploy resources and outreach to our members, we have a rate of breast cancer screening in the 50- to 72-year age group of only 80 percent according to our reports in the Health Plan Employer Data and In-

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis formation Set (HEDIS). We consider this screening rate unacceptable given benchmarks from our other health plans of upwards of 93 percent; we believe we could attain such rates if we deployed our resources appropriately. So in thinking through the CAD decision, it was against a backdrop of an overall screening rate of 80 percent, and a recognition that our first priority in this system would have to be to improve this rate. However, that was not the only consideration. Within a set of possible new technologies or ways of improving performance of existing technologies, there are a number of competing approaches. Even among screened women, we have incomplete sensitivity and imperfect specificity and high false positive (suspicious and not cancer) rates, which is typical of the technology. How can we change our system and technology to improve performance of the test for the women being screened. The first way is to change the way we organize our existing services. In the IOM report, there is a case study which describes how this was accomplished in the Colorado Kaiser plan (see also Adcock, 2004). Replicating the Colorado model in our 11 facilities in Southern California is our main focus. Now we can look carefully at CAD in the context of other alternatives. To begin our assessment, we supplemented the Blue Cross and Blue Shield evidence assessment. On reviewing this evidence, we concluded first of all that use of CAD was really not better than an experienced radiologist in terms of sensitivity. We felt that we had a pressing need to get experienced radiologists, or train a few radiologists so that they had high levels of experience, as had been achieved by the organizational changes in Colorado. Secondly, we concluded that the evidence about the effect of CAD on callback rates in populations similar to ours and in similar broad screening efforts was poor. Evaluations of CAD had mostly been done in highly-specialized centers against specially constructed test sets. Such evaluations did not give us a very good idea of what our callback rates would be. This is important information for us because of the possible burden imposed on our system already stressed by existing service demands of about 80,000 women age-eligible for mammography and performance of about 290,000 screening mammographies each year. And finally, at the time that we were considering CAD, we were in the process of rebuilding a number of our hospitals due to the seismic safety standards in the state of California. We were cognizant of the fact that imaging technology is moving in the direction of digital imaging, and that we would likely need to replace any CAD devices that we invested in somewhere between two and five years later. All in all, this did not seem to us a good use of resources compared to investments to increase our screening rates and better organize our services.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis This kind of assessment and decision-making exemplifies why it is a privilege to work within a system. I believe it also probably represents the kind of thinking that is typical of some of the countries that have been discussed today as models. In such settings, there is a recognition that resources available to spend on any health service are fixed, and that the responsibility of decision-makers is to maximize the deployment of those resources for the benefit of the populations needing that health care service. In his presentation this morning, Dr. Tunis also suggested that we all think comprehensively in these terms about the broad clinical, economic, and social value of new or added technologies. DR. NORTON: We heard two different views on the role of payors in doing research in imaging technology development. There is existing evidence in existing trials, or you try to get information out of ongoing trials. DR. PETITTI: We would participate in the trial if we had that opportunity, and if it could be done for the same price as the existing service, or someone else was going to help foot the bill. I was encouraged by what Dr. Tunis said this morning. The amount of money going into direct head to head comparison, or even the trials for imaging, has been incredibly limited because the big payor is CMS, and we have a limited ability to mount them on our own. DR. NORTON: But on the other hand, going back to autologous bone marrow transplantation for breast cancer, the Blues were paying for transplants for ten years. It took us that long to find out it did not work, and the Blues could stop paying for it. Had they supported trials, we would have gotten out in two years. DR. BAUGH: It was not that we wanted to go in and pay for this intervention. We were mandated to pay by the courts of the United States. We got the cart before the horse. DR. NORTON: But you wouldn’t pay for participation in clinical trials. DR. BAUGH: We would have paid for clinical trials had that been an option, but at the time, it was mandated in the courts that we pay without benefit of a clinical trial. DR. NORTON: We could go on about that, but the point is, in terms of costs, there may be more cost-effective technology out there and we may continue to pay for technology that isn’t as good. DR. PETITTI: I think we are agreeing that someone should pay. The question is, what pocket does it come out of. So it is not so much a matter of Blue Cross, or Kaiser, or CMS paying; it is that society bears the burden of the inefficiencies that are created by using ineffective or unproven technologies or even multliple, layered technologies as seems to be happening in imaging. We need to find out how we can pay to get the evidence to sort this out. DR. BOHMER: Or making the decision to use technologies in the absence of an evaluation of the system-wide impact of those decisions and the system-wide resources that those decisions imply. To some extent, most payors are

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis obliged to make one by one decisions in a way that Kaiser, because it combines pay or and provider, can avoid and is better off for it. DR. BAUGH: I agree. I think they are better off. I think the kind of evidence we are talking about needs to be gathered and paid for. People will look to find the money. We have some of the responsibility in the state of Maryland at least. When it first happened, we looked at it with mixed emotions, but I think at this point we are ready to step up to our portion. DR. NORTON: The argument for reimbursement of the patient care costs in clinical trials is that it is cheaper in the long run for everybody, and better for patients. DR. BAUGH: But I think it is not up to a single pay or. This is something that has to be across the board and shared. DR. PENHOET: The committee heard some strange ideas about very large trials during the course of our work. There have to be some controls if we are going to expect other people to pay for studies. And are we talking primarily about what Dr. Tunis this morning called practical clinical trials, trials focused on evidence that will help improve practice? PARTICIPANT: And would you centralize decision-making so you have a public/private collective that could evaluate at the proof of concept point? Dr. Vosburgh, you were talking about technologies that could be out there, that could replace something that exists today, how the evidence could be collected and they could be brought to the marketplace by working with a large enough collective. DR. PETITTI: At least from our point of view, we looked to the NIH, and maybe they become the clearinghouse, given these public-private partnerships. They have enormous credibility in the kinds of trials and the decision making process. For example, the ALLHAT trial example from our report was an NIH trial where $20 million came from the pharmaceutical companies to pay for it. The fact that it came to us as an NIH trial with all the oversight and the integrity that implies made us willing to participate even though, I can tell you, we lost a ton of money. I have documented to the NHLBI how much money we lost in participating in that trial in the short run. DR. BOHMER: It is an investment. DR. PENHOET: I think the NIH review mechanisms are pretty good at sorting out the bad ideas. The issue left hanging in the air—is AHRQ a hindrance to further progress in this field or a help. It is possible that before we look at this again, we might think about refolding it back to the NIH. It is very hard given the current situation to see how it is going to work otherwise. The existing agency today, NIH, is clearly in the best position.

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis DR. PETITTI: I am thinking of the CT colonography trial. We would be willing to be in that trial, but it will be competitive. There will be more competent sites interested in participating than can be accommodated. That often happens with the really good NIH trials. Why would we want to be part of the trial? First of all, we get the information early. We are able to tell our members that we are looking at it. We have the satisfaction of making a public contribution, but we would probably lose money on that trial, too. DR. VOSBURGH: Importantly, I think the NIH is taking a broader view of funding that has marketplace implications rather than scientific implications by broadening out the review process and the participation of different communities. That is essential here. It is not just a matter of clinical efficacy, but of the business case and of the potential to market it. So I think headway is being made there, but it is something that will bear attention as you move forward. DR. PENHOET: Dr. Hanash, you never did give us your prediction of the date your proteomic marker will come to market. DR. HANASH: Personally, I think there are multiple strategies which have merit, so which one would pan out remains to be determined. Some investment is needed for early discovery and validation to determine which markers are the winners. I think it would be very premature, at this point, to predict which particular one. In the end, I think there is going to be a continuum whereby we could start with something cheaper than a mammogram and apply progressively more expensive testing to subpopulations to confirm a diagnosis DR. NORTON: What is the best way to collect serum proteins for proteomic analysis? DR. HANASH: To some extent it depends on what type of marker you are after, but in terms of representing what is circulating in the blood, it is clear that plasma is best. When you subject blood to clotting, you burst a lot of cells and you activate many different subcellular systems. So what you are seeing in serum may not represent what is normally in the circulation. Plasma is a cleaner preparation because avoiding the clotting process eliminates a lot of the proteins from burst cells that you see in serum. DR. NORTON: I am thinking about the implications of having the most informative samples. I am talking about my experiences in trials. Historically, we did not collect tissue from the tumor over the many years we were doing trials, so that, as we developed therapies that worked, we did not have samples that might identify the responsive subset of patients. Now we have the molecular technologies for classification by gene expression and gene copying and various other things that we can measure. It seems obvious to me that at some point we are also going to have protein patterns which may be informative. If we do not collect specimens prospectively during imaging or other trials, 5 years from now we will not have the opportunity to look back and identify the various subsets of

OCR for page 68
Saving Woman’s Lives: Strategies for Improving Breast Cancer Detection and Diagnosis patients in which those technologies were effective. We will have missed a real golden opportunity, don’t you think? DR. HANASH: Absolutely. There is still a complete disconnect between clinical trials and molecular approaches to cancer biology. We must somehow deal with that. The cooperative trial groups do not seem very adept at designing molecular components into their studies. We have been looking for support for that without much success. At the moment, it seems that the trials are aimed only at finding out if the drug does or does not work. DR. NORTON: We all know it is critically important. We can’t get agreement on who is supposed to pay for it. Therefore, it is a question of doing something that you can afford. DR. HANASH: We still need to figure out how to synergize molecular approaches to tumor profiling or serum profiling with cooperative trials. The NCI is very interested in having another workshop like this one to deal with that specific issue. Many challenges remain. DR. PENHOET: I think it is worth pointing out that it is almost inevitable that in this case, as in many others, the screening test will evolve from the diagnostic test. Most of the money is going into paying for therapeutic trials and not for screening trials. Now we are finding genetic markers that predict therapies, so I think your point is well taken that money invested in clinical trial diagnostics is not money wasted in terms of eventual screening techniques. DR. VOSBURGH: I had somewhat the same thought as Dr. Norton, but from a different perspective. This came up in early discussions of the committee and may be in our report somewhere. There is a significant role and perhaps some advocacy for the education of patients to support the acquisition of blood or tissue samples so that we can build these longitudinal databases and then go back and validate new technologies as they are developed. This is something that people can do now for the long-term advancement of detection of disease. There is a call for action here that we probably haven’t emphasized as much because there are so many other good things in the report. DR. PENHOET: It is possible that a gene chip, if you have a candidate number of genes—you would still need a few hundred—could be very inexpensive to run—or proteomics—if you only have a dozen markers or so. DR. HANASH: We should be careful not to embellish this. That creates disappointment later. This is really a very slow painful process of an incremental nature. There is not going to be a revolution overnight; you wake up and mammography has been replaced by a 100 percent sensitive and specific test. It is incremental and very tedious.