Skip to main content

Currently Skimming:

Workshop Summary
Pages 1-68

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 1...
... . The consensus committee found that while the technical quality of mammography had improved since MQSA implementation, mammography interpretation remained quite variable, and that this variability limited the full potential of mammography to reduce breast cancer mortality by detecting breast cancers at an early stage.
From page 2...
... , and patient advocacy organizations, discussed potential options for action to improve the quality of mammography interpretation. Topics discussed included • challenges in the delivery of high-quality mammography, including a lack of mammography specialists and geographic variability in patient access to mammography; • the impact of training and experience on interpretive performance; • how best to measure interpretative performance and identify radiol ogists and facilities that could benefit from interventions to improve performance; • various tools and interventions that could potentially be used to improve interpretation skills, such as self-tests, audits with feedback, and mentoring; and • the impact of new technologies and supplemental imaging on inter pretation of breast screening and diagnostic images.
From page 3...
... •  evise measures that take into account biopsy capture rates D and the bias that a lack of capture will have on cancer detection rates and other metrics. (Diana Miglioretti)
From page 4...
... •  ndertake research and development on test sets, including U identifying ranges for specialized test sets and best strategies for communicating results in a standardized fashion, and validating the value of test sets. (Smith)
From page 5...
... L (D'Orsi) •  he American Board of Radiology could specify that test sets are T acceptable activities for Maintenance of Certification for radiolo gists.
From page 6...
... have been archived online.2 HISTORY OF MAMMOGRAPHY OVERSIGHT AND EFFORTS TO IMPROVE QUALITY Breast imaging has been the focus of many debates over several decades, noted Diana Buist, senior scientific investigator for Group Health Breast Cancer Surveillance at the Group Health Research Institute, in her opening remarks at the workshop. She said most of the debate has centered around when we should start screening, how often we should screen, and when we should stop.
From page 7...
... In addition, an IOM committee of experts was formed to determine how to improve mammography quality further, in preparation for the anticipated 2007 reauthorization of MQSA. The IOM report that stemmed from this committee's deliberations was published in 2005 and, as reported by Etta Pisano, Dean Emerita and Distinguished University Professor, Medical University of South Carolina, made several recommendations related to improving mammography interpretation, including to • revise and standardize the MQSA-required medical audit; • facilitate a voluntary advanced medical audit with feedback; • designate specialized Breast Imaging Centers of Excellence that undertake demonstration projects and evaluations; and • study the effects of continuing medical education, reader volume, double reading, and computer-aided diagnosis (CAD)
From page 8...
... A number of changes were made to the medical audit with the publication of the final regulations in 1998, including defining a positive mammogram as suspicious or highly suggestive of malignancy (assessed as 4 or 5 on the Breast Imaging Reporting and Data System [BI-RADS] 4 or 5; see Table 1)
From page 9...
... , the cancer detection rate per 1,000 women, and a measurement of the abnormal interpretation rate, that is, those readings that lead to additional imaging or biopsy (recalls for additional imaging rate plus overall biopsy numbers)
From page 10...
... . Helen Barr, Director, Division of Mammography Quality Standards, FDA, pointed out that shortly after the IOM report was published in 2005, she and her former colleague Charles Finder drafted amended regulations to address some of the IOM recommendations, particularly with regard to those focused on further enhancing the medical audit.
From page 11...
... The NMD also collects exam information, including the date of the exam, physician and facility identifying codes, breast density, assessment category, and management recommendation. In addition, data are included on outcomes, such as biopsy procedure date and result, and for breast cancers detected, tumor size, nodal status, and tumor stage.
From page 12...
... "If you belong to the NMD, you get your stats compared to the rest of the people in the program and compared to the BCSC data," said Barbara Monsees, Emeritus Chief, Breast Imaging Section, Washington University School of Medicine. But unlike the BCSC, the NMD is currently not linked to tumor registries, so there is no way to calculate sensitivity and specificity rates from its data, Monsees noted.
From page 13...
... Although no federal action was taken in response to this recommendation, in 2007 ACR started its own Breast Imaging Centers of Excellence program to complement the Breast Cancer Centers of Excellence program.6 There are currently more than 1,200 such breast imaging centers scattered throughout the country, as shown in Figure 2. Wyoming is the only state that lacks a Breast Imaging Center of Excellence.
From page 14...
... This collection of difficult as well as easy cases also can be used to satisfy volume requirements. As Pisano noted, "This test set is more valuable than reading 480 consecutive cases at a breast imaging center because with the latter, you might only see one or two cancers, but the test cases include cancers, and all the abnormal cases are pathology proven, whereas if you're in a practice all you have is the opinion of the person sitting next to you about whether the case is positive or not, so you rise to the level of that person training you as opposed to the known truth." Pisano pointed out that the aim of the boot camp program is to improve radiologists' interpretive skills, but is not intended to be used as a screening tool to assess which physicians should improve their performance.
From page 15...
... Citing both prospective and retrospective studies, Monsees said research indicates that this technology can improve cancer detection rate, decrease recall rates, and improve screening performance in all but fatty breasts (Ciatto et al.,
From page 16...
... screen for a breast cancer.
From page 17...
... Matthew Wallis, director of the Cambridge and Huntington Breast Screening Service, added that "If you are recalling, you've got to be able to see and cope with the tears of distress that are associated with the damage you cause when you write to women saying ‘please come back.' Anybody who says recall is not a stressful process does not live in my world." Ganz also stressed that, "The risks of misdiagnosis or a harm from an overdiagnosis or recall are huge at the population level. It is going to cost the health care system a lot." Smith agreed and said, "If you can improve quality and reduce avoidable recalls, you're going to save money." He added that enormous societal costs are avoided if breast cancer is identified and treated early, including the costs of job absenteeism due to cancer treatment or while caring for a relative with advanced cancer, as well as the costs of disability payments and the loss of a valued employee.
From page 18...
... ACR data indicate that only 647 radiologists devote themselves solely to breast imaging, compared to more than 9,000 radiologists who report spending some time reading mammograms, and the more than 7,000 who report spending some time doing breast imaging (ACR, unpublished data)
From page 19...
... The study analyzed data from the Breast Cancer Surveillance Consortium (BCSC)
From page 20...
... in screening mammograms, radiologists who do not have large practices may not accrue adequate experience detecting breast tumors. The volume of mammograms read by radiologists correlates with their interpretive performance, several studies indicate.
From page 21...
... radiologists, especially those in rural practices, to meet such high volume requirements. Legal Challenges Monsees noted that performance expectations are high for radiologists interpreting mammograms in the United States, and that radiologists frequently are sued for malpractice if they fail to identify a cancer in a mammogram.
From page 22...
... Onega cited a study that found that half of breast imaging facilities took nearly a decade to make the transition to digital mammography (Miglioretti et al., 2009) and noted that MRI breast imaging is also slowly diffusing into clinical practice.
From page 23...
... For example, at the site in the city serving an underresourced population, the recall rate is 16 percent and the cancer detection rate is 8 per 1,000 women. But when the same group of radiologists read in the suburban site, the recall rate is 9 percent and the cancer detection rate is 3 per 1,000 women.
From page 24...
... "It is possible for a radiologist to be board certified without testing in the breast imaging module," Monticciolo noted, although the noninterpretive skills section contains topics pertinent to breast imaging, such as breast screening, recall rates, and radiation safety. "So even if you are not opting for a breast module, you're still going to see breast imaging questions on the certifying exam," Monticciolo stressed.
From page 25...
... As part of their education requirements, technologists must also participate in performance evaluation and recording of quality control tests, and review at least 10 mammogram exams with an MQSA-qualified interpreting physician who evaluates their technique and positioning, and assesses their knowledge of breast anatomy and pathology. The technologist also has to pass an exam that assesses the knowledge and skills typically required of entry-level mammography technologists.
From page 26...
... For example, some screening programs include DCIS in their cancer detection rates or in their percentage of minimal cancers detected rates, while others only count invasive cancers. In addition, some audits lump together diagnostic and screening mammograms when assessing performance while others separate them.
From page 27...
... For example, as the false-positive rate is decreased, specificity tends to go up, but the false-negative rate tends to rise as well. To help make sense of this, D'Orsi presented an analogy in which the false-positive rate can be considered as the "money paid" to detect breast cancer, and the cancer detection rate is "how much stuff -- breast cancers detected -- that you bought with that money." The minimal cancer detection rate is a measure of "what kind of stuff you bought with that money," he added, continuing the analogy.
From page 28...
... . Wide confidence intervals are often needed due to the small volume of mammograms read by most radiologists and the rarity of breast cancer, she added.
From page 29...
... Other Factors That Can Affect Reader Metrics A few participants suggested considering patient demographic characteristics when conducting audits and adjusting them accordingly. Cancer detection rates will vary depending on the age of the population being screened and how frequently they are screened, and other measures will vary according to genetic, ethnic, or sociodemographic characteristics of the population, as noted by Wallis and Onega.
From page 30...
... This session explored possible criteria and cut-points for low performance, challenges in using those cut-offs for quality assurance purposes, and ways to measure facility versus radiologist interpretive performance. Carney began this session by noting the significant variability of the interpretive acumen of radiologists in mammography, with their sensitivity varying between 75 and 95 percent and the specificity ranging between 83 and 98.5 percent (IOM, 2005)
From page 31...
... For work-up of a breast lump, an additional 335 cancers would be diagnosed per 100,000 women, with a reduction in the number of false-positive examinations of 634 per 100,000 women. Carney noted that the normative data used to determine cut-points was based on at least 30 cancer interpretations for sensitivity and 1,000 interpretations for the other performance measures, but these numbers may be too TABLE 3  Final Cut-Points for Screening Mammography Using the Angoff Method Percentage of the BCSC Low Performance Radiologists in Low Measure Range Performance Range Sensitivity <75 18.0% Specificity <88 or >95 47.7% Recall rate <5 or >12 49.1% PPV1 <3 or >8 38.4% PPV2 <20 or >40 34.0% Cancer detection rate <2.5/1,000 28.4% NOTE: BCSC = Breast Cancer Surveillance Consortium; PPV = positive predictive value.
From page 32...
... TABLE 4b  Final Cut-Points for Diagnostic Mammography to Work Up a Breast Lump Using the Angoff Method Percentage of BCSC Low Performance Radiologists in Low Measure Range Performance Range Sensitivity <85 31.6% Specificity <83 or >95 24.0% Recall rate <10 or >25 20.5% PPV2 <25 or >50 32.3% PPV3 <30 or >55 46.3% Cancer diagnosis rate <40/1,000 19.7% NOTES: BCSC = Breast Cancer Surveillance Consortium; PPV = positive predictive value. SOURCES: Carney presentation, May 12, 2015; Carney et al., 2010.
From page 33...
... For these combined criteria, a broader range of recall rates was allowed for radiologists with higher cancer detection rates. The percentage of radiologists who met these combined criteria was 62 percent, compared to 40 percent of radiologists who met the original cut-points prior to the combined analysis.
From page 34...
... Hubbard conducted simulations of her binary approach using representative BCSC data and Medicare claims data and the guideline threshold cancer detection rate of 2.5 per 1,000 women and a recall rate of 12 percent. She found that her binary approach in both types of simulations worked well for the recall rate criteria because recalls are relatively common, but was not as precise for the cancer detection rate, which is based on a rarer outcome.
From page 35...
... She also stressed that just because recall rate can be estimated well and is more reliable from a statistical standpoint does not mean it is a better tool to measure performance. Wallis noted that Miglioretti's simulation to determine adequate performers is based on cancer detection rate data from radiologists performing a minimum of nearly 3,000 mammograms.
From page 36...
... Carney suggested coaching or mentoring individuals who don't meet performance criteria. Pisano responded that such coaching tends to occur in large academic practices.
From page 37...
... Many test sets provide immediate feedback and detail how the radiologist's interpretation differs from those of experts. Test sets also provide an opportunity to set reference standards, Broeders and Smith pointed out, and to measure performance against that standard or to assess performance on new imaging technology.
From page 38...
... Monsees reiterated that it is easy for radiologists to obtain their recall rates, even if they operate at low volume, whereas it is more difficult to assess cancer detection rates. Thus, she suggested that radiologists could use their audit data to determine their specificity and rely on test sets more to determine their sensitivity and detection rate.
From page 39...
... Test Sets for Quality Assurance Broeders and Smith reported that some test sets were developed to help ensure quality in national screening programs, including the PERFORMS test set that debuted in the United Kingdom in 1991 along with its national mammography screening program, the test set called BREAST, which is used in Australia and New Zealand, a test set used to qualify radiologists to read mammograms in British Columbia's provincial mammography screening program, and a test set developed in Italy. PERFORMS is designed to be an educational self-assessment and training program for professionals interpreting mammograms.
From page 40...
... This test set is only taken once to qualify radiologists as mammography readers when they begin working in the province's mammography screening program. Subsequently, bimonthly review of all screen-detected cancers and all interval cancers as well as an annual audit review of individual and program data are deemed sufficient to measure continuing performance and to qualify for the program.
From page 41...
... The experience with the Dutch test set revealed the importance of having location sensitivity measured, according to Broeders, because for some cases there was a large degree of variability in location interpretation, with some radiologists identifying lesions that were not in close proximity to those identified by the experts, and sometimes even identifying lesions in the opposite breast. Use of the Dutch test also underlined the need to have fixed test dates and locations to avoid technical user problems that occurred when participants downloaded the test cases to their workstations at multiple locations, Broeders reported.
From page 42...
... . In contrast, Smith noted an Australian study on BREAST indicated that test set performance correlated well with clinical performance, particularly on measures of sensitivity, recall rate, and detection rate of small cancers, but did not correlate well with specificity, probably due to differences between the settings of clinical practice and testing and because the radiologists were informed the test set was embedded with cancers (Soh et al., 2015)
From page 43...
... For example, should they be voluntary or mandatory? Monsees and Smith suggested it might be best to start with test sets being voluntary and perhaps mandating them later.
From page 44...
... The researchers compared the individual performance on test sets before and after the teaching program and found that although more radiologists made positive changes after the live seminar compared to the DVD arm, the magnitude of those changes was not statistically significant. When the DVD intervention arm was compared to a control group, sensitivity and PPV significantly improved, but not specificity.
From page 45...
... . For example, if readers with low performance scores have high recall rates combined with moderate to high cancer detection rates, their recalls are reviewed with a mentor, whereas those with low recall and low cancer detection rates are encouraged to increase their recalls.
From page 46...
... cal measures such as sensitivity and specificity are not used because "most radiologists fell asleep in statistics classes," Wallis said. The analysis of cancer detection and recall rates can suggest radiologists who read similarly and should possibly not be paired together for double readings.
From page 47...
... NOTE: CDR = cancer detection rate. SOURCES: Wallis presentation, May 13, 2015; West Midlands Quality Assurance Reference Centre, Public Health England.
From page 48...
... Smith suggested assessing how long performance improvements last following interventions, such as test sets and mentoring. "Do people fall back into their same old reading patterns if they don't have the support of ongoing audits, reviews, and support?
From page 49...
... The Society of Breast Imaging could also create a subgroup for young breast imagers "and provide care and feeding to that group at regular meetings," he said. Monsees concurred, noting that the Society of Breast
From page 50...
... "It's a signal-to-noise problem and we haven't focused on providing that kind of education, so it is really alien to a lot of people who have only 3 months of mammography in their training after having 3.5 years in anatomy-based imaging," he said. Smith suggested developing a special breast imaging curriculum for new radiologists entering the mammography workforce, with emphasis on the value of tracking their performance and participating in the NMD.
From page 51...
... However, a study she conducted found that radiologists who work up a large number of their own recalls tend to have better performance metrics, such as sensitivity and cancer detection rates, than those who work up fewer than 25 per year (Buist et al., 2014)
From page 52...
... These studies found that use of the trained technologists as prereaders or double readers increased cancer detection rates without significantly increasing recall or false-positive rates (Bassett et al., 1995; Haiart and Henderson, 1990; Pauli et al., 1996; van den Biggelaar et al., 2009; Wivell et al., 2003)
From page 53...
... showed the following correlations: • Higher false positives with lower volumes (screening, diagnostic, and total volume) • Significantly lower cancer detection with low diagnostic volume and high percentage of screening • Significantly lower sensitivity with high percentage of screening Buist said these data suggest that "we should consider increasing minimum interpretive volume in the United States and also include a minimum diagnostic interpretation requirement." Smith agreed, noting that new volume requirements could be addressed with amendments to MQSA regulations.
From page 54...
... Agreeing with Smith, who called the BCSC a national asset, Monsees added, "If the NMD or the BCSC are national treasures then we have to find federal funding for them." Monsees suggested considering merging the two databases and exploring whether it would be better to support the National Mammography Database and link it to tumor registries in all 50 states, or instead link each mammography facility to a tumor registry. Both options would require a significant amount of funding, she noted.
From page 55...
... She suggested devising measures that take into account biopsy capture rates and the bias that a lack of capture will have on cancer detection rates and other measures, noting a site might appear to have a very low cancer detection rate because it is not capturing a lot of the biopsy results that follow a positive exam. "There are opportunities for collaboration between the BCSC and NMD and understanding the strengths and limitations due to not having the cancer linkage," Miglioretti said.
From page 56...
... I am hopeful that we are also headed in a direction where there is that culture of evaluation of loading something up and sharing how good you are, having new indexes calculated in a meaningful way that will be much more accurate than taking a certification exam," she said. Monsees added that test sets are a good way for radiologists to see how accurately they are able to detect cancer and should be used more to assess performance.
From page 57...
... Monsees also suggested that the American Board of Radiology could specify test sets and other quality improvement and self-assessment projects as part of MOC for radiologists. Seeding Positive Mammograms in Clinical Practice Some participants suggested that seeding a radiologist's clinical caseload with images of confirmed cancers could enable radiologists to more quickly gain expertise in identifying cancers and hopefully improve their cancer detection rates.
From page 58...
... Performance-Based Incentives Smith suggested that breast imaging centers should facilitate the provision of feedback on performance to radiologists, adopt new strategies to improve performance, and reward quality performance. Referring to the RVU (relative value unit)
From page 59...
... She added that a study by Burnside also showed that when prior mammograms are available in the screening setting, the breast cancers detected are more likely to be at an earlier stage, before any spread to lymph nodes (Burnside et al., 2002)
From page 60...
... . Monsees concluded that there are limits to standard mammography and that supplemental imaging with ultrasound, MRI, or other complementary technologies could improve cancer detection, but she emphasized again that it is not yet known who should have supplemental imaging and with what types of technologies.
From page 61...
... "I don't think the population understands what false positives are, and it would be great to come up with some sort of patient education tool that was simple to understand and that every practice could provide," Geller said.
From page 62...
... 62 ASSESSING AND IMPROVING THE INTERPRETATION OF BREAST IMAGES Federal Oversight of Breast Imaging Barr pointed out several "gaps" in MQSA, which originally was created because of concerns about radiation dose and image quality. There was a focus on training for the technologists and medical physicists to provide a quality image and ensure proper dose, and it was assumed that a quality image would ensure quality mammography; the quality of the interpretation was not given precedence, she said.
From page 63...
... In addition, she emphasized the need to document both the short- and long-term effects of various educational opportunities, such as selectorships at Centers of Excellence, and to determine whether various CME programs and self-assessment tests improve outcomes and for how long. In closing, Buist pointed out that since the advent of MQSA, mammography has been at the forefront in medicine for assessing and ensuring quality performance, and what has been learned from that experience could be applied to other areas of medicine, including lung and colon cancer screening programs.
From page 64...
... 1998. A comparison of cancer detection rates achieved by breast cancer screening programmes by number of readers, for one and two view mammography: Results from the uk national health service breast screening programme.
From page 65...
... 2014. Breast cancer screening using tomosynthesis in combination with digital mammography.
From page 66...
... 2008. Comparing screening mammography for early breast cancer detection in vermont and norway.
From page 67...
... 2007. Provider's volume and quality of breast cancer detection and treatment.
From page 68...
... 2014. Radiologist interpretive volume and breast cancer screening accuracy in a canadian organized screening program.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.