This chapter explains the committee’s definition of diagnostic error, describes the committee’s approach to measurement, and reviews the available information about the epidemiology of diagnostic error. The committee proposes five purposes for measurement: to establish the incidence and nature of the problem of diagnostic error; to determine the causes and risks of diagnostic error; to evaluate interventions; for education and training purposes; and for accountability purposes. Because diagnostic errors have been a very challenging area for measurement, the current focus of measurement efforts has been on understanding the incidence and nature of diagnostic error and determining the causes and risks of diagnostic error. The committee highlighted the way in which various measurement approaches could be applied to develop a more robust understanding of the epidemiology of diagnostic error and the reasons that these errors occur.
The Institute of Medicine (IOM) has defined quality of care as “the degree to which health services for individuals and populations increase the likelihood of desired health outcomes and are consistent with current professional knowledge” (IOM, 1990, p. 5). The IOM’s report Crossing the Quality Chasm further elaborated on high-quality care by identifying six aims of quality: “[H]ealth care should be (1) safe—avoiding injuries to patients from the care that is intended to help them; (2) effective—providing services based on scientific knowledge to all who could ben-
efit and refraining from providing services to those not likely to benefit; (3) patient-centered—providing care that is respectful of and responsive to individual preferences, needs, and values, and ensuring that patient values guide all clinical decisions; (4) timely—reducing waits and sometimes harmful delays for both those who receive and those who give care; (5) efficient—avoiding waste, including waste of equipment, supplies, ideas, and human resources; and (6) equitable—providing care that does not vary in quality because of personal characteristics, such as gender, ethnicity, geography, and socioeconomic status” (IOM, 2001, p. 6). Communicating accurate and timely diagnoses to patients is an important component of providing high-quality care; errors in diagnosis are a major threat to achieving high-quality care.
The IOM defines an error in medicine to be the “failure of a planned action to be completed as intended (i.e., error of execution) and the use of a wrong plan to achieve an aim (i.e., error of planning) [commission]” (IOM, 2004, p. 30). The definition also recognizes the failure of an unplanned action that should have been completed (omission) as an error (IOM, 2004). The IOM report To Err Is Human: Building a Safer Health System distinguished among four types of error: diagnostic, treatment, preventive, and other (see Box 3-1). An adverse event is “an event that results in unintended harm to the patient by an act of commission or omission rather than by the underlying disease or condition of the patient” (IOM, 2004, p. 32).
The committee’s deliberations were informed by a number of existing definitions and definitional frameworks on diagnostic error (see Appendix C). For instance, Graber and colleagues used a classification of error from the Australian Patient Safety Foundation to define diagnostic error as a “diagnosis that was unintentionally delayed (sufficient information was available earlier), wrong (another diagnosis was made before the correct one), or missed (no diagnosis was ever made), as judged from the eventual appreciation of more definitive information” (Graber et al., 2005, p. 1493). They further divided diagnostic error into three main categories: no-fault errors, system-related errors, and cognitive errors. No-fault errors, originally described by Kassirer and Kopelman (1989), stem from factors outside the control of the clinician or the health care system, including atypical disease presentation or patient-related factors such as providing misleading information. The second category, system-related errors, can include technical or organizational barriers, such as problems with communication and care coordination; inefficient processes; technical failures; and equipment problems. Finally, there are cognitive errors that clinicians may make. The causes of these can include inadequate knowledge, poor critical thinking skills, a lack of competency, problems in data gathering, and failing to synthesize information (Chimowitz et al.,
Error or delay in diagnosis; failure to employ indicated tests; use of outmoded tests or therapy; failure to act on results of monitoring or testing
Error in the performance of an operation, procedure, or test; error in administering the treatment; error in the dose or method of using a drug; avoidable delay in treatment or in responding to an abnormal test; inappropriate (not indicated) care
Failure to provide prophylactic treatment; inadequate monitoring or follow-up of treatment
Failure of communication; equipment failure; other system failure
SOURCE: IOM, 2000, p. 36.
1990). Each of these errors can occur in isolation, but they often interact with one another; for instance, system factors can lead to cognitive errors.
Schiff and colleagues (2009, p. 1882) defined diagnostic error as “any mistake or failure in the diagnostic process leading to a misdiagnosis, a missed diagnosis, or a delayed diagnosis.” Schiff and colleagues (2005) divide the diagnostic process into seven stages: (1) access and presentation, (2) history taking/collection, (3) the physical exam, (4) testing, (5) assessment, (6) referral, and (7) follow-up. A diagnostic error can occur at any stage in the diagnostic process, and there is a spectrum of patient consequences related to these errors ranging from no harm to severe harm. Schiff and colleagues noted that not all diagnostic process errors will lead to a missed, delayed, or wrong diagnosis, and not all errors (either in the diagnostic process or related to misdiagnosis) will result in patient harm. Relating this model to Donabedian’s structure-process-outcome framework, Schiff and colleagues consider diagnosis to be an intermediate outcome of the diagnostic process, and any resulting adverse patient harm would be considered true patient outcomes (Schiff and Leape, 2012; Schiff et al., 2005, 2009).
In describing diagnostic error, Singh focused on defining missed op-
portunities, where a missed opportunity “implies that something different could have been done to make the correct diagnosis earlier. . . . Evidence of omission (failure to do the right thing) or commission (doing something wrong) exists at the particular point in time at which the ‘error’ occurred” (Singh, 2014, p. 99). Singh’s definition of a missed opportunity takes into account the evolving nature of a diagnosis, making the determination of a missed opportunity dependent on the temporal or sequential context of events. It also assumes that missed opportunities could be caused by individual clinicians, the care team, the system, or patients. Singh also highlighted preventable diagnostic harm—when a missed opportunity results in harm from delayed or wrong treatment or test—as the best opportunity to intervene.
Newman-Toker (2014a,b) developed a conceptual model of diagnostic error that attempted to harmonize the current definitional frameworks. His framing distinguished between diagnostic process failures and diagnostic labeling failures. Diagnostic process failures include problems in the diagnostic workup, and they may include both cognitive and system errors. Diagnosis label failures occur when the diagnosis that a patient receives is incorrect or when there is no attempt to provide a diagnosis label. Newman-Toker identified preventable diagnostic error as the overlap between a diagnostic process failure and a diagnostic label failure, and he noted that this is similar to Singh’s conceptualization of a missed opportunity (Singh, 2014). A preventable diagnostic error differs from a near-miss process problem, which is a failure in the diagnostic process without a diagnostic labeling failure. Newman-Toker also identifies unavoidable misdiagnosis, which is a diagnostic labeling failure that may occur in the absence of a diagnostic process failure and corresponds to the no-fault category described earlier. Furthermore, his model illustrates that harm may—or may not—result from diagnostic process failures and diagnostic labeling failures.
In reviewing the diagnostic error literature, the committee concluded that there are varying definitions and terminology currently in use to describe diagnostic error. For example, there is disagreement about exactly what constitutes a diagnostic error as well as about the precise meanings of a delayed diagnosis, a missed diagnosis, and a misdiagnosis (Newman-Toker, 2014b). Some treat the terms “diagnostic error” and “misdiagnosis” as synonyms (Newman-Toker, 2014b; Newman-Toker and Pronovost, 2009). There are some who prefer the term “diagnosis error” rather than “diagnostic error” because they conclude that diagnostic error should refer to the process of arriving at a diagnosis, whereas diagnosis error should refer to the final multifactorial outcome, of which the diagnostic process is only one factor (Berenson et al., 2014). Some use the term “missed diagnosis” solely for situations in which the diagnosis was found upon
autopsy (Graber et al., 2005; Newman-Toker, 2014b). While some definitions of diagnostic error include unavoidable errors, others conceptualize diagnostic error as something that stems from a failure in the diagnostic process (Graber et al., 2005; Newman-Toker, 2014b; Schiff et al., 2009). In part, the various definitions that have arisen reflect the intrinsic dualistic nature of the term “diagnosis,” which has been used to refer both to a process and to the result of that process. Definitions of diagnostic error can also vary by stakeholder; for example, a patient’s definition of a diagnostic error may be different from a clinician- or research-oriented definition of diagnostic error. Other terms used in the diagnostic error literature include diagnostic accuracy (Wachter, 2014), misdiagnosis-related harm (Newman-Toker and Pronovost, 2009), and preventable diagnostic errors (Newman-Toker, 2014b).
Because of this lack of agreement, the committee decided to formulate a new definition of diagnostic error. The committee’s patient-centered definition of diagnostic error is:
the failure to (a) establish an accurate and timely explanation of the patient’s health problem(s) or (b) communicate that explanation to the patient.
The definition frames a diagnostic error from the patient’s perspective, in recognition that a patient bears the ultimate risk of harm from a diagnostic error. The committee’s definition is two-pronged; if there is a failure in either part of the definition, a diagnostic error results. It also conveys that each arm of the definition may be evaluated separately for measurement purposes (see section on measurement and assessment of diagnostic error).
The first part of the committee’s definition focuses on two major characteristics of diagnosis: accuracy and timeliness. A diagnosis is not accurate if it differs from the true condition a patient has (or does not have) or if it is imprecise and incomplete (lacking in sufficient detail). It is important to note that a working diagnosis, described in Chapter 2, may lack precision or completeness but is not necessarily a diagnostic error. The nature of the diagnostic process is iterative, and as information gathering continues, the goal is to reduce diagnostic uncertainty, narrow down the diagnostic possibilities, and develop a more precise and complete diagnosis. The other characteristic the committee highlighted was timeliness. Timeliness means that the diagnosis was not meaningfully delayed. However, the committee did not specify a time period that would reflect “timely” because this is likely to depend on the nature of a patient’s condition as well as on a realistic expectation of the length of time needed to make a diagnosis. Thus, the term “timely” will need to be operationalized
for different health problems. Depending on the circumstances, some diagnoses may take days, weeks, or even months to establish, while timely may mean quite quickly (minutes to hours) for other urgent diagnoses.
The second part of the committee’s definition focuses on communication. A fundamental conclusion from the committee’s deliberations was that communication is a key responsibility in the diagnostic process. From a patient’s perspective, an accurate and timely explanation of the health problem is meaningless unless this information reaches the patient so that a patient and health care professionals can act on the explanation. The phrase “explanation of the patient’s health problem(s)” was chosen because it was meant to describe the health problem (or problems) involved as well as the manner in which the information is conveyed to a patient. The explanation needs to align with a patient’s level of health literacy and to be conveyed in a way that facilitates patient understanding. Because not all patients will be able to participate in the communication process, there will be some situations where the explanation of the health problem may not be feasible to convey or be fully appreciated by the patient (e.g., pediatric patients or patients whose health problems limit or prevent communication). In these circumstances, the communication of the health problem would be between the health care professionals and a patient’s family or designated health care proxy. There may also be urgent, life-threatening situations in which a patient’s health problem will need to be communicated following treatment. However, even in these urgent situations, patients and their families need to be informed about new developments, so that decision making reflects a patient’s values, preferences, and needs. Timely communication is also context-dependent: With some health problems, providing an explanation to a patient can take weeks or months to establish. However, throughout this time clinicians can communicate the working diagnosis, or the current explanation of the patient’s health problem, as well as the degree of certainty associated with this explanation.
The phrase “failure to establish” is included in the definition because it recognizes that determining a diagnosis is a process that involves both the passage of time and the collaboration of health care professionals, patients, and their families to reach an explanation. The committee chose the term “health problem” because it is more inclusive than the term “diagnosis” and often reflects a more patient-centered approach to understanding a patient’s overall health condition. For example, a health problem could include a predisposition to developing a condition, such as a genetic risk for disease. In addition, there are circumstances when it is important to focus on resolving the symptoms that are interfering with a patient’s basic functioning, described as “activities of daily living,” rather than focusing exclusively on identifying and following up on all of a
patient’s potential diagnoses (Gawande, 2007). Individual patient preferences for possible health outcomes can vary substantially, and with the growing prevalence of chronic disease, patients often have comorbidities or competing causes of mortality that need to be taken into consideration when defining a patient’s health problem and subsequent plan for care (Gawande, 2014; Liss et al., 2013; Mulley et al., 2012).
There could be situations in which clinicians and health care organizations, practicing conscientiously (e.g., following clinical practice guidelines or established standards of care), may be unable to establish a definitive diagnosis. Sometimes a health care professional will need to acknowledge an inability to establish a diagnosis and will need to refer the patient to other specialists for further assessment to continue the diagnostic process. However, in some cases, this iterative process may still not lead to a firm diagnosis. For example, individuals may have signs and symptoms that have not been recognized universally by the medical community as a specific disease. From the patient’s perspective, this could be a diagnostic error, but medicine is not an exact science, and documenting and examining such instances could provide an opportunity to advance medical knowledge and ultimately improve the diagnostic process.
The committee’s definition reflects the six aims of high-quality care identified by the IOM (2001). It specifically refers to effectiveness and efficiency (i.e., accuracy), timeliness, and patient-centeredness as important aspects of diagnosis, while assuming safety and equity throughout the diagnostic process. Patients and their families play a key role in the diagnostic process, but a patient’s care team is ultimately responsible for facilitating the diagnostic process and the communication of a diagnosis (see Chapter 4).
The committee’s definition of diagnostic error differs from previous definitions in that it focuses on the outcome from the diagnostic process (the explanation of the patient’s health problem provided to the patient). Other definitions of diagnostic error focus on determining whether or not process-related factors resulted in the diagnostic error. For example, Singh’s definition focuses on whether there was a missed opportunity to make a diagnosis earlier (Singh, 2014). Likewise, Schiff and colleagues’ (2009) definition of diagnostic error requires a determination that there was a mistake or failure in the diagnostic process. The committee’s focus on the outcome from the diagnostic process is important because it reflects what matters most to patients—the communication of an accurate and timely explanation of their health problem. However, identifying failures in the diagnostic process is also critically important, which is reflected in the committee’s dual focus on improving the diagnostic process and reducing diagnostic errors. The committee’s discussion of measurement includes an emphasis on understanding where failures in the diagnostic
process can occur and the work system factors that contribute to these failures (see section on determining the causes and risks of diagnostic error).
Analyzing failures in the diagnostic process provide important information for learning how to improve the work system and the diagnostic process. Some failures in the diagnostic process will lead to diagnostic errors; however, other failures in the diagnostic process will not ultimately lead to a diagnostic error. In this report, the committee describes “failures in the diagnostic process that do not lead to diagnostic errors” as near misses.1 In other words, a near miss is a diagnosis that was almost erroneous. For example, it would be considered a near miss if a radiologist reported no significant findings from a chest X-ray, but a primary care clinician reviewing the image identified something that required further follow-up (Newman-Toker, 2014b). While there may have been a failure in the diagnostic process, the patient nonetheless received an accurate and timely explanation of the health problem. Examining near misses can help identify vulnerabilities in the diagnostic process as well as strengths in the diagnostic process that compensate for these vulnerabilities (see discussion of error recovery in Chapter 6). Likewise, several of the committee’s recommendations focus on identifying both diagnostic errors and near misses because they both serve as learning opportunities to improve diagnosis.
The diagnostic process can lead to a number of outcomes (see Figure 3-1). An accurate and timely diagnosis that is communicated to a patient presents the best opportunity for a positive health outcome because clinical decision making will be tailored to a correct understanding of the patient’s health problem. Diagnostic errors and near misses can stem from a wide variety of causes and result in multiple outcomes, and as evidence accrues, a more nuanced picture of diagnostic errors and near misses will develop. For example, further research can be directed at better understanding the causes of diagnostic errors and vulnerabilities in the
1 The term “near miss” is used within many fields—including health care—with varying definitions. For example, an IOM report defined a near miss as “an act of commission or omission that could have harmed the patient but did not cause harm as a result of chance, prevention, or mitigation” (IOM, 2004, p. 227). Because diagnostic errors can have a range of outcomes (including no harm) this definition of near miss is not consistent with the committee’s definition of diagnostic error. However, the committee’s conceptualization of a near miss is similar to previous uses. For example, the 2004 IOM report states that most definitions of a near miss imply an incident causation model, in which there is a causal chain of events that leads to the ultimate outcome: “Near misses are the immediate precursors to later possible adverse events” (IOM, 2004, p. 227). Rather than focus on adverse events as the outcome of interest, the committee’s outcome of interest is diagnostic error. Thus, the committee’s definition of a near miss is a failure in the diagnostic process that does not lead to diagnostic error.
diagnostic process. Some of the reasons diagnostic errors and near misses occur may be more remediable to interventions than others. In addition, determining which types of diagnostic errors are priorities to address, as well as which interventions could be targeted at preventing or mitigating specific types of diagnostic errors, will be informative in improving the quality of care.
A better understanding of the outcomes resulting from diagnostic errors and near misses will also be helpful. For example, if there is a diagnostic error, a patient may or may not experience harm. The potential harm from diagnostic errors could range from no harm to significant harm, including morbidity or death. Errors can be harmful because they can prevent or delay appropriate treatment, lead to unnecessary or harmful treatment, or result in psychological or financial repercussions. Harm may not result, for example, if a patient’s symptoms resolve even with an incorrect diagnosis. Diagnostic errors and near misses may also lead to inefficiency in health care organizations (e.g., the provision of unnecessary treatments) and increase system costs unnecessarily (covering the costs of otherwise unnecessary care or medical liability expenses). Diagnostic errors and near misses influence both the morale of individuals participating in the diagnostic process and public trust in the health care system. Correct diagnoses, diagnostic errors, and near misses can be used as opportunities to learn how to improve the work system and the diagnostic process (Klein, 2011, 2014).
There is growing recognition that overdiagnosis is a serious problem in health care today, contributing to increased health care costs, overtreatment, and the associated risks and harms from this treatment (Welch, 2015; Welch and Black, 2010). Overdiagnosis has been described as “when a condition is diagnosed that would otherwise not go on to cause symptoms or death” (Welch and Black, 2010, p. 605). Chiolero and colleagues note that advances in prevention and diagnosis “have changed the diagnostic process, expanding the possibilities of interventions across asymptomatic individuals and blurring the boundaries between health, risk, and disease” (Chiolero et al., 2015, p. w14060). Overdiagnosis has been attributed to the increased sensitivity of diagnostic testing (e.g., improved radiographic resolution); the identification of incidental findings; the widening boundaries or lowered thresholds for defining what is abnormal (e.g., hypertension, diabetes, or cholesterol levels); and clinicians’ concerns about missing diagnoses and subsequent medical liability risks
(see Chapter 7 for a discussion of defensive medicine concerns) (Chiolero et al., 2015; Gawande, 2015; Moynihan et al., 2012).
Recent discussions in the diagnostic error community have drawn attention to the issue of overdiagnosis and whether overdiagnosis should be defined and classified as an error (Berenson et al., 2014; Newman-Toker, 2014b; Zwaan and Singh, 2015). Although overdiagnosis is a complex and controversial topic, it is distinct from diagnostic error. For example, Chiolero and colleagues (2015, p. w14060) state: “Overdiagnosis is . . . neither a misdiagnosis (diagnostic error), nor a false positive result (positive test in the absence of a real abnormality).” Similarly, Gawande makes the distinction between overdiagnosis and diagnostic error: “Overtesting has also created a new, unanticipated problem: overdiagnosis. This isn’t misdiagnosis—the erroneous diagnosis of a disease. This is the correct diagnosis of a disease that is never going to bother you in your lifetime” (Gawande, 2015). Challenges in terminology and the blurry distinctions between diagnosis and treatment add to the confusion between overdiagnosis and diagnostic error. Recent reports in the literature have used the term “overdiagnosis” broadly to incorporate the concept of overmedicalization, including overdetection, overdiagnosis, overtreatment, and overutilization (Carter et al., 2015). For example, widening the criteria used to define a disease may raise important concerns about overmedicalization, but if a diagnosis is consistent with consensus guidelines for medical practice, it would not constitute a diagnostic error as defined by the committee.
A major reason overdiagnosis is not characterized as an error is because it is found primarily with population-based estimates; it is virtually impossible to assess whether overdiagnosis has occurred for an individual patient (Welch and Black, 2010). Our understanding of biology and disease progression is often not advanced enough to determine which individuals are going to be harmed by their health condition, versus the health conditions that are never going to lead to patient harm (e.g., thyroid, breast, and prostate cancers). Thus, clinicians are treating patients based on uncertain prognoses, and many more people are treated compared to those who actually benefit from treatment. Likewise, screening guidelines are intended to identify populations that will most likely benefit from screening, but not all individuals who undergo screening will benefit. For example, screening mammography—like many interventions—is an imperfect test with associated harms and benefits; some breast cancers will be missed, some women will die from breast cancer regardless of being screened, and some cancers that are identified will never lead to harm (Pace and Keating, 2014). Because current diagnostic testing technologies often cannot distinguish the cancers that are likely to progress and lead to patient harm from those that will not, inevitably
clinicians treat some patients with breast cancer who will not benefit from the treatment (Esserman et al., 2009). It would be incorrect (and largely impossible) to classify these cases as errors because clinicians are basing screening and treatment decisions on the best available medical knowledge, and the assessment of overdiagnosis is dependent on population-based analysis. For example, once diagnosed and treated for cancer, it is impossible to know whether the patient’s outcome would have been different if the tumor (which may have been indolent rather than life-threatening) had never been diagnosed.
However, overdiagnosis represents a true challenge to health care quality, and further efforts are warranted to prevent overdiagnosis and associated overtreatment concerns. Reducing overdiagnosis will likely require improved understanding of disease biology and progression, as well as increased awareness of its occurrence among health care professionals, patients, and their families (Chiolero et al., 2015). In addition, an important strategy that has been suggested for preventing overdiagnosis and associated overtreatment is avoiding unnecessary and untargeted diagnostic testing (Chiolero et al., 2015).
Box 3-2 provides an overview of overutilization of diagnostic testing in health care. Based on the committee’s definition of diagnostic error, which focuses on the outcomes for patients, overutilization of diagnostic testing is not necessarily a diagnostic error. Overutilization of diagnostic testing would be considered a failure in the diagnostic process (failure in information gathering—see the measurement section below). Overutilization is a serious concern, and efforts to improve diagnosis need to focus on preventing inappropriate overutilization of diagnostic testing (Newman-Toker, 2014a).
Improving diagnosis should not imply the adoption of overly aggressive diagnostic strategies. Chapter 2 highlights that the goal of diagnostic testing is not to reduce diagnostic uncertainty to zero (an impossible task), but rather to optimize decision making by judicious use of diagnostic testing (Newman-Toker et al., 2013; Kassirer, 1989). This is also why the committee highlighted iterative information gathering and the role of time in the diagnostic process; oftentimes it is not appropriate to test for everything at the outset—further information-gathering activities can be informed by test results, time, and a patient’s response to treatment. The committee makes a number of recommendations throughout the report that are targeted at preventing overutilization in the diagnostic process, including improved collaboration and communication among treating clinicians and pathologists, radiologists, and other diagnostic testing health care professionals, as well as increased emphasis on diagnostic testing in health care professional education (see Chapters 4 and 6).
While diagnostic testing has brought many improvements to medical care, advances in diagnostic testing have also led to some challenges, including an under-reliance on more traditional diagnostic tools, such as careful history taking and the physical exam, and the inappropriate utilization of diagnostic testing (Iglehart, 2009; Newman-Toker et al., 2013; Rao and Levin, 2012; Zhi et al., 2013). Inappropriate use has included both overutilization (testing when it is not indicated) and underutilization (not testing when it is indicated).
The use of diagnostic testing to rule out conditions, clinicians’ intolerance of uncertainty, an enthusiasm for the early detection of disease in the absence of symptoms, and concerns over medical liability can all contribute to overutilization (Grimes and Schulz, 2002; Newman-Toker et al., 2013; Plebani, 2014). In one survey of physicians in specialties at high risk of litigation (emergency medicine, general surgery, orthopedic surgery, neurosurgery, obstetrics/gynecology, and radiology), 59 percent of respondents reported that they ordered more tests than were medically indicated (Studdert et al., 2005). In an analysis that examined patient understanding of medical interventions, researchers identified a complex array of reasons for overuse, including payment systems that favor more testing over patient interaction, the ease of requesting tests, and patient beliefs that more testing and treatment is equivalent to better care (Croskerry, 2011; Hoffmann and Del Mar, 2015). When a clinician does not have enough time to discuss symptoms and potential diagnoses with a patient, ordering a test is sometimes considered more straightforward and less risky (Newman-Toker et al., 2013). Another contributing factor is an overestimation of the benefits of testing; for example, patients often overestimate the benefits of mammography screening (Gigerenzer, 2014; Hoffmann and Del Mar, 2015).
The overutilization of medical imaging techniques that employ ionizing radiation (such as computed tomography [CT]) is of special concern and has gained considerable attention in the wake of research showing a marked increase in radiation exposure from medical imaging in the U.S. population (Hricak et al., 2011). Epidemiological studies have found reasonable, though not definitive, evidence that exposure to ionizing radiation (organ doses ranging from 5 to 125 millisieverts) result in a very small but statistically significant increase in cancer risk (Hricak et al., 2011). Children are more radiosensitive than adults, and cancer risks increase with cumulative radiation exposure. In addition to age at exposure, genetic considerations, sex, and fractionation and protraction of exposure may influence the level of risk. Medical imaging needs to be justified by weighing its potential benefit against its potential risk. It is important to be sure that imaging is truly indicated and to consider alternatives to the use of ionizing radiation, especially for pediatric patients and those with a history of radiation exposure. In 2010 the Food and Drug Administration launched the Initiative to Reduce Radiation Exposure, aimed at promoting the justification of all imaging examinations and the optimization of imaging protocols so as to minimize radiation doses (FDA, 2015). Studies have shown that the use of clinical decision support and guidelines can minimize unnecessary radiation exposure and that they could prevent as many as 20 to 40 percent of CT scans without compromising patient care (Hricak et al., 2011).
For a variety of reasons, diagnostic errors have been more challenging to measure than other quality or safety concepts. Singh and Sittig (2015, p. 103) note that “[c]ompared with other safety concerns, there are also fewer sources of valid and reliable data that could enable measurement” of diagnostic errors. Studies that have evaluated diagnostic errors have employed different definitions, and the use of varying definitions can lead to challenges in drawing comparisons across studies or synthesizing the available information on measurement (Berenson et al., 2014; Schiff and Leape, 2012; Singh, 2014). Even when there is agreement on the definition of diagnostic error, there can be genuine disagreement over whether a diagnostic error actually occurred, and there are often blurry boundaries between different types of errors (e.g., treatment or diagnostic) (Singh and Sittig, 2015; Singh et al., 2012a).
The complexity of the diagnostic process itself, as well as the inherent uncertainty underlying clinical decision making, makes measurement a challenging task (Singh, 2014; Singh and Sittig, 2015). The committee’s conceptual model illustrates the complex, time-dependent, and team-based nature of the diagnostic process as well as all of the potential work system factors that can contribute to the occurrence of diagnostic error. The temporal component of the diagnostic process can complicate measurement because the signs and symptoms of a health condition may evolve over time, and there can be disagreement about what an acceptable time frame is in which to make a timely diagnosis (Singh, 2014; Zwaan and Singh, 2015). Clinical reasoning plays a role in diagnostic errors, but clinical reasoning processes are difficult to assess because they occur in clinicians’ minds and are not typically documented (Croskerry, 2012; Wachter, 2010). Similarly, some measurement approaches, such as medical record reviews, may not identify diagnostic errors because information related to diagnosis may not be documented (Singh et al., 2012a). Furthermore, many people recover from their health conditions regardless of the treatment or diagnosis they receive, so a diagnostic error may never be recognized (Croskerry, 2012).
The Purposes of Measurement
There are a variety of ways that measurement can be used in the context of the diagnostic process and in assessing the occurrence of diagnostic errors. The committee identified five primary purposes for measuring diagnostic errors: establishing the incidence and nature of the problem of diagnostic error; determining the causes and risks of diagnostic error; evaluating interventions to improve diagnosis and reduce diagnostic
errors; for educational and training purposes; and for accountability purposes (e.g., performance measurement). Each of these purposes is described in greater detail below.
- Establish the incidence and nature of the problem of diagnostic error. Today this task is primarily the province of research and is likely to remain that way for the foreseeable future. Researchers have used a variety of methods to assess diagnostic errors. Attention to harmonizing these approaches and recognizing what each method contributes to the overall understanding of diagnostic error may better characterize the size and dimensionality of the problem and may facilitate assessment of diagnostic error rates over time.
- Determine the causes and risks of diagnostic error. This use of measurement and assessment is also primarily undertaken in research settings, and this is also likely to continue. Previous research has provided numerous insights into causes and risks, but moving from these insights to constructing approaches to prevent or detect problems more rapidly will require additional work.
- Evaluate interventions. This report should stimulate the development of programs designed to prevent, detect, and correct diagnostic errors across the spectrum, but these programs will require appropriate measurement tools (both quantitative and qualitative) to allow a rigorous assessment of whether the interventions worked. This will be particularly challenging for measuring prevention, as is always the case in medical care. Research needs to focus on the required attributes of these measurement tools for this application.
- Education and training. Given the importance of lifelong learning in health care, it will be useful to have measurement tools that can assess the initial training of health care professionals, the outcomes of ongoing education, and the competency of health care professionals. For this application, these tools need to provide an opportunity for feedback and perhaps decision support assistance in identifying potential high risk areas. In this instance, the measurement tools need to include not only the assessment of whether an event occurred or is at risk for occurring but also effective methods for feeding back information for learning.
- Accountability. In today’s environment, significant pressure exists to push toward accountability through public reporting and payment for every area in which a potential problem has been identified in health care. As an aspiration, the committee recognizes that transparency and public reporting are worthy goals for helping patients identify and receive high-quality care. However, current
pushes for accountability neglect diagnostic performance, and this is a major limitation of these approaches. The committee’s assessment suggests that it would be premature either to adopt an accountability framework or to assume that the traditional accountability frameworks for public reporting and payment will be effective in reducing diagnostic error. A primary focus on intrinsic motivation—unleashing the desire on the part of nearly all health care professionals to do the right thing—may be more effective at improving diagnostic performance than programs focused on public reporting and payment. Public awareness may also be a key leverage point, but at this point measurement approaches that reveal weak spots in the diagnostic process and identify errors reliably are lacking. For both health care professionals and patients, it is critical to develop measurement approaches that engage all parties in improving diagnostic performance.
With this in mind, the following discussion elaborates on three of the purposes of measurement: Establishing the incidence and nature of diagnostic error, determining the causes and risks of diagnostic error, and evaluating interventions. This section summarizes the approaches to measurement that are best matched to each purpose. All of the data sources and methods that were identified have some limitations for the committee-defined purposes of measurement.
Issues related to assessing the competency of health care professionals are addressed in Chapter 4; because the committee determined that it is premature to consider diagnostic error from an accountability framework, measurement for the purpose of accountability is not described further in this chapter.
Establishing the Incidence and Nature of the Problem of Diagnostic Error
A number of data sources and methods have been used to understand the incidence and nature of diagnostic error, including postmortem examinations (autopsy), medical record reviews, malpractice claims, health insurance claims, diagnostic testing studies, and patient and clinician surveys, among others (Berner and Graber, 2008; Graber, 2013; Singh and Sittig, 2015).
Before reviewing each of these approaches, the committee sought to identify or construct a summary, population-based estimate of the frequency with which diagnostic errors occur. Such a number can underscore the importance of the problem and, over time, be used to evaluate whether progress is being made. To arrive at such a number, the com-
mittee considered the necessary measurement requirements to establish the incidence and nature of diagnostic errors. First, one would need an estimate of the number of opportunities to make a diagnosis each year (denominator) and the number of times the diagnosis (health problem) is not made in an accurate and timely manner or is not communicated to the patient. This formulation takes into consideration the fact that during any given year patients may experience multiple health problems for which a diagnosis is required; each represents an opportunity for the health care system to deliver an accurate and timely explanation of that health problem. About one-third of ambulatory visits are for a new health problem (CDC, 2015). The formulation also reflects the fact that the final product (the explanation of the patient’s health problem) needs to be free of defects; that is, it needs to meet all elements of a correct diagnosis (accuracy, timeliness, and communication).
Perhaps not surprisingly, the available research estimates were not adequate to extrapolate a specific estimate or range of the incidence of diagnostic errors in clinical practice today. Even less information is available to assess the severity of harm caused by diagnostic errors. Part of the challenge in gathering such data is the variety of settings in which these errors can occur; these settings include hospitals, emergency departments, a variety of outpatient settings (such as primary and specialty care settings and retail clinics), and long-term-care settings (such as nursing homes and rehabilitation centers). A second part of the challenge is the complexity of the diagnostic process itself. Although there are data available to examine diagnostic errors in some of these settings, there are wide gaps and much variability in the amount and quality of information available. In addition, a number of problems arise when aggregating data across the various research methods (such as postmortem examinations, medical record reviews, and malpractice claims). Each method captures information about different subgroups in the population, different dimensions of the problem, and different insights into the frequency and causes of diagnostic error. Taken together, however, the committee concluded that the evidence suggests that diagnostic errors are a significant and common challenge in health care and that most people will experience at least one diagnostic error in their lifetime. The committee based this observation on its collective assessment of the available evidence describing the epidemiology of diagnostic errors. In each data source that the committee evaluated, diagnostic errors were a consistent quality and safety challenge.
The committee anticipates that its definition of diagnostic error will inform measurement activities. The two components of the definition—(a) accuracy and timeliness and (b) communication—will likely have to be accounted for separately. For example, it is often difficult to determine
from a medical record review whether the diagnosis has been communicated to the patient. Other data sources, such as patient surveys, may be helpful in making this determination. Alternatively, medical record charting practices could be improved to emphasize communication because of its importance in improving diagnosis and subsequent care. Measuring each arm of the definition is also consistent with the committee’s approach to identifying failures in the diagnostic process; the committee specifies that each step in the diagnostic process can be evaluated for its susceptibility to failures (see section on determining the causes and risks of diagnostic error).
To better understand both the challenges and the opportunities associated with the various measurement methods, the committee examined for each of the data sources (1) the mechanism by which eligible patients were identified for assessment (denominator) and (2) the way that diagnostic errors were identified (numerator). The results are summarized in Table 3-1. In the sections following the table, the committee describes each data source; highlights the features of the data source that enhance or limit its utility for estimating the incidence of diagnostic error; describes the methods that have been used in studies to select cases for review (the denominator); and describes the methods for determining if an error occurred (numerator). Next, a summary of what is known about the incidence of diagnostic errors from studies that use those data sources is offered. Each section ends with a discussion of potential improvements to the methods that use each data source.
|Data Source||Key Features of the Data Source||Method(s) for Selecting Cases for Review (Denominator)||Method for Determining if Error Occurred (Numerator)|
Postmortem examination (Autopsy)
Deaths only Limited number of reviews Selection bias (typically focused on unexpected deaths) Limited workforce
Consecutive series with criteria Convenience samples Prespecified criteria Requests (from clinicians or families)
Comparison to another data source (medical record, interview, location/ circumstance of death) Cause of death determination Effects or indication of disease
|Data Source||Key Features of the Data Source||Method(s) for Selecting Cases for Review (Denominator)||Method for Determining if Error Occurred (Numerator)|
Rely on documentation (what was recorded, such as clinical history and interview, physical exam, and diagnostic testing)
Prespecified criteria (e.g., trigger tool) Random sample
Implicit review/expert assessment Explicit criteria
Medical malpractice claims
Requires claim to be filed; more likely for negligent care Most studies done on closed claims
Classification criteria (typically based on claim made in suit)
Claims adjudication process (including courts)
Health insurance claims
Requires a billable event Relies on documentation necessary for payment
Criteria-based algorithm (selected) Universe of claims
Source data available for review Applies only to diagnoses for which diagnostic testing data are a key factor Focus on interpretation
Random sample Prespecified criteria
Expert assessment compared to original
Source data available for review Applies only to diagnoses for which medical imaging data are a key factor Focus on interpretation
Random sample Prespecified criteria
Expert assessment compared to original
Surveys of clinicians
Subject to nonresponse bias May be difficult to validate
Sample receiving survey
Descriptive statistics on self-report
Surveys of patients
Subject to nonresponse bias May be difficult to validate
Sample receiving survey
Descriptive statistics on self-report
Description of the data source Postmortem examinations, often referred to as autopsies, are highly specialized surgical procedures that are conducted to determine the cause of death or extent of disease. Hoyert (2011, p. 1) identifies two primary types of postmortem exams conducted in the United States: (1) “hospital or clinical autopsies, which family or physicians request to clarify cause of death or assess care,” and (2) “medicolegal autopsies, which legal officials order to further investigate the circumstances surrounding a death.” Postmortem exams may vary from an external-only exam to a full external and internal exam, depending on the request. While this chapter focuses on full-body postmortem exams, Chapter 6 describes the potential future state of postmortem examinations, which may include more minimally invasive approaches, such as medical imaging, laparoscopy, biopsy, histology, and cytology.
Notes about the data source Postmortem exams are considered a very strong method for identifying diagnostic errors because of the extensiveness of the examination that is possible (Graber, 2013; Shojania, 2002). However, there are some limitations to this data source for the purpose of estimating the incidence of diagnostic error. Postmortem exams are conducted on people who have died; thus, the results can only provide information about diagnostic errors that led to the patient’s death and about other diseases present that had not been previously identified, whether or not they contributed to the patient’s death. A very limited number of postmortem exams are performed annually, and postmortem exam rates can also vary geographically and institutionally. Little information is available for characterizing the relationship between those who receive postmortem exams and the potential number of eligible cases, but those who undergo autopsy are more likely to have experienced a diagnostic error and that error is more likely to have contributed to the patient’s (premature) death (an example of selection bias) (Shojania, 2002).
Methods for identifying cases for review (denominator) The decision about whether an individual patient will receive a postmortem exam is based on requests from clinicians or family members as well as on local criteria set by coroners or medical examiners. With the exception of postmortem examinations done for criminal forensic purposes, family members must consent to having the procedure done. There is no systematic information on the frequency with which the request for an autopsy is refused (which would introduce response bias into results). The performance of postmortem exams has declined substantially in the United States in recent decades (Lundberg, 1998). National data on postmortem exams
have not been collected since 1994; at that time, fewer than 6 percent of non-forensic deaths underwent a postmortem exam (Shojania et al., 2002).
Research studies that have used postmortem exam results have used consecutive series, prespecified criteria (including randomly selected autopsies), or convenience samples (Shojania, 2002).
Methods for determining if an error occurred (numerator) The results of the postmortem exam typically provide a cause of death and a description of the presence and severity of other diseases. These results are compared to another data source, typically medical records or interviews with treating clinicians or family members. Discrepancies between what was found in the postmortem exam and what was known prior to that are the basis for determining the occurrence of a diagnostic error. Such determinations are subject to the reliability and validity of both the postmortem exam findings and the results from the data collected from the original sources.
What is known Postmortem examinations have been described as an important method for detecting diagnostic errors (Berner and Graber, 2008; Graber, 2013). In their review of postmortem examination data, Shojania and colleagues concluded that “the autopsy continues to detect important errors in clinical diagnosis” (Shojania et al., 2002, p. 51). On average, 10 percent of postmortem exams were associated with diagnostic errors that might have affected patient outcomes (i.e., Class I errors).2 They estimated that the prevalence of major errors (i.e., Class I and II errors) related to the principal diagnosis or the cause of death was 25 percent. Some incidental findings found during postmortem exams should not be classified as diagnostic errors; of primary importance is identifying diagnostic errors that contributed to a patient’s death (Class I errors).3 Shojania and colleagues noted that some selection bias is reflected in this estimate because the cases in which there was more uncertainty about the diagnosis were more likely to undergo postmortem exam. A systematic review of diagnostic errors in the intensive care unit found that 8 percent of postmortem exams identified a Class I error and that 28 percent identified at least one diagnostic error (Winters et al., 2012). According to Shojania et al. (2003, p. 2849), the rates of autopsy-identified diagnostic errors have
2 A Class I error is a major diagnostic error that likely played a role in the patient’s death. A Class II error is a major diagnostic error that did not contribute to the patient’s death. A Class III error is a minor diagnostic error that is not related to the patient’s cause of death but is related to a terminal disease. A Class IV error is a missed minor discrepancy (Winters et al., 2012).
3 For example, incidental findings of prostate cancer that are not relevant to the patient’s provision of health care, terminal disease, or death may not be appropriate to classify as diagnostic error.
declined over time but remain “sufficiently high that encouraging ongoing use of the autopsy appears warranted.” Based on their findings, they estimated that among the 850,000 individuals who die in U.S. hospitals each year, approximately 8.4 percent (71,400 deaths) have a major diagnosis that remains undetected (Shojania et al., 2003).
Opportunities for improvement The committee concluded that postmortem exams play a critical role in understanding the epidemiology of diagnostic errors and that increasing the number of such exams is warranted. In addition, tracking the number of deaths, those eligible and selected for postmortem exams, and the refusal rate among family members would enable the development of better national estimates of diagnostic error incidence. The committee weighed the relative merits of increasing the number of postmortem examinations conducted throughout the United States versus a more targeted approach. The committee concluded that it would be more efficient to have a limited number of systems who are highly qualified in conducting postmortem exams participate to produce research-quality information about the incidence and nature of diagnostic errors among a representative sample of patient deaths. This approach reflects both financial realities and workforce challenges (i.e., a limited number of pathologists being available and willing to conduct a large number of such exams) (see also Chapter 6). The systems that are selected to routinely conduct postmortem exams could also investigate how new, minimally invasive postmortem approaches compare to full-body postmortem exams.
Description of the data source A medical record is defined as a documented account of a patient’s examination and treatment that includes the patient’s clinical history and symptoms, physical findings, the results of diagnostic testing, medications, and therapeutic procedures. The medical record can exist in either paper or electronic form.
Notes about the data source Medical records exist only for patients who have sought care from a clinician, team, or facility. Although there are some common conventions for structuring medical records (both in paper and electronic formats), much of the content of the record depends on what the clinician chooses to include; thus, there may be variations in the extent to which clinical reasoning is documented (e.g., what alternative diagnoses were considered, the rationale for ordering [or not ordering] certain tests, and the way in which the information was collected and integrated). Both regulatory and local rules affect which members of the diagnostic team contribute to the documentation in a medical record
and how they contribute. Except in highly integrated systems, patients typically have a separate medical record associated with each clinician or facility from which they have sought care. When patients change their source of care, the information from medical records maintained by the previous clinicians may or may not be incorporated into the new record.
Methods for identifying cases for review (denominator) The most common methods for identifying cases for review are either to draw a random sample of records from a facility (especially hospitals), clinic, or clinician practice or to assemble a criteria-based sample (e.g., a trigger tool). The criteria-based tools typically select events that have been associated with a higher probability of identifying a diagnostic error, such as unplanned readmissions to a hospital, emergency department visits after an outpatient visit, or the failure of a visit to occur after an abnormal test result. Estimates of the incidence of diagnostic errors based on medical records need to account for the probability that an individual is included in the study sample and the likelihood that a visit (or set of visits) requires that a diagnosis be made. Because these factors likely vary by geography and patient populations, arriving at national estimates from studies done in limited geographic areas is difficult.
Methods for determining if an error occurred (numerator) There are two common methods for determining if an error occurred: implicit and explicit. In the implicit method, an expert reviewer, taking into account all of the information that is available in the medical record, determines whether or not an accurate or timely diagnosis was made and, if a defect in the process occurred, the nature of that problem. In the explicit method, specific criteria are developed and data are abstracted from the medical record to determine whether or not an error occurred. The reliability of implicit and explicit methods for assessing quality of care and patient safety has been studied. Generally, implicit methods have been found to be less reliable than explicit methods (Hofer et al., 2004; Kerr et al., 2007). In the Utah and Colorado Medical Practice Study, which was one of the sources for estimating medical errors in the IOM’s To Err Is Human report, the inter-rater reliability (agreement among reviewers) was κ=0.40–0.41 (95 percent confidence interval, 0.30–0.51) for identifying adverse events and κ=0.19–0.24 (95 percent confidence interval, 0.05–0.37) for identifying negligent adverse events (Thomas et al., 2002). These rates are considered moderate to poor (Landis and Koch, 1977). The reliabilities for the Harvard Medical Practice Study were in the same range (Brennan et al., 1991). Zwaan et al. (2010) reported a reliability of κ=0.25 (95 percent confidence interval, 0.05–0.45) (fair) for identifying adverse events and of κ=0.40 (95 percent confidence interval, 0.07–0.73) (moderate) for whether
the event was preventable. Reliability in turn can affect the event rate that is reported. By contrast, the inter-rater reliability for explicit review of records for quality studies has been reported at approximately 0.80 (McGlynn et al., 2003).
What is known Two studies based on medical record reviews reported in the literature in the 1990s and early 2000s estimated that diagnostic errors account for 7 and 17 percent of adverse events in hospitalized patients, respectively. In the Harvard Medical Practice Study of more than 30,000 patient records, diagnostic errors were identified in 17 percent of the adverse events (Leape et al., 1991). A review of 15,000 records from Colorado and Utah found that diagnostic errors constituted 6.9 percent of adverse events (Thomas et al., 2000).
More recently, Zwaan and colleagues conducted a retrospective patient record review to assess the occurrence of diagnostic adverse events (harm associated with a diagnostic error) within hospitals in the Netherlands (Zwaan et al., 2010). Those researchers found that diagnostic adverse events occurred in 0.4 percent of all hospital admissions and that diagnostic adverse events accounted for 6.4 percent of all adverse events. The researchers had reviewers classify the causes of diagnostic adverse events by human, organizational, technical, patient-related, and other factors (Zwaan et al., 2010). They further divided the “human” category into knowledge-based, rule-based, skill-based, or other (such as violations or failures by deliberate deviations from rules or procedures). They found that human failures were the main cause of diagnostic adverse events—96.3 percent of these events had a human cause.4 However, organizational and patient-related factors were present in 25.0 percent and 30.0 percent of diagnostic adverse events, respectively. The researchers found that the primary causes of diagnostic adverse events were knowledge-based failures (physicians did not have sufficient knowledge or applied their knowledge incorrectly) and information transfer failures (physicians did not receive the most current updates about a patient).
In another study by Zwaan and colleagues (2012), rather than focusing exclusively on adverse events, the researchers had four internists review 247 patient medical records for patients with dyspnea (shortness of breath) symptoms. The reviewers used a questionnaire to identify failures in diagnostic reasoning, diagnostic errors, and harm. They found that failures in diagnostic reasoning occurred in 66 percent of the cases, that diagnostic errors occurred in 13.8 percent of all cases, and that the patient was harmed in 11.3 percent of cases. Although cases with diag-
4 It is likely that the “human failures” identified in this study actually related to work system factors.
nostic errors and patient harm had more failures in diagnostic reasoning, in 4 percent of the cases diagnostic errors occurred in the absence of diagnostic reasoning failures.
Singh et al. (2014) estimated the frequency of diagnostic error in the outpatient setting using data from three prior studies (Murphy et al., 2014; Singh et al., 2010a, 2012a). Two of the studies used “triggered” electronic queries to identify suspected cases of diagnostic error. In one study these triggers identified medical records in which a patient had a primary care visit followed by an unplanned hospitalization or unscheduled follow-up appointment, while the other study looked for a lack of follow-up for abnormal colorectal cancer findings. The third study examined consecutive cases of lung cancer. Physicians reviewed medical records to determine if there was a diagnostic error (defined as a missed opportunity to make or pursue the correct diagnosis when adequate data were available at the index [i.e., first] visit) (Singh, 2012a). The combined estimate of diagnostic error based on these three datasets was about 5 percent. Extrapolating to the entire U.S. population, Singh et al. (2014) estimated that approximately 12 million adults (or 1 in 20 adults) experience a diagnostic error each year; the researchers suggested that about half of these errors could be potentially harmful. Due to the definition of diagnostic error that Singh and colleagues employed, they asserted—as have other researchers—that this number may be a conservative estimate of the rate of outpatient diagnostic errors (Aleccia, 2014).
Opportunities for improvement Medical records will continue to be an important source of data for assessing diagnostic errors. The advent of electronic forms that make some methods more cost-efficient, combined with mechanisms such as health information exchanges that may make it easier to assemble the entire patient diagnostic episode, may enhance the use of these methods. Developing a standard method that could be applied to a random sample of records (either nationally or in prespecified settings) would enhance opportunities to learn about both the incidence and the variation in the likelihood of patients experiencing a diagnostic error. Greater attention to the reliability with which the method is applied, particularly through the use of explicit rather than implicit methods, would also enhance the scientific strength of these studies.
Medical Malpractice Claims
Description of the data source Medical malpractice claims are defined as the electronic and paper databases maintained by professional liability insurers on claims that have been filed by patients or their families seeking compensation for alleged medical errors, including diagnostic errors;
the information in support of the claims (medical records, depositions, other reports); and the final determination, whether achieved through a settlement or a court ruling. In addition to files maintained by insurers, the Health Resources and Services Administration, an agency within the Department of Health and Human Services (HHS), maintains the National Practitioner Data Bank (NPDB). The NPDB is a repository of clinician names, affiliations, and malpractice payments that have been made. It serves primarily as a system to facilitate comprehensive review of the credentials of clinicians, health care entities, providers, and suppliers, but it has been used for research as well. Many states also require claim reporting for purposes of maintaining a state-level database of paid claim information.
Notes about the data source For a diagnostic error to be included in malpractice claims datasets, a patient must have filed a claim, which is a relatively rare event (Localio et al., 1991), and is more likely if the patient has experienced significant harm or if negligence is a factor. For example, one study using data from the Harvard Medical Practice Study estimated that the probability of negligent injury was 0.43 percent and that the probability of nonnegligent injury was 0.80 percent (Adams and Garber, 2007). Furthermore, the probability that a claim would be filed was 3.6 percent if a negligent injury occurred and 3.2 percent if a nonnegligent injury occurred. The probability that a claim would be paid was 91 percent for negligent injury claims and 21 percent for nonnegligent injury claims. Thus, malpractice claims data provide a small window into the problem of diagnostic errors and are biased toward more serious diagnostic errors. For diagnosis-related claims, an average of 5 years elapses between the incident and the settlement of the claim (Tehrani et al., 2013). The validity of claims is uncertain; some claims will be filed and closed when no error occurred. Many, if not most, errors do not lead to malpractice claims. Cases may also be dismissed even when a true diagnostic error occurred.
Methods for identifying cases for review (denominator) Studies of diagnostic error using malpractice claims data use all malpractice claims (any allegation) as the denominator.
Methods for determining if an error occurred (numerator) In malpractice claims, the allegation in the claim is the basis for a determination; multiple allegations can be associated with a single claim. A number of studies have assessed the validity of malpractice claims (Localio et al., 1991; Studdert et al., 2000, 2006). Generally speaking, studies use only closed claims, that is, those for which the insurer has determined that no further legal action will be taken (claims may be closed due to settlement,
verdict, dismissal, abandonment, or other reasons). Data from CRICO’s Comparative Benchmarking System indicate that 63 percent of closed diagnosis-related cases were withdrawn, denied, or dismissed with no indemnity payment (CRICO, 2014).
What is known Tehrani et al. (2013) analyzed 25 years of closed medical malpractice claims from the National Practitioner Data Bank in order to characterize the frequency, patient outcomes, and economic consequences of diagnostic errors. The researchers found that diagnostic errors were the leading type of paid malpractice claims (28.6 percent) and were responsible for the highest proportion of total payments (35.2 percent) (Tehrani et al., 2013). Diagnostic errors were almost twice as likely to be associated with patient death as other allegation categories (such as treatment, surgery, medication, or obstetrics claims). Almost 70 percent of diagnostic error claims were from the outpatient setting, but inpatient diagnostic error claims were more likely to be associated with patient death. The researchers estimated that the 2011 inflation-adjusted mean and median per claim payout for diagnostic error were $386,849 and $213,250, respectively.
Schiff and colleagues (2013) reviewed closed primary care malpractice claims in Massachusetts from 2005 to 2009. During that 5-year period, 551 medical malpractice claims were from primary care practices. More than 70 percent of the allegations were related to diagnosis. The diagnoses most often appearing in these claims were cancer, heart diseases, blood vessel diseases, infections, and stroke.
CRICO has conducted comprehensive analyses of its claim files and associated medical records for diagnostic errors (CRICO, 2014; Siegal, 2014). CRICO’s database represents about 30 percent of the NPDB and includes around 400 hospitals and health care entities and 165,000 physicians. In CRICO’s analysis of data from 2008 to 2012 (including more than 4,500 cases and more than $1 billion total incurred losses), the organization reported that diagnosis-related claims represented 20 percent of cases by volume and 27 percent of indemnity payments. It found that diagnostic errors are more common in the ambulatory care setting than in the inpatient or emergency department setting (56 percent versus 28 percent and 16 percent, respectively). Within the inpatient setting, the top diagnoses represented in closed malpractice claims included myocardial infarction (MI) and cardiac events, complications of care (failure to rescue), and infections/sepsis (Siegal, 2014). In the ambulatory care setting, cancer, cardiac care (including MI), and injury (orthopedic, head, and spine) represented the top diagnoses in paid claims. CRICO found that cancer represented almost one-third of all the diagnosis-related medical malpractice claims.
The Doctors Company, another large national medical liability insurer, compiled information from its 2007–2013 claims database for the committee. In its analysis of diagnosis-related claims, The Doctors Company included information from 10 medical specialties (internal medicine, family medicine, obstetrics, cardiology, gynecology, general surgery, emergency medicine, orthopedics, pediatrics, and hospital medicine). For the 10 specialties, diagnosis-related claims constituted between 9 percent (obstetrics) and 61 percent (pediatrics) of total claims. The analysis included the top five diagnoses associated with each specialty’s malpractice claims. That analysis indicated that more than half of the diagnoses appeared within multiple specialties and generally were for commonly encountered diseases (such as acute MI, acute cerebral vascular accident, cancer, and appendicitis) (Troxel, 2014).
Opportunities for improvement For malpractice claims to be useful for estimating the incidence of diagnostic error, it will be necessary to develop a better understanding of the underlying prevalence of diagnostic error as well as of the probability that a claim will be filed if an error has occurred and the likelihood that a filed claim will be settled. This will require significant research activity, and such research would have to explore variations by geography, specialty, type of error, and other factors. Databases from malpractice insurers contain much more clinical detail than the NPDB and are likely to be more useful in describing patterns of diagnostic errors, such as the steps in the diagnostic process that present the highest risk for different diagnoses. CRICO’s benchmarking studies demonstrate the utility of these data for understanding where in the diagnostic process errors are most likely to occur and what factors contributed to the error. This can be useful for designing both monitoring and improvement programs.
Health Insurance Claims
Description of the data source The data source consists of electronic databases maintained by health insurance companies that contain the details of bills submitted by health care professionals and organizations for payment of services delivered. Both public (e.g., Medicare, Medicaid) and private (e.g., Aetna, Blue Cross, United Healthcare) entities maintain such databases on the individuals with whom they have a contractual arrangement to provide payment. Typically, health care professionals and organizations bill multiple insurers for services.
Notes about the data source For information to be present in the database, a patient has to have used a service, a claim must have been filed,
the service must have been covered, and (usually) payment must have been made. Claims are based on structured coding systems (ICD-9/10, CPT-IV, NDC, DRG) and do not generally include clinical details (e.g., results of history and physical examinations, diagnostic testing results) except as categorical codes. Because data are available electronically and represent the universe of claims filed for any insurer, the probability that a patient or episode of care has been selected for analysis can be calculated. Because health care professionals and organizations bill multiple insurance companies, each of which has different rules, it can be difficult to understand the health care professionals’ and organizations’ overall practices with data from a single source.
Methods for identifying cases for review (denominator) Although a random sample of claims or groups of claims could be selected, it is more common to focus studies on those with patterns of care consistent with the possibility that a diagnostic error occurred.
Methods for determining if an error occurred (numerator) Frequently, an algorithm is developed to determine when an error likely occurred, such as cases in which there is no evidence that a diagnostic test was done prior to a new diagnosis being made (e.g., breast cancer diagnosis in the absence of a screening mammogram). Health insurance claims data may be linked to other data sources (e.g., National Death Index, diagnostic testing results, medical records) to make a determination that an error occurred.
What is known Within the quality and safety field, improvements in the measurement of both process and outcome measures of quality have been made possible by the expanding use of health information technology (health IT) and health insurance claims databases over the past several decades. For example, health insurance claims databases linked to validated federal death registries have made possible the measurement of 30-day mortality for acute MI, heart failure, and pneumonia, all of which are considered as outcome measures of quality. Similar databases provide the backbone for measuring process quality measures (such as 30-day rehospitalizations, appropriate assessment of left ventricular function in patients with congestive heart failure, and retinopathy screening among patients with diabetes). There are a few examples of the use of these data for investigating diagnostic error. Newman-Toker and colleagues (2014) identified patients who were admitted to the hospital with a diagnosis of stroke who in the previous 30 days had been treated and released from an emergency department for symptoms consistent with a stroke. They found that 12.7 percent of stroke admissions reflected potential missed stroke diagnoses and 1.2 percent reflected probable missed diagnoses.
These rates suggest that 15,000 to 165,000 stroke diagnoses are missed annually in the United States, with a higher risk for missed diagnoses among younger, female, and white patients. The researchers note that their estimates of diagnostic error are inferred rather than confirmed because of the lack of clinical detail in health insurance claims.
Opportunities for improvement Health insurance claims databases maintained by the Centers for Medicare & Medicaid Services (CMS) and by commercial insurers offer the possibility of measuring certain types of diagnostic errors, identifying their downstream clinical consequences and costs, and understanding the system-level, health care professional–level, and patient-level factors that are associated with these errors.
For example, analyses of claims data could be used in “look back” studies to identify the frequency with which acute coronary syndrome is misdiagnosed. Specifically, for those enrollees who are ultimately diagnosed with acute coronary syndrome, analysts could explore how frequently these beneficiaries were seen by health care professionals in the week prior to ultimate diagnosis (either in outpatient, emergency department, or hospital settings), the incorrect diagnoses that were made, and the factors associated with the diagnostic error. For instance, this epidemiologic approach using large administrative databases would make it possible to determine whether the diagnostic error occurs more frequently in specific hospitals, among specific types of clinicians or practice settings, or during particular days of the week when staffing is low or the volume of patients treated is unexpectedly high. The strength of this approach to understanding the epidemiology of diagnostic error is its ability to provide national estimates of diagnostic error rates across a vast array of conditions; to understand how these diagnostic error rates vary across geography and specific settings of care; to study the impact of specific care delivery models on diagnostic error rates (e.g., do accountable care organizations lower diagnostic errors?); and to update measurements as quickly as the administrative data are themselves collected. The main critique of this approach concerns the validity of the findings because of the limited availability of the clinical data necessary to confirm a diagnosis. Thus, this data source may be most useful in combination with other sources.
Diagnostic Testing (Anatomic and Clinical Pathology)
Description of the data source Diagnostic testing includes the examination of secretions, discharges, blood, or tissue using chemical, microscopic, immunologic, or pathologic methods for the purposes of making or ruling out a diagnosis. Analysis of the data may involve automated
processes or a visual examination by trained health care professionals (clinical and anatomic pathologists).
Notes about the data source A unique feature of this type of data is that the original source data (the samples) are frequently available for reanalysis or inspection by another health care professional, thus allowing for an independent assessment based on the same data. For the committee’s purposes, the focus is on those diagnoses for which diagnostic testing findings are a key information source. A common taxonomy in this field distinguishes among five phases: pre-pre-analytic (i.e., deciding whether or not to order a particular test), pre-analytic (i.e., sample labeling and acquisition, test performance), analytic (i.e., the accuracy of the test or examination of the sample), post-analytic (i.e., the results are reported correctly, interpreted correctly, and communicated back to the ordering clinician in a timely way), and post-post-analytic (i.e., the ordering clinician uses test results to inform patient care) (Plebani et al., 2011). For the purpose of examining the incidence of diagnostic error, the committee focused on those circumstances in which diagnostic testing results are a key information source. One study estimated that at least 10 percent of diagnoses require diagnostic testing results in order to be considered final; this number is likely higher today (Epner et al., 2013; Hallworth, 2011; Peterson et al., 1992). Primary care clinicians order tests in about one-third of patient visits (Hickner et al., 2014). For anatomic pathology specimens, which require visual inspection and clinical judgment, second reviews by another pathologist offer insight into the potential rate of diagnostic error.
Methods for identifying cases for review (denominator) Two methods—random samples and prespecified criteria—are commonly used to identify cases. Both methods allow for the denominator to be characterized (i.e., the probability that a case was reviewed, the characteristics of the cases reviewed as compared to all cases).
Methods for determining if an error occurred (numerator) Because testing involves multiple steps, there are many different methods for identifying errors, including an examination of other data sources such as medical records, malpractice claims, or pharmacy databases (Callen et al., 2011). For second review studies, an error is typically defined as a discrepancy between the findings of the first pathologist and the second pathologist. This review can identify errors in which a finding that leads to a diagnosis was missed and errors in which a finding was inaccurate (i.e., no disease was found by the second reviewer). Second review studies typically assume that the second review is more accurate, but these studies do not typically link to patient outcomes. When second reviews are linked to
patient outcomes, Renshaw and Gould (2005) concluded that in many cases, the first reviewer was correct. For other diagnostic tests, errors may be detected in the interpretation or communication of results in a timely manner.
What is known Plebani reported that errors in laboratory medicine studies vary greatly because of the heterogeneity in study designs and the particular step or steps in the process that were examined (Plebani, 2010). A considerable focus on the analytic phase has led to substantial reductions in errors in that step; the pre- and post-analytic phases are seen as more vulnerable to error. A review published in 2002 (that only classified the diagnostic testing process in three phases) found that 32 to 75 percent of errors occurred in the pre-analytic phase, 13 to 32 percent in the analytic phase, and 9 to 31 percent in the post-analytic phase (Bonini et al., 2002). A study of urgent diagnostic testing orders in the hospital, which also classified the diagnostic testing process in three phases, found that 62 percent of errors were in the pre-analytic phase, 15 percent in the analytic phase, and 23 percent in the post-analytic phase (Carraro and Plebani, 2007). One study estimated that 8 percent of errors had the potential to result in serious patient harm (Goldschmidt and Lent, 1995). A systematic review of the literature on follow-up of test results in the hospital found failure rates of 1 to 23 percent in inpatients and 0 to 16.5 percent in emergency department patients (Callen et al., 2011).
As Berner and Graber (2008) note, second reviews in anatomic pathology identify varying discrepancy rates. The College of American Pathologists and the Association of Directors of Anatomic and Surgical Pathology recently published guidelines based on a systematic review of the literature which found a median rate of major discrepancies in 5.9 percent of cases (95 percent confidence interval, 2.1–10.5 percent) (Nakhleh et al., 2015). The study also reported variations in the rate by the service performed (surgical pathology versus cytology), the organ system (single versus multiple), and the type of review (internal versus external). Kronz and Westra (2005) report a diagnostic discrepancy rate for the head and neck found by second review of between 1 and 53 percent for surgical pathology and 17 to 60 percent for cytopathology. A study by Gaudi and colleagues (2013) found that pathologists with dermatopathology fellowship training were more likely to disagree with preliminary diagnoses provided by nonspecialist pathologists.
Opportunities for improvement The contribution of diagnostic testing to diagnosis is substantial, but it has not been systematically quantified recently. The understanding of this critical information source could be improved by developing better methods for identifying and enumerating
the diagnoses for which such testing is critical, mechanisms for evaluating the appropriateness of test ordering, and methods for determining the impact on patient outcomes. Additionally, studies that use diagnostic variance as a surrogate for accuracy (second reviews in which the second reviewer is considered more accurate) could benefit from the inclusion of patient outcomes.
Description of the data source The data are visual representations of the interior of the body generated using a variety of methods (e.g., X-ray, ultrasound, computed tomography [CT], magnetic resonance imaging, and positron emission tomography) that are collected for the purpose of diagnosis; these visual representations generally require interpretation by a radiologist or, in certain circumstances, physicians in nuclear medicine, emergency medicine, or cardiology. In this context, the medical imaging data are reviewed by at least one other clinician, and the findings of all health care professionals are recorded.
Notes about the data source As with anatomic pathology, a unique feature of this data type is the availability of the original images for review by a second radiologist. The focus is on those diagnoses for which medical imaging results are a key information source. In approximately 15 percent of office visits, an imaging study is ordered or provided (CDC, 2010), whereas one or more medical imaging studies are ordered in approximately 47 percent of emergency department visits (CDC, 2011). In both settings, X-rays are the most common imaging method used.
Methods for identifying cases for review (denominator) Typically a random sample of cases is selected for second review, although some studies have included prespecified criteria (e.g., cases known to have higher potential rates of error in interpretation, or abnormal findings only).
Methods for determining if an error occurred (numerator) An error is assumed to have occurred whenever a discrepancy exists between the two clinicians in interpreting the medical imaging study. Some studies have also involved radiologists conducting a second review of their own previously completed studies.
What is known Berlin noted that medical imaging discrepancy rates as indicated by second review have not changed much over the past 60 years (Berlin, 2014). For instance, a study by Abujudeh and colleagues explored intra- and interobserver variability in medical imaging by having three
experienced radiologists review 30 of their own previously interpreted CT exams and 30 CT exams originally interpreted by other radiologists (Abujudeh et al., 2010). They found a major discrepancy rate of 26 percent for interobserver variability and 32 percent for intraobserver variability. Velmahos and colleagues (2001) found an 11 percent discrepancy rate between the preliminary and final readings of CT scans of trauma patients. Discrepancy rates were negatively associated with level of experience: The lower the level of experience of the preliminary reader, the more likely there was to be a discrepancy. In many of the second review studies in imaging, high error rates resulted from using a denominator that consisted only of abnormal cases. Studies that look at real-time errors—that is, devising an error rate using both normal and abnormal exams as the denominator—suggest an error rate in the 3 to 4.4 percent range (Borgstede et al., 2004).
Opportunities for improvement Medical imaging plays a key role in many diagnoses, and errors in the use and interpretation of these studies can contribute to diagnostic error. For the purposes of estimating the incidence of diagnostic error due to errors related to medical imaging, it would be useful to identify the subset of diagnoses for which medical imaging results are central to making the diagnosis and to conduct studies to determine the likelihood of errors, the nature of those errors, and the variation in the circumstances under which errors occur. The role of second reviews in error recovery—identifying and “intercepting” errors before they affect patient outcomes—both for medical imaging and for anatomic pathology is discussed in Chapter 6.
Surveys of Clinicians
Description of the data source The data come from questionnaires (written, telephone, interview, Web-based) that obtain clinicians’ self-reports about diagnostic errors they have made or what they know about diagnostic errors made by other clinicians. The information content of such surveys can vary.
Notes about the data source As with all surveys, the results can be affected by a number of biases, including nonresponse bias (nonresponders being systematically different from responders, such as being more or less likely to have committed a diagnostic error) or reporting bias (systematic differences in the information that is revealed or suppressed, such as not reporting more serious errors). Unless the self-report can be compared to an authoritative source, it is difficult to determine the validity of rates based solely on self-report. Surveys usually have the advantage
of anonymity, which might make respondents more likely to report their errors accurately than through other methods.
Methods for identifying cases for review (denominator) Surveys are frequently conducted on random samples of clinicians, making the implicit denominator the number of opportunities a clinician had to make a diagnosis in the study period. Convenience samples are also used (e.g., surveys of clinicians participating in a continuing medical education course). Reports of survey findings have used different denominators, but often the denominator is the number of clinicians responding to the survey.
Methods for determining if an error occurred (numerator) An error is judged to have occurred when a clinician self-reports having made one or more diagnostic errors in the study time frame. Some studies have asked about errors known to the clinician that were made by other clinicians or experienced by family members. This approach makes estimating the incidence rate nearly impossible, as the true denominator is unknown.
What is known Schiff et al. (2009) surveyed physicians and asked them to recall instances of diagnostic error. In their analysis of 583 reports of diagnostic error, they found that physicians readily recalled instances of diagnostic error; the most commonly reported diagnostic errors were pulmonary embolism, drug reactions, cancer, acute coronary syndrome, and stroke. Singh and colleagues (2010b, p. 70) surveyed pediatricians about diagnostic errors and found that “more than half of respondents reported that they made a diagnostic error at least once or twice per month.” In another survey of physicians, 35 percent reported that they had experienced medical errors either in their own or a family member’s care (Blendon et al., 2002).
Opportunities for improvement For the purposes of making national estimates of the incidence of diagnostic errors, it would be useful to have more clearly defined sampling frames, more detailed questions about the nature of the errors and the circumstances surrounding the error, and an opportunity to compare this method to other methods that use different data sources. Surveys have the advantage of being a potentially easy way to get a snapshot of diagnostic error rates, but the quality of the information may make this source less useful for other applications. The biases that are inherent in surveys are difficult to overcome and likely limit the utility of this source.
Surveys of Patients
Description of the data source The data come from questionnaires (written, telephone, interview, Web-based) that obtain patients’ self-reports about diagnostic errors they have experienced or their awareness of diagnostic errors experienced by others. The information collected can vary.
Notes about the data source As with all surveys, the results can be affected by nonresponse bias and by reporting bias. Unless there are opportunities to compare answers to other data sources, it may not be possible to confirm the validity of the responses. Patient definitions of diagnostic errors might vary from the definitions of health care professionals. Patient surveys can be very useful in determining whether a new health problem was explained to the patients and whether they understood the explanation.
Methods for identifying cases for review (denominator) Surveys are usually conducted on a sample of patients that is randomly drawn from some population (e.g., geographic area, members of a health plan, and patients who utilize a specific care setting) or selected so that the patients meet certain criteria (similar to the trigger tools discussed above). Convenience samples are also used.
Methods for determining if an error occurred (numerator) The determination of an error is based on self-report by the patient. Some studies inquire about both the patient’s own experience and that of others known to the patient. The latter approach makes it impossible to estimate a true incidence rate because of uncertainty around the real size of the denominator.
What is known In one survey of patients, 42 percent reported that they had experienced medical errors either in their own or a family member’s care (Blendon et al., 2002). A poll commissioned by the National Patient Safety Foundation found that approximately one in six of those surveyed had experience with diagnostic error, either personally or through a close friend or relative (Golodner, 1997). More recently, 23 percent of people surveyed in Massachusetts indicated that they or someone close to them had experienced a medical error, and approximately half of these errors were diagnostic errors (Betsy Lehman Center for Patient Safety and Medical Error Reduction, 2014). Weissman and colleagues (2008) surveyed patients about adverse events during a hospital stay and compared survey-detected adverse events with medical record review. Twenty-three
percent of surveyed patients reported at least one adverse event, compared to 11 percent identified by medical record review.
Opportunities for improvement The particular value of patient surveys is likely to be related to understanding failures at the front end of the diagnostic process (failure to engage) and in the process of delivering an explanation to the patient. Both are critical steps, and patients are uniquely positioned to report on those elements of diagnostic performance. The committee did not have examples of this application, and potential future uses are discussed in Chapter 8.
A variety of other methods have been employed to examine different dimensions of diagnostic error. These methods were not included in the table because they are unlikely to be a major source for estimating the incidence of error.
Patient actors, or “standardized patients,” have been used to assess rates of diagnostic error. Patient actors are asked to portray typical presentations of disease, and clinicians are assessed on their diagnostic performance. In one study in internal medicine, physicians made diagnostic errors in 13 percent of interactions with patient actors portraying four common conditions (Peabody et al., 2004). In a more recent multicenter study with unannounced patient actors, Weiner et al. (2010) looked at both biomedical-related errors (such as errors in diagnosis and treatment) and context-related errors (such as the lack of recognition that a patient may be unable to afford a medicine based on certain patient cues) in patient management. They found that physicians provided care that was free from errors in 73 percent of the uncomplicated encounters but made more errors in more complex cases (Weiner et al., 2010).
Many health care organizations in the United States have systems in place for patients and health care professionals to report minor and major adverse events. However, voluntary reporting typically results in under-reporting and covers only a limited spectrum of adverse events (AHRQ, 2014b). For example, one study found that over half of voluntary reports concentrated on medication/infusion adverse events (33 percent), falls (13 percent), and administrative events, such as discharge process, documentation, and communication (13 percent) (Milch et al., 2006). In Maine, the use of a physician champion to encourage voluntary diagnostic error reporting was implemented in 2011. During the 6-month pilot, there were 36 diagnostic errors reported. Half of the diagnostic errors were associated with moderate harm, and 22 percent of the diagnostic errors were classified as causing severe harm (Trowbridge, 2014).
Direct observation is another method that has been used to identify medical errors. Andrews and colleagues (1997) conducted observational research within a hospital setting and found that approximately 18 percent of patients in the study experienced a serious adverse event.
There have also been efforts to assess disease-specific diagnostic error rates, using a variety of data sources and methods. Berner and Graber (2008) and Schiff and colleagues (2005) provide examples of diagnostic errors in a variety of disease conditions.
Summary of Approaches to Assess the Incidence of Diagnostic Error
A number of methods have been used to assess the frequency with which diagnostic error occurs. Based on the committee’s review, the most promising methods for estimating incidence are postmortem exams, medical record reviews, and medical malpractice claims analysis, but none of these alone will give a valid estimate of the incidence of diagnostic error. This conclusion is consistent with studies in the broader area of medical errors and adverse events. For example, the Office of Inspector General of HHS completed an analysis that compared different measurement methods (nurse reviews, analysis of administrative claims data, patient interviews, analysis of incident reports, and an analysis of patient safety indicators) and found that 46 percent of patient safety events were identified by only one of the methods (Office of Inspector General, 2010). Levtzion-Korach and colleagues (2010) compared information gathered with five different measurement approaches—incident reporting, patient complaints, risk management, medical malpractice claims, and executive WalkRounds—and concluded that each measurement method identified different but complementary patient safety issues. In a related commentary, Shojania concluded that “it appears that a hospital’s picture of patient safety will depend on the method used to generate it” (Shojania, 2010, p. 400). This suggests that no one method will perfectly capture the incidence and the nature of medical errors and adverse events in health care: “[A] compelling theme emerged . . . different methods for detecting patient safety problems overlap very little in the safety problems they detect. These methods complement each other and should be used in combination to provide a comprehensive safety picture of the health care organization” (Shekelle et al., 2013, p. 416). This likely applies to the measurement of diagnostic errors; with the complexity of the diagnostic process, multiple approaches will be necessary to provide a more thorough understanding of the occurrence of these errors.
Determining the Causes and Risks of Diagnostic Error
This section describes how measurement can be used to better characterize diagnostic errors by identifying the causes and the risks associated with diagnostic error. Characterization of diagnostic errors requires understanding (1) which aspects in the diagnostic process are susceptible to failures and (2) what the contributing factors to these failures are. The committee used its conceptual model and input from other frameworks to provide a context for the measurement of the causes and the risks of diagnostic error. Measurement can focus on diagnostic process steps, the work system components, or both in order to identify causes and risks of diagnostic error.
The Diagnostic Process and Measurement Approaches to Identifying Potential Failures
Because the diagnostic process is a complex, team-based, iterative process that occurs over varying time spans, there are numerous opportunities for failures. The failures can include (1) the step never occurring, (2) the step being done incompletely or incorrectly (accuracy), and (3) a meaningful delay in taking a step (timeliness). In Figure 3-2, the committee’s conceptual model is used to identify where in the diagnostic process these failures can occur, including the failure of engagement in the health care system, failure in the diagnostic process, failure to establish an explanation of the health problem, and failure to communicate the explanation of the health problem.
Table 3-2 is organized around the major steps in the diagnostic process and adapts Schiff and colleagues’ (2009) framework to the failures associated with each of these steps. For example, diagnostic testing is part of several diagnostic steps where failures may happen, namely, during information gathering, integration, and interpretation. The last column identifies some of the methods that can be used to identify failures in actual practice settings. Experimental laboratory methods are a complementary approach to the methods in Table 3-2 to understand potential failures related to reasoning (Kostopoulou et al., 2009, 2012; Zwaan et al., 2013). The following discussion includes more information about the measurement approaches that can be used at each of these steps.
Failure of engagement This step primarily involves either patients not recognizing symptoms or health risks rapidly enough to access the health care system or patients experiencing significant barriers to accessing health care. Health care organizations are familiar with routine measures of eligible patients presenting for common screening tests;
|Where in the Diagnostic Process the Failure Occurred||Nature of Failurea||Methods for Detecting Failures|
|Failure to engage in the health care system or in the diagnostic process||
Analysis of emergency department, urgent care, and other high-risk cohorts
Surveys to determine why and what could be done differently
|Failure in information gathering||
Diagnostic trigger tools (e.g., high-risk cohort algorithms and missed opportunity targets)
Comparison to checklists
Video recording and debriefing (e.g., “stimulated recall”)
|Failure in information integration||
|Failure in information interpretation||
||Second review of samples|
|Failure to establish an explanation (diagnosis)||
Examination of expected follow-up (e.g., Kaiser Permanente’s SureNet system)
|Failure to communicate the explanation to the patient||
Video recording and debriefing
Medical record review
Shared decision making result
a Adapted from Schiff et al., 2009.
these systems can be extended to detect other failures to engage (or reengage) related to routine monitoring for disease progress, follow-up of abnormal test results, and so on (Danforth et al., 2014; Kanter, 2014; Singh et al., 2009). Surveys and interviews with patients can be used to identify approaches that are likely to be successful (and unsuccessful) in reducing delays and increasing engagement. The CRICO benchmarking study found that 1 percent of malpractice claims had an error associated with a failure to engage (CRICO, 2014).
Failure in information gathering The information-gathering step can involve failures to elicit key pieces of information; a failure to order the right diagnostic testing (in the right sequence or with the right specification); or technical errors in the way that samples are handled, labeled, and processed. The CRICO benchmarking study found that 58 percent of cases had one or more errors in the initial diagnostic assessment (CRICO, 2014). Failure to order appropriate diagnostic tests has been found to account for 55 percent of missed or delayed diagnoses in malpractice claims in ambulatory care (Gandhi et al., 2006) and 58 percent of errors in emergency departments (Kachalia et al., 2006). In their examination of physician-reported cases of error, Schiff and colleagues (2009) found that a failure or delay in ordering needed tests was the second most common factor contributing to a diagnostic error. Methods of rapid detection might include random reviews, diagnostic trigger tools, checklists, observation, video or audio recording, and feedback.
Failure in interpretation Inaccurate or failed attempts to interpret information gathered in the diagnostic process can involve such things as diagnostic tests, clinical history and interview, or information received from referral and consultation with other clinicians. CRICO reported that 23 percent of cases in its malpractice benchmarking study had errors in diagnostic test interpretation; 49 percent had errors in medical imaging, 20 percent in medicine, 17 percent in pathology, and 8 percent in surgery (CRICO, 2014). Schiff and colleagues (2009) reported that an erroneous laboratory or radiology reading of a test contributed to 11 percent of the diagnostic errors that they examined. Studies have shown that an incorrect interpretation of diagnostic tests occurs in internal medicine (38 percent reported in Gandhi et al., 2006) and emergency medicine (37 percent reported in Kachalia et al., 2006). Hickner and colleagues (2008) found that 8.3 percent of surveyed primary care physicians reported uncertainty in interpreting diagnostic testing. Failure in interpretations for medical imaging and anatomic pathology can be identified through second reviews conducted by expert clinicians.
Failure in integration Integration failures can be divided into failures in hypothesis generation, the suboptimal weighting and prioritization of information gathered in the diagnostic process, and the failure to recognize or weight urgency of clinical signs or symptoms. In examining major diagnostic errors, Schiff and colleagues (2009) found that 24 percent were the result of a failure to consider or a delay in considering the correct diagnosis. Potential approaches to measuring failure in integration include structured debriefings with the clinicians involved, conferences that review diagnostic errors (such as morbidity and mortality [M&M] conferences and root cause analyses), and random reviews.
Failure to establish an explanation (diagnosis) Failures can also occur when there is a failure to establish the explanation of the patient’s health problem. This can include suboptimal weighting and prioritization of clinical signs and symptoms, delays in considering a diagnosis, or failing to follow up with patients (including failing to create and implement an appropriate follow-up plan). CRICO (2014) found that referral errors were common in cancer cases in which there were diagnostic errors (48 percent of cases lacked appropriate referrals or consults). Methods for identifying these failures include random reviews and the analysis of expected follow-up, such as Kaiser Permanente’s SureNet system (Danforth et al., 2014; Graber et al., 2014).
Failure to communicate the explanation Failures to communicate the explanation of a patient’s health problem can include cases in which no communication was attempted, in which there was a delay in communicating the explanation, or in which the communication occurred but it was not aligned with a patient’s health literacy and language needs and was not understood. CRICO (2014) reported that 46 percent of cases in its benchmarking study involved a failure in communication and follow-up, including 18 percent of cases where the clinician did not follow up with the patient and 12 percent of cases where the information was not communicated within the care team. Potential measurement methods for this step include video recording and debriefing, patient surveys, medical record reviews, and shared decision-making results.
Other researchers have employed different classification schemes to illustrate where in the diagnostic process failures occur. For example, some researchers have classified the diagnostic process into three phases: initial diagnostic assessment; diagnostic test performance, interpretation, and results reporting; and diagnostic follow-up and coordination (CRICO, 2014; Lyratzopoulos et al., 2015). Another framework that is useful to depict the steps in the diagnostic testing process where failures can occur is the brain-to-brain loop model described in Chapter 2. The nine-step process
was originally developed in the laboratory medicine setting (Lundberg, 1981; Plebani et al., 2011), but it can be applied to anatomic pathology and medical imaging as well. Targeted measurement has shown that the phases of the process that are most prone to errors occur outside of the analytical phase and include test ordering (part of the diagnostic process information-gathering step) and subsequent decision making on the basis of the test results (part of the interpretation step) (Epner et al., 2013; Hickner et al., 2014; Plebani et al., 2011).
The Work System and Measurement Approaches to Identifying Potential Vulnerabilities and Risk Factors
In considering the options for making significant progress on the problem of diagnostic error, it is important to understand the reasons why these failures occur. For this discussion, the committee draws on the general patient safety literature, and applies it specifically to the challenge of diagnostic error. Traditional approaches to evaluating medical errors have focused on identifying individuals at fault. However, the modern patient safety movement has emphasized the importance of a systems approach to understanding medical errors. According to the IOM report To Err Is Human: Building a Safer Health System:
The common initial reaction when an error occurs is to find and blame someone. However, even apparently single events or errors are due most often to the convergence of multiple contributing factors. Blaming an individual does not change these factors and the same error is likely to recur. Preventing errors and improving patient safety for patients require a systems approach in order to modify the conditions that contribute to errors. People working in health care are among the most educated and dedicated workforce in any industry. The problem is not bad people; the problem is that the system needs to be made safer. (IOM, 2000, p. 49)
Often, a diagnostic error has multiple contributing factors. One analogy that has been employed to describe this phenomenon is the Swiss cheese model developed by psychologist James Reason (AHRQ, 2015a; Reason, 1990). In this model, a component of the diagnostic process would represent a slice of cheese in a stack of slices. Each component within the diagnostic process has vulnerabilities to failure (represented by the holes in a slice of Swiss cheese); in a single step of the diagnostic process, this may not affect the outcome. However, if the vulnerabilities (holes in the Swiss cheese) align, a diagnostic error can result.
Another way to think about the causes of diagnostic error is to distinguish between active errors and latent errors. Active errors typically
involve frontline clinicians (sometimes referred to as the “sharp end” of patient safety) (IOM, 2000). In contrast, latent errors are more removed from the control of frontline clinicians and can include failures in organizations and design that enable active errors to cause harm (often called the “blunt end” of patient safety) (AHRQ, 2015a; IOM, 2000). In the event of a medical error, too often the focus is on identifying active errors, especially within health care organizations with punitive cultures that focus on individual blame and punishment. But the IOM noted that:
Latent errors pose the greatest threat to safety in a complex system because they are often unrecognized and have the capacity to result in multiple types of active errors. . . . Latent errors can be difficult for people working in the system to notice since the errors may be hidden in the design of routine processes in computer programs or in the structure or management of an organization. People also become accustomed to design defects and learn to work around them, so they are often not recognized. (IOM, 2000, p. 55)
In line with the IOM’s earlier work, the committee took a systems approach to understanding the causes and risks of diagnostic errors. Consistent with the committee’s conceptual model, measurement for this purpose examines the different dimensions of the work system to identify the circumstances under which diagnostic errors are more (and less) likely to occur and to identify the risk factors for such errors. Factors contributing to diagnostic errors can be mapped along the components of the work system, including diagnostic team members and their tasks, technologies and tools, organizational characteristics, the physical environment, and the external environment.
Some of the more familiar approaches for assessing the system causes of medical errors are M&M conferences that apply a modern patient safety framework (a focus on understanding contributing factors rather than a focus on individual errors and blame) (Shojania, 2010) and root cause analyses (AHRQ, 2015b). For example, root cause analysis methods were applied to identify the factors that contributed to delays in diagnosis in the Department of Veterans Affairs system (Giardina et al., 2013). Diagnostic errors have also been evaluated in M&M conferences (Cifra et al., 2015).
As the committee’s conceptual model shows, the diagnostic process is embedded in a work system. Examining how the various dimensions of the work system contribute to diagnostic errors or how they can be configured to enhance diagnostic performance leads to a deeper understanding of the complexity of the process. Table 3-3 identifies the dimensions of the work system, the contribution each makes to diagnostic errors, and ex-
amples of measurement methods that have been used to assess each area. Although diagnostic team members are a critical component of the work system, approaches to ensuring diagnostic competency are addressed in Chapter 4 and they are not included here. The focus here is on the specific measurement tools that are available to help health care organizations better identify aspects of the work system that present vulnerabilities for diagnostic errors. A distinctive feature of some of these methods is
|Work System Dimension||Contribution to Diagnostic Errors||Examples of Methods for Assessing Effects|
|Tasks and workflow|
Information visualization (where, when, and how the information is received in the system)
Fragmented workflow and lack of support for accurate and timely information flow
Work-around strategies that increase risk
Cognitive task and work analysis methods (e.g., decision ladder model)
Observation of care process (e.g., work sampling; task analysis; video recording of care process and debriefing, e.g., stimulated recall)
Proactive risk assessment, including failure mode and effects analysis
Lack of support for stages/steps of diagnostic process: information gathering, information integration, information interpretation
Information visualization (where, when, and how the information is received in the system)
Observation of technology in use Proactive risk assessment, including failure mode and effect analysis
|Work System Dimension||Contribution to Diagnostic Errors||Examples of Methods for Assessing Effects|
Not supporting work system design efforts aimed at improving the diagnostic process and preventing/mitigating diagnostic errors
Conflicting messages about regulations across the organization
Confusion about responsibilities for tasks with unclear roles
Reluctance to question people with greater authority
Surveys aimed at assessing leadership and management in quality/safety improvement
Interviews or focus groups with clinicians and patients
Additional stressors on diagnostic team members that can affect cognitive tasks in diagnostic process: information gathering, information integration, and information interpretation
Physical human factors/ergonomics methods (e.g., direct assessment of noise and lighting [with equipment], survey of diagnostic team members regarding physical environment)
Link analysis for assessment of physical layout and team communication
that they can be used proactively to identify risks before an error occurs, versus the measurement methods described above that examine steps leading to an error that has already occurred.
Tasks and workflow The diagnostic process involves a series of tasks and an implicit or explicit workflow that contains and connects those tasks. A variety of challenges can occur with the tasks and workflow that are required to make a diagnosis, including problems with the information (amount, accuracy, completeness, appropriateness), communication issues, the complexity of the task, a lack of situational awareness, poor workflow design, interruptions, and inefficiencies. These issues contribute to diagnostic error at each step in the information gathering, integration, and interpretation process; they can contribute to problems with the
timeliness of information availability, and they can lead to problems in cognitive processing.
There are a variety of measurement approaches that can be used to evaluate tasks and workflow. It should be noted that these are best applied in the real-world environment in which the diagnosis is being made. The methods include cognitive task and work analysis (Bisantz and Roth, 2007; Rogers et al., 2012; Roth, 2008); observation of care processes (Carayon et al., 2014); situation awareness (Carayon et al., 2014; Salas et al., 1995); workflow modeling (Kirwan and Ainsworth, 1992); and proactive risk assessment (Carayon et al., 2014). These methods are briefly described below.
Cognitive task and work analysis The purpose of cognitive task and work analysis is to identify and describe the cognitive skills that are required to perform a particular task, such as making a diagnosis. The most common method used for such an analysis is an in-depth interview combined with observations of the specific task of interest (Schraagen et al., 2000). Because cognitive errors are an important contributing factor to diagnostic errors (Croskerry, 2003) these methods are likely to have considerable utility in efforts to reduce errors. Koopman and colleagues (2015) used cognitive task analysis to examine the relationship between the information needs that clinicians had in preparing for an office visit and the information presented in the electronic health record. They found a significant disconnect between clinician needs and the amount of information and the manner in which it was presented. This disconnect can lead to cognitive overload, a known contributor to error (Patel et al., 2008; Singh et al., 2013). The researchers recommended significant reengineering of the clinical progress note so that it matched the workflow and information needs of primary care clinicians.
Observation of care processes Process observation is a means of verifying what exactly occurs during a particular process (CAHPS, 2012). Frequently, these observations are documented in the form of process maps, which are graphical representations of the various steps required to accomplish a task. The approach is able to capture the complex demands imposed on members of the diagnostic team, and it allows for the “documentation of the coordination and communication required between clinicians to complete a task, use their expertise, tools, information and cues to problem solve” (Rogers et al., 2012). For example, Fairbanks and colleagues (2010) used this method to examine workflow and information flow in an emergency department’s use of digital imaging by applying both hierarchical task analysis and information process diagrams. The analysis identified gaps in how the information system for imaging sup-
ported communication between radiologists and emergency department physicians. In analyzing diagnostic error, this technique can identify the role that contextual or social factors play in assisting or impeding problem resolution (Rogers et al., 2012). Observations of care processes can also provide input for other work system analysis methods, such as cognitive task and work analysis as well as failure mode and effects analysis (FMEA).
Situation awareness Endsley (1995, p. 36) defined situation awareness as “the perception of elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future.” Situation awareness has been applied at the individual, team, and system levels. There are a variety of approaches to measuring situation awareness, including objective and subjective measures, performance and behavioral measures, and process indices. Because of the multidimensional nature of the construct, a combination of approaches is likely most useful. Examples of measurement tools in medicine include the Anesthetists’ Non-Technical Skills (ANTS) measure (Fletcher et al., 2003), the Ottawa Global Rating Scales (Kim et al., 2006), and an instrument to measure pediatric residents’ self-efficacy skills (which include situation awareness) in crisis resource management (Plant et al., 2011).
Workflow modeling Workflow modeling is a form of prospective analysis used to describe the processes and activities involved in completing clinical tasks. In contrast to observing work processes, modeling techniques allow for quantitative and qualitative estimations of tasks and of the possible paths that can be taken to complete them (Unertl et al., 2009). Challenges to workflow modeling in health care—and diagnosis in particular—include the fact that clinicians must remain flexible because of the need to respond to the nonroutine presentation of symptoms, results, and events as well as the variability in workflow across different health care organizations. Resulting models can be adapted and modified as necessary to reflect observations of care processes. Numerous methods for workflow modeling exist. Carayon et al. (2012) describe 100 methods in 12 categories (e.g., data display/organization methods and process mapping tools) for workflow modeling of the implementation of health IT. Jun et al. (2009) focus on eight workflow or process modeling methods that have been used in quality improvement projects; these include flowcharts and communication diagrams. These methods have great potential for helping to understand the dynamic sequences of tasks performed by various team members in the diagnostic process.
Proactive risk assessment The term “proactive risk assessment” refers to a variety of methods that are used to identify, evaluate, and minimize potential risks or vulnerabilities in a system. An example of such a method is FMEA. Several steps are involved in FMEA, including graphically describing the process, observing the process to ensure that the diagram is an accurate representation, brainstorming about failure modes, conducting a hazard analysis (i.e., different ways in which a particular process can fail to achieve its purpose), and development of a plan to address each failure mode along with outcome measures. DeRosier and colleagues (2002) describe the use of this method by the Department of Veterans Affairs (VA) National Center for Patient Safety and provide concrete examples of its application.
Technology A variety of technologies are used in the diagnostic process, and these can contribute to diagnostic errors for a variety of reasons, including inappropriate technology selection, poor design, poor implementation, use error, technology breakdown or failure, and misuse of automation. Technology failures contribute to problems in information gathering, integration, and interpretation; they may also produce information overload and may interfere with cognitive processes because of problems with the way the information is received and displayed.
Methods for improving the selection, design, implementation, and use of technology involve some of the methods described above, such as workflow modeling, FMEA, and other proactive risk assessment methods. In particular, many health care organizations have been concerned about whether enough attention is being paid to the usability of health IT. For example, in a study of physician job satisfaction, Friedberg and colleagues (2013) found that a number of factors related to electronic health records (EHRs) had a substantial impact on satisfaction, including: poor usability, the time required for data entry, interference in patient interactions, greater inefficiencies in workflow, less fulfilling work content, problems in exchanging information, and a degradation of clinical documentation. This study used a mixed-method design which included semi-structured and structured interviews with physicians. Its findings were consistent with research using other methods to assess the extent to which EHRs are enhancing care delivery (Armijo et al., 2009; Unertl et al., 2009). The American Medical Informatics Association Board of Directors issued recommendations about improving the usability of EHRs that were based in large part on usability studies that had been conducted by Middleton and colleagues (2013). The use of various usability evaluation methods can help in ensuring that usability concerns are addressed as early as possible in the design process. For example, Smith and colleagues incorporated usability testing into the design of a decision-support software
tool to catch missed follow-up of abnormal cancer test results in the VA (Smith et al., 2013). These various possible usability evaluation methods include heuristic evaluation methods, scenario-based usability evaluation, user testing, and the observation of technology in use (Gosbee and Gosbee, 2012).
Organizational characteristics Culture, leadership, and management are some of the organizational characteristics that can affect the diagnostic process. Some of the culture-related issues that can contribute to diagnostic error are a lack of organizational support for improvements, conflicting messages about regulations, confusion about task responsibilities, and the perception by people that they should not speak up even when they know a problem is occurring. These issues have been identified in the broader context of patient safety but are likely to affect diagnostic processes as well.
The main mechanisms for assessing these organizational characteristics are surveys (about culture, leadership, management, collaboration, communication) and focus groups. For instance, Shekelle and colleagues (2013) identified a number of survey-based measures in these areas as part of a report on the context-sensitivity of patient safety practices.
Physical environment Various characteristics of the physical environment (e.g., noise, lighting, layout) may affect the diagnostic process (Alvarado, 2012; Parsons, 2000). The physical environment places additional stresses on a diagnostic team that can affect the performance of cognitive tasks and information gathering, integration, and interpretation. For example, the layout and lighting of the radiology reading room may hinder accurate viewing of screens. Emergency departments are another example of a place where it makes sense to examine the effects of the physical environment on diagnostic errors (Campbell et al., 2007).
Human factors/ergonomics methods can be used to evaluate the physical environment. These methods include, for example, making a direct assessment of noise and lighting with specific equipment (e.g., a light meter) and direct observation of care processes to identify challenges related to layout. For instance, observing the physical movements of clinicians can help identify communication among team members and the barriers posed by the physical environment (e.g., lack of available equipment or poorly located equipment; see Potter et al., 2004; Wolf et al., 2006). In addition, surveys can also be used to gather data from a larger population of staff and patients about environmental characteristics, such as the adequacy of lighting and the perception of noise and its impact. In an example of this approach, Mahmood and colleagues (2011) surveyed nurses about the aspects of their physical environment that affected the
risk of medication errors. Many of these factors contribute to latent errors—for example, creating conditions under which cognitive functioning is impaired because of the work environment itself.
Summary The committee reviewed a number of methods for assessing the effects of the work system on diagnostic error. This section highlights a number of those methods and illustrates how they have been applied in various health care settings to develop insights into the risks of error and to identify potential areas for improvement. The methods have in common the fact that they combine observation of the actual processes (tasks, communication, interaction with technology) with documentation of those processes. These methods can be relatively labor intensive, and they tend to require application at the individual site level, which implies that this is work that all teams and settings in which diagnoses are made need to become more skilled at undertaking. While standardized tools exist (surveys, methods of observation, and analysis of teams) and might be applied to samples of different types of teams and settings to identify particular vulnerabilities for diagnostic error, the most useful application of these methods is typically for improvement at the local level. The human factors science in this area suggests that a number of likely problems can be readily identified—that is, that deep study may not be necessary—but the complexity of the interactions among these various factors suggests that high levels of vigilance and attention to measurement will likely be necessary throughout the health care system.
Measurement will be critical to assessing whether changes that are intended to improve diagnosis and reduce diagnostic errors are effective. Changes can be implemented and evaluated as part of a quality improvement program or a research project. For both purposes it would be helpful to develop assessment tools that can be implemented within routine clinical practice to rapidly identify potential failures in the diagnostic process, to alert clinicians and health care organizations to diagnostic errors, and to ascertain trend changes over time. For quality improvement approaches, establishing a baseline (knowing the current rate of failure in a particular step in the diagnostic process using some of the measurement methods in Table 3-2) will provide the main method for understanding whether interventions are having the desired effect. For research studies, the specific aims and change strategy under evaluation will indicate what measurement choice should be made from a broader set or possibilities (e.g., long-term clinical outcomes, diagnostic errors, diagnostic process failures, and contextual variables hypothesized or known to influence diagnostic
performance). In some cases the aim of measurement will be to assess whether interventions designed to address specific failures are resulting in lower failure rates. In other cases the aim of measurement will be to assess whether a global intervention reduces multiple causes simultaneously. This purpose relates to the work system focal point for analysis and intervention (Table 3-3 measures). An important contribution to research in this area will be the identification of approaches that can reduce the risk for diagnostic error.
There have been few studies that have evaluated the impact of interventions on improving diagnosis and reducing diagnostic error. McDonald and colleagues (2013) conducted a systematic review to identify interventions targeted at reducing diagnostic error. They found more than 100 evaluations of interventions and grouped them into six categories: “techniques (changes in equipment, procedures, and clinical approaches); personnel changes (the introduction of additional health care professionals or the replacement of certain health care professionals for others); educational interventions (residency training, curricula, and maintenance of certification changes); structured process changes (implementation of feedback mechanisms); technology-based interventions (clinical decision support, text messaging, and pager alerts); and additional review methods (independent reviews of test results)” (McDonald et al., 2013, p. 383). The measures used in these intervention studies included diagnostic accuracy, outcomes related to further diagnostic test use, outcomes related to further therapeutic management, direct patient-related outcomes, time to correct therapeutic management, and time to diagnosis; 26 of the 100 intervention studies examined diagnostic delays. The researchers identified 14 randomized trials (rated as having mostly a low to moderate risk of bias), 11 of which reported interventions that reduced diagnostic errors. The evidence appeared to be strongest for technology-based interventions and specific techniques. The researchers found that very few studies evaluated the impact of the intervention on patient outcomes (e.g., mortality, morbidity), and they suggested that further evaluations of promising interventions should be conducted in large studies across diverse settings of care in order to enhance generalizability (McDonald et al., 2013).
Two previous reviews evaluated the impact of “system-related interventions” and “cognitive interventions” on the reduction of diagnostic errors (Graber et al., 2012; Singh et al., 2012b). For system-related interventions Singh and colleagues concluded, “Despite a number of suggested interventions in the literature, few empirical studies have tested interventions to reduce diagnostic error in the last decade. Advancing the science of diagnostic error prevention will require more robust study designs and rigorous definitions of diagnostic processes and outcomes to measure intervention effects” (Singh et al., 2012b, p. 160). Graber and col-
leagues identified a variety of possible approaches to reducing cognitive errors in diagnosis. Not all of the suggested approaches had been tested, and of those that had been tested, they generally involved observing trainees in artificial settings, making it difficult to extrapolate the results to actual practice. “Future progress in this area,” they concluded, “will require methodological refinements in outcome evaluation and rigorously evaluating interventions already suggested” (Graber et al., 2012, p. 535).
The three systematic reviews of diagnostic interventions draw similar conclusions about the heterogeneity of measures used as well as the dearth of patient-reported outcomes. Synthesizing information from the available interventions is difficult because of the lack of comparable outcomes across studies. As with other areas of quality and patient safety, improved patient outcomes is a common goal, but it may not be practical to assess such patient outcomes during limited-time intervention studies (or quality improvement efforts). Intermediate measures that assess process failures (e.g., the development of algorithms to identify and quantify missed opportunities for making a specific diagnosis among an at-risk population) or cognitive problems (e.g., debriefing to determine what biases are at play and at what frequency) will continue to provide useful information for understanding the influence of an intervention at its point of expected action (as part of the diagnostic process or other component of the work system, or at the sharp or blunt end of care). As with other areas of patient safety research and quality improvement, evidence connecting any intermediate measures to patient outcomes will need proper attention.
Another key area of attention for patient safety intervention research, which applies to diagnostic error measurement, is context-sensitivity. As noted in the section on identifying risks for diagnostic error, work system dimensions have the potential to contribute to diagnostic error. For any diagnostic error reduction intervention, measurement focused on context variables (e.g., dimensions of the work system, as noted in Table 3-3) will allow testing of the hypothesized role of these variables in diagnostic error. Shekelle and colleagues (2013) pointed to the need for evidence about the context in which safety strategies have been adopted and tested in order to help health care organizations understand what works and under what circumstances, so that the intervention strategy can be adapted appropriately to local needs. McDonald summarized domains and measurement options for studying context in relation to quality improvement interventions, which could be extended to new areas such as diagnostic safety interventions. She noted that “efficient and effective means to incorporate the domain of context into research . . . has received relatively minimal attention in health care, even though the salience of this broad topic is well understood by practitioners and policy makers” (McDonald, 2013, p. S51).
In summary, there are a multitude of specific measurement choices when developing and testing interventions for quality improvement or research, but no single repository of options exists. Funders and researchers have developed repositories of measurement tools for various other topics and applications. For example, the Agency for Healthcare Research and Quality’s Care Coordination Measures Atlas is a resource that includes a measurement framework, identified measures with acceptable performance characteristics, and maps of these measures to framework domains (AHRQ, 2014a). A similar resource would be useful for those involved in diagnostic error interventions from proof of concept through the spread of successful interventions with widespread applicability (i.e., cases in which an intervention exhibits limited context sensitivity or the cases in which an intervention works well within many contexts). Such a resource could build on the domains and measures shown in Tables 3-2 and 3-3, as well as other sources from quality improvement and patient safety research applicable to diagnostic error.
Abujudeh, H. H., G. W. Boland, R. Kaewlai, P. Rabiner, E. F. Halpern, G. S. Gazelle, and J. H. Thrall. 2010. Abdominal and pelvic computed tomography (CT) interpretation: Discrepancy rates among experienced radiologists. European Radiology 20(8):1952–1957.
Adams, J. L., and S. Garber. 2007. Reducing medical malpractice by targeting physicians making medical malpractice payments. Journal of Empirical Legal Studies 4(1):185–222.
AHRQ (Agency for Healthcare Research and Quality). 2014a. Care Coordination Measures Atlas update. www.ahrq.gov/professionals/prevention-chronic-care/improve/coordination/atlas2014/index.html (accessed May 26, 2015.).
AHRQ. 2014b. Patient Safety Network: Voluntary patient safety event reporting (incident reporting). http://psnet.ahrq.gov/primer.aspx?primerID=13 (accessed May 8, 2015).
AHRQ. 2015a. Patient Safety Network: Patient safety primers. Systems approach. http://psnet.ahrq.gov/primer.aspx?primerID=21 (accessed May 8, 2015).
AHRQ. 2015b. Patient Safety Network: Root cause analysis. www.psnet.ahrq.gov/primer.aspx?primerID=10 (accessed May 8, 2015).
Aleccia, J. 2014. Misdiagnosed: Docs’ mistakes affect 12 million a year. NBC News, April 16. www.nbcnews.com/health/health-news/misdiagnosed-docs-mistakes-affect-12-million-year-n82256 (accessed October 30, 2014).
Alvarado, C. J. 2012. The physical environment in health care. In P. Carayon (ed.), Handbook of human factors and ergonomics in health care and patient safety (pp. 215–234). Boca Raton, FL: Taylor & Francis Group.
Andrews, L. B., C. Stocking, T. Krizek, L. Gottlieb, C. Krizek, T. Vargish, and M. Siegler. 1997. An alternative strategy for studying adverse events in medical care. Lancet 349(9048):309–313.
Armijo, D., C. McDonnell, and K. Werner. 2009. Electronic health record usability: Electronic and use case framework. AHRQ Publication No. 09(10)-0091-1-EF. Rockville, MD: Agency for Healthcare Research and Quality.
Berenson, R. A., D. K. Upadhyay, and D. R. Kaye. 2014. Placing diagnosis errors on the policy agenda. Washington, DC: Urban Institute. www.urban.org/research/publication/placing-diagnosis-errors-policy-agenda (accessed May 22, 2015).
Berlin, L. 2014. Radiologic errors, past, present and future. Diagnosis 1(1):79–84.
Berner, E. S., and M. L. Graber. 2008. Overconfidence as a cause of diagnostic error in medicine. American Journal of Medicine 121(5 Suppl):S2–S23.
Betsy Lehman Center for Patient Safety and Medical Error Reduction. 2014. The public’s views on medical error in Massachusetts. Cambridge, MA: Harvard School of Public Health.
Bisantz, A., and E. Roth. 2007. Analysis of cognitive work. Reviews of Human Factors and Ergonomics 3(1):1–43.
Blendon, R. J., C. M. DesRoches, M. Brodie, J. M. Benson, A. B. Rosen, E. Schneider, D. E. Altman, K. Zapert, M. J. Herrmann, and A. E. Steffenson. 2002. Views of practicing physicians and the public on medical errors. New England Journal of Medicine 347(24):1933–1940.
Bonini, P., M. Plebani, F. Ceriotti, and F. Rubboli. 2002. Errors in laboratory medicine. Clinical Chemistry 48(5):691–698.
Borgstede, J., R. Lewis, M. Bhargavan, and J. Sunshine. 2004. RADPEER quality assurance program: A multifacility study of interpretive disagreement rates. Journal of the American College of Radiology 1(1):59–65.
Brennan, T. A., L. L. Leape, N. M. Laird, L. Hebert, A. R. Localio, A. G. Lawthers, J. P. Newhouse, P. C. Weiler, and H. H. Hiatt. 1991. Incidence of adverse events and negligence in hospitalized patients: Results of the Harvard Medical Practice Study I. New England Journal of Medicine 324(6):370–376.
CAHPS (Consumer Assessment of Healthcare Providers and Systems). 2012. The CAHPS improvement guide. www.facs.org/~/media/files/advocacy/cahps/improvement%20guide.ashx (accessed July 12, 2015).
Callen, J., A. Georgiou, J. Li, and J. I. Westbrook. 2011. The safety implications of missed test results for hospitalised patients: A systematic review. BMJ Quality & Safety in Health Care 20(2):194–199.
Campbell, S. G., P. Croskerry, and W. F. Bond. 2007. Profiles in patient safety: A “perfect storm” in the emergency department. Academic Emergency Medicine 14(8):743–749.
Carayon, P., R. Cartmill, P. Hoonakker, A. S. Hundt, B.-T. Karsh, D. Krueger, M. L. Snellman, T. N. Thuemling, and T. B. Wetterneck. 2012. Human factors analysis of workflow in health information technology implementation. In P. Carayon (ed.), Handbook of human factors and ergonomics in health care and patient safety (pp. 507–521). Boca Raton, FL: Taylor & Francis Group.
Carayon, P., Y. Li, M. M. Kelly, L. L. DuBenske, A. Xie, B. McCabe, J. Orne, and E. D. Cox. 2014. Stimulated recall methodology for assessing work system barriers and facilitators in family-centered rounds in a pediatric hospital. Applied Ergonomics 45(6):1540–1546.
Carraro, P., and M. Plebani. 2007. Errors in a stat laboratory: Types and frequencies 10 years later. Clinical Chemistry 53(7):1338–1342.
Carter, S. M., W. Rogers, I. Heath, C. Degeling, J. Doust, and A. Barratt. 2015. The challenge of overdiagnosis begins with its definition. BMJ 350:h869.
CDC (Centers for Disease Control and Prevention). 2010. National Ambulatory Medical Care Survey: 2010 summary tables. www.cdc.gov/nchs/data/ahcd/namcs_summary/2010_namcs_web_tables.pdf (accessed May 26, 2015).
CDC. 2011. National Hospital Ambulatory Medical Care Survey: 2011 emergency department summary tables. www.cdc.gov/nchs/data/ahcd/nhamcs_emergency/2011_ed_web_tables.pdf (accessed May 26, 2015).
CDC. 2015. Ambulatory health care data. www.cdc.gov/nchs/ahcd.htm (accessed May 18, 2015).
Chimowitz, M. I., E. L. Logigian, and L. R. Caplan. 1990. The accuracy of bedside neurological diagnoses. Annals of Neurology 28(1):78–85.
Chiolero, A., F. Paccaud, D. Aujesky, V. Santschi, and N. Rodondi. 2015. How to prevent overdiagnosis. Swiss Medical Weekly 145:w14060.
Cifra, C. L., K. L. Jones, J. A. Ascenzi, U. S. Bhalala, M. M. Bembea, D. E. Newman-Toker, J. C. Fackler, and M. R. Miller. 2015. Diagnostic errors in a PICU: Insights from the Morbidity and Mortality Conference. Pediatric Critical Care Medicine 16(5):468–476.
CRICO. 2014. Annual benchmarking report: Malpractice risks in the diagnostic process. Cambridge, MA: CRICO. www.rmfstrategies.com/benchmarking (accessed June 4, 2015).
Croskerry, P. 2003. The importance of cognitive errors in diagnosis and strategies to minimize them. Academic Medicine 78(8):775–780.
Croskerry, P. 2011. Commentary: Lowly interns, more is merrier, and the Casablanca Strategy. Academic Medicine 86(1):8–10.
Croskerry, P. 2012. Perspectives on diagnostic failure and patient safety. Healthcare Quarterly 15(Special issue):50–56.
Danforth, K. N., A. E. Smith, R. K. Loo, S. J. Jacobsen, B. S. Mittman, and M. H. Kanter. 2014. Electronic clinical surveillance to improve outpatient care: Diverse applications within an integrated delivery system. eGEMS 2(1):1056.
DeRosier, J., E. Stalhandske, J. P. Bagian, and T. Nudell. 2002. Using health care failure mode and effect analysis™: The VA National Center for Patient Safety’s prospective risk analysis system. Joint Commission Journal on Quality and Patient Safety 28(5):248–267.
Endsley, M. R. 1995. Toward a theory of situation awareness in dynamic systems. Human Factors 37(1):32–64.
Epner, P. L., J. E. Gans, and M. L. Graber. 2013. When diagnostic testing leads to harm: A new outcomes-based approach for laboratory medicine. BMJ Quality & Safety 22(Suppl 2):ii6–ii10.
Esserman, L., Y. Shieh, and I. Thompson. 2009. Rethinking screening for breast cancer and prostate cancer. JAMA 302(15):1685–1692.
Fairbanks, R., T. Guarrera, A. Bisantz, M. Venturino, and P. Westesson. 2010. Opportunities in IT support of workflow & information flow in the emergency department digital imaging process. Proceedings of the Human Factors and Ergonomics Society Annual Meeting 54(4):359–363.
FDA (Food and Drug Administration). 2015. Initiative to reduce unnecessary radiation exposure from medical imaging. www.fda.gov/Radiation-EmittingProducts/RadiationSafety/RadiationDoseReduction/ucm2007191.htm (accessed May 3, 2015).
Fletcher, G., R. Flin, P. McGeorge, R. Glavin, N. Maran, and R. Patey. 2003. Anaesthetists’ Non-Technical Skills (ANTS): Evaluation of a behavioural marker system. British Journal of Anaesthesia 90(5):580–588.
Friedberg, M. W., P. G. Chen, K. R. Van Busum, F. Aunon, C. Pham, J. Caloyeras, S. Mattke, E. Pitchforth, D. D. Quigley, and R. H. Brook. 2013. Factors affecting physician professional satisfaction and their implications for patient care, health systems, and health policy: Santa Monica, CA: RAND Corporation.
Gandhi, T. K., A. Kachalia, E. J. Thomas, A. L. Puopolo, C. Yoon, T. A. Brennan, and D. M. Studdert. 2006. Missed and delayed diagnoses in the ambulatory setting: A study of closed malpractice claims. Annals of Internal Medicine 145(7):488–496.
Gaudi, S., J. M. Zarandona, S. S. Raab, J. C. English, and D. M. Jukic. 2013. Discrepancies in dermatopathology diagnoses: The role of second review policies and dermatopathology fellowship training. Journal of the American Academy of Dermatology 68(1):119–128.
Gawande, A. 2007. The way we age now: Medicine has increased the ranks of the elderly. Can it make old age any easier? The New Yorker, April 30. www.newyorker.com/magazine/2007/04/30/the-way-we-age-now (accessed May 18, 2015).
Gawande, A. 2014. Being mortal: Illness, medicine, and what matters in the end. London, UK: Wellcome Collection.
Gawande, A. 2015. Overkill. The New Yorker, May 11. www.newyorker.com/magazine/2015/05/11/overkill-atul-gawande (accessed July 13, 2015).
Giardina, T. D., B. J. King, A. P. Ignaczak, D. E. Paull, L. Hoeksema, P. D. Mills, J. Neily, R. R. Hemphill, and H. Singh. 2013. Root cause analysis reports help identify common factors in delayed diagnosis and treatment of outpatients. Health Affairs (Millwood) 32(8):1368–1375.
Gigerenzer, G. 2014. Breast cancer screening pamphlets mislead women. BMJ 348:g2636.
Goldschmidt, H. M. J., and R. W. Lent. 1995. From data to information: How to define the context? Chemometrics and Intelligent Laboratory Systems 28(1):181–192.
Golodner, L. 1997. How the public perceives patient safety. Newsletter of the National Patient Safety Foundation 1(1):1–4.
Gosbee, J., and L. L. Gosbee. 2012. Usability evaluation in health care. In P. Carayon (ed.), Handbook of human factors and ergonomics in health care and patient safety, 2nd ed. (pp. 543–555). Boca Raton, FL: Taylor & Francis Group.
Graber, M. L. 2013. The incidence of diagnostic error in medicine. BMJ Quality and Safety 22(Suppl 2):ii21–ii27.
Graber, M. L., N. Franklin, and R. Gordon. 2005. Diagnostic error in internal medicine. Archives of Internal Medicine 165(13):1493–1499.
Graber, M. L., S. Kissam, V. L. Payne, A. N. Meyer, A. Sorensen, N. Lenfestey, E. Tant, K. Henriksen, K. Labresh, and H. Singh. 2012. Cognitive interventions to reduce diagnostic error: A narrative review. BMJ Quality and Safety 21(7):535–557.
Graber, M. L., R. Trowbridge, J. S. Myers, C. A. Umscheid, W. Strull, and M. H. Kanter. 2014. The next organizational challenge: Finding and addressing diagnostic error. Joint Commission Journal on Quality and Patient Safety 40(3):102–110.
Grimes, D. A., and K. F. Schulz. 2002. Uses and abuses of screening tests. Lancet 359(9309):881–884.
Hallworth, M. J. 2011. The “70% claim”: What is the evidence base? Annals of Clinical Biochemistry 48(6):487–488.
Hickner, J., D. G. Graham, N. C. Elder, E. Brandt, C. B. Emsermann, S. Dovey, and R. Phillips. 2008. Testing process errors and their harms and consequences reported from family medicine practices: A study of the American Academy of Family Physicians National Research Network. Quality and Safety in Health Care 17(3):194–200.
Hickner, J., P. J. Thompson, T. Wilkinson, P. Epner, M. Sheehan, A. M. Pollock, J. Lee, C. C. Duke, B. R. Jackson, and J. R. Taylor. 2014. Primary care physicians’ challenges in ordering clinical laboratory tests and interpreting results. Journal of the American Board of Family Medicine 27(2):268–274.
Hofer, T., S. Asch, R. Hayward, L. Rubenstein, M. Hogan, J. Adams, and E. Kerr. 2004. Profiling quality of care: Is there a role for peer review? BMC Health Services Research 4(1):9.
Hoffmann, T. C., and C. Del Mar. 2015. Patients’ expectations of the benefits and harms of treatments, screening, and tests: A systematic review. JAMA Internal Medicine 175(2):274–286.
Hoyert, D. L. 2011. The changing profile of autopsied deaths in the United States, 1972–2007. NCHS Data Brief 67(August).
Hricak, H., D. J. Brenner, S. J. Adelstein, D. P. Frush, E. J. Hall, R. W. Howell, C. H. McCollough, F. A. Mettler, M. S. Pearce, O. H. Suleiman, J. H. Thrall, and L. K. Wagner. 2011. Managing radiation use in medical imaging: A multifaceted challenge. Radiology 258(3):889–905.
Iglehart, J. K. 2009. Health insurers and medical-imaging policy—A work in progress. New England Journal of Medicine 360(10):1030–1037.
IOM (Institute of Medicine). 1990. Medicare: A strategy for quality assurance (2 vols.). Washington, DC: National Academy Press.
IOM. 2000. To err is human: Building a safer health system. Washington, DC: National Academy Press.
IOM. 2001. Crossing the quality chasm: A new health system for the 21st century. Washington, DC: National Academy Press.
IOM. 2004. Patient safety: Achieving a new standard for care. Washington, DC: The National Academies Press.
Jun, G. T., J. Ward, Z. Morris, and J. Clarkson. 2009. Health care process modelling: Which method when? International Journal for Quality in Health Care 21(3):214–224.
Kachalia, A., T. K. Gandhi, A. L. Puopolo, C. Yoon, E. J. Thomas, R. Griffey, T. A. Brennan, and D. M. Studdert. 2006. Missed and delayed diagnoses in the emergency department: A study of closed malpractice claims from 4 liability insurers. Annals of Emergency Medicine 49(2):196–205.
Kanter, M. H. 2014. Diagnostic errors—Patient safety. Presentation to the Committee on Diagnostic Error in Health Care, August 7, 2014, Washington, DC.
Kassirer, J. P. 1989. Our stubborn quest for diagnostic certainty. A cause of excessive testing. New England Journal of Medicine 320(22)1489–1491.
Kassirer, J. P., and R. I. Kopelman. 1989. Cognitive errors in diagnosis: Instantiation, classification, and consequences. American Journal of Medicine 86(4):433–441.
Kerr, E. A., T. P. Hofer, R. A. Hayward, J. L. Adams, M. M. Hogan, E. A. McGlynn, and S. M. Asch. 2007. Quality by any other name? A comparison of three profiling systems for assessing health care quality. Health Services Research 42(5):2070–2087.
Kim, J., D. Neilipovitz, P. Cardinal, M. Chiu, and J. Clinch. 2006. A pilot study using high-fidelity simulation to formally evaluate performance in the resuscitation of critically ill patients: The University of Ottawa Critical Care Medicine, High-Fidelity Simulation, and Crisis Resource Management I Study. Critical Care Medicine 34(8):2167–2174.
Kirwan, B. E., and L. K. Ainsworth (eds.). 1992. A guide to task analysis: The Task Analysis Working Group. Boca Raton, FL: Taylor & Francis Group.
Klein, G. 2011. What physicians can learn from firefighters. Paper presented at the 4th International Diagnostic Error Conference, October 23–26, 2011, Chicago, IL.
Klein, G. 2014. Submitted input. Input submitted to the Committee on Diagnostic Error. December 20, 2014, Washington, DC.
Koopman, R. J., L. M. Steege, J. L. Moore, M. A. Clarke, S. M. Canfield, M. S. Kim, and J. L. Belden. 2015. Physician Information needs and electronic health records (EHRs): Time to reengineer the clinic note. Journal of the American Board of Family Medicine 28(3):316–323.
Kostopoulou, O., C. Mousoulis, and B. C. Delaney. 2009. Information search and information distortion in the diagnosis of an ambiguous presentation. Judgment and Decision Making 4(5):408–418.
Kostopoulou, O., J. E. Russo, G. Keenan, B. C. Delaney, and A. Douiri. 2012. Information distortion in physicians’ diagnostic judgments. Medical Decision Making 32(6):831–839.
Kronz, J. D., and W. H. Westra. 2005. The role of second opinion pathology in the management of lesions of the head and neck. Current Opinion in Otolaryngology and Head and Neck Surgery 13(2):81–84.
Landis, J. R., and G. G. Koch. 1977. The measurement of observer agreement for categorical data. Biometrics 33(1):159–174.
Leape, L. L., T. A. Brennan, N. Laird, A. G. Lawthers, A. R. Localio, B. A. Barnes, L. Hebert, J. P. Newhouse, P. C. Weiler, and H. Hiatt. 1991. The nature of adverse events in hospitalized patients: Results of the Harvard Medical Practice Study II. New England Journal of Medicine 324(6):377–384.
Levtzion-Korach, O., A. Frankel, H. Alcalai, C. Keohane, J. Orav, E. Graydon-Baker, J. Barnes, K. Gordon, A. L. Puopulo, E. I. Tomov, L. Sato, and D. W. Bates. 2010. Integrating incident data from five reporting systems to assess patient safety: Making sense of the elephant. Joint Commission Journal on Quality and Patient Safety 36(9):402–410.
Liss, M. A., J. Billimek, K. Osann, J. Cho, R. Moskowitz, A. Kaplan, R. J. Szabo, S. H. Kaplan, S. Greenfield, and A. Dash. 2013. Consideration of comorbidity in risk stratification prior to prostate biopsy. Cancer 119(13):2413–2418.
Localio, A. R., A. G. Lawthers, T. A. Brennan, N. M. Laird, L. E. Hebert, L. M. Peterson, J. P. Newhouse, P. C. Weiler, and H. H. Hiatt. 1991. Relation between malpractice claims and adverse events due to negligence. Results of the Harvard Medical Practice Study III. New England Journal of Medicine 325(4):245–251.
Lundberg, G. D. 1981. Acting on significant laboratory results. JAMA 245(17):1762–1763.
Lundberg, G. D. 1998. Low-tech autopsies in the era of high-tech medicine: Continued value for quality assurance and patient safety. JAMA 280(14):1273–1274.
Lyratzopoulos, G., P. Vedsted, and H. Singh. 2015. Understanding missed opportunities for more timely diagnosis of cancer in symptomatic patients after presentation. British Journal of Cancer 112:S84–S91.
Mahmood, A., H. Chaudhury, and M. Valente. 2011. Nurses’ perceptions of how physical environment affects medication errors in acute care settings. Applied Nursing Research 24(4):229–237.
McDonald, K. M. 2013. Considering context in quality improvement interventions and implementation: Concepts, frameworks, and application. Academic Pediatrics 13(6):S45–S53.
McDonald, K. M., B. Matesic, D. G. Contopoulos-Ioannidis, J. Lonhart, E. Schmidt, N. Pineda, and J. P. Ioannidis. 2013. Patient safety strategies targeted at diagnostic errors: A systematic review. Annals of Internal Medicine 158(5 Pt 2):381–389.
McGlynn, E. A., S. M. Asch, J. Adams, J. Keesey, J. Hicks, A. DeCristofaro, and E. A. Kerr. 2003. The quality of health care delivered to adults in the United States. New England Journal of Medicine 348(26):2635–2645.
Middleton, B., M. Bloomrosen, M. A. Dente, B. Hashmat, R. Koppel, J. M. Overhage, T. H. Payne, S. T. Rosenbloom, C. Weaver, and J. Zhang. 2013. Enhancing patient safety and quality of care by improving the usability of electronic health record systems: Recommendations from AMIA. Journal of the American Medical Informatics Association 20(e1):e2–e8.
Milch, C. E., D. N. Salem, S. G. Pauker, T. G. Lundquist, S. Kumar, and J. Chen. 2006. Voluntary electronic reporting of medical errors and adverse events. Journal of General Internal Medicine 21(2):165–170.
Moynihan, R., J. Doust, and D. Henry. 2012. Preventing overdiagnosis: How to stop harming the healthy. BMJ 344:e3502.
Mulley, A. G., C. Trimble, and G. Elwyn. 2012. Stop the silent misdiagnosis: Patients’ preferences matter. BMJ 345(1):e6572.
Murphy, D. R., A. Laxmisan, B. A. Reis, E. J. Thomas, A. Esquivel, S. N. Forjuoh, R. Parikh, M. M. Khan, and H. Singh. 2014. Electronic health record-based triggers to detect potential delays in cancer diagnosis. BMJ Quality and Safety 23(1):8–16.
Nakhleh, R. E., V. Nosé, C. Colasacco, L. A. Fatheree, T. J. Lillemoe, D. C. McCrory, F. A. Meier, C. N. Otis, S. R. Owens, S. S. Raab, R. R. Turner, C. B. Ventura, and A. A. Renshaw. 2015. Interpretive diagnostic error reduction in surgical pathology and cytology: Guideline from the College of American Pathologists Pathology and Laboratory Quality Center and the Association of Directors of Anatomic and Surgical Pathology. Archives of Pathology & Laboratory Medicine. Epub ahead of print. http://dx.doi.org/10.5858/arpa.2014-0511-SA (accessed December 6, 2015).
Newman-Toker, D. E. 2014a. Prioritization of diagnostic error problems and solutions: Concepts, economic modeling, and action plan. Input submitted to the Committee on Diagnostic Error in Health Care, August 7, 2014, Washington, DC.
Newman-Toker, D. E. 2014b. A unified conceptual model for diagnostic errors: Underdiagnosis, overdiagnosis, and misdiagnosis. Diagnosis 1(1):43–48.
Newman-Toker, D. E., and P. J. Pronovost. 2009. Diagnostic errors: The next frontier for patient safety. JAMA 301(10):1060–1062.
Newman-Toker, D. E., K. M. McDonald, and D. O. Meltzer. 2013. How much diagnostic safety can we afford, and how should we decide? A health economics perspective. BMJ Quality and Safety 22(Suppl 2):ii11–ii20.
Newman-Toker, D. E., E. Moy, E. Valente, R. Coffey, and A. L. Hines. 2014. Missed diagnosis of stroke in the emergency department: A cross-sectional analysis of a large population-based sample. Diagnosis 1(2):155–166.
Office of Inspector General. 2010. Adverse events in hospitals: Methods for identifying events. Washington, DC: Office of Inspector General. https://oig.hhs.gov/oei/reports/oei-06-08-00221.pdf (accessed June 4, 2015).
Pace, L. E., and N. L. Keating. 2014. A systematic assessment of benefits and risks to guide breast cancer screening decisions. JAMA 311(13):1327–1335.
Parsons, K. C. 2000. Environmental ergonomics: A review of principles, methods and models. Applied Ergonomics 31:581–594.
Patel, V. L., J. Zhang, N. A. Yoskowitz, R. Green, and O. R. Sayan. 2008. Translational cognition for decision support in critical care environments: A review. Journal of Biomedical Informatics 41(3):413–431.
Peabody, J. W., J. Luck, S. Jain, D. Bertenthal, and P. Glassman. 2004. Assessing the accuracy of administrative data in health information systems. Medical Care 42(11):1066–1072.
Peterson, M. C., J. H. Holbrook, D. Von Hales, N. L. Smith, and L. V. Staker. 1992. Contributions of the history, physical examination, and laboratory investigation in making medical diagnoses. Western Journal of Medicine 156(2):163–165.
Plant, J. L., S. M. van Schaik, D. C. Sliwka, C. K. Boscardin, and P. S. O’Sullivan. 2011. Validation of a self-efficacy instrument and its relationship to performance of crisis resource management skills. Advances in Health Sciences Education 16(5):579–590.
Plebani, M. 2010. The detection and prevention of errors in laboratory medicine. Annals of Clinical Biochemistry 47(2):101–110.
Plebani, M. 2014. Defensive medicine and diagnostic testing. Diagnosis 1(2):151–154.
Plebani, M., M. Laposata, and G. D. Lundberg. 2011. The brain-to-brain loop concept for laboratory testing 40 years after its introduction. American Journal of Clinical Pathology 136(6):829–833.
Potter, P., S. Boxerman, L. Wolf, J. Marshall, D. Grayson, J. Sledge, and B. Evanoff. 2004. Mapping the nursing process: A new approach for understanding the work of nursing. Journal of Nursing Administration. 34(2):101–109.
Rao, V. M., and D. C. Levin. 2012. The overuse of diagnostic imaging and the Choosing Wisely initiative. Annals of Internal Medicine 157(8):574–576.
Reason, J. 1990. Human error. New York: Cambridge University Press.
Renshaw, A. A., and E. W. Gould. 2005. Comparison of disagreement and error rates for three types of interdepartmental consultations. American Journal of Clinical Pathology 124(6):878–882.
Rogers, M. L., E. S. Patterson, and R. M. L. 2012. Cognitive work analysis in health care. In P. Carayon (ed.), Handbook of human factors and ergonomics in health care and patient safety, 2nd ed. (pp. 465–474). Boca Raton, FL: Taylor & Francis Group.
Roth, E. M. 2008. Uncovering the requirements of cognitive work. Human Factors 50(3):475–480.
Salas, E., C. Prince, D. P. Baker, and L. Shrestha. 1995. Situation awareness in team performance: Implications for measurement and training. Human Factors 37(1):123–136.
Schiff, G. D., and L. L. Leape. 2012. Commentary: How can we make diagnosis safer? Academic Medicine 87(2):135–138.
Schiff, G. D., S. Kim, R. Abrams, K. Cosby, A. S. Elstein, S. Hasler, N. Krosnjar, R. Odwanzy, M. F. Wisniewsky, and R. A. McNutt. 2005. Diagnosing diagnosis errors: Lessons from a multi-institutional collaborative project for the diagnostic error evaluation and research project investigators. Rockville, MD: Agency for Healthcare Research and Quality.
Schiff, G. D., O. Hasan, S. Kim, R. Abrams, K. Cosby, B. L. Lambert, A. S. Elstein, S. Hasler, M. L. Kabongo, N. Krosnjar, R. Odwazny, M. F. Wisniewski, and R. A. McNutt. 2009. Diagnostic error in medicine: Analysis of 583 physician-reported errors. Archives of Internal Medicine 169(20):1881–1887.
Schiff, G. D., A. L. Puopolo, A. Huben-Kearney, W. Yu, C. Keohane, P. McDonough, B. R. Ellis, D. W. Bates, and M. Biondolillo. 2013. Primary care closed claims experience of Massachusetts malpractice insurers. JAMA Internal Medicine 173(22):2063–2068.
Schraagen, J. M., S. F. Chipman, and V. L. Shalin. 2000. Cognitive task analysis. New York: Psychology Press.
Shekelle, P. G., R. M. Wachter, P. J. Pronovost, K. Schoelles, K. M. McDonald, S. M. Dy, K. Shojania, J. Reston, Z. Berger, B. Johnsen, J. W. Larkin, S. Lucas, K. Martinez, A. Motala, S. J. Newberry, M. Noble, E. Pfoh, S. R. Ranji, S. Rennke, E. Schmidt, R. Shanman, N. Sullivan, F. Sun, K. Tipton, J. R. Treadwell, A. Tsou, M. E. Vaiana, S. J. Weaver, R. Wilson, and B. D. Winters. 2013. Making health care safer II: An updated critical analysis of the evidence for patient safety practices. Evidence Reports/Technology Assessments No. 211. Rockville, MD: Agency for Healthcare Research and Quality.
Shojania, K. G. 2010. The elephant of patient safety: What you see depends on how you look. Joint Commission Journal on Quality and Patient Safety 36(9):399–401.
Shojania, K. G., E. C. Burton, K. M. McDonald, and L. Goldman. 2002. The autopsy as an outcome and performance measure. AHRQ Publication No. 03-E002. Rockville, MD: Agency for Healthcare Research and Quality.
Shojania, K. G., E. C. Burton, K. M. McDonald, and L. Goldman. 2003. Changes in rates of autopsy-detected diagnostic errors over time: A systematic review. JAMA 289(21):2849–2856.
Siegal, D. 2014. Analysis of diagnosis-related medical malpractice claims: Input submitted to the Committee on Diagnostic Error in Health Care, August 4, 2014.
Singh, H. 2014. Helping health care organizations to define diagnostic errors as missed opportunities in diagnosis. Joint Commission Journal on Quality and Patient Safety 40(3):99–101.
Singh, H., and D. F. Sittig. 2015. Advancing the science of measurement of diagnostic errors in healthcare: The Safer Dx Framework. BMJ Quality and Safety 24(2):103–110.
Singh, H., K. Daci, L. A. Petersen, C. Collins, N. J. Petersen, A. Shethia, and H. B. El-Serag. 2009. Missed opportunities to initiate endoscopic evaluation for colorectal cancer diagnosis. American Journal of Gastroenterology 104(10):2543–2554.
Singh, H., K. Hirani, H. Kadiyala, O. Rudomiotov, T. Davis, M. M. Khan, and T. L. Wahls. 2010a. Characteristics and predictors of missed opportunities in lung cancer diagnosis: An electronic health record–based study. Journal of Clinical Oncology 28(20):3307–3315.
Singh, H., E. J. Thomas, L. Wilson, P. A. Kelly, K. Pietz, D. Elkeeb, and G. Singhal. 2010b. Errors of diagnosis in pediatric practice: A multisite survey. Pediatrics 126(1):70–79.
Singh, H., T. D. Giardina, S. N. Forjuoh, M. D. Reis, S. Kosmach, M. M. Khan, and E. J. Thomas. 2012a. Electronic health record-based surveillance of diagnostic errors in primary care. BMJ Quality and Safety 21:93–100.
Singh, H., M. L. Graber, S. M. Kissam, A. V. Sorensen, N. F. Lenfestey, E. M. Tant, K. Henriksen, and K. A. LaBresh. 2012b. System-related interventions to reduce diagnostic errors: A narrative review. BMJ Quality and Safety 21(2):160–170.
Singh, H., C. Spitzmueller, N. J. Petersen, M. K. Sawhney, and D. F. Sittig. 2013. Information overload and missed test results in electronic health record-based settings. JAMA Internal Medicine 173(8):702–704.
Singh, H., A. N. Meyer, and E. J. Thomas. 2014. The frequency of diagnostic errors in outpatient care: Estimations from three large observational studies involving U.S. adult populations. BMJ Quality and Safety 23(9):727–731.
Smith, M., D. Murphy, A. Laxmisan, D. Sittig, B. Reis, A. Esquivel, and H. Singh. 2013. Developing software to “track and catch” missed follow-up of abnormal test results in a complex sociotechnical environment. Applied Clinical Informatics 4(3):359–375.
Studdert, D. M., E. J. Thomas, H. R. Burstin, B. I. Zbar, E. J. Orav, and T. A. Brennan. 2000. Negligent care and malpractice claiming behavior in Utah and Colorado. Medical Care 38(3):250–260.
Studdert, D. M., M. M. Mello, W. M. Sage, C. M. DesRoches, J. Peugh, K. Zapert, and T. A. Brennan. 2005. Defensive medicine among high-risk specialist physicians in a volatile malpractice environment. JAMA 293(21):2609–2617.
Studdert, D. M., M. M. Mello, A. A. Gawande, T. K. Gandhi, A. Kachalia, C. Yoon, A. L. Puopolo, and T. A. Brennan. 2006. Claims, errors, and compensation payments in medical malpractice litigation. New England Journal of Medicine 354(19):2024–2033.
Tehrani, A., H. Lee, S. Mathews, A. Shore, M. Makary, P. Pronovost, and D. Newman-Toker. 2013. 25-year summary of U.S. malpractice claims for diagnostic errors 1986–2010: An analysis from the National Practitioner Data Bank. BMJ Quality and Safety 22:672–680.
Thomas, E. J., D. M. Studdert, H. R. Burstin, E. J. Orav, T. Zeena, E. J. Williams, K. M. Howard, P. C. Weiler, and T. A. Brennan. 2000. Incidence and types of adverse events and negligent care in Utah and Colorado. Medical Care 38(3):261–271.
Thomas, E. J., S. R. Lipsitz, D. M. Studdert, and T. A. Brennan. 2002. The reliability of medical record review for estimating adverse event rates. Annals of Internal Medicine 136(11):812–816.
Trowbridge, R. 2014. Diagnostic performance: Measurement and feedback. Presentation to the Committee on Diagnostic Error in Health Care. August 7, 2014, Washington, DC.
Troxel, D. 2014. Input submitted to the Committee on Diagnostic Error in Health Care from The Doctors Company Foundation, April 28, 2014.
Unertl, K. M., M. B. Weinger, K. B. Johnson, and N. M. Lorenzi. 2009. Describing and modeling workflow and information flow in chronic disease care. Journal of the American Medical Informatics Association 16(6):826–836.
Velmahos, G. C., C. Fili, P. Vassiliu, N. Nicolaou, R. Radin, and A. Wilcox. 2001. Around-the-clock attending radiology coverage is essential to avoid mistakes in the care of trauma patients. American Surgeon 67(12):1175–1177.
Wachter, R. M. 2010. Why diagnostic errors don’t get any respect—and what can be done about them. Health Affairs (Millwood) 29(9):1605–1610.
Wachter, R. M. 2014. Diagnostic errors: Central to patient safety, yet still in the periphery of safety’s radar screen. Diagnosis 1(1):19–21.
Weiner, S. J., A. Schwartz, F. Weaver, J. Goldberg, R. Yudkowsky, G. Sharma, A. Binns-Calvey, B. Preyss, M. M. Schapira, S. D. Persell, E. Jacobs, and R. I. Abrams. 2010. Contextual errors and failures in individualizing patient care: A multicenter study. Annals of Internal Medicine 153(2):69–75.
Weissman, J. S., E. C. Schneider, S. N. Weingart, A. M. Epstein, J. David-Kasdan, S. Feibelmann, C. L. Annas, N. Ridley, L. Kirle, C. Gatsonis. 2008. Comparing patient-reported hospital advere events with medical record review: Do patients know something that hospitals do not? Annals of Internal Medicine 149(2):100–108.
Welch, H. G. 2015. Less medicine more health: 7 assumptions that drive too much medical care. Boston, MA: Beacon Press.
Welch, H. G., and W. C. Black. 2010. Overdiagnosis in cancer. Journal of the National Cancer Institute 102(9):605–613.
Winters, B., J. Custer, S. M. Galvagno, E. Colantuoni, S. G. Kapoor, H. Lee, V. Goode, K. Robinson, A. Nakhasi, and P. Pronovost. 2012. Diagnostic errors in the intensive care unit: A systematic review of autopsy studies. BMJ Quality and Safety 21(11):894–902.
Wolf, L., P. Potter, J. A. Sledge, S. B. Boxerman, D. Grayson, and B. Evanoff. 2006. Describing nurses’ work: Combining quantitative and qualitative analysis. Human Factors 48(1):5–14.
Zhi, M., E. L. Ding, J. Theisen-Toupal, J. Whelan, and R. Arnaout. 2013. The landscape of inappropriate laboratory testing: A 15-year meta-analysis. PLoS ONE 8(11):e78962.
Zwaan, L., and H. Singh. 2015. The challenges in defining and measuring diagnostic error. Diagnosis 2(2):97–103.
Zwaan, L., M. de Bruijne, C. Wagner, A. Thijs, M. Smits, G. van der Wal, and D. R. Timmermans. 2010. Patient record review of the incidence, consequences, and causes of diagnostic adverse events. Archives of Internal Medicine 170(12):1015–1021.
Zwaan, L., A. Thijs, C. Wagner, G. van der Wal, and D. R. Timmermans. 2012. Relating faults in diagnostic reasoning with diagnostic errors and patient harm. Academic Medicine 87(2):149–156.
Zwaan, L., G. D. Schiff, and H. Singh. 2013. Advancing the research agenda for diagnostic error reduction. BMJ Quality and Safety 22:(Suppl 2):ii52-ii57.