The Accident Precursors Project
Overview and Recommendations
In 2003, the National Academy of Engineering Program Office undertook the Accident Precursors Project to examine the complex issue of accident precursor analysis and management. This seven-month project was designed to document and promote industrial and academic approaches to detecting, analyzing, and benefiting from accident precursors, as well as to understand public-sector and private-sector roles in using precursor information. The committee examined an array of approaches for benefiting from precursor information and discussed these approaches in a workshop held on July 17 and 18, 2003, in Washington, D.C. This report is the official record of the project and the workshop.
THE ACCIDENT PRECURSORS WORKSHOP
The workshop brought together experts on risk, engineers, practitioners, and policy makers from the aerospace, aviation, chemical, health care, and nuclear industries. Participants were selected for their expertise and their interest in engaging in a cross-industry dialogue. Presentations by invited experts in the field were followed by targeted discussions in breakout groups.
The workshop presentations addressed four general areas:
The Opportunity of Precursor Analysis (Section II): the opportunities presented by precursors and some organizational and analytical approaches to detecting and analyzing them
Risk Assessment (Section III): methods of identifying and modeling different types of precursors
Risk Management (Section IV): how risks can be understood and mitigated once precursors have been identified and how organizations can engage their members in this endeavor
Linking Risk Assessment and Risk Management (Section V): how linking risk assessment and risk management can create a continuous improvement process and how industry and government can facilitate this
Breakout and plenary sessions involved discussions by participants focused on advising both private organizations and government agencies on how they might use precursor information to reduce their risk exposure. Discussions were based on drafts of presenters’ papers (provided before the workshop) and were led by facilitators and designated presenters.
The Committee on Accident Precursors evaluated the presentations and discussions, as well as additional submissions from Drs. Frosch and Westrum (Appendixes A and B). The resulting findings and recommendations are based on these inputs and subsequent committee deliberations.
James Bagian, director of the Department of Veterans Affairs (VA) National Center for Patient Safety, delivered the opening keynote address. Bagian drew on his personal experiences as well as efforts by the VA to promote patient safety. He described the challenges to engaging individuals and organizations, the difficulty of recognizing when current safety standards are inadequate, and the importance of making commitments to the institutional and management processes necessary to achieving lasting, continuous improvements in safety.
Elisabeth Paté-Cornell, chair of the Department of Management Science and Engineering at Stanford University, delivered the dinner keynote address. Paté-Cornell highlighted past examples of the management of precursors. In some cases, precursors were ignored, and catastrophes followed. In other cases, precursors were recognized as warning signs, and disasters may have been avoided. Paté-Cornell also provided an overview of some of the precursor models she and her students have developed for use as decision aids. These models have been used in a broad range of applications, from optimizing the alert thresholds of warning systems, such as fire alarms (Paté-Cornell, 1986), to aiding in combating terrorism (Paté-Cornell and Guikema, 2002).
Workshop presenters discussed how precursors could be identified and managed. Michal Tamuz of the University of Tennessee Health Science Center
discussed similarities and differences in approaches to collecting and assessing precursor data in the aviation, health care, and nuclear power industries, among others. William Corcoran, president of the Nuclear Safety Review Concepts Corporation, used historical examples to illustrate the distinctions between different kinds of precursors. Martin Sattison, manager of the Risk, Reliability and Regulatory Support Department at the Idaho National Engineering and Environmental Laboratory, provided a historical overview of the U.S. Nuclear Regulatory Commission (U.S. NRC) Accident Sequence Precursor (ASP) Program and outlined lessons that could be transferred to other industries.
The next group of speakers described organizational barriers to, and opportunities for, leveraging precursor information to reduce the likelihood of accidents. Dennis Hendershot, senior technical fellow of the Rohm and Haas Company, provided everyday and industrial examples illustrating how systems can be designed or redesigned to make them inherently safer. Tjerk van der Schaaf of the Eindhoven University of Technology pointed out potential “blind spots” in reporting systems, showing why many types of near misses can go unreported. John Carroll of the Sloan School of Management of the Massachusetts Institute of Technology discussed how knowledge about potential accidents could be shared throughout an organization, both formally and informally.
The last group of speakers described approaches to engaging stakeholders, institutions, and industries in the process of identifying and managing accident precursors. Linda Connell, director of the National Aeronautics and Space Administration (NASA) Aviation Safety Reporting System (ASRS), described the history and implementation of ASRS and discussed its potential applicability in the health care, nuclear power, maritime, and security domains. Christopher Hart, assistant administrator for the Federal Aviation Administration (FAA) Office of System Safety, identified the hurdles to improving an already high level of safety (a “plateau”) and discussed how a recognition of precursors could help to achieve this end. Yacov Haimes, director of the Center for Risk Management of Engineering Systems of the University of Virginia, discussed the transferability of methods used to identify and mitigate accident precursors to security systems for combating terrorism.
In the aftermath of catastrophes, it is common to find prior indicators, missed signals, and dismissed alerts that, had they been recognized and appropriately managed before the event, might have averted the undesired event. Indeed, the accident literature is replete with examples, including the space shuttle Columbia (CAIB, 2003), the space shuttle Challenger (Vaughan, 1997), Three Mile Island (Chiles, 2002), the Concorde crash (BEA, 2002), the London Paddington train crash (Cullen, 2000), and American Airlines Flight 587 to Santo Domingo
(USA Today, May 25, 2003), among many others (Kletz, 1994; Marcus and Nichols, 1999; Turner and Pidgeon, 1997).
Recognizing signals before an accident clearly offers the potential of improving safety, and many organizations have attempted to develop programs to identify and benefit from accident precursors. In this summary, the committee examines how these programs can be designed to reduce system risk exposure and the responsibilities of various constituents in implementing or facilitating these programs.
At first glance it might appear that the design and operation of precursor programs would be relatively straightforward. This perception may be the result of hindsight bias (Fischhoff, 1975; Hawkins and Hastie, 1990), that is, after an accident, individuals often believe that the accident should have been considered highly likely, if not inevitable, by those who observed the system prior to the accident. (Hindsight bias also helps to explain the frequent discrepancies between pre- and post-accident risk assessments.)
In fact, upon examination, designing and running a precursor management program turns out to be challenging. In order to leverage precursor information, precursor programs must be able to identify possible threats before they occur; detect, filter, and prioritize precursors when they occur; evaluate precursor causes; and identify and implement corrective actions (see for example Lakats and Paté-Cornell, in press).
Although creating programs with all of these features can be difficult, it is important to consider how it can be done and whether existing programs can be improved. For example, are some individuals, companies, organizations, or even industries better able to envision and respond to potential accidents than others? If so, what processes do they use, and what organizational structures, management approaches, and regulatory frameworks support these processes?
The first topic addressed in this summary is the opportunity presented by accident precursors for improving safety. Next, a case is made, based on historical examples, for the need for a better understanding of precursor management. This is followed by several examples of precursor programs illustrating differences and parallels in approaches. The final section includes the committee’s findings and recommendations.
Defining Accident Precursors
Accident precursors can be defined in a number of ways. To encourage a wide-ranging discussion of alternative definitions and reporting systems, the committee deliberately chose a broad definition. Precursors were defined as the conditions, events, and sequences that precede and lead up to accidents. Based on this definition, precursor events can be thought of loosely as “building blocks” of accidents and can include both events internal to an organization (such as equipment failures and human errors) and external events (such as earthquakes and hurricanes). This definition helped the committee (and the workshop partic-
ipants) focus their discussions on the management of events that could progress to accidents, without unduly limiting or foreclosing those discussions. The definition also helped the committee and workshop participants distinguish between actual events and general underlying conditions (such as an organization’s culture) that may not be part of a specific accident scenario but may still influence the likelihood of an accident.
Some organizations, such as the U.S. NRC, have chosen to limit the use of the term “precursors” to events that exceed a specified level of severity. For example, precursors might be defined as the complete failure of one or more safety systems and/or the partial failure of two or more safety systems. Similarly, a quantitative threshold may be established for the conditional probability of an accident given a precursor, and events of lesser severity either not considered precursors, or at least not singled out as deserving of further analysis.
Other organizations have designed and implemented incident reporting systems that address incidents with a much wider range of severities, including defects or off-normal events that may involve inconsequential losses of safety margins. In such cases, of course, screening, filtering, and prioritizing reported incidents is necessary to identify the events that merit further analysis; in addition, there must be a recognition that the reporting of an event is not necessarily a prejudgment of its risk significance.
Both approaches to defining precursors have advantages and disadvantages. Setting the threshold for reporting too high or defining reportable precursors too precisely may mean that risk-significant events may not be reported, especially if they were not anticipated. Moreover, it may be impossible to develop a precise definition of reportable precursors in relatively new or immature technologies and systems or in systems for which no quantitative risk analyses are available.
Conversely, setting the threshold for reporting too low runs the risk that the reporting system may be overwhelmed by false alarms, especially if the system design requires some corrective action or substantial analysis for all reported events. In addition, too low a reporting threshold can lead to a perception that the reporting system is of little value. These competing trade-offs can lead to errors, as shown in Table 1. Type I errors are reported events that do not pose a significant risk. Type II errors are events that do pose a significant risk but are not reported.
TABLE 1 Errors in Event Reporting
(the event is significant)
(Type II error)
Event not reported
(Type I error)
(the event is not significant)
Finally, even reporting systems based on strict definitions of accident precursors with high thresholds for reporting may need a mechanism that allows for reporting new and previously unexpected precursors if they are judged to be severe. Sometimes, a single unrecognized or “hidden” flaw can render a technology much less safe than had been believed (Freudenburg, 1988), and precursor reporting systems are typically used for technologies in which unforeseen problems can have serious consequences.
THE OPPORTUNITY OF PRECURSOR MANAGEMENT
Programs for managing accident precursors have a number of benefits, as outlined by van der Schaaf et al. (1991). First, reviewing and analyzing observed precursors can reveal what can go wrong with a particular system or technology and how accidents can develop (modeling). For example, a precursor may reveal a previously unknown failure mode, which can then be incorporated into an updated model of accident risk. Second, because precursors generally occur much more often than accidents, analyses of accident precursors can help in trending the safety of a system (monitoring). For example, a precursor reporting system can provide evidence of improving or deteriorating safety trends and hence decreasing or increasing accident likelihoods. This information might not be apparent from sparse or nonexistent accident data. Trends in observed precursors can also be used to analyze the effectiveness of actions taken to reduce risk. Finally, and perhaps most important, precursor programs can improve organizational awareness (mindfulness) of safety problems (Weick and Sutcliffe, 2001). In organizations where actual accidents are rare, the dissemination of information on accident precursors can reduce complacency. Thus, the establishment of a precursor program may encourage an ongoing dialogue about safety in an organization, resulting in greater awareness of what can go wrong and greater willingness to discuss potential risks and safety hazards. Even if these discussions are not part of the formal precursor program, the more effective safety culture that they represent may still be a result of that program.
One way organizations seek to benefit from precursors is by analyzing near misses (sometimes referred to as near accidents, near hits, or close calls), fragments of an accident scenario that can be observed in isolation—without the occurrence of an accident. For a given accident scenario, near misses can and frequently do occur with greater frequency than the actual event (Bird and Germain, 1996). Several examples from the accident literature confirm this expectation, including the Concorde air crash (BEA, 2002), the London Paddington train crash (Cullen, 2000), and the Morton Salt chemical plant explosion (CSB, 2002); all three of these catastrophes were preceded by near misses, and some of the precursor events in the near misses were also parts of the eventual accident scenarios.
To organizations seeking to learn about potential accidents, near misses represent inexpensive learning opportunities for analyzing what can go wrong.
Near misses are especially important for organizations that have not experienced a major accident, because they enable these organizations to experience what March et al. (1991) refer to as “small histories”—or fragments of what might be experienced if an accident occurred. To benefit from near misses, organizations ranging from hospitals to manufacturing facilities and airlines to power plants, have set up management systems for reporting and analyzing near misses (see examples documented in this report and Barach and Small, 2000; Bier and Mosleh, 1990; Jones et al., 1999; van der Schaaf, 1992).
Analyses of accident precursor data can also be useful in conjunction with probabilistic risk analyses (PRAs). A PRA, also sometimes called a quantitative risk assessment or probabilistic safety assessment, is a method of estimating the risk of failure of a complex technical system by deconstructing the system into its component parts and identifying potential failure sequences. PRA has been used in a variety of applications, including transportation, electricity generation, chemical and petrochemical processing, aerospace, and military systems.
PRA methods make it possible to quantify the likelihood that each type of precursor will lead to accidents of different severities by assessing the conditional probability of accidents given certain precursors (Bier, 1993; Cooke and Goosens, 1990; Minarick and Kukielka, 1982). Such information can be helpful in prioritizing precursors for further investigation and/or corrective action. For an in-depth discussion of PRA, see for example Bedford and Cooke (2001) or Kumamoto and Henley (2000).
Precursor analyses have different strengths and weaknesses than PRAs and can, therefore, be used in conjunction with PRA models. PRA risk estimates are often heavily dependent on assumptions in the PRA model. For example, although every attempt is made to include important dependencies when they are recognized, a PRA may nonetheless incorrectly assume that two particular events are independent of each other. Because empirical data on observed precursors are relatively free of such assumptions, they can be used to assess the validity of those assumptions. Thus, if two events are positively correlated rather than independent, precursors involving both of them will tend to occur more often than predicted under the assumption of their independence, providing a potentially more accurate estimate of accident risks (and a check on the validity of the PRA model).
Other approaches have also been used to take advantage of precursor data. Automated surveillance systems, fault detection algorithms, and a variety of alarm systems are examples of systems that attempt to recognize precursors automatically. These methods have one common characteristic—they attempt to leverage precursor data to gain a better understanding of potential accidents.
Compared to purely statistical analyses of observed accident frequencies, near-miss analyses, PRA methods, and other precursor analyses can be viewed as examples of “decomposition” (i.e., breaking an accident scenario up into its component parts or building blocks). Forecasting expert J. Scott Armstrong of
the Wharton School, University of Pennsylvania, notes that decomposition typically leads to better judgments, particularly in cases where uncertainty is high (as in the likelihood of an accident, where estimates can vary by orders of magnitude). Armstrong (1985) describes the following merits of decomposition:
It allows the forecaster to use information in a more efficient manner. It helps to spread the risk; errors in one part of the problem may be offset by errors in another part. It allows the researchers to split the problem among different members of a research team. It makes it possible for expert advice to be obtained on each part. Finally, it permits the use of different methods on different parts of the problem.
Comparing Accident Analysis and Precursor Analysis
One of the most attractive aspects of precursor analysis is the abundance of precursor events compared to actual accidents (Bird and Germain, 1996). Thus, precursor data sets are often much richer than accident data sets. Analyzing precursor data can therefore reduce the uncertainty about the likelihood of an accident and lead to better decisions.
The committee believes that in many cases precursor events can be more effectively analyzed than accidents. After an accident, it may be difficult to determine what actually occurred for a variety of reasons: damage can be so severe that accident reconstruction may be inaccurate; the investigation may require too much time or money; legal and financial concerns may create disincentives that affect the investigation (e.g., individuals or organizations may be unwilling to disclose information that could increase their liability, or they may share information selectively); and witnesses may be unavailable. In contrast, when analyzing accident precursors, the system itself is usually intact, and stakeholders and witnesses may be more willing to report and share information about the event.
Comparing precursor analysis with accident analysis also reveals some of the challenges of benefiting from precursor information. Because precursors are likely to be numerous, resource limitations may make it impractical to investigate all of them to the level of detail that would normally be used in an accident investigation. Hence, thresholds are often set to select the precursors that are most indicative of system risk (Paté-Cornell, 1986). If a large number of precursors are considered important enough for analysis, they may be subjected to further prioritization and filtering.
Moreover, the potential for precursor events to develop into actual accidents might be unclear. As in any use of decomposition methods, the resulting model may not be entirely accurate (Bier et al., 1999); for example, there may be erroneous assumptions as to which additional events would be necessary to cause an accident given a particular precursor. In fact, non-accident precursors are inherently ambiguous (Bier and Mosleh, 1990) because they provide indications
of system safety (e.g., the fact that an actual accident did not occur), as well as indications of risk (e.g., the fact that a precursor did occur). Thus, if a precursor occurs and no accident follows, some individuals may (correctly or incorrectly) conclude that the system is less prone to accidents than was initially believed, and there may be disagreements and debates about how seriously that precursor should be taken.
Because of their less dramatic end states, precursor events may seem less salient as lessons learned than accidents. For example, corrective actions developed in response to precursor data may be less persuasive and more open to question than corrective actions based on actual accidents (March et al., 1991). Because accidents are at least partly random, there is no guarantee that corrective actions adopted in response to even relatively severe precursors will actually prevent an accident. Decision makers may, therefore, pay less attention to precursors than to accidents, and it may be difficult to persuade them to make changes in technical or organizational designs based on observations of precursors.
Finally, legal concerns may compel an organization to analyze an accident thoroughly but may also inhibit the use of precursor data. For example, showing that an organization knew about a particular precursor but did not take corrective action could increase the organization’s liability in the event of an actual accident. As a result, some organizations may be reluctant to establish formal precursor reporting programs; for example, they may rely on oral, rather than written, notification of observed precursors.
We can also compare the costs associated with precursor and accident analysis. Accidents can have a number of direct costs, such as medical expenses, costs associated with employee convalescence, and equipment damage. In contrast, precursor events may have minimal if any direct costs. Accidents also have a number of indirect costs that may far outweigh the direct costs. Typical indirect costs include lost production, a drop in employee morale, scheduling delays, additional hiring/training, legal costs, and the costs of implementing corrective actions. After a precursor event, many of these indirect costs may not apply (e.g., there may be no lost production) or may be lower than if an actual accident had occurred.
From this comparison, one might wonder if implementing a precursor analysis program can be more cost effective than assuming the risks and costs of the accident the program is intended to prevent. To the committee’s knowledge, no comprehensive cost-benefit analysis of precursor analysis programs has been conducted. Nonetheless, the committee firmly believes that precursor programs can be, and often are, cost effective. That is, the costs associated with achieving risk reduction through a precursor program are far lower than the risk-adjusted costs assumed when no such program is in place and precursors are not systematically analyzed.
Encouraging the Use of Precursor Analysis
The relatively high frequency and low cost associated with precursor events suggest that many industries could benefit from using precursor analyses to reduce the risk of accidents. Perhaps not surprisingly, industries that have traditionally sought to benefit from precursor analysis (e.g., aviation, aerospace, nuclear power, and the chemical process industry) are subject to accidents that can be so severe, but also so infrequent, that the advantages of precursor analysis are especially attractive.
One factor that seems to be essential for the adoption of precursor programs is the active engagement of companies—a company must “own” a precursor program. Thus, an organization must have leadership and a “safety culture” that can support such a program. The concept of a safety culture was developed by the International Atomic Energy Agency in the analysis of contributing factors to the Chernobyl disaster (Wiegmann et al., 2002). Although there are a number of industry-specific definitions of safety culture (see Wiegmann et al., 2002, for several examples), Pidgeon (1991) provides one encompassing definition:
[A safety culture is] the set of beliefs, norms, attitudes, roles, and social and technical practices that are concerned with minimizing the exposure of employees, managers, customers and members of the public to conditions considered dangerous or injurious.
Carroll and Hatakenaka (2001) describe an example of a plant, the Millstone Nuclear Power Station, in New London, Connecticut, that underwent an organizational shift and became a safety-conscious work environment that exhibited many of the characteristics associated with a healthy safety culture. In 1996, the Millstone Nuclear Power Station was featured in a Time magazine cover story as a rogue utility that cut corners and intimidated or fired employees who raised safety concerns (Pooley, 1996). The U.S. NRC placed Millstone on a watch list of plants receiving additional regulatory attention, and, following a shutdown of the plant’s three units, ordered that all three demonstrate that they were safe and in compliance with license and regulatory requirements prior to restarting.
In an effort to address shortcomings in compliance and safety, Northeast Utilities (Millstone’s owner) changed the top leadership of its nuclear program and brought in Bruce Kenyon to be CEO of Northeast Nuclear Energy Company. Carroll and Hatakenaka (2001) describe how Kenyon engineered an organizational transformation. Afterward, instead of suppressing the sharing of safety-related concerns, leadership of the company considered it essential that safety concerns be shared among employees and management. Some of the key changes included: the dismissal or demotion of senior managers who were identified by their peers as underperformers; the hiring of new managers to run the employee safety-concern program; the creation of formal structures and forums for two-way communication for the sharing of safety-related information between
employees and management; and the hiring of third-party consultants to oversee and monitor the effectiveness of instituted changes.
Carroll and Hatakenaka’s (2001) account of Millstone’s transformation underscores that leadership is essential but not the sole component of an effective safety culture; all members and strata of the organization must embrace the safety culture. Nonetheless, if the parent company’s leadership had not embraced the sharing of safety-related concerns and instituted changes to enable this sharing, it appears unlikely that Millstone would have been able to transform itself.
Leadership may be even more important in organizations and industries with less regulatory oversight or where safety reporting is voluntary. In such organizations, a culture and leadership that encourage reporting may be one of the few compelling reasons for employees, contractors, and front-line managers to share safety concerns and, potentially, information regarding precursors to accidents.
The private sector, industry associations, government, and third parties can all play a role in helping organizations understand and manage their risk exposures through the sharing of risk-related information and precursor analysis. Economic and regulatory mechanisms can provide incentives for organizations or companies to institute precursor analysis programs.
Some regulatory agencies use command-and-control regulation to mandate the reporting of certain types of precursors (e.g., the Licensee Event Reports mandated by the U.S. NRC in Code of Federal Regulations 10CFR50.83). Other organizations have voluntary programs, such as the Aviation Safety Action Program (discussed below), that protect individuals who report precursors from sanctions provided that certain “cardinal rules” are followed. Adhering to nonpunitive guidelines (under which individuals are not punished for reporting events in which they were involved) helps to build and maintain trust, although there is generally a threshold above which some type of punishment may apply. For example, incidents that involve clear violations, such as criminal or malicious behavior, are typically managed separately from precursor programs to avoid protecting individuals who have committed such violations.
Other incentives to encourage precursor management could include monetary or other rewards for companies that institute programs to identify and collect data on precursor events. For example, insurance premiums could be reduced for organizations that try to reduce their risk exposure through the systematic use of precursor information (Kunreuther et al., 2003). In lieu of involvement by regulatory agencies, third parties, such as trade organizations, insurers, accrediting bodies, and comparable companies, could inspect companies to determine whether they have effective and appropriate precursor programs in place (Er et al., 1998; Kunreuther et al., 2002).
Legal safeguards could also be used to protect individuals and companies that collect and share information about risk. Under current law, precursor reports
generated prior to an accident are often considered discoverable evidence after an accident. This may deter companies from soliciting and collecting reports about safety problems, and some industries have taken steps to insulate reporters of safety problems from liability.
LEARNING FROM PAST EXPERIENCE
The loss of the space shuttle Columbia and other major events (such as the terrorist attacks of September 2001) and recent lapses in safety (such as the serious corrosion problems discovered at the Davis-Besse nuclear power plant in Ohio in 2002 and the major blackout in the eastern United States in August 2003) have raised questions about how organizational structures and cultures can learn from precursors. These events have raised issues about how knowledge can be disseminated and applied throughout an organization; the feasibility and challenges of transferring precursor approaches from one industry to another; and the potential transferability of precursor approaches to problems outside the area of technological accidents.
The Space Shuttle Columbia
The Columbia accident occurred about seven months prior to the workshop. The signals leading up to the accident and how NASA managed them were analyzed extensively by the Columbia Accident Investigation Board (CAIB, 2003). Like the analyses of many other accidents, the CAIB study of the Columbia accident (mission STS-107) revealed a number of warning signals suggesting that the likelihood of an accident was greater than NASA had perceived at the time.
Considerable analysis by the CAIB (2003) addressed what sociologist Diane Vaughan calls “the normalization of deviance” (Vaughn, 1997). The CAIB report concluded that, although certain precursor events in missions prior to STS-107 had indicated problems, their continued occurrence without resulting in accidents had led to a misperception they were consistent with normal operation. In other words, precursors were initially considered warning signals, but over time were no longer considered indicative of serious risks. The CAIB report argued that NASA had thus “normalized” precursor events that today are generally believed to have been the direct cause of the orbiter loss.
The direct cause of the accident appears to have been insulating foam detached from the external tank striking the left wing of Columbia during the orbiter’s ascent and piercing the orbiter’s thermal protection system. During reentry into the Earth’s atmosphere, hot plasma gases then entered the orbiter and disintegrated the orbiter’s internal structure (CAIB, 2003). Debris strikes that had not penetrated the thermal protection system had been well documented in previous missions and had been carefully monitored and analyzed. Debris
strikes resulting from detached foam had been observed in 65 of the 79 missions for which photographic imagery was available.
In fact, Paté-Cornell and Fischbeck (1993) had undertaken PRA studies to analyze the case of foam becoming detached from the external tank, hitting the tiles of the orbiter, and causing enough damage to the thermal protection system to result in “burn-through” during reentry. They concluded that the likelihood of this event was sufficiently high to merit some attention to this problem.
The debris strike on STS-27R (on December 2, 1988) was similar to the eventual failure of the Columbia on mission STS-107, but the CAIB noted that during STS-27R, the orbiter had been inspected and managed much more diligently than during STS-107. The CAIB concluded that NASA’s perception of the severity of debris strikes had changed between missions STS-27R and STS-107: “NASA engineers and managers increasingly regarded the foam-shedding as inevitable, and as either unlikely to jeopardize safety or simply an acceptable risk.” The CAIB report concluded that the shuttle program lacked the “institutional memory” to benefit from the lessons of STS-27R (CAIB, 2003). This finding demonstrates how changes in organizational culture can affect the way precursors are perceived and managed.
INTRAORGANIZATIONAL SHARING AND ANALYSIS OF PRECURSOR INFORMATION
Some researchers believe that certain complex, tightly coupled, high-hazard organizations routinely maintain better than expected levels of safety and reliability. These are generally referred to as “high-reliability organizations” (HROs). Examples of HROs include nuclear power plants (Bourrier, 1996; Marcus, 1995), air traffic control systems (LaPorte, 1988; LaPorte and Consolini, 1998), and aircraft carriers (LaPorte and Consolini, 1998; Roberts, 1990; Rochlin et al., 1987; Weick and Roberts, 1993)
Researchers on the cultures, structures, and processes of HROs have postulated that one of the defining characteristics of HROs is a high sensitivity to things that can go wrong. HROs are believed to have organizational cultures that encourage “a rich awareness of discriminatory detail and facilitates the discovery and correction of errors capable of escalation into catastrophe” (Weick et al., 1999).
One factor that contributes to greater sensitivity and attentiveness to precursors in HROs is transparency, that is, an environment conducive to the free flow of information about potential risks. In some organizations, such as air traffic operations, in which constant communication reinforces confidence in the integrity and status of operations, information is shared almost continually (Rochlin, 1999). Data may also be exchanged on a more occasional basis through informal channels that encourage discussions of risks and lessons learned at all levels of an organization (Roberts, 1990). Either way, the important point is that the climate created makes it easy for information about problems to be brought
to the attention of key decision makers, including front-line personnel and senior managers.
It is important to keep in mind that attentiveness to precursors is not the only characteristic of an HRO. Organizations may exhibit a high sensitivity to precursors but fail to achieve high reliability because they do not have key characteristics of effective safety management. As Westrum and Adamski (1999) and Dowell and Hendershot (1997) point out, the search for errors can sometimes increase system risk if intended safety improvements inadvertently create more risk-prone systems. As Rochlin (1999) observes, “the search for safety is not just a hunt for errors.”
INTERORGANIZATIONAL SHARING OF INFORMATION
The management and exchange of information pertaining to risks beyond a single organization is an important issue associated with precursor management. Organizations can be deluged with information from internal and external sources, which can make filtering and recognizing problem areas and recognizing precursors to accidents more difficult. This, in turn, makes it more difficult to determine which information should be shared outside the organization. Even for precursors that are recognized, concerns about releasing proprietary knowledge, tarnishing a firm’s image, or incurring legal recriminations may discourage information sharing.
Sharing of information across organizations is important because many hallmark accidents that have drawn attention to the importance of precursor management were preceded by similar but non-catastrophic precursor events in other organizations. Because of a lack of effective information exchange, the organization that experienced the eventual accident was often unaware that others had learned from and acted on related precursor events.
This was the case in the Three Mile Island (TMI) accident, in which one of the factors in the partial core meltdown was a pressure relief valve that was stuck open, leading to confusion and misinterpretation in the plant control room. A similar event in which signals from a stuck relief valve had been temporarily misinterpreted had occurred at the Davis-Besse nuclear power plant in Ohio a year-and-a-half earlier. Fortunately, the progression of the Davis-Besse scenario had been halted, and an accident at that plant was averted. Although the Davis-Besse management had documented the event and learned from it internally, the information had not been shared with anyone outside the plant. Thus, management at TMI was not able to benefit from the experience (Chiles, 2002).
A similar situation led to the development of the ASRS (Aviation Safety Reporting System) in the aviation industry. On December 1, 1974, TWA Flight 514 was inbound to Dulles Airport near Washington, D.C. During the descent, the flight crew misunderstood the approach instructions and descended prematurely to the final approach altitude. The premature descent, coupled with
limited visibility due to inclement weather, significantly contributed to the pilots flying the aircraft into a mountaintop, killing everyone on board. During the National Transportation Safety Board’s accident investigation, a disturbing finding emerged. Six weeks prior to the accident, under similar conditions, a United Airlines flight crew had experienced a similar misunderstanding and had narrowly averted hitting the same mountain. After landing, the crew had reported the near miss to their company’s new internal reporting program, and an alert had been issued to all United Airlines pilots about the potential hazard. Because there was no established mechanism for sharing this information externally, the crew of TWA 514 was unaware of the hazard (ASRS, 2001).
Research suggests that transparency and the free flow of information should ideally extend to observers outside of an organization (i.e., “institutional permeability”). The need is illustrated most vividly when the absence of institutional permeability contributes to disasters. For example, Turner and Pidgeon (1997) discuss cases in which “individuals outside the principal organizations … had foreseen the danger which led to the disaster, and had complained, only to meet with a high-handed or dismissive response.” The examples include a minetailings landslide that killed 144 people and a rail-crossing accident that killed 11 people. Related issues are addressed by Lodwick (1993) and Martin (1999).
Chess et al. (1992) note that “organizations can develop systems to amplify the concerns of those outside the plant so that these voices can be heard easily by personnel inside the plant who have the capability to reduce risk”; they also describe how this was achieved by a small chemical manufacturer through the implementation of “an exemplary risk communication program.” Although concerns about protecting proprietary information are valid, some level of institutional permeability (especially receptiveness to concerns raised by “outsiders”) can expand the range of information available to an organization and can counteract complacency and the normalization of deviance.
SAMPLE INDUSTRY APPROACHES
A number of industries have implemented programs for taking advantage of precursor information, several within the past few years. They include the Accident Sequence Precursor (ASP) Program and the Institute for Nuclear Power Operation’s Significant Event Evaluation and Information Network in the nuclear industry, the ASRS in the aviation industry (DOT-FAA, 2002), site-specific and company-specific near-miss programs in the chemical industry (van der Schaaf, 1992), the U.K. rail industry’s confidential reporting systems (CIRAS, 2003), voluntary reporting programs for maritime safety (BTS, 2002a), surveillance systems to detect adverse drug events in health care (Kilbridge and Classen, 2002), national voluntary reporting systems in health care (IOM, 2000), and motor vehicle safety programs defined under the TREAD Act (DOT-NHTSA, 2002).
To illustrate the differences among these approaches, several methods of collecting and analyzing precursor data are highlighted below. These descriptions are not intended to be representative of all approaches used in a given industry, and the committee does not endorse one approach over another.
Accident Sequence Precursor Program
The ASP Program, overseen by the U.S. NRC, analyzes and disseminates findings from potential precursor events at U.S. commercial nuclear power plants. This nationwide precursor program overseen by a federal agency is discussed in more detail in the paper by Martin Sattison (p. 89 in this volume). The ASP Program was initiated several decades ago, following publication of the first PRAs of nuclear power plants to analyze precursors to a potentially catastrophic core meltdown by (USNRC, 1978):
quantifying and ranking the safety significance of events at operating reactors
determining the generic implications of these events
characterizing risk based on those events
providing feedback for operators of other plants to learn from these experiences
The ASP Program defines an accident sequence precursor as an operational event or plant condition that is an element of a postulated accident sequence that could lead to inadequate core cooling and hence to core damage. The precursors analyzed in the ASP Program are selected primarily from Licensee Event Reports that must be submitted to the U.S. NRC by plant licensees. Each event is reviewed to determine its severity and relevance to safety. Accident precursors estimated to have a conditional core damage probability greater than 1.0 × 10−6 (greater than a one in a million chance of resulting in core damage) are selected for further analysis (Johnson and Rasmuson, 1996; Reisch, 1994).
Aviation Safety Action Programs
Aviation safety action programs (ASAPs) are airline-initiated programs that encourage employees to voluntarily report safety information that may be critical to identifying potential accidents (DOT-FAA, 2002). ASAPs are based on memorandums of understanding (MOUs) between airlines (or repair stations), the FAA, and applicable third parties representing employees, such as labor associations. Although ASAPs are carrier operated, the programs must adhere to federal guidelines, and information is shared between the carriers and the FAA. In a recent advisory circular, the FAA states (DOT-FAA, 2002):
The objective of the ASAP is to encourage air carrier and repair station employees to voluntarily report safety information that may be critical to identifying potential precursors to accidents. The Federal Aviation Administration has determined that identifying these precursors is essential to further reducing the already low accident rate.
Although ASAPs are company-administered programs, all signatories to an MOU must adhere to its provisions in the execution of the program. ASAP guidelines have been updated periodically after analyses of demonstration programs and as more companies have developed their own ASAPs (DOT-FAA, 1997, 2000, 2003). Although reports are managed internally, the information is shared with the FAA and throughout the industry when warranted.
Each ASAP has an event review committee (ERC) that evaluates whether submitted reports should be included in the ASAP program. Members (and alternates) of an ERC are designated representatives of the FAA, the certificate holder (i.e., an airline or repair station), and a representative of a third party, such as an employee union. ERCs have five specific responsibilities (DOT-FAA, 2002):
Reviewing and analyzing reports submitted to the program.
Determining whether reports qualify for inclusion in the program.
Identifying actual and potential safety issues from the information in the reports.
Proposing corrective actions to remedy identified safety concerns.
Following up on ERC recommendations for corrective actions to assess whether they have been satisfactorily accomplished.
Several demonstration programs initiated after a 1997 advisory circular (DOT-FAA, 1997) have engaged employees in discussing safety issues. Among these programs are the USAir Altitude Awareness Program, the American Airlines Safety Action Partnership, and the Alaska Airlines Altitude Awareness Program. Since their inception, more than two dozen ASAP programs have been established (DOT-FAA, 2002). To encourage wider participation by carriers, President Clinton announced that ASAPs would be part of a national effort to reduce aviation accidents (White House, 2000).
ASAPs have been promoted because they encourage aviation employees to report safety problems quickly (DOT-FAA, 2002). The programs stress implementation of corrective actions over punishment and discipline, although the FAA can prosecute cases involving egregious acts (e.g., substance or alcohol abuse or the intentional falsification of information). ASAPs provide previously unavailable information rapidly and directly from those responsible for day-to-day aviation operations. These programs are expected to lead to improvements in FAA management of the National Aerospace System, airline flight operations and maintenance procedures, pilot-controller communications, human-machine interactions and interfaces, and training programs, ultimately helping to meet the
FAA’s goal of reducing the accident rate for commercial aviation by 2007 (White House, 2000).
Adverse Drug Events
Programs to detect potential and actual adverse drug events (ADEs) in health care are examples of how precursors can be actively and automatically monitored and how work processes can be structured around precursor detection. ADEs, events in which patients are harmed as a result of drug interventions, are some of the most frequent negative outcomes in health care, and their cumulative effects are enormous. Every year, an estimated one million serious medication errors are made in hospitals (Birkmeyer et al., 2000). Two well known cases of fatal ADEs are the deaths of Betsy Lehman (a health care reporter for the Boston Globe, who died of a chemotherapy overdose after being given four times the normal dosage over a four-day interval [Cook et al., 1998]) and Libby Zion (an 18-year-old woman who died when she took a prescribed drug that had a known, potentially fatal interaction with an antidepressant she was also taking [Asch and Parker, 1988]). Health care institutions have recently shown a good deal of interest in creating surveillance systems to monitor ADEs (see, for example, Bates et al., 1999, and Kilbridge and Classen, 2002).
ADEs can occur for a wide variety of reasons (Classen, 2003). Allergic reactions, drug-drug interactions, incorrect dosage prescriptions, incorrect dosage administration, and unintended repeated dosages are a few common ADEs. Although voluntary reporting systems encourage the reporting of these events or their precursors, many ADEs appear to go unreported (O’Neill et al., 1993). An alternative approach is to implement surveillance systems that automatically monitor for precursor events and to establish work processes to ensure that when an incident is detected, the impending accident is averted.
An example of the latter approach is prescription error-detection software, which is often integrated into computerized physician order entry (CPOE) systems used for ordering medications. Once a surveillance system has been implemented, a wide variety of precursors to ADEs can be detected, and potential harm to patients can be averted. For instance, if a doctor mistakenly orders penicillin for a patient who is allergic to it, an alert automatically informs the doctor of the precursor event.
When implemented successfully, surveillance systems have been shown to decrease ADEs dramatically (Bates et al., 1998, 1999; Evans et al., 1998). Based on the potential of surveillance systems to improve safety, the Leapfrog Group (a coalition of large health care purchasers that seeks to align health care purchasing with health care safety) has encouraged hospitals to implement CPOE systems. To meet Leapfrog’s CPOE standard, hospitals must satisfy the following requirements (Leapfrog Group, 2003):
Ensure that physicians enter hospital medication orders via a computer system that includes error-prevention software.
Demonstrate that the inpatient CPOE system can alert physicians to at least 50 percent of common, serious prescribing errors using a testing protocol now under development.
Require that a physician electronically document the reason for overriding an interception prior to doing so.
An automated surveillance approach could also potentially be applicable to other industries. In fact, a wide variety of alarm systems can be considered surveillance systems for detecting precursors to accidents. For example, near midair collisions and trains passing a red signal (indicating danger), both of which are generally considered precursors to accidents, can be automatically detected.
Surveillance systems have certain advantages over voluntary reporting systems. First, surveillance systems can often be built into work-flow processes so that precursors that might otherwise progress to accidents can be halted through detection and alerts. In addition, these systems frequently yield higher reporting rates than voluntary reporting systems and sometimes even encourage individuals to submit more voluntary near-miss or safety-related reports. However, surveillance systems also have some drawbacks. For instance, they may not capture all types of precursors because they generally detect only unambiguous signals that are known to have the potential to progress to accidents and that can be readily monitored. In addition, surveillance systems can create new, unexpected problems. For instance, if alerts are triggered too often, people may disregard them.
FINDINGS AND RECOMMENDATIONS
These findings and recommendations are based on surveys of the literature by National Academy of Engineering staff and the project committee, committee meetings, workshop presentations, feedback from workshop participants, and the workshop papers reproduced in this report. The recommendations are intended to help organizations design, refine, and oversee precursor programs and to help government agencies encourage the use of precursor data in a range of domains. In keeping with the cross-industry focus of the study, the recommendations are not industry specific. The findings and recommendations are presented in five sections—opportunity, precursor management, organizational commitment, engaging industry, and engaging government.
Finding 1. The collection, filtering, and analysis of accident precursor data, followed by the implementation of corrective actions, can improve reliability and safety.
There is ample evidence showing that improvements have resulted from precursor-type programs. In aviation, for instance, a variety of precursor programs have led to improvements in safety. Flight operational quality assurance (FOQA) programs, in which flight data are routinely analyzed regardless of whether an incident was observed or reported, have identified a number of potential precursors and led to the adoption of new safety measures. These include modifications of pilot training, revisions to or renewed emphasis on standard operating procedures, equipment fixes, and the issuance of alerts to pilots regarding potential hazards (GAO, 1998). The Flight Safety Foundation’s publication, Flight Safety Digest, shows that other aviation safety reporting and sharing platforms, including ASAPs, ASRS, and the Global Aviation Information Network, also frequently identify precursors and support analyses of precursor events (DOT-FAA, 2002). Studies of other industries also cite safety improvements after the institution of precursor programs (see examples in the papers by James Bagian [p. 37] and Dennis Hendershot [p. 103] in this volume).
This finding does not explicitly address the cost effectiveness of precursor programs. However, as indicated earlier, continued major lapses in safety management (such as the loss of the space shuttle Columbia, the corrosion problems discovered at the Davis-Besse nuclear power plant in 2002, and the August 2003 blackout) suggest that we are far from the point of diminishing returns on investments in safety.
Recommendation 1. Organizations involved in operations with significant safety and reliability concerns should evaluate the opportunities for risk reduction through precursor analysis programs.
The effective management of precursors, near misses, and close calls poses a number of challenges. Managing a single incident involves recognizing that a precursor has occurred, ensuring that the event is reported, and analyzing the event to assess its causes and identify possible corrective actions. Managing an entire precursor program requires identifying the types of precursors to be reported, prioritizing and filtering observed incidents (e.g., deciding which precursors justify reporting, which reports justify further analysis, and which analyses justify corrective actions), and deciding which reports to disseminate and which corrective actions to implement on an organizational scale.
The following findings address specific issues associated with the management of accident precursors. They are not intended to be comprehensive, and some aspects of precursor management (such as root-cause analysis, discussed by William Corcoran, p. 79 in this volume) are not addressed here.
Finding 2. Effective precursor management programs include clear definitions
of risk, risk-reduction objectives, and the types of precursor data needed for risk management.
The range of precursors reported depends on how precursors are defined. Definitions vary from highly specific criteria (such as exceeding a specific quantitative threshold) to broad definitions that encompass a wide range of events and circumstances. Definitions of near misses and close calls can also vary from one industry or setting to another.
Designers and managers of precursor programs may assume that participants know what types of events to report and that they will recognize them when they occur. However, even highly knowledgeable individuals can have different views of the meaning of accident precursors, which can substantially affect the range of incidents reported. Phimister et al. (2003) cite examples from the chemical industry of personnel identifying precursor events that would have been of interest to management but not reporting them because they did not match the stated definition of the precursor program.
Recommendation 2. Precursor programs should define the precursors of interest in a way that is readily understandable to everyone expected to report a precursor, close call, near miss, or other safety-related occurrence.
Finding 3. The expected operation of a technology is not always characterized in a way that makes deviations readily apparent. This can result in precursors going unreported.
Although it is not always possible to distinguish between normal and abnormal operations, distinguishing precursor events based on a defined, ideal mode of operation has several advantages. First, if participants in precursor programs have a clear understanding of the standards of operation, they can compare an observed incident with the standards to determine if the deviation is significant. Second, a clear understanding of ideal operation can provide a basis for deciding whether a corrective action is necessary and, if so, which action to take. Third, explicit contrasts between precursors and the standards of operation can help in the prioritization of observed precursors.
Defining ideal operations involves not only knowing about the operation of the system in question, but also making value judgments about the range of acceptable deviations. This requires the identification of a consistent threshold between ideal and abnormal operations. Although some deviations from ideal operation may be considered acceptable (and may, in fact, be unavoidable in some situations), Vaughan (1997) has illustrated the risks associated with the normalization of deviance. Therefore, there should be a high “safety margin” in evalu-
ating the risks posed by deviations. Deviations that are judged to be unacceptable after careful scrutiny should trigger corresponding contingency responses.
Recommendation 3. Activities with potentially significant risks should be subjected to an appropriate level of hazard analysis, which should then be used to help identify and define precursor events of concern.
Finding 4. Barriers to reporting precursor events include a variety of factors: fear of blame for an event; reluctance to report a coworker’s failure; concerns about liability; and lack of time to complete reports.
Precursor events that do not result in damage or loss, are witnessed by only a few people, or that cannot be readily monitored by a surveillance system can be difficult to capture in a reporting system. For management to learn of such events, the workforce must be actively engaged in the program. Christopher Hart outlines a number of legal and political barriers that can impede the reporting of potential errors to management or regulatory authorities, including (p. 147 in this volume):
The belief that an individual may be held responsible for a precursor event that he or she reports.
The potential for criminal prosecution of the individuals involved in an event.
The possibility that the information could be disseminated to the public.
The possibility that the information could be used in civil litigation proceedings.
Others have cited additional barriers to reporting, including lack of confidence that a report will result in safety improvements and lack of time to complete the report and still complete other tasks (Bridges, 2000). Management must develop strategies to overcome such barriers.
Recommendation 4. Organizations that implement precursor management systems should ensure that the work environment encourages honest reporting of problems as part of a positive safety-improvement culture.
Prioritizing Precursors and Disseminating Precursor Information
Finding 5. Organizations considering or implementing precursor programs face a variety of challenges, including filtering and prioritizing reports for effective analysis and identifying sound risk-reduction responses to observed precursors.
Programs that motivate individuals to report precursors face other challenges, such as how to manage the reported information effectively. If only a few reports are submitted, they can all be analyzed and disseminated to the relevant parties (as is typically done for serious accidents). However, if a large number of precursor reports are submitted, resource constraints may make it difficult to analyze all of them, and it may be impractical to share information about all reported events with everyone participating in the program. For example, ASRS receives about 2,900 reports a month, only 15 to 20 percent of which are logged because of resource constraints (Strauss and Morgan, 2002).
Prioritizing precursor events once they have been reported can also be a challenge. A number of approaches are currently used to prioritize precursors. In some programs, one or more individuals involved in the program simply screen precursor events and prioritize them subjectively. Sometimes, a database of historical events and precursors is used for trending purposes (e.g., to identify increasing or decreasing rates of particular types of precursors over time). In addition, mathematical modeling can be used to assess the probability of an accident conditional on a given type of precursor—as a measure of precursor severity, for example. PRA can be used to estimate the likelihood of accidents based on precursor information and to reduce uncertainties about accident risk. Delphi approaches can also be used to solicit and aggregate expert information on the likelihood of accidents.
Recommendation 5. Organizations should link precursor programs to the hazard assessment methodology used to manage safety and reliability, thereby developing a basis for setting priorities and using precursor information to establish measurements for improvements in risk.
The ability to leverage precursor information to reduce risk exposure depends heavily on organizational endorsement, commitment, and leadership. Organization leaders must be involved in the development and implementation of precursor programs and must have a clear understanding of each program’s structure, merits, and potential vulnerabilities.
Finding 6. Each organization has its own management structures, history, and culture, which are integral to both its safety philosophy and the role of precursor programs as part of the organization’s commitment to safe, reliable operation.
The design of a precursor program must be sensitive to the characteristics of the particular situation, such as management structures, industry and organizational history, government and labor relations, the regulatory environment, legal
considerations and constraints, the financial health of the industry and organization, and public perceptions of the risks posed by the industry in question.
To ensure continued participation, precursor programs must also lead to demonstrable improvements in safety. Because improvements resulting from precursor programs may not be readily visible to the casual observer, they should be audited and evaluated in terms of both risk reduction and cost effectiveness, and the resulting information should be shared with the people expected to participate in the program to encourage them to continue their participation. Evaluating whether safety improvements achieve the desired objectives requires organizational and management commitment to the program, as well as adequate resources.
Recommendation 6. Precursor programs should be implemented with the commitment of management at all levels, and measurable safety improvements attributable to the program should be publicized.
Finding 7. Many precursor events (and major accidents) occur in the private sector. Therefore, to reduce accident rates through precursor management, the private sector must be actively engaged in identifying and managing precursor events.
Although an increasing number of companies in high-hazard industries (i.e., industries that may experience catastrophic events) have initiated precursor or near-miss reporting programs, the committee believes this represents only a small fraction of the companies that could benefit from such programs. The committee encourages companies that do not have programs in place to examine industry best practices and implement programs suited to their needs and the hazards they face.
Recommendation 7. Companies in high-hazard industries should institute and/ or maintain formal precursor programs for the collection, analysis, and sharing of risk-related information.
Finding 8. In some cases, channels for communicating risk-related information among companies in high-hazard industries are weak or nonexistent.
Many companies have valid concerns about sharing information, such as concerns about releasing proprietary information and/or the legal implications of sharing information. As a result, important information may either not be shared or may be shared only after it has been stripped of essential facts, so that it is of relatively little use to the recipient.
Participation by multiple parties in information sharing often amplifies the benefits derived from the information, especially when the parties face common risks. Hence, the committee encourages companies to work to overcome the barriers and develop novel approaches to sharing risk-related information. For instance, in a regulated industry, a private third party could play the role of honest broker, instead of a government agency, with government approval of the overall approach. A similar model is already being used in the chemical industry, where a number of chemical companies participate in the Process Safety Incident Database maintained by the Center for Chemical Process Safety (CCPS). The CCPS (a division of the American Institute of Chemical Engineers) collects, de-identifies, and shares anonymous information about accidents, incidents, and near misses with participating companies (Kelly and Clancy, 2001).
Recommendation 8. Companies in high-hazard industries should develop strategies for sharing risk-related information with other companies, when possible, as well as with other plants and facilities within their own companies, and should work to make proprietary information “shareable” between companies.
Finding 9. Greater cross-industry sharing of risk-related research, experiences, and practices could be widely beneficial, as evidenced by the cross-industry learning experienced at the workshop.
The advance of precursor practices and research requires open channels of communication—not only among the facilities of a single company or among firms in the same industry, but also among industries. It was evident at the workshop that industries have much to learn from each other and that obstacles in one industry might be overcome by leveraging the research and practices of other industries. More cross-industry sharing would encourage both research and the conversion of research results to reliable, effective practices. Cross-industry sharing could be facilitated by bringing together members of high-hazard industries regularly to discuss risk-related issues. This could be done by trade organizations, the National Academies, the Society for Risk Analysis, the Public Entity Risk Institute, and/or government bodies.
Recommendation 9. Organizations should support and participate in cross-industry collaborations on precursor management and research.
Even though government institutions are already engaged in facilitating the reporting and analysis of precursors, the committee believes that government could do more to foster the cross-company and cross-industry sharing of information. However, government actions must be carefully considered to ensure
that they encourage rather than discourage individuals and organizations from participating in precursor identification and management programs.
Finding 10. Existing regulatory models for using precursor data are potentially applicable to multiple industries.
Government agencies seeking to leverage precursor information in an industry should consider adapting approaches that have already been developed for other industries. For example, analogous versions of the ASAP and ASRS models have been developed for industries other than aviation. In the ASAP model, each company collects and manages near-miss and precursor data in parallel with other companies using similar data-collection methods. Phimister et al. (2003) and Barach and Small (2000) discuss similar reporting systems in the chemical and health care industries, respectively. In the ASRS model, a third party (in this case, NASA) is endorsed by the regulatory agency as an honest broker. The Department of Veterans Affairs uses a similar reporting system in health care settings.
Transferring precursor program models from one industry to another must be done carefully, however. Workforces may have different cultures that affect the acceptability of particular models; stakeholders may have different relationships; issues of proprietary information may impede the transfer of safety-sensitive information; and legal issues may hinder the sharing of information. Finally, incentives for sharing information about risks may differ from one industry to another. Steps that can be taken to encourage the adoption of precursor programs include providing economic incentives for information sharing, aligning market mechanisms to encourage precursor management (e.g., through reductions in insurance premiums), and third-party inspections of corporate risk-management programs (Carroll and Hatakenaka, 2001; Kunreuther et al., 2002).
Recommendation 10. Government agencies overseeing high-hazard industries or technologies that do not have a cohesive strategy for managing precursor information should develop an initial agency policy on precursor management to initiate a dialogue on how precursors can and should be managed.
The committee notes that some industries and agencies have already initiated activities consistent with this recommendation. For example, a white paper prepared by the Volpe Center (2003) served as the basis for a discussion at a railroad industry workshop held in 2003. The paper and workshop helped initiate an industry dialogue to evaluate how precursor information is currently used in the industry and how it could be used more effectively to improve railroad safety. In addition, as part of the Safety Data Initiative at the Bureau of Transportation Statistics, working groups have been charged with collecting better data on accident precursors and expanding the collection of near-miss data to all modes of transportation (BTS, 2002b).
Finding 11. There is already an ongoing research agenda in precursor analysis and management.
The committee believes that further research on precursor management would lead to higher levels of system safety. Given the number and severity of technological accidents in the past two decades, research should be considered a high priority for agencies that regulate high-hazard industries. The source(s) and amount of funding for such research will vary from one industry to another.
Because many disciplines in engineering, physical sciences, and social sciences can contribute to precursor analysis and management, and because the research needs vary from one industry to another, it is difficult to prioritize research topics. However, areas of general interest that may benefit precursor management programs might include: the identification of trends in large amounts of statistical data; the design of fault-tolerant systems; human factors analysis; the design of human-machine interfaces; team dynamics in safety-critical system operations; and organizational learning and leadership.
Research topics directly usable in precursor programs might include: data acquisition methods; improved fault-detection algorithms; risk modeling and trending methods; the relative effectiveness of alternative regulatory frameworks for precursor reporting and management; industry epidemiological analyses; and strategies for engaging large organizations in risk management. Academia, industry, government, and collaborative public-private projects could all be involved in research on these topics and other challenges identified in the papers in this report.
The committee also believes that basic research on precursor management would benefit numerous industries. Some of the most effective practices in precursor management are summarized in this report, but there are still significant uncertainties about the effectiveness of different approaches—partly because of insufficient scientific evaluations of precursor management methods. For example, basic scientific research could compare the merits of voluntary and mandatory reporting systems or quantify the decrease in system risks affected by precursor programs (e.g., using PRA or industry epidemiological analysis). The committee encourages the National Science Foundation and the mission agencies to support basic research in these and related areas.
Recommendation 11. Mission agencies with discretionary research budgets should support precursor-related research and pilot studies relevant to their respective missions. In addition, funding agencies and foundations should support basic research on using accident precursors in risk management programs and the characteristics of effective precursor information management.
The practice of searching for and learning from accident precursors is a valuable complement to other safety management practices, such as sound system engineering, adherence to standards, and the design of robust, fault-tolerant systems. Maintaining safety is an ongoing, dynamic process that does not stop when a technology has been designed, built, or deployed. Despite the best engineering practices, and despite strict adherence to standards and ongoing maintenance, indicators of future problems can and do arise. Organizations that formally search for and manage accident precursors can continually find opportunities for improving safety and can thereby reduce the probability of disasters.
Armstrong, J.S. 1985. Long-Range Forecasting: From Crystal Ball to Computer. New York: John Wiley and Sons.
Asch, D.A., and R.M. Parker. 1988. The Libby Zion case: one step forward or two steps backward? New England Journal of Medicine 318(12): 771–775.
ASRS (Aviation Safety Reporting System). 2001. The Office of the NASA Aviation Safety Reporting System. Callback 260. Moffet Field, Calif.: National Aeronautics and Space Administration.
ASRS. 2003. ASRS Program Overview. Available online: http://asrs.arc.nasa.gov/overview_nf.htm.
Barach, P., and S.D. Small. 2000. Reporting and preventing medical mishaps: lessons from non-medical near miss reporting systems. British Medical Journal 320(7237): 759–763.
Bates, D.W., L.L. Leape, D.J. Cullen, N. Laird, L.A. Petersen, J.M. Teich, E. Burdick, M. Hickey, S. Kleefield, B. Shea, M. Vander Vliet, and D.L. Seger. 1998. Effect of computerized physician order entry and a team intervention on prevention of serious medication errors . Journal of the American Medical Association 280(15): 1311–1316.
Bates, D.W., J.M. Teich, J. Lee, D. Seger, G.J. Kuperman, N. Ma’Luf, D. Boyle, and L. Leape. 1999. The impact of computerized physician order entry on medication error prevention. Journal of the American Medical Informatics Association 6(4): 313–321.
Battles, J.B., H.S. Kaplan, T.W. Van der Schaaf, and C.E. Shea. 1998. The attributes of medical event-reporting systems: experience with a prototype medical event-reporting system for transfusion medicine. Archives of Pathology and Laboratory Medicine 122(3): 231–238.
BEA (Bureau d’enquetes et d’analyses pour la securite de l’aviation civile). 2002. Accident on 25 July 2000 at “La Patte d’Oie” in Gonesse (95), to the Concorde, registered F-BTSC operated by Air France. Paris: Ministere de l’equipement des transports et du logement. Available online: http://www.bea-fr.org/docspa/2000/f-sc000725pa/pdf/f-sc000725pa.pdf.
Bedford, T., and R. Cooke. 2001. Probabilistic Risk Analysis: Foundations and Methods. Cambridge, U.K.: Cambridge University Press.
Bier, V.M. 1993. Statistical methods for the use of accident precursor data in estimating the frequency of rare events. Reliability Engineering and System Safety 41: 267–280.
Bier, V.M., Y.Y. Haimes, J.H. Lambert, N.C. Matalas, and R. Zimmerman. 1999. A survey of approaches for assessing and managing the risk of extremes. Risk Analysis 19(1): 83–94.
Bier, V.M., and A. Mosleh. 1990. The analysis of accident precursors and near misses: implications for risk assessment and risk management. Reliability Engineering and System Safety 27(1): 91–101.
Bird, F.E., and G.L. Germain. 1996. Practical Loss Control Leadership. Revised ed. Calgary, Alberta: Det Norske Veritas.
Birkmeyer, J.D., C.M. Birkmeyer, D.E. Wennberg, and M.P. Young. 2000. Leapfrog Safety Standards: Potential Benefits of Universal Adoption. Washington, D.C.: The Leapfrog Group.
Bourrier, M. 1996. Organizing maintenance work at two nuclear power plants. Journal of Contingencies and Crisis Management 4: 104–112.
Bridges, W.G. 2000. Get Near Misses Reported, Process Industry Incidents: Investigation Protocols, Case Histories, Lessons Learned. Pp. 379–400 in Proceedings of the International Conference and Workshop on Process Industry Incidents: Investigation Technologies, Case Histories, and Lessons Learned. October 2, 5, 6, 2000. New York: American Institute of Chemical Engineers.
BTS (Bureau of Transportation Statistics). 2002a. Project 6 Overview: Develop Better Data on Accident Precursors or Leading Indicators. In Safety in Numbers Conference Compendium. Washington, D.C.: Bureau of Transportation Statistics.
BTS. 2002b. Project 7 Overview: Expand the Collection of “Near-Miss” Data to All Modes. In Safety in Numbers Conference Compendium. Washington, D.C.: Bureau of Transportation Statistics. Available online: http://www.bts.gov/publications/safety_in_numbers_conference_2002/project07/project7_overview.html.
CAIB (Columbia Accident Investigation Board). 2003. Columbia Accident Investigation Board Report. Vol. 1. Washington, D.C.: National Aeronautics and Space Administration. Available online at: www.caib.us/news/report.
Carroll, J.S., and S. Hatakenaka. 2001. Driving organizational change in the midst of crisis. MIT Sloan Management Review 42(3): 70–79.
Chess, C., A. Saville, M. Tamuz, and M. Greenberg. 1992. The organizational links between risk communication and risk managment: the case of Sybron Chemicals Inc. Risk Analysis 12(3): 431–438.
Chiles, J.R. 2002. Inviting Disaster: Lessons from the Edge of Technology. New York: HarperCollins.
CIRAS (Confidential Incident Reporting and Analysis System). 2003. CIRAS Executive Report. Glasgow, U.K.: CIRAS.
Classen, D. 2003. Engineering a Safer Medication System Creating a National Standard. Presentation to the National Academy of Engineering/Institute of Medicine Workshop on Engineering and the Health Care System, February 6–7, 2003, Irvine, California.
Cook, R., D. Woods, and C. Miller. 1998. A Tale of Two Stories: Contrasting Views of Patient Safety. Chicago: National Patient Safety Foundation.
Cooke, R., and L. Goossens. 1990. The Accident Sequence Precursor methodology for the European post-Seveso era. Reliability Engineering and System Safety 27: 117–130.
CSB (Chemical Safety Board). 2002. Investigation Report: Chemical Manufacturing Incident. NTIS PB2000-107721. Washington, D.C.: Chemical Safety Board.
Cullen, W.D. 2000. The Ladbroke Grove Rail Inquiry. Norwich, U.K.: Her Majesty’s Stationery Office.
DOT-FAA (U.S. Department of Transportation, Federal Aviation Administration). 1997. Advisory Circular Aviation Safety Action Programs (ASAP), AC# 120-66. Washington, D.C.: Federal Aviation Administration.
DOT-FAA. 2000. Advisory Circular Aviation Safety Action Programs (ASAP), AC# 120-66A. Washington, D.C.: Federal Aviation Administration.
DOT-FAA. 2002. Advisory Circular: Aviation Safety Action Programs. AC# 120-66B. Washington, D.C.: U.S. Department of Transportation.
DOT-FAA. 2003. Advisory Circular: Aviation Safety Action Programs. AC# 120-66C. Washington, D.C.: U.S. Department of Transportation.
DOT-NHTSA (U.S. Department of Transportation, National Highway Traffic Safety Administration). 2002. Reporting of Information and Documents About Potential Defects Retention of Records That Could Indicate Defects; Final Rule, CFR, Vol. 67, No. 132. Washington, D.C.: U.S. Department of Transportation.
Dowell, A.M., and D.C. Hendershot. 1997. No good deed goes unpunished: case studies of incidents and potential incidents caused by protective systems. Process Safety Progress 16(3): 132–139.
Er, J., H.C. Kunreuther, and I. Rosenthal. 1998. Utilizing third-party inspections for preventing major chemical accidents. Risk Analysis 18(2): 145–153.
Evans, R.S., S.L. Pestotnik, D.C. Classen, T.P. Clemmer, L.K. Weaver, J.F. Orme, J.F. Lloyd, and J.P. Burke. 1998. A computer assisted management program for antibiotics and other antiinfective agents. New England Journal of Medicine 338(4): 232–238.
Fischhoff, B. 1975. Hindsight = / = foresight: the effect of outcome knowledge on judgment under uncertainty. Journal of Experimental Psychology: Human Perception and Performance 1: 288–299.
Freudenburg, W.R. 1988. Perceived risk, real risk: social science and the art of probabilistic risk assessment . Science 242 (4875): 44–49.
GAO (General Accounting Office). 1998. U.S. efforts to implement flight operational quality assurance programs. Aviation Safety 17(7-9): 1–36.
Hawkins, S.A., and R. Hastie. 1990. Hindsight: biased judgments of past events after the outcomes are known. Psychological Bulletin 107: 311–327.
IOM (Institute of Medicine). 2000. To Err Is Human: Building a Safer Health System, L.T. Kohn, J.M. Corrigan, and M.S. Donaldson, eds. Washington, D.C.: National Academies Press.
Johnson, J.W., and D.M. Rasmuson. 1996. The US NRC’s Accident Sequence Precursor Program: an overview and development of a Bayesian approach to estimate core damage frequency using precursor information. Reliability Engineering and System Safety 53: 205–216.
Jones, S., C. Kirchsteiger, and W. Bjerke. 1999. The importance of near miss reporting to further improve safety performance. Journal of Loss Prevention in the Process Industries 12: 59–67.
Kelly, B.D., and M.S. Clancy. 2001. Use a comprehensive database to better manage process safety. Chemical Engineering Progress 97(8): 67–69.
Kilbridge, P., and D. Classen. 2002. Surveillance for Adverse Drug Events: History, Methods and Current Issues. VHA Research Series, Vol. 2. Irving, Texas: Veterans Health Administration.
Kletz, T. 1994. Learning from Accidents, 2nd ed. Oxford, U.K.: Butterworth-Heinemann.
Kumamoto, H., and E.J. Henley. 2000. Probabilistic Risk Assessment and Management for Engineers and Scientists. New York: John Wiley and Sons.
Kunreuther, H.C., P.J. McNulty, and Y. Kang. 2002. Third-party inspection as an alternative to command and control regulation. Risk Analysis 22(2): 309–318.
Kunreuther, H.C., S. Metzenbaum, and P. Schmeidler. 2003. Leveraging the Private Sector: Management-Based Strategies for Improving Environmental Performance. Paper Presented at Conference on Leveraging the Private Sector: Management-Based Strategies for Improving Environmental Performance, July 31–August 1, 2003, Resources for the Future, Washington, D.C.
Lakats, L.M., and M.E. Paté-Cornell. In press. Organizational warning systems: a probabilistic approach to optimal design. IEEE Transactions on Engineering Management 51(2).
LaPorte, T.R. 1988. The United States Air Traffic System: Increasing Reliability in the Midst of Rapid Growth. Pp. 215–244 in The Development of Large-Scale Technical Systems, R. Mayntz and T. Hughes, eds. Boulder, Colo.: Westview Press.
LaPorte, T.R., and P. Consolini. 1998. Theoretical and operational challenges of “high-reliability organizations”: air-traffic control and aircraft carriers. International Journal of Public Administration 21: 847–852.
Leapfrog Group. 2003. The Leapfrog Group Factsheet: Computerized Physician Order Entry System. Revision 4/18/03. Washington, D.C.: The Leapfrog Group.
Lodwick, D.G. 1993. Rocky Flats and the evolution of distrust. Research in Social Problems and Public Policy 5: 149–170.
March, J.G., L.S. Sproull, and M. Tamuz. 1991. Learning from samples of one or fewer. Organization Science 2(1): 1–13.
Marcus, A.A. 1995. Managing with danger. Industrial and Environmental Crisis Quarterly 9(2): 139–152.
Marcus, A.A., and M.L. Nichols. 1999. On the edge: heeding the warnings of unusual events. Organization Science 10(4): 482–499.
Martin, B. 1999. Suppression of dissent in science. Research in Social Problems and Public Policy 7: 105–135.
Minarick, J.W., and C.A. Kukielka. 1982. Precursors to Potential Severe Core Damage Accidents: 1969–1979, A Status Report. NUREG/CR-2497. Washington, D.C.: U.S. Nuclear Regulatory Commission.
O’Neill, A.C., L.A. Petersen, E.F. Cook, D.W. Bates, T.H. Lee, and T.A. Brennan. 1993. Physician reporting compared with medical-record review to identify adverse medical events. Annals of Internal Medicine 119(5): 370–376.
Paté-Cornell, M.E. 1986. Warning systems in risk management. Risk Analysis 5(2): 223–234.
Paté-Cornell, M.E., and P. Fischbeck. 1993. Probabilistic risk analysis and risk-based priority scale for the tiles of the space shuttle. Reliability Engineering and System Safety 40(3): 221–238.
Paté-Cornell, M.E., and S.D. Guikema. 2002. Probabilistic modeling of terrorist threats: a systems analysis approach to setting priorities among countermeasures. Military Operations Research 7(4): 5–23.
Phimister, J.R., U. Oktem, P.R. Kleindorfer, and H. Kunreuther. 2003. Near miss incident management in the chemical process industry. Risk Analysis 23(3): 445–459.
Pidgeon, N.F. 1991. Safety culture and risk management in organizations. Work and Stress 12(3): 202–216.
Pooley, E. 1996. Nuclear warriors. Time, March 4, pp. 46–54.
Reisch, F. 1994. The IAEA asset approach to avoiding accidents is to recognize the precursors to prevent incidents. Nuclear Safety 35: 25–35.
Roberts, K.H. 1990. Some characteristics of one type of high reliability organization. Organization Science 1(2): 160–176.
Rochlin, G.I. 1999. Safe operation as a social construct. Ergonomics 42(3): 1–12.
Rochlin, G.I., T.R. LaPorte, and K.H. Roberts. 1987. The self-designing high-reliability organization: aircraft carrier flight operations at sea. Naval War College Review 40(4): 76–90.
Strauss, B., and M.G. Morgan. 2002. Everyday threats to aircraft safety. Issues in Science and Technology 19(2): 82–86.
Turner, B.M., and N. Pidgeon. 1997. Man-made Disasters, 2nd ed. London: Butterworth-Heinemann.
USNRC (U.S. Nuclear Regulatory Commission). 1978. Risk Assessment Review Group Report. NUREG/CR-0400. Washington, D.C.: Nuclear Regulatory Commission.
van der Schaaf, T.W. 1992. Near Miss Reporting in the Chemical Process Industry. Ph.D. Thesis, Eindhoven University of Technology, the Netherlands
van der Shaff, T.W., D.A. Lucas, and A.R. Hale, eds. 1991. Near Miss Reporting as a Safety Tool. Oxford, U.K.: Butterworth-Heineman.
Vaughan, D. 1997. The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA. Chicago: University of Chicago Press.
Volpe Center. 2003. Improving Safety through Understanding Close Calls. Cambridge, Mass.: Volpe Center.
Weick, K.E., and K.H. Roberts. 1993. Collective mind and organizational reliability: the case of flight operations on an aircraft carrier deck. Administrative Science Quarterly 38: 357–381.
Weick, K.E., and K.M. Sutcliffe. 2001. Managing the Unexpected: Assuring High Performance in an Age of Complexity, Vol. 1. New York: John Wiley and Sons.
Weick, K.E., K.M. Sutcliffe, and D. Obstfeld. 1999. Organizing for high reliability. Pp. 81–123 in Research in Organization Behavior 21, R.S. Sutton and B.M. Staw, eds. Stamford, Conn.: JAI Press.
Westrum, R., and A.J. Adamski. 1999. Organizational factors associated with safety and mission success in aviation environments. Pp. 67–104 in Handbook of Aviation Human Factors, D.J. Garland, J.A. Wise, and V.D. Hopkin, eds. Mahwah, N.J.: Lawrence Erlbaum Associates.
Wiegmann, D.A., H. Zhang, T. von Thaden, G. Sharma, and A. Mitchell. 2002. A Synthesis of Safety Culture and Safety Climate Research. Technical Report ARL-02-3/FAA-02-2. Urbana-Champagne, Ill.: Aviation Research Laboratory, Institute of Aviation, University of Illinois.
White House. 2000. President Clinton Announces New Public-Private Partnerships to Increase Aviation Safety. Press release, January 14. Washington, D.C.: Office of the Press Secretary, White House.