Trustworthy Medical Device Software
Kevin Fu, PhD
This report summarizes what the computing research community knows about the role of trustworthy software for safety and effectiveness of medical devices. Research shows that problems in medical device software result largely from a failure to apply well-known systems engineering techniques, especially during specification of requirements and analysis of human factors. Recommendations to increase the trustworthiness of medical device software include (1) regulatory policies that specify outcome measures rather than technology, (2) collection of statistics on the role of software in medical devices, (3) establishment of open-research platforms for innovation, (4) clearer roles and responsibility for the shared burden of software, (5) clarification of the meaning of substantial equivalence for software, and (6) an increase in Food and Drug Administration (FDA) access to outside experts in software. This report draws upon material from research in software engineering and trustworthy computing, public FDA data, and accident reports to provide a high-level understanding of the issues surrounding the risks and benefits of medical device software.
Software plays a significant and increasing role in the critical functions of medical devices. From 2002 to 2010, software-based medical devices resulted in over 537 recalls affecting more than 1,527,311 devices (Stewart and Fu, 2010). From 1999 to 2005, the number of recalls affecting devices containing software more than doubled from 118 to 273 (Bliznakov et al.,
2006). During this period, 11.3% of all recalls were attributable to software failures. This recall rate is nearly double compared to the period of 1983–1997 where only 6% of recalls were attributable to computer software (Wallace and Kuhn, 2001). For pacemakers and implantable cardioverter–defibrillators, the number of devices recalled due to software abnormalities more than doubled compared with 1991–2000 (Maisel et al., 2002). In 2006, Faris noted the milestone that over half the medical devices on the US market now involve software (Faris, 2006).
Yet, despite the lessons learned by tragic accidents, such as the radiation injuries and deaths caused by the Therac-25 linear accelerator over 20 years ago (Leveson and Turner, 1993), medical devices that depend on software continue to injure or kill patients in preventable ways. Problems in medical device software result largely from a failure to apply well-known systems engineering techniques, especially during specification of requirements and analysis of human factors.
“The ability of software to implement complex functionality that cannot be implemented at reasonable cost in hardware makes new kinds of medical devices possible…” (NRC, 2007).
Software Can Help and Hurt
Software can significantly affect patient safety in both positive and negative ways. Software helps to automatically detect dangerous glucose levels that could be fatal for a person using an insulin pump to treat diabetes. Medical linear accelerators use software to more precisely target the radiation dose. Remote monitoring of implanted devices may help to more quickly discover malfunctions and may lead to longer survival of patients (Kolata, 2010). However, medical device software contributes to the injury or death of patients. Problems ranging from poor user interfaces to overconfidence in software have led to accidents such as fatally incorrect dosages on infusion pumps (FDA, 2004a, 2010; Meier, 2010) and in radiation therapy (Leveson and Turner, 1993; Bogdanich, 2010b). A common trait for adverse events in medical device software is that the problems are often set in place before any implementation begins (see Table D-1).
Medical Devices Ought to Be Trustworthy
In the context of software, trustworthiness is inextricably linked with safety and effectiveness. There are several definitions of trustworthy software (see Sidebar 1) that vary by the specific contributions and terminology of various research subdisciplines. However, the fundamental idea is that software trustworthiness is a system property measuring how well a software system meets requirements such that stakeholders will trust in the
TABLE D-1 Examples of Adverse Events Where Medical Device Software Played a Significant Role
Linear accelerator: Patients died from massive overdoses of radiation.
An FDA memo regarding the Corrective Action Plan (CAP) notes that “unfortunately, the AECL response also seems to point out an apparent lack of documentation on software specifications and a software test plan” (Leveson, 1995).
Pacemakers and implantable defibrillators: Implant can be wirelessly tricked into inducing a fatal heart rhythm (Halperin et al., 2008).
Security and privacy need to be part of the early design process.
Infusion pump: Patients injured or killed by drug overdoses.
Software that did not prevent key bounce misinterpreted key presses of 20 mL as 200 mL (Flournoy, 2010).
Infusion pump: Underdosed patient experienced increased intracranial pressure followed by brain death.
Buffer overflow (programming error) shut down pump (FDA, 2007).
Ambulance dispatch: Lost emergency calls.
An earlier system for the London Ambulance Service failed two major tests and was scuttled (Graham, 1992). Ambulance workers later accused the computer system of losing calls and said that “the number of deaths in north London became so acute that the computer system was withdrawn” (Tompsett, 1992). The ambulance company attributed the problems to “teething troubles” with a new computer system (Tompsett, 1992).
Health information technology (HIT) devices: Computer systems globally rendered unavailable.
An anti-virus update misclassified a core Windows operating system component as malware and quarantined the file, causing a continuous reboot cycle for any system that accepted the software update (Leyden, 2010). Numerous hospitals were affected. At Upstate University Hospital in New York, 2,500 of the 6,000 computers were affected (Tobin, 2010). In Rhode Island, a third of the hospitals were forced “to postpone elective surgeries and stop treating patients without traumas in emergency rooms” (Svensson, 2010).
The many definitions of trustworthiness.
One definition of trustworthy software is “software that is dependable (including but not limited to reliability, safety, security, availability, and maintainability) and customer-responsive. It can fulfill customer trust and meet the customer’s stated, unstated, and even unanticipated needs” (Jayaswal and Patton, 2007). Another definition emphasizes the multidimensional, system-oriented nature that trustworthiness of a system implies that it is worthy of being trusted to satisfy its specified requirements (e.g., safety, effectiveness, dependability, reliability, security, privacy) with some [quantifiable] measures of assurance (Neumann, 2006). The National Science Foundation associates trustworthiness with properties of security, reliability, privacy, and usability—arguing that these “properties will lead to the levels of availability, dependability, confidentiality, and manageability that our systems, software and services must achieve in order to overcome the lack of trust people currently feel about computing and what computing enables” (NSF, 2010).
operation of the system. The requirements include overlapping and sometimes competing notions of safety, effectiveness, usability, dependability, reliability, security, privacy, availability, and maintainability.
Failure to meaningfully specify requirements, complacency, and lack of care for human factors can erode trustworthiness. The lack of trustworthy medical device software leads to shortfalls in properties such as safety, effectiveness, usability, dependability, reliability, security, and privacy. Good systems engineering (Ryschkewitsch et al., 2009) and the adoption of modern software engineering techniques can mitigate many of the risks of medical device software. Such techniques include a technical and managerial mindset that focuses on “design and development of the overall system” (Leveson, 1995) as opposed to focusing on optimization of components; meaningful specification of requirements such as intent specifications (Leveson, 2000); application of systems safety (Leveson, 1995); and static analysis (NITRD, 2009).
Although it is possible to create trustworthy medical device software under somewhat artificial constraints to achieve safety and effectiveness without satisfying other properties, in practice it is difficult to find environments where the properties are not linked. A medical device that works effectively in isolation may lose the effectiveness property if another component engineered separately joins the system, causing unanticipated inter-
actions. For example, a computer virus caused 300 patients to be turned away from radiation therapy because of shortfalls in security (BBC News, 2005). A security component can also reduce effectiveness if not designed in the context of system. For instance, a mammography imaging system may become ineffective if an automatic update of an anti-virus program designed to increase security causes the underlying operating system to instead fail (Leyden, 2010).
Innovations that combine computer technology with medical devices could greatly improve the quality of health care (Lee et al., 2006; NITRD, 2009), but the same life-saving technology could reduce safety because of the challenges of creating trustworthy medical device software. For instance, an implantable medical device with no physical means to wirelessly communicate over long distances may work safely and effectively for years. However, adding remote monitoring of telemetry to the device introduces an interface that fundamentally changes the properties of the overall system. The new system must require not only that any component designed to interact with the device be trustworthy, but also that any component capable of communicating with the device be trustworthy.
MEDICAL DEVICES, BUT WITH SOFTWARE: WHAT’S THE DIFFERENCE?
Patients benefit from software-based medical devices because “computers provide a level of power, speed, and control not otherwise possible” (Leveson, 1995). Without computer software, it would not be feasible to innovate a closed-loop, glucose-sensing insulin pump; a remotely monitored, implantable cardiac defibrillator; or a linear accelerator that calculates the radiation dose based on a patient’s tissue density in each cross-section. However, the methodology used in practice to mitigate risks inherent in software have not kept pace with the deployment of software-based medical devices. For example, using techniques that work well to assure the safety and effectiveness of hardware or mechanical components will not mitigate the risks introduced by software. The following points use the writing of Pfleeger et al. (2001) with permission. There are several reasons why software requires a different set of tools to assure safety and effectiveness:
The discrete (as opposed to continuous) nature of software (Lorge Parnas, 1985). Software is sensitive to small errors. Most engineered systems have large tolerances for error. For example, a 1-inch nail manufactured to be 1.0001 inch or 0.9999 inch can still be useful. Manufacturing is a continuous process, and small errors lead to results essentially the same as the exact, desired result. However, consider a slight error in entering a bolus dosage on an infusion pump. A
single key press error in selecting hours vs minutes could result in a bolus drip at 60 times the desired rate of drug delivery (FDA, 2004b). With some exceptions, small changes in continuous systems lead to small effects; small changes to discrete systems lead to large and often disastrous effects. The discrete nature of software also leads to limited ability to interpolate between test results. A system that correctly provides radiation doses of 20 centigrays (cGy) and 40 cGy does not on its own allow interpolation that would work correctly for 32 cGy. There is also seldom no direct equivalent to “over-engineering” safety margins for software systems in comparison to physical systems.
The immaturity of software combined with rapid change. We keep running at an ever-faster pace to develop or use increasingly complex software systems that we do not fully understand, and we place such software in systems that are more and more critical. For example, a Networking and Information Technology Research and Development Program (NITRD) report of the High-Confidence Medical Devices, Software, and Systems (HCMDSS) Workshop (NITRD, 2009) notes that
Many medical devices are, essentially, embedded systems. As such, software is often a fundamental, albeit not always obvious, part of a devices’s functionality …. Devices and systems are becoming increasingly complicated and interconnected. We may already have reached the point where testing as the primary means to gain confidence in a system is impractical or ineffective.
The recent reporting of several radiation deaths stemming from medical linear accelerators (Bogdanich, 2010a) further highlights how complexity outpaces the maturity of present-day practices for creating trustworthy medical device software:
“When it exceeds certain levels of complexity, there is not enough time and not enough resources to check the behavior of a complicated device to every possible, conceivable kind of input,” said Dr. Williamson, the medical physicist from Virginia.
But the technology introduces its own risks: it has created new avenues for error in software and operation, and those mistakes can be more difficult to detect. As a result, a single error that becomes embedded in a treatment plan can be repeated in multiple radiation sessions.
Despite these challenges, software has improved the effectiveness of critical systems in contexts such as avionics. Modern airplanes would be difficult to fly without the assistance of software, but airplanes have also introduced safety risks of software by using fly-by-wire (electronic) controls instead of pneumatics. However, there is a substantial belief among
software engineers that the medical device community (unlike the avionics community) does not take full advantage of well-known techniques for engineering software for critical systems. Many software engineers feel that that well-known technology not only lags, but is often ignored by medical device manufacturers. The safety culture of the avionics community does not appear to have universal appreciation in the medical device community.
TECHNIQUES TO CREATE TRUSTWORTHY MEDICAL DEVICE SOFTWARE
While the role of software in medical devices continues to increase in significance, deployment lags for well-known techniques that can mitigate many of the risks introduced by software. The following discussion draws from several technical documents on software engineering for critical systems.
The reader is strongly encouraged to read the full text of reports from NITRD on high-confidence medical devices (NITRD, 2009) and from the National Academies on software for dependable systems (NRC, 2007). Highly recommended reading on software engineering for critical systems includes Safeware: System Safety and Computers (Leveson, 1995) and Solid Software (Pfleeger et al., 2001) as well as evidence-based certification strategies such as the British Ministry of Defence Standard 00-56 (Ministry of Defence, 2007).
Adopt Modern Software Engineering Techniques
Medical device software lags in the adoption of modern software engineering techniques ranging from requirements specification to verification techniques. Fortunately, mature technology is already available to address common problems in medical device software, and that technology has been successful in other safety-critical industries such as avionics.
Programming languages that do not support software fault detection as comprehensively as possible should be avoided in medical device software. The C programming language, for example, has a very weak type system, and so the considerable benefits of strong type checking are lost. By contrast, the Ada programming language provides extensive support for software fault detection. Similarly, mechanical review of software using a technique known as static analysis is a mature technology that can identify possible faults quickly and efficiently. Static analysis supports the overall goal of developing trustworthy software and should be employed to the extent possible. Type checking and static analysis are two mature methods that guide software engineers toward safer and more effective medical device software by reducing or eliminating common sources of software errors.
Some programming systems permit a specification of software to be embedded into the software itself so that compliance of the code with the specification can be checked mechanically. A commercial system that provides this capability along with commercial support of both the language itself and the associated static analysis tools is SPARK Ada. Techniques such as these should be employed whenever possible to enable more effective testing and analysis of software.
A software specification is a statement of what the software has to do. Stating precisely what software has to do has proved extremely difficult, and specification is known to be a major source of software faults. Research over many years has yielded formal languages—i.e., languages with semantics defined in mathematics—that can help to avoid specification errors. Formal specification has been shown to be effective, and formal specifications for medical devices should be employed whenever possible.
Meaningfully Specify Requirements
Safety failures in software tend to stem from flaws during specification of requirements (Leveson, 1995). The first example in Table D-1 represents a failure of requirements specification in a 1980s linear accelerator that killed a number of patients, and some believe that the lack of meaningful systems-level specification of requirements contributed to the deaths in the recent radiation overdoses from a modern linear accelerator (Bogdanich, 2010a).
In critical systems, meaningful specification of requirements is crucial to properly anchor testing and analysis. Shortfalls in specification of requirements will lead to a false sense of safety and effectiveness during subsequent design, implementation, testing, etc. An example of meaningful specification of a requirement might be “stop delivery if dose exceeds patient’s prescription” or “patient’s received level of radiation must match level of radiation specified by operator.” Such specification of requirements goes beyond purely functional descriptions such as “pressing start button begins primary infusion” or “delivered level of radiation adjusts to tissue density” that do not meaningfully capture the end-to-end system properties of a medical device.
Leading software engineers believe that many medical device manufacturers have an opportunity to significantly improve specification of requirements. In comparing medical devices to avionics systems, researchers wrote in the NITRD report High-Confidence Medical Devices: Cyber-Physical Systems for 21st Century Health Care (NITRD, 2009) that “perhaps the most striking [difference] is the almost complete lack of regard, in the medical-device software domain, for the specification of requirements.” A National Academies report (NRC, 2007) similarly noted that “at least in comparison with other domains (such as medical devices), avionics software
appears to have fared well inasmuch as major losses of life and severe injuries have been avoided.”
The NITRD report emphasizes that business models and incentives in the medical device sector lead to highly proprietary technologies that have two detrimental side effects: (1) companies are less likely to perceive value from specification of requirements, and (2) academic researchers have a much harder time in participating in the innovation of medical device technology.
The National Academies report recommended a direct path to dependable software (Jackson, 2009) for critical systems such as found in medical devices. Under this philosophy, system designers focus on providing direct evidence to support claims about software dependability. The approach contrasts with prescriptive standards that may otherwise dictate the specific claims. System designers are given flexibility to innovate by selecting the claims deemed necessary for the specific application at hand. Designers are forced to think carefully about proving the claims, but a difficulty remains in that the results are only as meaningful as the chosen claims.
Apply a Systems Engineering Approach
Software adds such complexity to the design of medical devices that the device must be treated as a system rather than an isolated component. The behavior of medical device software depends on its context within a system. Whereas biocompatibility of material may lend itself to conventional testing (Kucklick, 2006), the complexity of software requires a systems engineering approach (Ryschkewitsch et al., 2009). At a recent workshop on infusion pumps, it was pointed out that the 510(k) process is mostly a checklist, but this checklist approach provides less assurance as devices increase in complex system behavior (Chapman, 2010). Shuren (2010) provides an example of software-based medical devices that may operate safely and effectively in isolation, but not when integrated as a system:
Images produced by a CT scanner from one vendor were presented as a mirror image by another vendor’s picture archiving and communication system (PACS) web software. The PACS software vendor stipulates that something in the interface between the two products causes some images to be randomly “flipped” when displayed.
The NITRD report of the HCMDSS workshop (NITRD, 2009) notes that
Integrating technology into the clinical environment—which includes practitioners, workflows, and specific devices—often lacks a holistic, systems perspective. Many medical devices are designed, developed, and marketed largely as
individual systems or gadgets. Device integration, interoperability, and safety features are not considered during development, acquisition, or deployment.
The rapid push toward device interoperability, wireless communication, and Internet connectivity will likely improve the effectiveness of care but will also reinforce the notion of medical device software as systems rather than isolated devices. Because medical devices are no longer isolated devices, an effective strategy for increasing trustworthiness is to follow good systems engineering methodology.
Evaluation of medical device software should require independent, third-party review by experts who are not connected with the manufacturer. Third-party evaluation in combination with good systems engineering can mitigate many of the system-level risks of medical device software.
Mitigate Risks Due to Human Factors
Poor understanding of human factors can lead to the design of medical device software that reinforces risky behavior, which can result in injury or death. For instance, a software application card used in an implantable drug pump was recalled because of a user interface where the hours and minutes fields for a bolus rate were ambiguously labeled on the computer screen (FDA, 2004a). A patient with an implantable drug pump died from an overdose because the health care professional set the bolus interval to 20 minutes rather than 20 hours (FDA, 2004b). Thus, the drug was administered at 60 times the desired rate. The patient passed out while driving, experienced a motor vehicle accident, and later died after the family removed life support.
Unmitigated risks of human factors also contributed to the recent radiation overdoses of patients treated by linear accelerators. One report from the New York Times (Bogdanich and Ruiz, 2010) quotes Dr. James Thrall, professor of radiology at Harvard Medical School and chairman of the American College of Radiology, saying, “There is nothing on the machine that tells the technologist that they’ve dialed in a badly incorrect radiation exposure.”
Medical device software must accommodate inevitable human errors without affecting patient safety. Moreover, the specification of requirements should take into account all the key stakeholders. For instance, it is believed that some infusion pump manufacturers specify requirements based mostly on interactions with physicians rather than the primary operators of the pump: nurses. When nurses become disoriented and frustrated using infusion pumps, operational problems can result. Inadequate attention to human factors during specification of requirements will promote hazardous situations.
Mitigate Low-Probability, High-Consequence Risks
Manufacturers, health care professionals, and users often put too much confidence in medical device software. “It can’t happen here.” “There are no reported problems.” Such statements have only a shallow basis in fact, but lead to a false sense of security. The manufacturer of the Therac-25 linear accelerator, which killed and injured a number of patients with radiation overdoses, initially responded to complaints from treatment facilities by saying that “the machine could not possibly over treat a patient and that no similar complaints were submitted to them” (Leveson and Turner, 1993; Leveson, 1995; Faris, 2006). It is very difficult to reproduce problems in software—often leading to denial rather than discovery of root causes. This difficulty derives in part from the complexity of a device’s system-of-systems architecture and from the embedded nature of the system.
Security and privacy fall into the category of low-probability, high-consequence risks that could lead to widespread problems with little or no warning. Problems range from downtime to intentional harm to patients. Because devices can easily connect with physically insecure infrastructure such as the Internet and because software vulnerabilities (see Sidebar 2) are
Medical devices are susceptible to malware.
Medical devices are no more immune to malware (e.g., viruses, botnets, and keystroke loggers) than any other computer. Computer viruses can delete files, change values, expose data, and spread to other devices. A computer virus does not distinguish between a home computer and a hospital computer. Yet in the health care setting, the consequences of malicious software could lead to less effective care (e.g., corrupted electronic medical records that necessitate retesting) and diminished safety (e.g., overdoses from infusion pumps, radiation therapy, or implantable medical devices).
For these reasons, vendors may advise health care providers to install antivirus software with automated Internet-based updates. However, these products introduce risks that can themselves reduce the trustworthiness of the medical device software. When McAfee released an automated update of its virus definition files, the antivirus product incorrectly flagged a critical piece of Windows software as malicious—and quarantined the software (Leyden, 2010). This disruption of a critical file caused a number of hospitals to suffer downtime. Medical systems were rendered unavailable.
often discovered with little or no warning before threats exploit the vulnerability (Staniford et al., 2002), security and privacy outcome measures should play a central role in all major aspects of software development of medical device software (specification, design, human factors, implementation, testing, and maintenance).
Patients who receive treatment from a potentially lethal medical device should have access to information about its evaluation just as they have access to information about the side effects and risks of medications (NRC, 2007).
Specification of requirements should address low-probability, high-consequence risks. If a high-consequence risk proves too difficult or costly to mitigate, health care professionals deserve to know about the risks, no matter how small.
Innovations in wireless communication and computer networking have led to great improvements in patient care ranging from remote, mobile monitoring of patients (e.g., at-home monitors for cardiac arrhythmias or diabetes) to reduced risks of infection as a result of removing computer equipment from the sterile zone of an operating room (e.g., wireless wands for pacemaker reprogramming). However, the increased interconnectedness of medical devices leads to security and privacy risks for medical devices both in the hospital and in the home (Kilbridge, 2003; Fu, 2009; Maisel and Kohno, 2010). For instance, there is no public record of a specification that requires a home monitoring device to be physically incapable of reprogramming an implanted cardiac device. Thus, a malicious piece of software could change the behavior of a home monitor to quietly disable therapies or even induce a fatal heart rhythm—without violating the public specification.
POLICY RECOMMENDATIONS FOR TRUSTWORTHY MEDICAL-DEVICE SOFTWARE
Regulatory and economic policies should promote innovation while incentivizing trustworthiness in a least burdensome manner. One study of medical device recalls concludes that the economic impact of poor quality does not in general have severe financial penalties on the affected company (Thirumalai and Sinha, 2010). The policy recommendations below focus on technical and managerial issues rather than financial penalties or incentives.
Specify Outcome Measures, Not Technology
The safety and effectiveness of software-based medical devices could be better regulated in terms of outcome measures rather than in terms of
specific technologies.1 The regulatory infrastructure should aim at making industry meet meaningful goals and show proof of achieving such goals.
The push toward prescriptive standards leads to an oversimplification in that the trustworthiness of a device depends on context. For example, one FDA notice advises to “update your operating system and medical device software” (FDA, 2009). However, software updates themselves can carry risks that should be either accepted or mitigated depending on the situation specific to each medical device. On a desktop computer used to update a portable automated external defibrillator (AED), it might be reasonable to routinely update the operating system even if there is a risk that the update may fail in a manner that makes the desktop machine inoperable. However, updating the operating system on the defibrillator itself carries a risk that failure could render the AED inoperable. A hospital that updates all its devices simultaneously is vulnerable to systemwide inability to provide care.
Rather than prescribe specific technologies, regulatory policies should incentivize manufacturers to specify meaningful outcome measures in the context of the given device and be required to prove such claims. Lessons from evidence-based medicine (IOM, 2007) could assist in creating outcome measures for trustworthy medical device software.
Collect Better Statistics on the Role of Software in Medical Devices
Many questions about the trustworthiness of medical device software are difficult to answer because of lack of data and inadequate record keeping. Questions include the following:
To what degree are critical device functions being performed by software (vs hardware)? Is the amount increasing? Decreasing?
What effect does software have on reliability? Availability? Maintainability? Ease of use?
How do these software characteristics compare with similar implementations in hardware? Does the software make the device safer or more effective?
What do data on the predicate device reveal about the new device? Do predicate data save time in specification of the new device? Do predicate data save time in testing of the new device?
Many record-keeping tools are already in place (e.g., the MAUDE adverse events database and the recalls database at FDA). However, these tools are severely underutilized. Databases suffer from severe underreporting. For
example, in the same time period there are only 372 adverse event reports in MAUDE that cite “computer software issues” despite there being well over 500 entries in the recall database that cite software as a reason for the recall. In the Department of Veterans Affairs, “over 122 medical devices have been compromised by malware over the last 14 months,” according to House testimony (Baker, 2010). But there are no records in MAUDE citing a “computer system security issue.”
Scott Bolte of GE Healthcare emphasizes that for security problems, formal reporting is especially lacking (Bolte, 2005):
Although there is a lot of anecdotal evidence that malicious software has compromised medical devices, there is a notable lack of formal evidence. So without this formal reporting, FDA is limited in its ability to act or intervene. Reporting is something providers and arguably the manufacturers themselves can and should start doing immediately.
Policies should encourage better reporting of adverse events and recalls. Otherwise it will only be possible to point out anecdotal failures rather than confidently point out trends for successful products that epitomize innovation of trustworthy medical device software.
Enable Open Research in Software-Based Medical Devices
The highly proprietary nature of the medical device software industry makes it difficult for innovators to build upon techniques of properly built systems. Some information may become public after an accident, but this information teaches about failure rather than success. More open access to success stories of engineering medical device software would lead to innovation of safer and more effective devices. The NITRD report (2009) explains:
Today we have open-research platforms that provide highly effective support for the widespread dissemination of new technologies and even the development of classified applications. The platforms also provide test beds for collaborations involving both researchers and practitioners. One spectacular example is the Berkeley Motes system with the TinyOS operating system.
The medical-device community could benefit from the existence of such open-research platforms. They would enable academic researchers to become engaged in directly relevant problems while preserving the need for proprietary development by the industry. (TinyOS facilitates academic input even on government-classified technology, which is an example of what is possible.)
An open research community needs to be established comprising academics and medical device manufacturers to create strategies for
the development of end-to-end, principled, engineering-based design and development tools.
Clearly Specify Roles and Responsibility
In complex systems of systems that rely on software, it is difficult to pinpoint a single party responsible for ensuring trustworthiness of software because the property is of the system of systems rather than of individual components. A modern linear accelerator is an example of a complex system of systems because commercial off-the-shelf (COTS) software such as Windows may serve as the underlying operating system for a separately engineered software application for planning and calculation of dose distribution. An embedded software system then uses the treatment plan to control mechanical components that deliver radiation therapy to a patient. When different entities separately manage software components in complex systems of systems, system-level properties such as safety are more difficult to ensure because no single entity is responsible for overall safety.
The FDA notes that a key challenge is a shared responsibility for failures in software (FDA, 2009). If the user updates the software on a medical device, is the manufacturer truly at fault? If a medical device relies on third party software such as operating systems, who is responsible for maintaining the software?
Technology alone is unlikely to mitigate risks that stem from system-level interactions of complex software designed by different organizations with different agendas and outcome measures. The problem is probably intractable without a single authority responsible for the trustworthiness of interfaces between interacting systems. The interface between medical device application software and COTS software is a common battleground for disclaimers of responsibility (see Sidebar 3).
Leveson (1995) points out that diffusion of responsibility and authority is an ineffective organizational structure that can have disastrous effects when safety is involved. The British Ministry of Defence (2007) provides a good example of clear roles and responsibilities for safety management of military systems. The ideas apply broadly to critical systems and may work well for medical systems.
Clarify the Meaning of Substantial Equivalence for Software
In the context of the 510(k) pre-market notification process, demonstration of “substantial equivalence” to a previously approved “predicate” medical device allows a manufacturer to more quickly seek approval to market a medical device (see Sidebar 4).
Imagine if the predicate device has a function implemented in hardware, and the manufacturer claims that the new version is substantially equivalent
Take Service Pack 3 and see me in the morning.
Medical devices can outlast the underlying operating system software. Many medical devices rely on commercial off-the-shelf (COTS) software, but COTS software tends to have a shorter lifetime for expected use than a typical medical device. For instance, Microsoft mainstream support for Windows XP lasted for less than 8 years (December 2001–April 2009) (Microsoft, 2003), whereas an MR scanner may have an operational life of 10–20 years (Bolte, 2005).
It is not uncommon for a newly announced medical device to rely on an operating system no longer supported by its manufacturer. Microsoft ended support for security patches for Windows XP Service Pack 2 and advises vendors to upgrade products to Service Pack 3. But hospitals often receive conflicting advice on whether to update software. House testimony (Joffe, 2009) mentions that
As a sobering side-note, over the last three weeks, in collaboration with a researcher from Georgia Tech in Atlanta who is involved with the Conficker Working group, I have identified at least 300 critical medical devices from a single manufacturer that have been infected with Conficker. These devices are used in large hospitals, and allow doctors to view and manipulate high-intensity scans (MRI, CT Scans etc), and are often found in or near ICU facilities, connected to local area networks that include other critical medical devices.
because the only difference is that the new version is implemented in software. Because hardware and software exhibit significantly different behavior it is important that the design, implementation, testing, human factors analysis, and maintenance of the new device mitigate the risks inherent in software. However, this difference casts doubt on substantial equivalence because of the different technological characteristics that raise different risks to safety and effectiveness. Furthermore, when does a software-related flaw in a recalled predicate device imply that the same flaw exists in the new device?
As was noted at the Institute of Medicine Workshop on Public Health Effectiveness of the FDA 510(k) Clearance Process held in June 2010, there is doubt as to whether hardware can act as a predicate for functions implemented in software. Dr. David Feigel, former director of the FDA Center for Devices and Radiological Health (CDRH), said that “one of the interesting
Worse, after we notified the manufacturer and identified and contacted the hospitals involved, both in the US and abroad, we were told that because of regulatory requirements, 90 days notice was required before these systems could be modified to remove the infections and vulnerabilities.
Users of medical picture archiving and communication systems (PACS) struggle to meet conflicting requirements: medical device manufacturers who require health care facilities to use old, insecure operating systems and FDA guidelines that advise keeping operating systems up-to-date with security patches. One anonymous posting on a technical support Web site (Windows Client Tech Center, 2008) reads:
I am setting up a medical imaging facility and I am trying to do the same thing as well. The PACS system we are integrating with is only compatible with SP2. I order 6 new Dell workstations and they came preloaded with SP3. There are “actual” versions of programs out there that require SP2. For instance, the $250,000 Kodak suite I am installing. Plus a $30,000/yr service contract. This holds true for the majority of the hospitals which have PACS systems.
However, if what you are saying is true then I found something useful within your post. You stated “if you installed XP with integrated sp3, it is not possible to downgrade sp3 to sp2,” is this true? Do you have any supporting documentation as this would be very helpful so that I can provide Dell with a reason why I need to order downgraded XP discs.
The plaintive quality of this call for help provides insight into how helpless some users feel because of the diffusion of responsibility for maintaining COTS software contained within medical devices.
classes is radiation equipment … even the software, which I wonder where they got the first predicate for software” (IOM, 2010). The interpretation of substantial equivalence needs clarification for software-based medical devices.
Increase FDA Access to Outside Experts in Software Engineering
The FDA should increase its ability to maintain safety and effectiveness of medical devices by developing a steady pipeline of human resources with expertise in software engineering for critical systems.
Various offices within FDA’s CDRH employ a small number of software experts. FDA also has a number of successful fellowship programs including the Commissioner’s Fellowship Program, the Medical Device Fellowship Program, and the Device Evaluation Intern Program to attract
Substantial equivalence: paper or plastic?
An interesting thought experiment is to ask how the trustworthiness of electronic health records differs from traditional paper records. FDA generally does not consider a paper medical record as a medical device. However, FDA may consider an electronic health record as a medical device. Adding automated algorithms to prioritize display of data from an electronic medical record would shift the system toward regulation as a medical device.
Paper records are subject to threats such as fire, flood, misplacement, incorrect entry, and theft. Paper records are cumbersome to back up and require large storage rooms. But electronic records introduce risks qualitatively different from paper records. Making changes to a paper record tends to leave behind physical evidence that is auditable, but making electronic records auditable requires intentional design. A single coding error or errant key press could lead to destruction of an entire collection of electronic records—especially for encrypted data. The speed of technology can make electronic record keeping easier, but can encourage bad habits that to lead to difficult to detect mistakes. For instance, a computer display that clears the screen following the completion of an operation makes it difficult to trace back a sequence of changes. Overconfidence in software for electronic medical records could lead to financially motivated decisions to discontinue paper-based backup systems. One full-scale failure of a clinical computing system at the Beth Israel Deaconess Medical Center lasted four days—forcing the hospital to revert to manual processing of paper records (Kilbridge, 2003). While paper-based backup procedures allowed care to continue, few of the medical interns had any experience with writing orders on paper. When health care professionals struggle with technology, patients are at risk.
Heated debates about paper versus electronic recording appears in other contexts such as voting. A National Academies report (NRC, 2005) provides context for the electronic voting debate with arguments applicable to the safety and effectiveness of electronic medical records.
students and experienced experts from medical and scientific communities. However, software experts are notably underrepresented in these programs. The Web page for the Medical Device Fellowship Program2 targets health
professionals, and other existing programs primarily target biomedical engineers rather than software engineers. Of the fifty Commissioner’s Fellows selected in 2009, none had formal training in computer science.3 In 2008, one of the fifty fellows had a computer science degree, but did not work in CDRH. A former FDA manager indicated that software experts rarely participate in these fellowship programs. Another person familiar with FDA processes noted that seldom does an FDA inspector assigned to review a 510(k) application have experience in software engineering—even though the majority of medical devices today rely on software.
The FDA should expand its access to outside experts for medical device software by creating fellowship programs that target software engineers. For instance, FDA could more aggressively recruit students and faculty from computer science and engineering—especially individuals with advanced training in software engineering topics such as system and software safety, dependable computing, formal methods, formal verification, and trustworthy computing.
The lack of trustworthy medical device software leads to shortfalls in safety and effectiveness, which are inextricably linked with properties such as usability, dependability, reliability, security, privacy, availability, and maintainability. Many risks of medical device software could be mitigated by applying well-known systems engineering techniques, especially during specification of requirements and analysis of human factors. Today, the frequency of news reports on tragic, preventable accidents involving software-based medical devices falls somewhere between that of automobile accidents and airplane accidents. Event reporting on tragic medical device accidents is likely headed toward the frequency of the former given the continued increase in system complexity of medical device software and present-day regulatory policies that do not adequately encourage use of modern software engineering and systems engineering practices.
Several individuals deserve thanks for assistance with this report. Shari Lawrence Pfleeger provided extensive feedback and contributed material on threats and on why software behaves differently from hardware. John Knight provided feedback and contributed material on modern software engineering techniques. Philip Henneman, Daniel Jackson, Leon Osterweil,
and Jerry Saltzer provided feedback on early drafts of this report. Paul L. Jones, Nancy Leveson, John F. Murray Jr., and Gideon Yuval provided further technical input. Quinn Stewart collected data pertaining to FDA medical device recalls. Hong Zhang assisted with formatting of the report. Heather Colvin and Abigail Mitchell at the Institute of Medicine deserve special thanks for organizing a successful series of workshops on the Public Health Effectiveness of the FDA 510(k) Clearance Process.
Baker, R.W. 2010. Statement of the Honorable Roger W. Baker. House Committee on Veterans’ Affairs, Subcommittee on Oversight and Investigations, Hearing on Assessing Information Security at the US Department of Veterans’ Affairs.
BBC News. 2005. Hospital struck by computer virus. [cited 10/29/2010] Available from http://news.bbc.co.uk/2/hi/uk_news/england/merseyside/4174204.stm.
Bliznakov, Z., G. Mitalas, and N. Pallikarakis. 2006. Analysis and classification of medical device recalls. World congress on medical physics and biomedical engineering 2006, 14(6):3782–3785.
Bogdanich, W. 2010a. As technology surges, radiation safeguards lag. The New York Times [cited 10/29/2010] Available from http://www.nytimes.com/2010/01/27/us/27radiation.html.
Bogdanich, W. 2010b. Radiation offers new cures, and ways to do harm. The New York Times [cited 10/29/2010] Available from http://www.nytimes.com/2010/01/24/health/24radiation.html.
Bogdanich, W., and R. Ruiz. 2010. FDA to increase oversight of medical radiation. The New York Times [cited 10/29/1020] Available from http://www.nytimes.com/2010/02/10/health/policy/10radiation.html.
Bolte, S. 2005. Cybersecurity of medical devices [cited 10/29/2010] Available from http://www. fda.gov/MedicalDevices/Safety/MedSunMedicalProductSafetyNetwork/ucm127816.htm.
Chapman, R. 2010. Assurance cases for external infusion pumps. FDA Infusion Pump Workshop [cited 10/29/2010] Available from http://www.fda.gov/downloads/MedicalDevices/NewsEvents/WorkshopsConferences/UCM217456.pdf.
Faris, T.H. 2006. Safe and Sound Software: Creating an Efficient and Effective Quality System for Software Medical Device Organizations. ASQ Quality Press.
FDA (US Food and Drug Administration). 2004a. Medtronic announces nationwide, voluntary recall of model 8870 software application card. [cited 10/29/2010] Available from http://www.fda.gov/MedicalDevices/Safety/RecallsCorrectionsRemovals/ListofRecalls/ucm133126.htm.
FDA. 2004b. Neuro n’vision programmer. MAUDE Adverse Event Report [cited 10/29/2010] Available from http://www.accessdata.fda.gov/SCRIPTs/cdrh/cfdocs/cfMAUDE/Detail.cfm?MDRFOI__ID=527622.
FDA. 2007. Baxter healthcare pte. ltd. colleague 3 cxe volumetric infusion pump 80frn. MAUDE Adverse Event Report. [cited 10/29/2010] Available from http://www.accessdata.fda.gov/SCRIPTs/cdrh/cfdocs/cfMAUDE/Detail.cfm?MDRFOI__ID=914443.
FDA. 2009. Reminder from FDA: Cybersecurity for networked medical devices is a shared responsibility [cited 10/29/2010] Available from http://www.fda.gov/MedicalDevices/Safety/AlertsandNotices/ucm189111.htm.
FDA. 2010. Infusion pump improvement initiative [cited 10/29/2010] Available from http://www.fda.gov/medicaldevices/productsandmedicalprocedures/GeneralHospitalDevicesandSupplies/InfusionPumps/ucm205424.htm.
Flournoy, V.A. 2010. Medical device recalls. [cited 10/29/2010] Available from http://www.fda.gov/downloads/MedicalDevices/NewsEvents/WorkshopsConferences/UCM216992.pdf
Fries, R.C. 2005. Reliable Design of Medical Devices, Second Edition. CRC Press.
Fu, K. 2009. Inside risks, reducing risks of implantable medical devices. Communications of the ACM, 52(6).
Graham, D.R. 1992. London ambulance service computer system problems. Risks Digest, 13(38) [cited 10/29/2010] Available from http://catless.ncl.ac.uk/Risks/13.38.html.
Halperin, D., T.S Heydt-Benjamin, B. Ransford, S.S. Clark, B. Defend, M. Benessa, W. Morgan, K. Fu, K. Tadayoshi, and W.H. Maisel. 2008. Pacemakers and implantable cardiac defibrillators: Software radio attacks and zero-power defenses. In Proceedings of the 29th Annual IEEE Symposium on Security and Privacy.
IOM (Institute of Medicine). 2007. The Learning Healthcare System: Workshop Summary. IOM Roundtable on Evidence-Based Medicine. Washington, DC: The National Academies Press.
IOM. 2010. Public Health Effectiveness of the FDA 510(k) Clearance Process: Balancing Patient Safety and Innovation. Edited by T. Wizemann. Washington, DC: The National Academies Press.
Jackson, D. 2009. A direct path to dependable software. Communications of the ACM, 52(4):78–88.
Jayaswal, B.J., and P.C. Patton. 2007. Design for Trustworthy Software. Prentice Hall, Pearson Education.
Joffe, R. 2009. House Committee on Energy and Commerce Subcommittee on Communications, Technology and the Internet. Testimony of Rodney Joffe, May 1 2009, [cited 10/29/2010] Available from http://energycommerce.house.gov/Press_111/20090501/testimony_joffe.pdf.
Kilbridge, P. 2003. Computer crash–lessons from a system failure. New England Journal of Medicine, 348(10):881–882.
Kolata, G. 2010. New tools for helping heart patients. The New York Times [cited 10/29/2010] Available from http://www.nytimes.com/2010/06/22/health/22heart.html.
Kucklick, T.R. 2006. The Medical Device R&D Handbook. Taylor & Francis.
Lee, I., G. Pappas, R. Cleaveland, J. Hatcliff, B. Krogh, P. Lee, H. Rubin, and L. Sha. 2006. High-confidence medical device software and systems. Computer, 39(4):33–38.
Leveson, N.G., and C. Turner. 1993. An investigation of the Therac-25 accidents. IEEE Computer, 26(7):18–41.
Leveson, N.G. 1995. Safeware: System Safety and Computers. Addison-Wesley.
Leveson, N.G. 2000. Intent specifications: An approach to building human-centered specifications. IEEE Transactions on Software Engineering, 26:15–35.
Leyden, J. 2010. Rogue McAfee update strikes police, hospitals and Intel. [cited 10/29/2010] Available from http://www.theregister.co.uk/2010/04/22/mcafee_false_positive_analysis/.
Lorge Parnas, D. 1985. Software aspects of strategic defense systems. Communications of the ACM, 28(12):1326–1335.
Maisel, W. H. and T. Kohno. 2010. Improving the security and privacy of implantable medical devices. New England Journal of Medicine, 362(13):1164–1166.
Maisel, W. H., W.G. Stevenson, and L.M. Epstein. 2002. Changing trends in pacemaker and implantable cardioverter defibrillator generator advisories. Pacing and Clinical Electrophysiology, 25(12):1670–1678.
Meier, Barry. 2010. FDA steps up oversight of infusion pumps. The New York Times [cited 10/29/2010] Available from http://www.nytimes.com/2010/04/24/business/24pump.html.
Microsoft. 2003. Windows XP history. [cited 10/29/2010] Available from http://www.micro-soft.com/windows/WinHistoryProGraphic.mspx.
Ministry of Defence. 2007. Ministry of defence standard 00-56: Safety management requirements for defence systems, June 2007. Issue 4.
Neumann, P.G. 2006. Risks of untrustworthiness. In 22nd Annual Computer Security Applications Conference (ACSAC).
NITRD (Networking and Information Technology Research and Development). 2009. High-confidence medical devices: Cyber-physical systems for 21st century health care. Networking and Information Technology Research and Development. Networking and Information Technology Research and Development Program.
NRC (National Research Council). 2005. Asking the Right Questions About Electronic Voting. Washington, DC: The National Academies Press.
NRC. 2007. Software for Dependable Systems: Sufficient Evidence? Washington, DC: The National Academies Press.
NSF (National Science Foundation). 2010. Trustworthy computing. The National Science Foundation. [cited 10/29/2010] Available from http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503326.
Pfleeger, C.P. and S.L. Pfleeger. 2007. Security in Computing. Prentice Hall, 4e edition.
Pfleeger, S.L., L. Hatton, C. Howell. 2001. Solid Software. Prentice Hall.
Ryschkewitsch, M., D. Schaible, W. Larson. 2009. [cited 10/29/2010] The art and science of systems engineering. NASA Monograph, Available from http://spacese.spacegrant.org/uploads/images/Art_and_Sci_of_SE.pdf.
Shuren, J. 2010. Testimony of Jeffrey Shuren, Director of FDA’s Center for Devices and Radiological Health. Health Information Technology (HIT) Policy Committee Adoption/Certification Workgroup.
Staniford, S., V. Paxson, and N. Weaver. 2002. How to own the internet in your spare time. In Proceedings of the 11th USENIX Security Symposium, pages 149–167.
Stewart, Q., and K. Fu. 2010. A second look: Analysis of FDA recalls of software-based medical devices. Manuscript, July 2010.
Svensson, P. 2010. McAfee antivirus program goes berserk, freezes PCs. Associated Press, [cited 10/24.2010] Available from http://abcnews.go.com/Technology/wireStory?id=10437730
Thirumalai, S., and K.K. Sinha. 2010. Product recalls in the medical device industry: An empirical exploration of the sources and financial consequences. Accepted to Management Science, May 2010.
Tobin, D. 2010. University hospital computers plagued by anti-virus glitch. The Post-Standard, [cited 10/29/2010] Available from http://www.syracuse.com/news/index.ssf/2010/04/university_hospital_plagued_by.html.
Tompsett, B. 1992. Long call wait for London ambulances. Risks Digest, Volume 13, Issue 42, [cited 10/29/2010] Available from http://catless.ncl.ac.uk/Risks/13.42.html#subj8.
Wallace, D. R., and D. Kuhn. 2001. Failure modes in medical device software: an analysis of 15 years of recall data. International Journal of Reliability Quality and Safety Engineering, 8:351–372.
Windows Client Tech Center. 2008. Downgrade from SP3 to SP2. http://social.technet.microsoft.com/forums/en-US/itproxpsp/thread/5a63cf6d-f249-4cbe-8cd2-620f301ec6fc/ (accessed 2010).