Three of the requirements needed for acceptability that would allow advanced aerial mobility to grow beyond the small scale that exists today are safety, cybersecurity, and how they relate to autonomy. While autonomy is not critical for achieving some types of urban air mobility (UAM)/advanced aerial mobility, some levels of autonomy will be required when human control is not adequate to assure safety in high-speed, crowded environments. Autonomy will be necessary to overcome the human factor problems that result from operating complex, high-speed equipment in a crowded environment. Furthermore, high levels of automation will most likely be required to achieve the economic benefits. To provide autonomy, challenges in assuring the safety and security of highly automated and thus software-intensive systems will have to be overcome.
Accidents and the perceived lack of safety in enhanced aerial mobility will be a major challenge, potentially preventing the acceptance of this technology. The hardware safety issues are easily handled as the hardware in this environment will bear many similarities to that used today in certificated systems, perhaps with upgrades in acoustics and electrical systems. Measures to manage safety in drone systems are in place today and are maturing. However, there are major challenges in ensuring acceptable safety and cybersecurity for manned vehicles.
System safety is not only with respect to the vehicles themselves. It includes the safety of the vehicles in the environments in which they will be used, including such things as collision avoidance, contingency management (e.g., to handle Global Positioning System or traffic management outages), and traffic management. Inability to meet these challenges will result not only in negative public perceptions of safety and unwillingness to participate but also in liability, insurance, legal, and other social challenges that could prevent a UAM system from achieving traffic levels above those in existence today.
The committee heard about two approaches to ensuring safety while using autonomy: testing and simulation. Unfortunately, neither of these are sufficient for complex, software-intensive systems. The usual safety engineering approaches will need to be utilized and scaled to handle software and the types of systems envisioned.
Exhaustive testing of software is impossible. The problem can be explained by examining what “exhaustive” might mean in the domain of software testing, as follows:
- Inputs. The domain of possible inputs to a software system includes both valid and invalid inputs, potential time validity of inputs (i.e., an input may be valid at a certain time but not at other times), and all the possible sequences of inputs when the design includes history (which is almost all software). This domain is too large to cover any but a very small fraction of the possible inputs in a realistic time frame.
- System states. Like the number of potential inputs, the number of states in these systems is enormous. For example, TCAS, an aircraft collision avoidance system, was estimated to have 1040 possible states, and even today, after many years in service, problems are still being found and fixed.1 Note that collision avoidance is only one small part of the automation that will be required to implement autonomous (and even nonautonomous) vehicles.
- Coverage of the software design. Taking a simple measure of coverage like “all the paths through the software have been executed at least once during testing” involves enormous and impractical amounts of testing time and does not guarantee correctness, let alone safety.
- Execution environments. In addition to the problems listed so far, the execution environment becomes significant when the software outputs are related to real-world states that may change frequently, such as weather, temperature, altitude, pressure, and so on.
In addition, even if it were possible to test the software exhaustively, virtually all accidents involving software stem from unsafe requirements.2,3 Testing can show only the consistency of the software with the requirements, not whether the requirements are flawed. While testing is important for any system, including software, it cannot be used as a measure or validation of acceptable safety.
All simulation depends on assumptions about the environment in which the system will execute. Autonomous cars have now been subjected to billions of cases in simulators and have still been involved in accidents as soon as they are used on real roads. The problems described for testing apply, but the larger problem is that accidents occur when the assumptions used in the simulation do not hold. Another way of saying this is that some accidents occur because of “unknown unknowns” in engineering design. There is no way to determine what the unknown unknowns are; thus, simulation can show only that the industry has handled the things it thought of, not the ones it did not think about.
The problem is not hopeless. System safety engineering has never depended exclusively on testing or simulation, so it is surprising that these two approaches are being suggested for advanced aerial mobility and other complex system development today. To handle all the states, even when an enormous number is involved, system safety engineers use modeling and analysis. An abstraction or model of the system is created, and that model is analyzed to ensure that the system it represents cannot get into a hazardous state.
Safety modeling and analysis tools have been used for the past 60-70 years in safety-critical systems. There are some technical limitations, however, that need to be overcome in these traditional techniques as they were developed before computers were used in hazardous systems. They do not work for today’s level of technical complexity although people still try to use them, perhaps due to the lack of alternatives.
1 N.G. Leveson, M.P.E. Heimdahl, H. Hildreth, and J.D. Reese, 1994, Requirements specification for process-control systems, IEEE Transactions on Software Engineering SE-20(9).
2 N. Leveson, 1995, Safeware: System Safety and Computers, Addison-Wesley, Boston, Mass.
3 R. Lutz, 1993, Analyzing software requirements errors in safety-critical, embedded systems, Proceedings of the International Conference on Software Requirements, https://doi.org/10.1109/ISRE.1993.324825.
There are currently military operators and contractors that have accumulated significant amounts of experience with unmanned aerial systems, including alongside piloted aircraft. However, these aircraft often operate in restricted or highly controlled airspace (e.g., military ranges within the United States or over conflicted territory overseas) and not over civilian areas. Military drones have an accident rate that would be totally unacceptable in civilian aviation or where human life is involved. The military has acknowledged the loss of hundreds of drones in the past 10 years out of a relatively small number of total flights compared to civilian aircraft. The numbers of accidents per thousand flight hours are a better measure than absolute numbers of crashes and demonstrate that substantial improvements in reliability will be required for commercial drone operations.4
The first limitation of traditional hazard analysis tools is that they handle hardware but not software. Attempts to use the same models and analysis methods for software do not work because of the unique nature of software compared to hardware. Hardware fails in a probabilistic fashion. Software does not fail probabilistically. In addition, traditional safety analysis is based on a model of accident causality that assumes that accidents are caused by the failures of system components. This assumption is usually acceptable for purely hardware systems, but nobody is building systems today without software components.
Software does not fail like hardware. Instead, it almost always executes the instructions it was given.5 An accident results only if the instructions are unsafe in the environment in which it is executing. It is not simply a matter of the software not satisfying its requirements. Those requirements may even change over time. Safety for hardware can be adequately estimated by the reliability of the hardware, but not for software.
The unique nature of software essentially reduces the software safety problem to the safety of the software requirements provided to the programmers. Showing the consistency of the requirements with their implementation in the software instructions can be handled using standard software engineering approaches. However, as noted, virtually all software-related accidents can be traced to unsafe requirements and related software requirements flaws. There needs to be a way of generating safe requirements or at least validating that the ones provided will result in a safe system.
One of the implications of the criticality of requirements is that safety must be built into the system from the beginning of development. Starting with a potentially unsafe set of requirements and relying on after-the-fact assurance will not be effective. Unsafe requirements are guaranteed to produce unsafe software. Changing the requirements late in development is impractical because of the astounding cost. The bottom line is that it is not possible to insert a property, like safety, into a system that does not already have it. Creating software that is safe from the beginning of development is a prerequisite for ensuring adequate risk in the final overall system and its components.
Because of the important distinctions between the systems for which traditional system safety engineering approaches were created and those being built today, which depend on software to a large degree, a paradigm change will be required in safety engineering. The nature of the paradigm change has been described,6 and the first tools to deal with software-intensive systems have been created and are being used successfully to create safer systems. The greater efficacy of the new tools over the traditional techniques has been proven both theoretically and empirically, but extending these new modeling and analysis tools to handle the increasingly complex systems being considered, including UAM, will require research and either extensions to the tools available today or new tools.
Finding: Testing and simulation alone are not adequate to ensure safety in complex, software-intensive systems like UAM.
4 See, for example, Wikipedia.com, “List of UAV-Related Incidents,” https://en.wikipedia.org/wiki/List_of_UAV-related_incidents (for civilian accidents); J. Judson, 2018, “These Two Drones Are Leaders in Accident Rates. How Is the US Army Responding?,” DefenseNews. com, April 25, https://www.defensenews.com/digital-show-dailies/aaaa/2018/04/25/these-two-drones-are-leaders-in-accident-rates-how-isthe-us-army-responding/; A. Susini, 2015, “A Technocritical Review of Drones Crash Risk Probabilistic Consequences and its Societal Acceptance,” pp. 27-38 in RIMMA Risk Information Management, Risk Models, and Applications, Lecture Notes in Information Sciences, Vol. 7, https://www.researchgate.net/publication/291697791_A_Technocritical_Review_of_Drones_Crash_Risk_Probabilistic_Consequences_ and_its_Societal_Acceptance; C. Cole, 2019, “Accidents Will Happen: A Dataset of Military Drone Crashes,” Drone Wars, September 6, https://dronewars.net/2019/06/09/accidents-will-happen-a-dataset-of-military-drone-crashes/.
5 The exception occurs when the computer hardware, on which the software is executing, experiences a failure. This case is easily handled using redundancy and standard reliability techniques and is not considered further here.
6 N. Leveson, 2012, Engineering a Safer World, MIT Press, Cambridge, Mass.
Finding: Traditional hazard analysis and safety engineering modeling and analysis tools do not apply to systems that include software for control. New types of tools will be required. Simply trying to extend existing tools to include analysis of software will not work.
Finding: The National Aeronautics and Space Administration (NASA), in coordination with the Federal Aviation Administration (FAA), could provide education on the need for new approaches beyond testing and simulation to the advanced aerial mobility development community.
Recommendation: In coordination with the FAA, NASA should support research on new, more powerful safety analysis tools that are widely used today that can be applied to software-intensive advanced systems.
Because of the dependence of advanced aerial mobility on software, cybersecurity will be a potential critical vulnerability. While cybersecurity efforts in the past have focused primarily on information security and privacy, the safety-critical element here changes the consequences and amplifies the challenge. For example, the cybersecurity challenge in advanced aerial mobility is not to prevent the theft of information from vehicles or passengers but to prevent outsiders from making the system and software behave unsafely.
Advanced aerial mobility faces several cybersecurity concerns: threats to onboard networks and code, attacks on vehicle/air traffic control (ATC) datalinks, and introduction of adversarial or incorrect data potentially used for safety-critical decisions and/or machine learning. Research in cybersecurity for onboard networks and traditional flight software is required to improve automated analysis and test to reduce software and data handling costs. Datalink security will require diversity and redundancy in communication links and new strategies for capturing cutting-edge cryptography strategies into living standards capable of assuring data authenticity despite evolving network attacks. Research is also needed to recognize and minimize the impacts of adversarial training examples in learning systems7 capable of adapting to new or unexpected percepts or data sets.
It is almost impossible to keep hackers out of any systems today, and advanced aerial mobility systems will not be an exception. In terms of vulnerability, advanced aerial mobility will depend on the operation of other complex software-intensive systems such as ATC, Global Positioning System, and various types of shared communication systems. If advanced aerial mobility becomes an important infrastructure component in the United States, adversaries will find it a tempting target in any attack scenarios.
As with modeling and designing in safety, new approaches to cybersecurity are required to make advanced aerial mobility a success. The paradigm changes that have been proposed for ensuring safety are also applicable to cybersecurity, but again research and development is needed.
As with safety, traditional testing and simulation alone are not adequate to ensure cybersecurity in complex, software-intensive systems like advanced aerial mobility. A system that learns and adapts its behavior according to experience cannot be made secure by current techniques. New techniques are required, and NASA is a capable research agency that can support the FAA in developing them.
Finding: Current cybersecurity approaches that rely on threat analysis, maintaining impenetrable boundaries, and focusing primarily on information security will not be adequate for UAM missions involving learning and adaptive platforms.
Finding: Current airworthiness hardware and software cybersecurity techniques do not accommodate advanced aerial mobility platforms.
7 See X. Yuan, P. He, Q. Zhu, and X. Li, 2019, Adversarial examples: Attacks and defenses for deep learning, IEEE Transactions on Neural Networks and Learning Systems 30(9): 2805-2824.
Finding: NASA has initiated research into the area of complex autonomous systems to include leveraging of cybersecurity-related investigations performed by other agencies (April 2018 System-Wide Safety Project Plan, Section 2.1.4). The committee believes this is important research.
Recommendation: NASA should conduct research and development on cybersecurity for advanced aerial mobility systems.
Recommendation: Working with the FAA certification experts, NASA should develop potential software and hardware certification techniques and guidelines to verify and validate the performance of complex software and hardware, including nondeterministic functionality. This NASA research into methods to demonstrate performance will provide valuable input to the FAA, including material for advisory circulars, to help applicants in the certification process.
Notably, other government agencies, like the Defense Advanced Research Projects Agency, have also conducted work in complex autonomous systems, and significant work is being done in the private sector. NASA may learn from these other efforts and find potential partners in their efforts.
Advanced aerial mobility is expected to require access for increasingly autonomous systems. Contingency response may be required when vehicle or infrastructure systems fail, environmental conditions are hazardous, passengers are distressed or disruptive, or special events require real-time rerouting. At the vehicle level, simplified vehicle operations or fully autonomous operations necessitate increasingly autonomous contingency management. At the traffic management level, increased traffic densities, route/mission complexities, and the need for new and novel contingency management necessitate autonomous traffic deconfliction and system-level contingency management.
Secure datalink is essential for advanced aerial mobility since voice-based communication has low bandwidth and introduces a multiple-second delay from event occurrence, to announcement on frequency, to comprehension by the human recipient. With secure datalink, real-time positions and velocities will enable software to rapidly identify and resolve conflicts in nominal and complex multivehicle route geometries that are impossible for human controllers to mentally model and manage. Air traffic contingency management will be required whenever unexpected bad weather is encountered, system elements fail or are attacked, or non-cooperative air traffic enters an airspace region normally occupied by cooperative traffic. Autonomous two-vehicle deconfliction and redundant datalinks are on the immediate horizon for manned aircraft and unmanned air systems. Weather and wind observations and forecasts improve each year. Autonomous multivehicle traffic deconfliction and increasingly resilient datalink systems are key technology needs for safe advanced aerial mobility in densely populated airspace.
A typical contingency management sequence requires initial detection or perception of a problem or an anomaly. An inference or a decision process triggered by the perceived problem leads to an action aimed at remediating or gracefully degrading in a manner that consistently maintains an acceptable level of risk. Perception systems have to detect situations in which contingency response might be required and have to balance missed detection versus false alarm risks. Traditionally, human pilot perception has been relied upon for contingency management based on feedback from aircraft automation and the environment. Advanced aerial mobility will require autonomous contingency management. Autonomous systems are distinct from traditional automation in their authority to make decisions and take necessary action without human oversight. Autonomous system authority is essential for advanced aerial mobility to assure risks are mitigated in time to restore a safe flight operational state despite the absence of a highly qualified onboard flight crew. Contingency management autonomy has to select and execute mitigation actions accurately and without vehicle-level loss of control, collision with other aircraft or obstacles/terrain, or unnecessary disruption to other air traffic or traffic management services. Contingency management autonomy will also need to integrate effectively with human system participants and evolve gracefully from legacy systems.
Finding: Due to the expected increase in number of aircraft operations per day and an observed steady-to-decreasing pilot training pipeline, autonomy for contingency management will be an essential component of advanced aerial mobility. Simplified vehicle operations with lower cost and reduced pilot training requirements are expected to be a precursor to fully autonomous aircraft operations. Well-trained pilots struggle with high workloads typical in emergencies requiring contingency response, so it is expected that pilots with less training and experience will be less prepared.
Finding: Encoding well-established contingency management procedures into autonomy will provide a rich baseline capability for automated contingency management in the near term. These procedures can be certified using a combination of existing and emerging certification practices to provide assurance that they will activate and execute safely and correctly. Software-based evaluation tools can be applied to rigorously evaluate autonomy for well-defined deterministic contingency management to reduce the manpower and cost required to use today’s certification practices.
Finding: Real-time data processing will be required to enable appropriate autonomous perception, decision-making, and action outcomes in contingency management cases not recognized and matched with established procedures. In such cases, pilots, especially inexperienced pilots, would also be required to ingest real-time data and adapt their situation understanding and decisions in real time. No guarantees of correct response are possible when either autonomy or pilot must learn in real time, yet learning and acting offers a better chance of survival or recovery than shutting down. Machine learning operating in the background might be able to assist in situational awareness (i.e., perception). Decisions informed by machine or human learning can be useful even when correctness guarantees are impossible. Supervisory constraints on learning or adaptive systems can limit machine learning system authority to situations in which automation is essential for success.
Finding: Advanced aerial mobility will typically rely on a variety of real-time data sources for detect and avoid, traffic coordination, and access to data updates—for example, weather and winds. Cyber resilience, the ability for a vehicle or local vehicle group to safely continue a flight operation despite loss or corruption of one or more datalinks or server connections, is an essential component of advanced aerial mobility contingency management.
Recommendation: NASA should conduct research, development, and testing of autonomy for contingency management to support safe advanced aerial mobility.