A key part of the committee’s initial focus was on the enterprise architecture of Next Generation Air Transportation System (NextGen), per its tasking. What soon became clear, however, was that an enterprise architecture1 is a necessary but not sufficient component of a successful “system of systems,”2 such as NextGen. An enterprise architecture serves as documentation and support of existing systems and business processes. A system architecture models and defines the structure and behavior of a system in a way that supports reasoning about the system and its characteristics. See Box 2.1 for more on the distinction between enterprise architecture and system architecture. Accordingly, and consistent with other elements of its task, such as software development,3 the committee
1 The Office of Management and Budget (OMB) requires that every federal agency have an enterprise architecture designed to “promote mission success by serving as an authoritative reference, and by promoting functional integration and resource optimization with both internal and external service partners” (OMB, “The Common Approach to Federal Enterprise Architecture,” Washington, D.C., May 2, 2012, http://www.whitehouse.gov/sites/default/files/omb/assets/egov_docs/common_approach_to_federal_ea.pdf, p. 5). Generally, an enterprise architecture (and its associated legislation and treatment by OMB) targets traditional “enterprise” information technology systems and are not focused on systems-of systems that include real-time control, operations, and so on.
2 For simplicity, the committee will usually use the term “system,” because all systems of systems are systems themselves.
3 The legislation referencing the committee’s task (P.L. 112-95, Section 212) uses the term “software development,” which is sometimes understood narrowly by laypeople as referring to such things as coding techniques or software development methodology. Here,
explored the question of architecture more broadly, focusing also on the system architecture for NextGen.
The committee has drawn on its collective experience and expertise (in software engineering, as executives at major government contractors, and in the architecture of large-scale systems) to develop its recommendations, which are focused around the importance and implications of system architecture. It is reasonable to hope that there might be architectures in other domains, at companies or in other agencies, that would serve as useful exemplars. However, it is difficult to point to specific other architectures as exemplars without having conducted an in-depth analysis of them, an effort that would have gone beyond the time and resources available for this study. Moreover, NextGen and the National Airspace System (NAS) pose distinctive challenges, and the committee notes that pointing to any one specific other example would be a distraction, as it would too readily raise the possibility of arguments about whether or not the comparison is fair, or about ways in which the analogy or comparison does not work. Instead, this chapter focuses on the features and aspects needed in a system architecture for a system such as NextGen and points out lessons to be learned from industrial approaches to architectural governance and the development of architectural leadership. This chapter discusses the committee’s assessment of the Federal Aviation Administration’s (FAA’s) current architectural approach for NextGen, key elements of architectural leadership, and recommendations for change.
During the course of its study, the committee heard briefings and studied documents related to the FAA’s NAS enterprise architecture and its overall approach to system architecture. This section provides the committee’s impressions regarding what the FAA is currently doing (at the time of the committee’s information gathering) with respect to architecture, and the next sections offer recommendations for how the FAA could improve its approach.
too, the committee has taken a broader view more consistent with contemporary software engineering theory and practice that, especially in the case of large-scale complex systems, the term encompasses such pre-coding activities as requirements specification, system architecture, and design as well as coding and testing. As with its use of the term system architecture, this broad notion of the term software development ensures that the full range of software issues critical to NextGen success are considered. NextGen contains a large software element, and that software element will have an architecture (and should have its own architecture description). Standard practice in software architecture recognizes that the “development view,” the structure of the code being developed, is just one view. Many of the important attributes about which one would want to reason are defined only by moving outside the development view to a run-time, data, or other view.
BOX 2.1 An Enterprise Architecture Is Necessary but Not Sufficient
Every system has an architecture, even when that architecture is not documented. Just as building architects distinguish architectures (what they have in mind) from blueprints (how their ideas are recorded), current practice in architecting distinguishes architectures from the artifacts, documents, models, or other work products expressing those architectures. Mature architecting practices include making “tacit architectures” explicit by means of architecture descriptions. Typically, these are distinguished as follows:1
- Architecture (of a system): fundamental concepts or properties of a system in its environment embodied in its elements, relationships, and in the principles of its design and evolution.
- Architecture description: work product used to express an architecture.
The maturation of architecting as a discipline has had to contend with a variety of distinctions, such as distinguishing an enterprise architecture from a systems architecture and from a software architecture. Each of these as typically understood has strengths and weaknesses:
- Enterprise architecture has evolved out of the management of large information technology (IT) systems. Some versions of enterprise architecture are little more than “bookkeeping” of IT assets. Other versions have come to focus on critical but “softer” aspects of enterprises, such as business vision, strategy and goals, human resources, and organizations.
- System architecture is perhaps the most mature of the three in its management of “non-functional” concerns such as reliability, affordability, and safety.
Although the FAA noted in briefings that the NAS enterprise architecture is meant to serve multiple purposes (“to align systems and technologies, identify duplication of effort, address the need for increased efficiency and interoperability, provide a common language for linkages and communications, and provide a framework for managing change by facilitating efficient identification of changes and implementations”), it was described to the committee as an explanatory set of documents, which is insufficient to meet NAS and NextGen needs (see Box 2.1). The enterprise architecture “describes the enterprise that directly supports operational air traffic services” and “describes the enterprise that supports FAA administrative operations.”4 Parts of the enterprise architecture
4 Remarks from FAA briefing “Role of Enterprise Architecture NextGen: Briefing to National Research Council” to study committee, March 2013.
- Software architecture is perhaps the most mature in its modeling practices and associated automated tools.
The current state of enterprise architecture is not adequately mature to manage large, distributed, real-time systems where safety-critical concerns predominate nor is it clear that even the best instantiation of an enterprise architecture is intended for such uses. Enterprise architecture has focused on “bookkeeping” of enterprise IT assets. For a system such as NextGen, a more comprehensive notion of architecture is needed. Having high-quality descriptions of a system is insufficient to ensure that the depicted system is fit for anything. Standard ISO/IEC/IEEE 420102 addresses this dichotomy. The quality of the design choices have to be assessed on their own, and the quality of the drawings is not a surrogate.
An enterprise architecture is typically interpreted as a set of documents instead of a set of decisions. This is consistent with what the committee learned in its briefings. However, an emphasis on documents and compliance over decision making is misplaced. A close reading of architecture description standards, such as the Department of Defense Architecture Framework, will show that this issue was recognized by the description standard authors. Such frameworks are careful to identify the need for an architecture team to identify the purpose of the system, the purpose of the architecture, the information needed, and only then fold that information into standardized document products. That basic discipline should be employed in any assessment of an architecture. The linkage between architecture purpose, stakeholder concerns, and the contents of a description document is explicit in Standard 42010. In this report, the committee calls for that broader approach and calls it a “system architecture.”
1 ISO/IEC/IEEE 42010:2011, 2011.
are used to justify expenditures; an enterprise architecture is required as programs work through the acquisition management system (AMS) and must be approved by the FAA’s Joint Resource Council. The enterprise architecture also serves to meet the OMB requirement for agency-wide enterprise architecture as laid out in the Clinger-Cohen Act.5 Given that a goal of the act is to ensure efficient capital planning and investment in information technology (IT), the enterprise architecture’s focus on business structures and process is not surprising. In briefings, when asked to discuss the system architecture, FAA staff noted that there is no “software
5 The Information Technology Management Reform Act (ITMRA) (Division E) and the Federal Acquisition Reform Act (FARA) (Division D) were signed into law as part of the National Defense Authorization Act for Fiscal Year 1996. The ITMRA and FARA were subsequently designated the Clinger Cohen Act of 1996 (P.L.104-106), encompassing both.
or hardware architecture” per se for NextGen and the NAS as a whole but that enterprise architecture at the program level describes how the system (comprised of software and hardware) will work.6
The FAA uses an integrated systems engineering framework (NAS ISEF)7 to build their enterprise architecture, which is based loosely on the Department of Defense architecture framework (DoDAF).8 According to briefings provided to the committee, program offices develop their own architectures, in compliance with the NAS ISEF and in compliance with high-level interface specifications embodied in additional diagrams contained within the enterprise architecture. Additionally, the Chief Architect’s office ensures that the enterprise architecture is integrated horizontally (identifying linkages and interdependencies from system to system within the system-of-systems) and vertically within functions or components (to address shortfalls and help facilitate prioritization analysis).
The NAS ISEF supports visualizing the broad scope and complexities of the architecture and allows for varying views. These views provide overviews and details aimed at specific stakeholders. In addition to the views provided by DoDAF—all view, systems view, operational view, and technical view—the NextGen enterprise architecture provides for two additional views that are important for acquisition. The executive view provides planning roadmaps and highlights the evolution and delivery of NAS capabilities, and the financial view provides expenditure forecasts. While these multiple views provide overviews and details aimed at the varying levels of the NAS, there is a risk that each program office will see only what it needs to (in a narrow sense) without an understanding of the full picture and without ensuring that the various perspectives in the architecture are consistent and interoperable. Further, the absence of a system architecture for the entirety of NextGen makes it difficult for the developers of the individual subsystems and components to reason about the impact that the characteristics of their separate systems will have on such key overall NextGen system characteristics as safety, security, efficiency, robustness, and evolvability.
6 The FAA is in the process of buying the hardware and software that comprises NextGen, and each component will be, in fact, a system in its own right. It will then have an architecture, whether or not the FAA has chosen to make it explicit and effective. For instance, both ERAM and STARS/TAMR pre-date NextGen and will need to be adapted to the NextGen architecture. As noted in NRC-AF-2008, the quality of the decomposition is likely to be a major determinant of success. If the hardware and software architectures have not been considered, then there is only chance to rely on for the quality of the decomposition.
8 Details about the DoDAF are available at Department of Defense, Chief Information +Officer, “The DoDAF Architecture Framework Version 2.02, Change 1,” released January 2015, http://dodcio.defense.gov/TodayinCIO/DoDArchitectureFramework.aspx.
Ultimately, the committee’s conclusion with regard to the NAS enterprise architecture is that the as-is architecture has evolved to also become the dominant understanding of the to-be architecture. That is, the existing design and deployment of the NAS embodies a tacit architecture that is described, at a non-detailed level, by the NAS enterprise architecture documentation. This induced, system-of-system architecture is, therefore, bottom-up and program-driven, and imposes implicit limits on what (and how) system capabilities can be realized. This has ramifications for how effective it can be, especially for reasoning about safety, security, and robustness, and in laying groundwork for future evolvability and enhancements.
As described in the committee’s interim report,9 the FAA developed an enterprise architecture responsive to OMB’s requirements. The committee was concerned that there was insufficient technical content in the enterprise architecture to allow clear traceability to lower-level architecture. As first defined by the Institute of Electrical and Electronics Engineers in Standard 1471,10 a system architecture is “the fundamental organization of a system embodied in its components, their relationships to each other and to the environment and the principles guiding its design and evolution.” (Box 2.2 describes system architecture in more detail.)
Although the documentation regarding the enterprise architecture is extensive, as noted above, the de facto “system architecture” for the NAS is the unmodified system as it is today, regardless of any documents to the contrary. The current enterprise architecture appears to be a set of functional enclaves that are providing individual services, described in a set of documents at the NAS enterprise architecture level. Additional improvements and modifications seem to be either changes to what is already deployed, or overlays onto what is already there. Discerning precisely what FAA’s architectural approach and strategy is was challenging, and some of it had to be inferred. The documents for the as-is architecture that the committee reviewed do not use abstraction to higher-level concepts that are used in the mid-term and far-term document sets. (For example, one cannot trace an ADS-B target from the ADS-B receiver to a display screen without going through several programs’ documentation.) Nor is it clear that the abstractions generated are sufficient to describe the
9 National Research Council (NRC), Interim Report of a Review of the Next Generation Air Transportation System Enterprise Architecture, Software, Safety, and Human Factors, The National Academies Press, Washington, D.C., 2014.
10 Institute of Electrical and Electronics Engineers (IEEE) Standard 1471:2000 has since been retired and replaced by the revised standard, ISO/IEC/IEEE 42010:2011, “Systems and Software Engineering—Architecture Description” (International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC)/IEEE, December 2011, http://www.iso.org/iso/catalogue_detail.htm?csnumber=50508).
BOX 2.2 On System Architecture
The 2010 National Research Council report Critical Code: Software Producibility for Defense offers a useful description of architecture and its importance:1
Just as in physical systems, architectural commitments comprise more than structural connections among components of a system. The commitments also encompass decisions regarding the principal domain abstractions to be represented in the software and how they will be represented and acted upon. The commitments also include expectations regarding performance, security, and other behavioral characteristics of the constituent components of a system, such that an overall architectural model can facilitate prediction of significant quality-related characteristics of a system that is consistent with the architectural model. Architecture represents the earliest and often most important design decisions—those that are the hardest to change and the most critical to get right. Architecture makes it possible to structure requirements based on an understanding of what is actually possible from an engineering standpoint—and what is infeasible in the present state of technology. It provides a mechanism for communications among the stakeholders, including the infrastructure providers, and managers of other systems with requirements for interoperation. It is also the first design artifact that addresses the so-called non-functional attributes, such as performance, modifiability, reliability, and security that in turn drive the ultimate quality and capability of the system. Architecture is an important enabler of reuse and the key to system evolution, enabling management of future uncertainty. In this regard, architecture is the primary determiner of modularity and thus the nature and degree to which multiple design decisions can be decoupled from each other. Thus, when there are areas of likely or potential change, whether it be in system functionality, performance, infrastructure, or other areas, architecture decisions can be made to encapsulate them and so increase the extent to which the overall engineering activity is insulated from the uncertainties associated with these localized changes.
A principal goal of a system architecture is to provide a specification of the structure of the system in order to foster the design and implementation of a system whose important properties are sufficiently well understood, able to be reasoned about, and assured.2 The architecture must provide a high-level view of the general nature of the system and support an understanding of the lower levels of the system that are needed in order to be assured that the system will satisfy key properties and behaviors. A well-formed architecture should provide a clear and consistent view of how each of its levels relates to the other levels and how the components of each of the various levels fit with each other. Key mechanisms for doing this are hierarchy, abstraction, and separation of concerns. A hierarchical architecture specifies how each component is comprised of a collection of lower-level components. Abstraction expresses functions of higher-level components of the hierarchy in terms of general concepts that suppress the details of the lower-level components. Abstraction allows for hiding specifics of decisions so that only the properties of concern need be addressed, avoiding inappropriate complexity. Separation of concerns helps ensure that appropriate decomposition takes place and reduces the opportunity for confusing overlaps or mismatches across components and views. The specifications, at all levels, are used to reason about properties the system must ensure.
A higher-level component specification is essentially a summary of the key fea-
tures, aspects, and behaviors that the component must implement. Typically these are specified as the set of interfaces that the component presents to the other components at its architectural level. But this specification should not address how these features and behaviors are to be achieved, except when dominating, systemic concerns prevail (for example, to specify canonical information sources for certain sorts of functions). Specifications of how to satisfy the specification are abstracted away at the higher level in order to provide flexibility and evolvability to implementers, thereby enabling changes to the actual implementation as system contexts, requirements, and experience change over time. The component’s decomposition into lower-level subcomponents provides these implementation details, specifying how the mandated abstract features and behaviors are to be realized. The features and behaviors of these lower-level components are specified, in turn, as sets of interfaces to each other that are abstractions of still lower-level subcomponents. Ultimately, a hierarchical architecture specification’s decomposition stops at a set of leaf components, namely, the lowest-level components whose structure need not be further elaborated in order to reason about the system architecture.
A hierarchical system architecture can be useful even if its specifications are relatively informal, as may well be the case early in the conceptualization and initial development of a complex system, or in the case of a system about which relatively few assurances are needed. In these cases, the architectural specification may need to be only lightly decomposed, and left relatively informal, with most of the development work being left to subsequent designers and implementers. If the architecture’s leaf components specify only very-high-level features and capabilities, then a very great deal must be assumed about the correctness of the implementation of these subcomponents, and there is consequent room for relatively greater doubt about whether the eventually implemented system will satisfy its critical properties and characteristics.
For complex systems, about which there need to be strong assurances about many potentially conflicting characteristics, more highly elaborated and more detailed architectural specifications are important. Such system architectures will probably start out as relatively informal high-level specifications, but should be expected to become increasingly complete and precise over time, as requirements, contexts, and available technologies all become better understood.3 The importance of more complete and precise architecture specifications is that they become increasingly effective in supporting increasingly definitive reasoning about key characteristics of the overall system. Eventually, the system architecture should be decomposed to sufficiently low levels, in sufficient detail and with sufficient precision, to support needed reasoning about such key overall system characteristics as safety, security, speed, robustness, and evolvability.4
At present, the NextGen system architecture appears to be tacit or at best, still at a very high level and largely specified quite informally. While this might be an acceptable, indeed inevitable, state of affairs early in the development of such a complex and critical system, it poses considerable risk and difficulty at this relatively late stage in the development of NextGen. The lack of a well-defined, sufficiently deeply decomposed system architecture for NextGen poses at least two serious problems. First, the fact that the architecture description is insufficiently deeply defined means that the specifications for the system’s component parts
may be too vague and incomplete to provide effective guidance to the developers of these lowest-level components. Unfortunately, if firm assurances about security and robustness (for example) of the overall system are required, then correct implementation of these vaguely specified components will have to be assumed. The more vague and incomplete these specifications, the greater the risk that their implementations may not be correct, leaving assurance of the desired properties in doubt. Second, even if an architecture specification does indeed decompose down to lower-level subcomponents, it is still essential that the interfaces to these components be very carefully specified. Imprecise specifications make it difficult to assure that the delivered components will indeed fit with each other as needed. More important, however, the less precisely the specifications of the components and their interfaces are specified, the less definitively the critical properties of the overall system can be determined.
Finally, a system architecture may encompass separate, but related, subarchitecture (sometimes referred to as architecture views), each of which addresses a different set of issues. Thus, for example, a system architecture may well incorporate a data architecture, specifying how data is managed; a user interface architecture, specifying how various system capabilities are presented to users; a safety architecture, specifying how various safety risks are attenuated by the system; a resilience architecture, specifying how the system can operate even though some components have failed; and a security architecture, specifying how the system provides safeguards against possible damage due to attacks. Each of these needs to be complementary to the overall system architecture and be consistent with each other—in other words, both horizontal and vertical correspondences are needed. And for each of these architectures, it is important that each component be described in terms of the abstract features and capabilities that must be provided, and that implementation details not be provided. A data architecture, for example, might specify that certain types of data must be logically centralized, leaving to the elaboration of this specification details and decisions about whether the data should or should not be physically centralized.
complete system. Moreover, it is not clear how or if there are systems engineering work products that are derived from the midterm architecture.
As one example, the committee noted that the first architecture rule written in the AV-1 “Mid-Term Overview and Summary Information” document states that
NAS [enterprise architecture] products shall be developed and decomposed only to the level of detail required to portray enterprise “To-Be” Operational Improvements/Sustainment and transformation priorities. The level of detail should articulate enterprise-level operations, functions, and systems without infringing on Program-level detail. Thus, the lowest
Note that the need for a coordinated collection of different subarchitectures is necessary even for systems and products that are far less complex and far better understood than NextGen. Thus, as an analogy, a building architect will typically have considerable experience in building office buildings and a firm grasp on the characteristics of materials and structures. Even so, each new office building project must start with a building architecture that must be rigorous, complete, and sufficiently precise about electrical systems, plumbing systems, elevator systems, and heating/cooling systems, in addition to the configuration of the building’s various structural members. The need for analogously complete subarchitectures is even more critical in the case of NextGen, which is an interconnected system of systems with far more complexity than an office building. With regard to data alone, NextGen has to cope with a range of information from weather data to flight plans to real-time traffic data to emergency declarations. In both cases, the building architecture and the NextGen system architecture, different notations and formalisms may be used to support specification of the different kinds of architectural features. But each notation or formalism must be precise, and the specification must be sufficiently deeply decomposed, in order to support reasoning about that architectural feature and its relations to the other architectural features.
1 NRC, Critical Code: Software Producibility for Defense, The National Academies Press, Washington, D.C., 2010, pp. 68-69.
2Architecting is the process of conceiving, defining, expressing, documenting, communicating, certifying proper implementation of, maintaining, and improving an architecture throughout a system’s life cycle, per ISO/IEC/IEEE 42010:2011, (2011, pp. 1-46).
3 A note on precision: precision in this discussion should not be viewed as an absolute, rather, precision evolves with the understanding of engineering trade-offs between requirements, designs, and business constraints throughout the life cycle.
4 If very firm assurances about these characteristics are required, then the architecture eventually must be defined in a notation that is sufficiently formal and precise to support the definitive reasoning needed for such assurances.
level of detail (e.g., leaf nodes) of the enterprise-level products should serve as the context/highest level of Program-level architecture elements.11
Although the infrastructure set up by the FAA to host its enterprise architecture is robust, there seems to be no explicit connection between
11 FAA, “Mid-Term Overview and Summary Information (AV-1), Version 3.0, Part of Integrated Mid-Term Release Package 3.0,” document NAS-EA-AV-1-Mid-Term-v3.0-022814, National Airspace System Enterprise Architecture, Office of NextGen, February 28, 2014, https://nasea.faa.gov/architecture/enterprise/display/4/tab/Mid-Term, p. 4. This document has been superseded by Version 4.0, in which similar language can be found on p. 12.
the “leaf nodes” and the program-level architectures or descriptive documents. This may be sufficient to meet the OMB mandate for an enterprise architecture and to rationalize acquisition efforts, but it is insufficient to support the technical steering of an effective system architecture. Capturing and providing appropriately detailed abstract specifications and interfaces is essential to the ability to reason about NextGen’s key properties and improves the ability to determine where and how to make improvements and how to assess impacts on other subsystems.
NextGen goals and associated programs should, by definition, provoke changes and adjustments in the NAS system architecture. That tacit architecture is diffused through many different programs, not all of which are under NextGen control. NextGen’s programs have developed, not under the control of the NextGen office, but under other FAA organizations. Thus, one program’s engineers must exhaustively search through all other programs’ documentation—hoping that it is up to date—to assess and understand changes. In such a situation, engineers may be unable to gain sufficiently clear insights into the nature of needed subsystems and subcomponents to assure that the program’s product will integrate correctly with the other subsystems and make the needed contributions to required system-level requirements.
The committee urges a focus on system architecture that reflects a set of fundamental, structural decisions about a system and that is distinguished from the architecture description, a document that records those decisions (much like the FAA uses the enterprise architecture). The FAA needs a system architecture for the NAS so as to ensure proper operation of the system; allow proper analyses for prediction of system behavior, performance, security, safety, and so on; and ensure future flexibility. A proper system architecture specifies the interfaces between different subsystems sufficiently enough that their design and implementation can proceed independently with reasonable confidence that they will interoperate correctly. (Internet protocols are a textbook example of this.) The current situation relates to the basic contract mechanism used by the FAA. Different functional elements are being built by different contractors selected and managed by different programs that may not have a clear view of the system architecture.
However, certain basic services need to be provided to various higher-level services as abstractions where the basic services are implemented system wide. Examples include security properties to assure that NextGen is resistant to possible attacks, robustness properties to assure that the system will degrade gracefully in the presence of equipment failures and unexpected contingencies, and evolvability properties to assure that NextGen can be modified and enhanced as new challenges are presented and so the advantages of new technologies can be exploited. Abstraction
should allow design decisions such as physical locations, formats of data, protocols, and so on to be hidden until the specifics of those decisions are needed. An appropriate separation of concerns can ensure that adequate information is available for reasoning about systemic properties.
Data that is crucial to the various services needs be defined carefully by an abstraction that is common to all services and that provides site-independent guarantees on availability, timeliness, ordering, throughput of the data, and so on. For example, various high-level functions need to be able to store their specific data, but there are important guarantees that the high-level functions need about data storage. To achieve this effectively, there would need to be a customizable data service structure that can be tailored by each high-level service according to its needs but which guarantees properties such as availability. Similarly, high-level functions need to have guarantees that when errors arise (for whatever reason) there are alternatives for backup data storage, processing, database availability, and so on.12 Without such abstractions and associated interfaces, developers of higher-level functions would begin to implement such elements individually, which is happening, as the committee learned through briefings with both the FAA and contractors. Although this is understandable given the current context, it is unlikely to be a very satisfactory approach for the system as a whole.13 Additional data properties, such as ownership, span of use, demand, safety, security, latency, freshness, and location, often have architectural significance and need to be managed at the architectural level. Moreover, different categories of data may necessitate services with very different characteristics.
As an example of a system-wide service in NextGen, the Surveillance and Broadcast Services (SBS) Final System Specification Rev. 4 states that
SBS is a system of systems, requiring functionality on participating aircraft and vehicles, in a ground infrastructure, and in participating automation systems. The SBS System provides ADS-B, TIS-B, ADS-R, Wide Area Multilateration (WAM), Airport Surface Surveillance Capability (ASSC), FIS-B, VHF Voice Communications, and Weather Observation Services. Performance for each of the services is specified and allocated herein
12 Modern industrial approaches such as Oracle, SQL Server, or Google’s Spanner are examples of the provision of this sort of functionality.
13 And indeed, inconsistencies or misunderstandings related to data definitions often lead to interface errors among software components. A stark example of this sort of error resulted in the loss of the Mars Orbiter in the late 1990s when NASA used the metric system and its contractor used English units of measurement. Modern software architecture standards all call for a “data view” or “logical view” or something equivalent. This is usually conceived of as an implementation-independent definition of core data, which is transformed into physical data models during implementation.
to the aircraft, surface vehicles, the ground infrastructure, and the ATC automation system. [emphasis added]14
This statement suggests that the NextGen organization provides little architectural direction to the enterprise. Although the choice above may in the end be a plausible architectural decision, the committee is concerned that this choice is being made by default rather than the result of a considered process through trades on the bases of explicitly stated criteria.15 Moreover, insufficient abstractions to high-level services place roadblocks in the way of those attempting to insert new technologies or services (such as unmanned aircraft systems (UAS) or cybersecurity protections, discussed in Chapter 3) into the context of the NAS.
As noted above, the FAA has adopted the DoDAF in an attempt to satisfy both OMB’s requirements as well as the need for a system architecture required to develop its NextGen systems. Reinforcing the committee’s conclusion that the enterprise architecture, as currently used and understood, falls short in providing what a proper system architecture could provide, a recent Department of Transportation Inspector General’s assessment of the enterprise architecture shows the FAA falling short in the OMB sense:
Overall, the [enterprise architecture]’s usefulness as a strategic planning tool for NextGen has been limited due to incomplete information, a lack of policy and guidance, and unresolved NextGen design decisions.16
That the system architecture is not well developed is hard to discern for two reasons: (1) the nearly exclusive focus on the enterprise architecture, which is important but does not address the technical issues of just how a new NAS could be built and (2) the NAS system architecture could be shown to meet all of the new requirements that have been communicated to the Congress and the public. A complex system (of systems) requires several architecture perspectives, each aimed at demonstrating how each of several different kinds of requirements are to be met. Each perspective needs to be clearly communicated so that developers can understand the local intentions and system integrators can reason
14 FAA, “Final Program Requirements, SBS-002, Revision 04,” Surveillance and Broadcast Services Program Office, June 26, 2012, p. 22. See Appendix C for acronyms.
15 This is recommended in NRC, Pre-Milestone A and Early-Phase Systems Engineering (The National Academies Press, Washington, D.C.), a 2008 report that examines the role systems engineering can play in the acquisition lifecycle.
16 Department of Transportation, Addressing Underlying Causes for NextGen Delays Will Require Sustained FAA Leadership and Action, Office of Inspector General Audit Report AV-2014-031, February 25, 2014, https://www.oig.dot.gov/library-item/28823.
about system behaviors and unintended consequences. Indeed, one of the main roles of a system architecture is to facilitate the ability to envision and express how the system will (or could) evolve. It can reveal how new capabilities (e.g., ADS-B surveillance) fit in and what implications they have. A system architecture allows one component’s evolution to be planned to co-evolve with other components, and so on. Rather than being a simple snapshot or a log of changes to date, a well-developed system architecture should play a critical role in system evolution over time, providing a way to look forward and map out likely outcomes from a variety of scenarios.
A useful reference with which to examine architectural approaches is the 2008 NRC report Pre-Milestone A and Early-Phase Systems Engineering.17 Although that report’s recommendations do not universally apply to the FAA’s situation, they are nonetheless useful. In particular, the 2008 report identifies certain patterns indicative of success and failure, including the need for appropriate engineering talent and clear lines of authority, the need to perform trade-offs and manage complexity early in the process, the importance of a stable set of system requirements, and the need to plan ahead for change through architectural choices. As discussed in the rest of this report, the committee observes some of these success patterns being ignored in NextGen.
The upshot of the limitations in the existing enterprise architecture discussed above is that programmatic risk and engineering risk are both increased. In addition, architectural perspectives have not been exposed to users (pilots and controllers) at sufficient depth and with sufficient interaction and discussion about how operations would change as a result of implementing new features (which themselves are not documented at a depth sufficient to foster useful conversations). Furthermore, it will be much more difficult to evolve the system to meet requirements and take advantage of new technologies.
See Box 2.3 for a discussion of an ERAM system failure18 that occurred during the course of the committee’s work; the committee’s analysis of this failure reinforces the importance and potential of more comprehensive approaches to system architecture. This incident and subsequent analysis are suggestive of the need for proper system architectures that would allow modeling and reasoning about the system as a whole. ERAM was developed as a replacement for the legacy host system; it provides core functionality to the NAS. This may have seemed initially like a component for component replacement. But ERAM relies on complex
17 NRC, Pre-Milestone A and Early-Phase Systems Engineering, 2008.
18 The fire at Chicago Air Route Traffic Control Center in September 2014 provides further evidence of an architectural resilience failure in the NAS architecture.
BOX 2.3 April 2014 En Route Automation Modernization Failure and Architectural Implications
During the course of the committee’s work, the Los Angeles En-Route Air Traffic Control Center (ZLA ARTCC) experienced failures of the En Route Automation Modernization (ERAM) Flight Data Manager (FDM) software resulting in a ground stop for all flights passing through that center. FAA statements at the time summarized the situation:
On April 30, 2014, an FAA air traffic system that processes flight plan information experienced problems while processing a flight plan filed for a U-2 aircraft that operates at very high altitudes under visual flight rules. […] The computer system interpreted the flight as a more typical low altitude operation, and began processing it for a route below 10,000 feet. The extensive number of routings that would have been required to de-conflict the aircraft with lower-altitude flights used a large amount of available memory and interrupted the computer’s other flight-processing functions.1
Because ERAM is a major NextGen program, the committee asked for information about this failure in order to better understand how such failures come about, how they are handled, and what improvements with regard to architecture and software development the committee could suggest.2 While not tasked with undertaking a complete analysis of this or any other specific incident, the committee offers the following limited analysis—based on necessarily limited data—of why such failures are a concern and how appropriate system architecture and software development approaches could help reduce the likelihood of such failures in the future.
First, it should be noted that this incident led to a considerable loss of air traffic control service. The result could have been catastrophic. That it was not catastrophic does not mitigate the seriousness of the event. In the committee’s view, the investigation of this (or similar events affecting so much of the NAS) and the subsequent reporting should have been at the same scale as would have been required had there been an accident with loss of life.
The documentation and reporting suggests that ERAM failed essentially in its entirety (including its backup system), and that indicates a serious, systemic design flaw. Incorrect flight data was entered for a particular aircraft that, coupled with the activity of the aircraft in question, resulted in exhaustion of a fixed-size memory area, which ultimately led to a failure of the flight data processor. The backup system suffered from the same bad data. There were several ways in which this failure exposed poor design choices:
- Failure due to exhaustion of resources. The system should be monitoring resources, data integrity, equipment availability, response times, queue lengths, activity levels, and many other system parameters continuously. Provision for handling deviations from planned levels should be present for all system parameters.
- Unhandled software exceptions in a critical system. There was apparently an unhandled software exception that led to the exhaustion of resources.
- Mishandling of poor data entry. Apparently, the altitude of the aircraft in question was entered incorrectly. The very large extent of the consequent adjustments to civil air traffic flight plans could have triggered a resiliency response (such
as an “Are you sure?” reply to the flight plan as entered). Financial services sites routinely do this, for example, when amounts entered for online transactions are outside of determined normative bounds. This is all part of a properly designed input-validation (and taint removal) process.
- Primary and backup systems failed simultaneously. The simultaneous failure of primary and backup systems is a design flaw resulting in part from an inadequately considered system architecture. The decision to automatically hand something off to a backup computer is appropriate for a hardware failure; it is also likely acceptable when the failure is caused by the total state of the system. In this case, though, the flaw was in how one specific input in combination with a particular aircraft’s behavior was handled. The system was always going to fail with these bad inputs. Handling such a failover appropriately requires a fundamentally different approach to error-handling than is suggested by the high-level overview provided to the committee. That this problem was “fixed” by increasing a buffer size exposes a security problem (in addition to an availability issue). Merely increasing a buffer size does not guarantee the system will not fail with some unlikely complex combination of existing and proposed flight plans. If there is no proof that the capacity of the buffer cannot be exceeded by a possible set of inputs, then any fixed buffer size could lead to a system failure.
- Inadequate recovery actions. The immediate corrective actions taken—increase the buffer size and change operational procedures—are not sufficient to address the problems exposed by this failure. The system design and architecture need to be examined to make the changes necessary to achieve the required availability. That faults occur in such systems is well known, as are the techniques to cope with such faults. A root cause analysis is needed when errors such as these occur in high-reliability systems.
- Lack of coherent approach to common-mode failures. A footnote in the document the committee was provided states, “This capability was added to ERAM in the most recent software build (EAC1400) as a result of efforts to address Common Mode failures. This is the change that added the 128 KB buffer and the logic for its use. To date, this fix has prevented four failures.”3 This statement is of concern because common-mode failures are well known and generally the result of a design fault, the most difficult type of fault to tolerate. Making a change that is described as a “fix” suggests that the basic design of ERAM does not have a comprehensive architecture-based approach to common-mode failures. The fact that “this fix has prevented four failures” suggests that common-mode failures are frequent, which is troubling.
1 Dan Whitcomb, FAA Says Air Traffic Computer Was Overwhelmed by U-2 Spy Plane, Reuters, May 5, 2014, http://www.reuters.com/article/2014/05/06/us-usa-airport-losangeles-idUSBREA4501C20140506; Laura Stampler, FAA Confirms Spy Plane Caused LAX Chaos, Reuters, May 6, 2014, http://time.com/89130/faa-spy-plane-los-angeles/.
2 FAA, “ZLA Air Traffic Control (ATC) - Summary of Events Surrounding Declaration of ATC Zero at ZLA,” 2014; received following committee inquiry.
3 FAA, “ZLA Air Traffic Control (ATC) - Summary of Events,” 2014; received following committee inquiry. See also A. Scott and J. Menn, Exclusive: Air traffic system failure Caused by computer memory shortage, Reuters.com, May 12, 2014, http://www.reuters.com/article/2014/05/12/us-airtraffic-bug-exclusive-idUSBREA4B02320140512.
software that likely resulted in many changes to the system (as is appropriate). If there were a system architecture with appropriate data, process, and other perspectives available, then much of the updated functionality could be simulated, modeled, evaluated, and reasoned about.
Rather than being a system that is developed within the conceptual framework of architectures that have been shown to meet requirements, NextGen is instead a collection of projects each aimed at upgrading or replacing existing componentry and capabilities. The upgrades are needed, so NextGen is delivering value. But they are upgrades and enhancements of the existing system, based on the longstanding architectures and designs of the NAS. As such, their ability to meet NextGen’s stated objectives and requirements is unknown, and indeed, without an appropriately scaled and specified system architecture, probably unknowable.
Finding: The FAA’s approach to enterprise architecture is not an adequate technical foundation for steering Nextgen’s technical governance and managing the inevitable changes in technology and operations.
Finding: Absent an appropriately scaled and specified system architecture, the ability of any given change, upgrade, or enhancement to meet stated objectives or requirements is unknown and unknowable.
The committee did learn about the existence of architectural steering groups, but it was not clear how much authority those groups have. Unfortunately, having de facto established the existing architecture as the architecture for NextGen, many opportunities to use the architecture in forward-looking ways have been ruled out. Thus, through its architectural choices, the FAA has put itself in a position where some important advances are going to be extremely challenging to accomplish, such as the ability to create persuasive and credible forecasts of change costs, technical risks, capability upgrades, and performance improvements. The committee’s recommendations in this chapter take this into account and offer suggestions as to how move forward most productively in developing better architectural approaches.
Any large-scale, software-intensive systems endeavor requires a system architecture that specifies how all of its parts fit together and interact, and which can be used in a dynamic way to help inform and drive plan-
ning for change and related decision making. A system architecture provides the capacity to develop and validate analyses that can help detect issues such as single points of failure and emergent properties early in the process. Architectural leadership is essential to the success of any large system of systems, as architecture embodies some of the most important design decisions and includes structural and design commitments that will constrain and guide subsequent expectations and capabilities. Astute architectural leadership encompasses the following elements:
- Recognition that, although there is a single system architecture, no single architecture perspective is sufficient. An enterprise perspective (embodied in the NAS enterprise architecture and that sought by OMB) serves different purposes from a software architecture, which is different still from a security architecture, for example.
- Architectural leadership at the system-of-systems level to help maintain appropriate alignment among the various architectural perspectives and to ensure that as requirements change and development proceeds that the architecture is kept consistent.
- Thoughtful and consistent attention to ensuring that the system architecture is flexible and evolvable.19 While modernization and incremental improvements proceed, care should be taken to ensure that the architecture does not become too rigid such that innovative changes cannot be put in place later on. A suitable architecture helps to position its users for future flexibility—it can provide an infrastructure that can be exploited for possible (unanticipated) future applications and enables thinking about the future in a structured and disciplined way.
- Assurance that verification and validation considerations are incorporated early in architectural perspectives. Verification and validation efforts relate to a wide range of attributes including system functional behavior, security, availability and resilience, performance, response to anomalous human behavior, and so on. Such an approach ensures that the highest value and highest risk elements of system assurance are also addressed earlier in the life cycle.
- Development and maintenance of effective architectural documentation. Exhortations for “better architecture” often result only in more volume and more detail, but no more (and sometimes even less) insight and effec-
19 Some efforts go so far as to incorporate an evolution viewpoint to guide the reasoning required for dealing with change. In this approach, points of change, their sources, dependencies among changes and impacts are explicitly managed as first-class entities. See for example, M. Razavian and P. Lago, “A Viewpoint for Dealing with Change in Migration to Services,” in Proceedings of the Joint 10th Working IEEE/IFIP Conference on Software Architecture and 6th European Conference on Software Architecture (WICSA/ECSA), IEEE Computer Society, http://ieeexplore.ieee.org, 2012.
tive guidance. This is not what the committee advocates. NextGen architectural leadership should focus instead on identifying system features and aspects that have the highest value (taking into account cost) or represent the greatest risk (see Chapter 3), and align technical projects with a coherent understanding of structural, operational, and performance intentions. An architecture that is documented in an effective useful manner is easier to understand, analyze, maintain, update, and use.20
- A practice of effective architecture evaluation. Both effective documentation and architecture are encouraged through periodic reviews (architecture evaluation).21
- A common understanding of key performance parameters, along with models to assess how each would be affected by various alternative choices within NextGen.22 In NextGen, such key performance parameters might include the ability to incorporate large growth in UAS, or versions of stated goals such as shorter routes, improved navigation around weather, reduced time and fuel costs, reduced delays, increased airport capacity and so on. These may need to be traded against each other, but doing so is best done transparently.23
- Attention to cost and sustainability. A key aspect of architecting in the large, beyond devising technical solutions and approaches, is feasible, cost-effective, sustainable solutions. In any large-scale system there will be numerous trade-offs, just as there are numerous risks (discussed in
20 For a given system, such as NextGen, there may be several descriptions of the architecture used for different purposes. Ideally, these should be kept consistent, but each may have different details and emphases. Some will be used for communication among projects and with stakeholders, others for specification of how things must be done in component systems, for planning, budgeting, and so on. To serve multiple purposes an architecture description should be explicitly linked from its stakeholders (who cares?) to their concerns (what do they care about?) to the viewpoints framing those concerns (how are these aspects of the architecture modeled?) to the views and models comprising the descriptions (how are the concerns of the stakeholders addressed?).
21 P. Clements, R. Kazman, and M. Klein, Evaluating Software Architectures: Methods and Case Studies, SEI Series, Addison-Wesley, 2002; H. Obbink et al. Report on Software Architecture Review and Assessment (SARA), Tech. Rep. Version 1.0, The SARA Working Group, February 2002.
22 The existence of such models is noted as a key success pattern in the 2008 NRC report Pre-Milestone A and Early-Phase Systems Engineering.
23 Many of these potential key performance parameters are still up for debate and potential trades even though the NextGen effort has been under way for quite some time. Resolving which of the (many, shifting) implicit and explicit goals of NextGen should be key performance parameters is beyond the scope of this committee’s tasking. Instead, the committee has focused on the importance of a suitable architectural approach—regardless of which key performance parameters are chosen—to make progress. A stronger architectural foundation will also make it possible to explore the relationship between the key parameters and the system design.
the next chapter). Architecture leadership should aim to devise solutions that are feasible, fit within budget, and minimize disruption to existing operations.
- An architectural community with growing diversity of thought, perspective, and knowledge. In the committee’s view, there is too much reliance at the FAA on tacit knowledge in the heads of a very small number of heroic architects. Developing and retaining a “deeper bench” of talent to grow FAA architectural capability is essential.
The most important thing on which the FAA should focus with respect to architecture is building a community of architecture leaders within and outside the agency. The FAA will need to increase its system architecture capabilities and establish a more capable architecture community. Good architects will tailor the efforts effectively, independent of the processes, methods, and artifacts with which they need to work. Architectural leadership requires creative skills and abstract reasoning coupled with domain experience. The imposition of additional process requirements or more training will not significantly help to transform an average systems engineer into a capable architect. Developing an effective architectural leadership team is more of a selection and hiring challenge than it is a training issue. Major software companies and system integrators have highly evolved career paths, mentoring programs, and peer selection boards in place to identify, develop, and certify qualified architects. The FAA and its contractor community could learn much from commercial practice in building stronger communities of architectural leadership.
Good architects need to be empowered with authority, within a suitable organizational structure, to succeed. The committee’s impression is that on many projects (particularly when OMB rules are in play), architecting is treated as a parallel activity to engineering and development with an emphasis on producing the mandated artifacts. Successful architects often must negotiate requirements, influence acquisition, change budgets, and recommend significant changes on-the-fly. Unfortunately, traditional project management structures are often not conducive to “architecture-centric” efforts. Much can be learned from commercial practices (such as in software development), although some of these have the advantage of being single-product oriented efforts. Another important model, for systems of this scale, is the city planner model, which empowers planners via legislation and building codes.
As discussed in the section above on architectural leadership, more intellectual leadership, diversity of thought and approach, and more
people challenging the status quo with good ideas are needed. FAA’s contractor community itself possesses some of the necessary architectural skills and expertise and could be harnessed to help provide input to the overall NAS architecture design and evolution. The FAA could leverage the relationships it already has with contractors working on particular systems and components and solicit higher-level input and advice from them while not abdicating its own overall responsibility for the architecture.
One challenge is that while such a community would be helpful in long-range, pre-competitive environments (pre-RFP), there is little incentive for contractors competing for work to share their best ideas in multistakeholder forums. Addressing this structural issue will be a challenge. Some consideration should be given to the nature of how the architects are organized. The FAA may need to structure an alignment, for its contractors, of decision authority and responsibility that incentivizes them both to participate in this collective process and to share responsibility for outcomes. One successful model—rarely found in government projects, but found in building architecture—separates the organizations responsible for architecting from organizations responsible for implementation or operation and separates architect from client. The architect works for the client and oversees and certifies implementation. The architect’s role is especially critical in managing “systemic properties,” such as security, safety, availability, end-to-end performance, information management, and interoperability, across the portfolio of NAS elements in a coherent manner.
The architecture leadership will need to prioritize—focusing on those properties considered systemic and on the evolvability of the system as a whole, while delegating implementation choices to specific programs. Some government agencies create advisory panels or similar constructs to review and comment on plans and efforts. In the FAA’s case, it would be important to ensure that there are technically competent people, however the leadership community is created, who are familiar with system architecture and help stimulate creative thinking. One downside of advisory panels as typically structured is that they do not have an explicit stake in the outcome of the development or responsibility to stay with an effort if it goes astray. In whatever form it takes, an architecture leadership community is a locus within which to accomplish the following:
- Provide leadership regarding requirements negotiation, acquisition, budgeting, verification and validation, testing and certification, acceptance, integration, and key changes and their anticipated effects on the system as a whole.
- Define coherent overarching technical objectives to provide direction to incremental program steps.
- Manage and communicate horizontal dependencies better.
- Design and develop an architecture for change, incorporating both flexibility and evolvability, so as to be ready for unanticipated demands or changes in requirements. These changes may range from new hazards that will need to be modeled to new requirements imposed by legislation. Architectural models will need to manage a variety of potential constraints on the architecture.
- Determine appropriate approaches to information hiding and abstraction to support basic system infrastructure services. For example, the NextGen architecture does not address availability or cybersecurity in a comprehensive way, which in part derives from this lack of infrastructure abstraction.24 By abstracting infrastructure aggressively, the system avoids reliance on particular hardware choices, but can instead allow the hardware (and some systems layers above, perhaps) to stay current with mainstream computational infrastructure. If managed well, this confers the benefit of lower infrastructural operations and maintenance costs and the added advantage of a steady increase in capacity to meet rises in demand.
- Ensure that suitable methods are in use for aggregating systemic properties from the levels at which they are understood to higher levels of abstraction where they can be managed.25
- Ensure that mechanisms are in place to discover and share architectural techniques. There are multiple technical and engineering fields that are relevant and which have value to large, complex, software-intensive efforts such as NextGen. A key insight that architects can provide is that complex systems entail multiple concerns, and therefore require a multidisciplinary approach.
- Communicate changing circumstances and reset expectations among stakeholders as needed (as described in Chapter 1)—perhaps the most important activity for which the community will be well positioned. A critical role of an effective architecture community is two-fold: (1) playing offense by assessing new features, new value propositions, and deployment plans and persuading stakeholders with proposed changes that improve efficiency or effectiveness of the system and (2) playing defense by quantifying and prioritizing critical risks, trading off mitiga-
24 The need to pay attention to availability was well illustrated by the recent ERAM failure described in Box 2.1. Just as security should begin with a threat model, design for availability should start with a comprehensive hazard analysis and proceed with suitable models of mitigation based on architectural specifics.
25 System-level issues such as safety, availability, and the like are typically assessed at the level of individual systems. It will be important to address these techniques at the system architecture level. Although there is progress in this area, generally, that sort of abstraction is also still very much an ongoing research topic.
tion approaches, and designing solutions that fit within programmatic constraints.
- Tap into the best ideas available worldwide, in public and private sector organizations, for air traffic control and airspace management.
To be clear, the committee does not urge the premature creation of more detailed specifications and artifacts absent deeper insights and stronger analyses of risks and trade-offs. In many ways, such efforts would be counterproductive, translating into more overhead (process and documentation) and less attention, resources, and expertise focused on better design, decisions, tests, and earlier integration. One failure pattern in the systems and software industry that the FAA should strive to avoid is building an extremely precise version of plans, scope, and architecture, with only an imprecise understanding of likely trade-offs, user needs, or the team’s capability. Additional premature precision often ultimately translates into future rework and waste.
There are some areas where commitments need to be deferred. Uncertainties associated with these deferred decisions can be encapsulated using suitably designed abstractions. Specific commitments should be deferred until the choice is both necessary to make and well informed (through precedent, modeling, simulation, prototyping, or other means) and both the likelihood and extent of adverse consequence of a wrong commitment are reduced. Hence, the committee’s emphasis on abstraction—exposing appropriate information and functionality at appropriate levels—and perspective.26 Documentation practices, and architecting in general, should be agile: the simplest thing that works with an emphasis on minimal documentation targeted to specific needs of specific stakeholders.
Balance with regard to both documentation and technical commitments is important. Finding an appropriate balance between the level of specific technical commitment in a system-architectural model (with all the benefits of fixing on particular choices in this regard) and the ability to respond to the continuing rapid evolution of technology and infrastructure is key. This ability to respond is determined both by the quality of the abstractions built in to the architectural models (at any given moment) and the nimbleness of the process for updating and evolving the architecture. An unfortunate pattern that can result is that “architecture” will come to refer to a set of enduring standards that are generally divorced from the current state of technology and that, consequently, cannot yield the intended benefits of a true system architecture.
26 Additional precision and detail can and should be added incrementally at appropriate levels as sharper perceptions and more solid understandings evolve.
Governance and appropriate authorities are vital for developing an effective architecture. The previous section noted the lack of explicit connection between “leaf nodes” in the enterprise architecture and the program-level architectures. The programs will need to be compliant with specifications in the higher-level architecture and show clearly how they satisfy it—but facilitating such compliance does not require detailed specifications of all the programs in the architecture itself. Because programs are developed by other parts of the FAA that are not under control of the NextGen office, there needs to be a decision authority—informed and guided by an able architecture leadership community—to enforce governance.
The architectural leadership should be responsible for balancing tensions among a number of competing goals, such as innovation and stability or safety, value delivered and costs incurred, security and openness, uncertainty and predictability, process maturity and agility, and so on—avoiding an excessively bureaucratic process- and document-bound approach while providing sufficient leadership and direction. Governance will be needed to facilitate an effective process by which issues can be raised, debated, decided, recorded, and implemented. There are obvious potential complications and conflicts of interest in involving architects from contractors, from other agencies, and from among and between different FAA projects. Without appropriate architectural governance and enforcement, involving the industry’s stakeholders, the community will fail to exert influence on NAS development.
Commercial practice has long recognized the need for nurturing and identifying strong architectural skills. Global systems integrators and large government contractors have disciplined programs for technical career paths that attract professionals with exceptional skills for architecting, research, and innovation. For example, IBM has evolved corporate standards for technical roles and technical career paths, including systems and software architects. These prestigious positions are achieved through years of apprenticeship, a track record of accomplishment, and selection by a certification board composed of technical peers. Job titles such as “distinguished engineer” and “fellow” reflect highly influential roles that have executive standing within companies like IBM and Microsoft.
An architectural community represents a set of eclectic skills coupled with deep domain experience. Relatively few engineering-trained professionals can excel in architectural decision making.27 For such a scarce resource, the FAA will be challenged to attract and retain such talent. The
27 For instance, at IBM where there is a very large pool of technical employees—greater than 100,000—the percentage of distinguished engineers and fellows is less than 1 percent of the technical population.
financial incentives and dynamic opportunities of commercial markets are significantly more attractive. Therefore, the FAA will need to look externally for these skills in resource pools such as academia, systems integrators, and professional societies.
The committee believes that the NextGen challenge represents a unique and attractive technical opportunity for qualified architects. NextGen is a world-class challenge with broad impact on the livelihoods of many people and a cornerstone of our way of life. It is the sort of system that easily ends up making news on the front page of every national newspaper if it works well, or if it does not. Marketing this opportunity and channeling the best of the available talent pool will require some technical leadership and compensation models that are foreign to the FAA and government contracts. The technical steering and technical governance model of the NAS requires some innovative thinking.
Recommendation: The Federal Aviation Administration (FAA) should initiate, grow, and engage a capable architecture community—leaders and peers within and outside FAA—who will expand the breadth and depth of expertise that is steering architectural changes.
Recommendation: The Federal Aviation Administration should conduct a small number of experiments among its system integration partners to prototype candidate solutions for establishing and managing a vibrant architectural community.
Recommendation: The Federal Aviation Administration should use an architecture leadership community and an effective governance approach to assure a proper balance between documents and artifacts and to provide high-level guidance and a capability that (1) enables effective management and communication of dependencies, (2) provides flexibility and evolvability to ensure accommodation of future needs, and (3) communicates changing circumstances in order to align expectations.