This report responds to a request from the U.S. Department of Defense (DOD) to identify engineering practices that have proved successful for system development and testing in industrial environments. It is the latest in a series of studies by the National Research Council (NRC), through the Committee on National Statistics, on the acquisition, testing, and evaluation of defense systems. The previous studies have been concerned with the role of statistical methods in testing and evaluation, reliability practices, software methods, combining information, and evolutionary acquisition. This study was sponsored by DOD’s Director of Operational Test and Evaluation (DOT&E) and the Under Secretary of Defense for Acquisition, Technology, and Logistics (USD-AT&L). It was conducted by the Panel on Industrial Methods for the Effective Test and Development of Defense Systems.
The study panel’s charge was to plan and conduct a workshop to explore how developmental and operational testing, modeling and simulation, and related techniques can improve the development and performance of defense systems, particularly techniques that have been shown to be effective in industrial applications and are likely to be useful in defense system development. This workshop was the panel's main fact- finding activity, which featured speakers who described practices from software and hardware industries.
We emphasize that we could not, and did not, carry out a comprehensive literature review or examination of industrial and engineering methods for system development. Rather, drawing on information from
the workshop and the experience and expertise of the panel’s members, we focused on the techniques that have been found to be useful in industrial system development and their applicability to the DOD environment, while acknowledging the differences in the two environments. To that end, we also considered the availability and access to data (especially test data), the availability of engineering and modeling expertise, and the organizational structure of defense acquisition.
Many, perhaps even most, of the industrial practices we discuss and recommend are or have been used in DOD, but they are not systematically followed. We do not offer new policy or procedural recommendations when (1) the techniques are already represented in DOD acquisition policies and procedures, (2) DOD has been trying to implement the desirable practices, or (3) the desirable practices have previously been recommended in other NRC reports or by other advisory bodies. In these cases we reiterate the benefits of and the need to fully adopt and follow the relevant policies, procedures, and practices. We do offer recommendations to determine if the defense acquisition community is moving in the wrong direction by reviewing policies, procedures, and practices that are new or have elements that are new.
Conclusion 1: It is critical that there is early and clear communications and collaboration with users about requirements. In particular, it is extremely beneficial to get users, developers, and testers to collaborate on initial estimates of feasibility and for users to then categorize their requirements into a list of “must haves” and a “wish list” with some prioritization that can be used to trade off at later stages of system development if necessary.
Although communication with users is common in defense acquisition, the emphasis at the workshop was on a continuous exchange with and involvement of users in the development of requirements. In addition, the industrial practice of asking customers to separate their needs into a list of “must haves” and a “wish list” forces customers to carefully examine a system’s needs and capabilities and any discrepancies between them and thus make decisions early in the development process. It is also important to use input from the test and evaluation community in the setting of initial requirements.
Conclusion 2: Changes to requirements that necessitate a substantial revision of a system’s architecture should be avoided as they
can result in considerable cost increases, delays in development, and even the introduction of other defects.
Having stable requirements during development allows the system architecture to be optimized for a specific set of specifications, rather than be modified in a suboptimal manner to try and accommodate various updates and changes over time. However, there must also be some flexibility that allows for modifications that are responsive to users’ needs and changing environments. Although existing DOD regulations mandate that changes in requirements must go through a rigorous engineering assessment before they are approved, these regulations do not appear to be strictly enforced.
Conclusion 3: Model-based design tools are very useful in providing a systematic and rigorous approach to requirements setting. There are also benefits from applying them during the test generation stage. These tools are increasingly gaining attention in industry, including among defense contractors. Providing a common representation of the system under development will also enhance interactions with defense contractors.
The term “model-based design tools” relates to formal methods used to translate and quantify requirements from high-level system and subsystem specifications, assess the feasibility of proposed requirements, and help examine the implications of trading off various performance capabilities (including various aspects of effectiveness and suitability, including durability and maintainability). It has also been called model- based engineering. In addition to rigorously assessing the feasibility of proposed requirements and helping assess the results of “lowering” some requirements while “raising” others, model-based design tools are known to provide a range of benefits: a formal specification of the actual intent of the functionality, they document the requirements; the model is executable, so any ambiguities can be identified; the model can be used to automatically generate test suites; and, possibly most importantly, the model captures knowledge that can be preserved.
DOD should have expertise in these tools and technologies and use them with contractors and users. More broadly, DOD should actively participate, if not lead, in the development of model-based design tools.
Recommendation 1: The Office of the Undersecretary of Defense for Acquisition, Technology, and Logistics and the Office of the Director of Operational Test and Evaluation of the U.S. Department of Defense and their service equivalents should acquire expertise
and appropriate tools related to model-based approaches for the requirements setting process and for test case and scenario generation for validation.
DESIGN AND DEVELOPMENT
Technological Maturity and Assessment
Conclusion 4: The maturity of technologies at the initiation of an acquisition program is a critical determinant of the program’s success as measured by cost, schedule, and performance. The U.S. Department of Defense (DOD) continues to be plagued by problems caused by the insertion of immature technology into the critical path of major programs. Since there are DOD directives that are intended to ensure technological readiness, the problem appears to be caused by lack of strict enforcement of existing procedures.
Technological immaturity is known to be a primary cause of schedule slippage and cost growth in DOD program acquisition. Many studies, including those of the National Research Council (2011), the Defense Science Board (1990), and the U.S. General Accounting Office (1992) and its successor, the U.S. Government Accountability Office (2004), have discussed the dangers associated with inserting insufficiently mature technologies in the critical path of DOD design and development.
Recommendation 2: The Under Secretary for Acquisition, Technology, and Logistics of the U.S. Department of Defense (USD-AT&L) should require that all technologies to be included in a formal acquisition program have sufficient technological maturity, consistent with TRL (technology readiness level) 7, before the acquisition program is approved at Milestone B (or earlier) or before the technology is inserted in a later increment if evolutionary acquisition procedures are being used. In addition, the USD-AT&L or the service acquisition executive should request the Director of Defense Research and Engineering (the DOD’s most senior technologist) to certify or refuse to certify sufficient technological maturity before a Milestone B decision is made. The acquisition executive should also
• review the analysis of alternatives assessment of technological risk and maturity;
• obtain an independent evaluation of that assessment as required in DOD instruction (DODI) 5000.02; and
• ensure, during developmental test and evaluation, that the materiel developer shall assess technical progress and maturity against critical technical parameters that are documented in the Test and Evaluation Master Plan (TEMP).
A substantial part of the above recommendation is currently required by law or by DOD instructions. Moreover, earlier NRC reports have also made similar recommendations. DOD has been moving in the wrong direction regarding the enforcement of an important and reasonable policy as stated in DODI 5000.02.
Conclusion 5: The performance of a defense system early in development is often not rigorously assessed, and in some cases the results of assessments are ignored; this is especially so for suitability assessments. This lack of rigorous assessment occurs in the generation of system requirements; in the timing of the delivery of prototype components, subsystems, and systems from the developer to the government for developmental testing; and in the delivery of production-representative system prototypes for operational testing. As a result, throughout early development, systems are allowed to advance to later stages of development when substantial design problems remain. Instead, there should be clear-cut decision making during milestones based on the application of objective metrics. Adequate metrics do exist (e.g., contractual design specifications, key performance parameters, reliability criteria, critical operational issues). However, the primary problem appears to be a lack of enforcement.
Defense systems should not pass milestones unless there is objective quantitative evidence that major design thresholds, key performance parameters, and reliability criteria have been met or can be achieved with minor product improvements.
Conclusion 6: There are substantial benefits to the use of staged development, with multiple releases, of large complex systems, especially in the case of software systems and software-intensive systems. Staged development allows for feedback from customers that can be used to guide subsequent releases.
The “agile development” process for software systems (discussed at the workshop) is a disciplined framework that ensures that best practices
are consistently used throughout system development. A staged development appears to be natural for large-scale complex software systems, and it may also be appropriate for some hardware systems. Each of the stages must retain the functionality of its predecessor systems, at the very least to satisfy the natural expectations of the customer over time.
The panel supports the recommendations on testing that have appeared in previous reports on this topic by the NRC. These recommendations have addressed the following issues:
• the importance of comprehensive test planning (National Research Council, 1998)
• the benefits from use of state-of-the-art experimental design principles and practices (National Research Council, 1998)
• the potential benefits from combining information for operational assessment (National Research Council, 1998)
• that testing should be carried out with an operational perspective (National Research Council, 2006)
• that testing should give greater emphasis to suitability (National Research Council, 1998)
• the benefits from the use of accelerated reliability testing methods (National Research Council, 1998)
COMMUNICATION, RESOURCES, AND INFRASTRUCTURE
Conclusion 1 highlights the need for early and clear communications about requirements. In addition, industry representatives at the workshop stressed the importance of collaboration and communication among customers and program developers, as well as participants across all aspects of system development and testing to avoid long, costly, and unsuccessful product development programs. Leading industrial companies have established programs to promote higher levels of collaboration among suppliers, manufacturers, customers, service organizations, and the ultimate users of the product.
A Data Archive
Conclusion 7: A data archive with information on developmental and operational test and field data will provide a common framework for discussions on requirements and priorities for development. In addition, it can be used to expedite the identification of
and correction of design flaws. Given the expenses and complexity of developing such an archive, it is important that the benefits of a data archive be adequately demonstrated to support development.
The collection and analysis of data on test and field performance, including warranty data, is a standard feature in commercial industries. The development of a data archive has been discussed in previous NRC reports, and we repeat its importance here. One possible reason for DOD’s failure to establish a data archive is the lack of an incentive to support this and any other central activity. DOD needs to be convinced of the advantages of building and maintaining such a database and then to commission an appropriate group of people with experience in program development to develop a concrete proposal on how the data archive should be structured.
Recommendation 3: The U.S. Department of Defense should create a defense system data archive containing developmental test, operational test, and field performance data from both contractors and the government. Such an archive would achieve several important objectives in the development of defense systems:
• substantially increase DOD’s ability to produce more feasible requirements,
• support early detection of system defects,
• improve developmental and operational test design, and
• improve modeling and simulation through better model validation.
As DOD initiates plans to begin creation of a defense system data archive, at least three issues need immediate resolution: (1) whether the archive should be DOD-wide or should be stratified by type of system to limit its size, (2) what data are to be included and how the data elements should be represented to facilitate linkages of related systems, and (3) what data-based management structure is used. A flexible architecture should be used so that if the archive is initially limited to a subset of the data sources recommended here due to budgetary considerations, the archive can be readily expanded over time to include the remaining sources.
Conclusion 8: Feedback loops can significantly improve system development by improving both developmental and operational test design and the use of modeling and simulation. Feedback
systems can function similarly to warranty management systems that have proved essential to the automotive industry. To develop feasible requirements, understanding how components installed in related systems have performed when fielded is extremely useful in understanding their limitations for possibly more stressful use in a proposed system. To support such feedback loops, data on field performance, test data, and results from modeling and simulation must be easily accessible, which highlights the necessity for a test and field data archive.
Field performance data are the ultimate indicators of how well a system is functioning in operational conditions. By field performance data, we also mean data on all the circumstances that can have an impact on the quality of the components, subsystems, and systems. These data include all relevant pre- and postdeployment activities, including transportation, maintenance, implementation, and storage. They could also include training data, if such data were collected objectively. Such information can and should be used to better understand the strengths and weaknesses of newly fielded systems in undertaking various missions, including such tactical information as identifying the scenarios in which the current system should and should not be used. Unfortunately, these data are rarely archived in a way that facilitates analysis.
Recommendation 4: After a test and field data archive has been established, the Under Secretary of Defense for Acquisition, Technology, and Logistics (USD-AT&L) and the acquisition executives in the military services should lead a U.S. Department of Defense (DOD) effort to develop feedback loops on improving fielded systems and on better understanding tactics of use of fielded systems. The DOD acquisition and testing communities should also learn to use feedback loops to improve the process of system development, to improve developmental and operational test schemes, and to improve any modeling and simulation used to assess operational performance.
SYSTEMS ENGINEERING EXPERTISE
Conclusion 9: In recent years, the U.S. Department of Defense has lost much of its expertise in all the key areas of systems engineering. It is important to regain in-house capability in areas relating to the design, development and operation of major systems and
subsystems. One such area is expertise in the model-based design tools as discussed earlier.
Commercial companies place a great deal of importance on systems engineering expertise. This is key for system development as well as for requirements setting, model development, and testing. Unfortunately, DOD’s expertise in systems engineering was decimated by congressionally mandated manpower reductions in the late 1990s and additional reductions by the services. DOD has recognized this problem and is taking steps to rectify it. However, given the time it will take to build up that expertise in house, the DOD should examine the short-term use of contractors, academics, employees of national laboratories, and others.
Enforcement of DOD Directives and Procedures
Conclusion 10: Many of the critical problems in the U.S. Department of Defense acquisition system can be attributed to the lack of enforcement of existing directives and procedures rather than to deficiencies in them or the need for new ones.
As workshop participants noted, there are many studies, documents, and DOD procedures relating to best practices. The problem is that they are not systematically followed in practice.
Role of a DOD Program Manager
The role of program manager is noticeably different in industry than in DOD. In industry, the program manager’s tenure covers the entire product realization process, from planning, design, development, and manufacturing to even initial phases of sales and field support, and the program manager is fully responsible and accountable for all of these activities. This tenure ensures a smooth transition across the different phases of acquisition, as well as transfer of knowledge. In contrast, in DOD the tenure of a program manager rarely covers more than one phase of a project, and there is little accountability. Moreover, there is little incentive for a DOD program manager to take a comprehensive approach to seek and discover system defects or design flaws.
Recommendation 5: The Under Secretary of Defense for Acquisition, Technology, and Logistics should provide for an indepen-
dent evaluation of the progress of ACAT I systems in development when there is a change in program manager. This evaluation should include a review by the Office of the Secretary of Defense (OSD), complemented by independent scientific expertise as needed, to address outstanding technical manufacturing and capability issues, to assess the progress of a defense system under the previous program manager, and to ensure that the new program manager is fully informed of and calibrated to present and likely future OSD concerns.
Clearly, there are many details and challenges associated with developing and implementing this recommendation that are beyond the panel’s scope and expertise. However, we emphasize that there are systemic problems with the current system of program management and that they are serious obstacles to the implementation of efficient practices.