Through the Cooperative Threat Reduction (CTR) Program, the United States works with partner countries to address threats of mutual concern that could manifest in, transit through, or emanate from their territory.1 Congress directed the Secretary of Defense to “develop and implement metrics to measure the impact and effectiveness of activities of the CTR Program of the Department of Defense [DoD] to address threats arising from the proliferation of chemical, nuclear, and biological weapons and weapons-related materials, technologies, and expertise” (Section 1304, P.L. 111-84, see Appendix A). The Secretary completed a report describing DoD’s metrics for the CTR Program (DoD, 2010; here called the DoD Metrics Report, see Appendix B) in September 2010 and, as required in the same law, contracted with the National Academy of Sciences to review the metrics DoD developed and identify possible additional or alternative metrics, if necessary. This report provides that review and advice.
What is a metric? As implied in the language cited above, a metric is an evidence-based tool that measures impact and effectiveness, which can be defined in terms of the performance of a program or a project with respect to its objectives. Metrics alone cannot ensure that the best options have been identified or are being implemented, nor by themselves can they tell managers and decision makers why progress is or is not being made (see What Metrics Cannot Do, in Chapter 3), but they can be helpful in establishing reference points and informing decision makers of whether the efforts are bearing fruit.
The CTR Program’s early focus was primarily on dismantlement and nonproliferation of weapons of mass destruction (WMD), WMD materials, and WMD expertise in the former Soviet Union. The Program has evolved both geographically and topically to include work aimed at improving partner nations’ abilities to deter, detect, and respond to emerging WMD threats, and the Program is envisioned to shift further in this direction. In sum, the Program has shifted from dealing with specific sources of known risk—weapons, weapons materials, and expertise that had been associated with actual WMD programs—to dealing with potential sources of future risk. This is particularly true of the CTR Cooperative Biological Engagement Program (CBEP). CBEP’s work resides in a gray mission space that has elements that overlap with public health activities. Impact, effectiveness, and success are difficult to measure in such efforts—there are no simple parallel metrics to counting the number of delivery vehicles destroyed or the fraction of weapons-useable nuclear material secured or eliminated that can effectively measure the impact of any complex capacity building program, such as CBEP. It is complex and challenging to develop metrics for the partner’s capabilities and the personal and institutional relationships established, which are the main products of a successful capacity-building program. As a result, the committee concludes that it is possible to successfully accomplish what is easily measurable and still fail in the engagement.
1 The CTR Program began in 1991 as a means of assisting the former Soviet Union and later additional countries with strategic offensive arms elimination; nuclear warhead dismantlement; nuclear weapons storage security; chemical weapons destruction; biological weapons proliferation prevention; reactor core conversions; nuclear material protection, control and accounting; export control initiatives; defense conversion; as well as other projects. Across the U.S. Government, CTR projects are administered by DoD, the Department of Energy, the Department of Commerce, and the Department of State. This report addresses only DoD programs and projects. The DoD CTR Program was incorporated into the Defense Threat Reduction Agency (DTRA) when the agency was established in 1998.
ASSESSMENT OF THE DOD METRICS REPORT
The committee evaluated the DoD CTR metrics described in the DoD Metrics Report based on whether the metrics provide decision makers the essential information to manage the effectiveness and impact of CTR programs. Most of the CTR programs try to develop a sustaining capability of some sort through a set of projects implemented with partner countries on their territory. The DoD Metrics Report contains reasonable metrics for the CTR programs that consolidate and eliminate weapons and weapons materials,2 and contains a solid starting point for developing metrics for the newer capacity-building programs. For meaningful evaluation, the committee assessed that DoD must (1) state the objectives of the program and the projects (i.e., the goals of the actual activities); (2) identify the capabilities it is trying to develop or maintain; (3) link those capabilities to metrics; (4) ensure that the metrics reflect program effectiveness and impact; and (5) plan for and measure sustainment.3 It is generally good practice for the program to establish minimum performance levels (performance that must be achieved for the project to not fail) and aspirational goals (the desired performance above the minimum) for each metric.4
• Overall, the DoD Metrics Report describes CTR’s highest-level objectives and difficulties in developing metrics clearly and succinctly in the introductory section. However, the report does not connect these objectives to threats for the capacity- building programs. This is not to say that there is no connection, nor that the explanation does not appear in other documents,5 just that the DoD Metrics Report does not describe the connection. For example, describing the connection would clarify how building the capacity to better track respiratory disease in East Africa reduces threats to U.S. national security. The lack of a concise statement of the objectives of each program and how the actions planned under the program are intended to reduce threat or risk is a deficiency that makes the Report less effective for communicating with people outside of the program and internally makes development and refinement of metrics more difficult.
• CTR programs are meant to be partnerships and work best when they are carried out as partnerships with other countries (hence, cooperative threat reduction). This includes joint development of the objectives with the partner countries. The DoD Metrics Report does not make clear whether and how partner countries participate in determining objectives and metrics, or the partner country’s role in measurement. The report reads as if metrics are U.S. measurements of the partner’s progress toward U.S. goals rather than the impact and effectiveness of the program measured with jointly agreed metrics.
2 The committee did not address the Strategic Offensive Arms Elimination Program, which is not discussed in the DoD Metrics Report because DoD plans to use the long-standing “Nunn-Lugar Scorecard” metrics for that program.
3 In its report, DoD seems to use the term sustainability to refer to both the ability to sustain the program, security state, or other improvements of the CTR programs, and to the actual result or act of sustaining them. The committee refers to the former as sustainability and the latter as sustainment.
4 In capabilities based planning, these are referred to as threshold values and objective values. Because the term “objective” is used in so many different ways, the committee has not adopted this usage.
5 See, for example, Nacht (2009).
• DoD did not use a consistent framework for developing and articulating program objectives, capacities, and metrics and did not prioritize among its metrics. As a consequence, the DoD Metrics Report mixes project management measures with higher-level program performance metrics for some of the CTR programs and weights equally metrics that are critically important and others are not.
• DoD plans to leverage other U.S. Government agencies’ experience, capabilities, and assets as CTR expands to new countries and as it continues existing programs. DoD also needs to communicate, coordinate, and cooperate with relevant agencies. At the same time, most of DoD’s metrics do not obviously draw on or tie into metrics developed by other agencies for similar or related purposes and related programs.
• The DoD Metrics Report deliberately does not consider future missions or changes in objectives, and for some programs it does not take into consideration or explicitly discuss planned and unplanned change over time. Projects have different phases that make different kinds of progress toward the program’s impact and effectiveness. Also, circumstances change, including changes in the threats, the political environment, the program’s priorities, and the funding available. A project’s performance itself feeds back into the management and decision-making process and can lead to change. It will be difficult to address change and sustainability without considering and incorporating these factors explicitly.
The practical consequences of some of these shortcomings, such as not articulating the connection to threat or risk, might not be large for projects in progress under longstanding agreements. But they are important for good management, new projects, and especially for new partnerships.
The committee has several recommendations to DoD on how it can improve its metrics for the CTR programs and reports on CTR. The committee does not recommend specific alternative metrics (DoD has to develop its own metrics with its partners), but does recommend what it considers a more effective approach for DoD to take to developing metrics. In the report, the committee provides an example of its recommended approach applied to the CBEP.
Objectives and Partnership:
1. For each program in the DoD Metrics Report, DoD should include a concise statement of its objectives and of how the program is intended to reduce threat or risk.
2. Objectives for projects and the overall CTR Program in a partner country are developed jointly between the United States and the partner country. An agreed set of metrics should also be built into projects from the outset. They may change, but the parties responsible for the projects should know at any given time the metrics that will be used to measure impact and effectiveness.
The CTR Program was established by Congress with clear authorities and each activity must begin with a clear statement of the United States’ authorized overall objectives. DoD then needs to work with partner nations to define mutual objectives for their joint efforts. To measure impacts and effectiveness, metrics must include outputs (e.g., changes in apprehension rates at borders) not just input metrics (e.g., training materials provided).6 Where possible, DoD should develop its metrics from outputs linked to the capacities that the programs are trying to build, and incorporate into the DoD-partner agreement provisions for metrics and the means to carry out those measurements.
As DoD takes CTR to new countries, it has opportunities to utilize lessons from 20 years of experience with cooperative threat reduction and develop metrics in a logical way, integrating them in the programs and in the projects from the beginning. The committee summarizes a logical order to developing metrics as follows.
1. State clearly the objectives of the overall U.S. CTR Program, including linkage to threat or risk.
2. Work with the partner country to define objectives for joint activities in their country. U.S. goals and partner-country goals do not need to be congruent (match exactly), but they must be compatible and should be explicitly stated.
3. Identify the partner capacity needed to meet the U.S. CTR Program objectives.7
4. Work with the partner country to define capacity development objectives for each capability. These may be prioritized based on their anticipated impact and the resources required to achieve good results (together yielding “bang for the buck”).
5. Define metrics with the partner country based on capacity objectives. Agree on baselines, data milestones, and measures of success. Identify the source of data collected for each metric and who will provide and maintain the data. Ideally, an entity independent of the decision makers and implementers would collect and report on the performance. Different metrics may be appropriate for different stages.
6. Prioritize the metrics based on their importance to achieving the program objectives and the increase in capacity required.
7. Build metrics, including exercises if appropriate, into the implementation.
8. Provide metrics data in addition to time and costs expended for each project. Evaluate results independently (United States only) and together with partner country.
9. Feed evaluation back into the U.S. and partner country decision-making process.
3. The committee judges that using a consistent framework to prioritize and refine metrics within a program would help DoD and other CTR decision
6 It may not be possible to directly measure the higher-level outcomes from some CTR program’s or project’s performance, even with the best metrics. For example, DoD may never know how many illegal shipments were not interdicted at a border crossing assisted by a CTR program, or how many patients sick with an illness of interest did not go to the hospital. Indeed, it can be difficult to interpret the meaning of a change in a metric. Recognizing this fact at the outset will help to avoid wasted time and effort. These challenges are discussed in greater depth in Chapter 3 (see, for example, the section on Metrics for Weapons of Mass Destruction Proliferation Prevention Program).
7 This mapping may be many to one, i.e., partner capacities may support one or more objectives.
makers. Using such a framework, DoD can identify the highest priority metrics, ensuring that the metrics are useable and useful, and allow decision makers to feed results back into the overall CTR objectives and budgetary process. Any of several decision-making or priority frameworks would work, including the decision analysis technique of swing-weight analysis and/or the DoD capabilities based planning process.
DoD developed each set of metrics in the DoD Metrics Report ab initio, using a different approach for each program, i.e. chemical, biological, borders, and nuclear. Although each CTR program is distinct and possibly unique, it is more difficult to be comprehensive, consistent, and focused without a consistent framework. An important component of measuring what matters is prioritizing among goals and within sets of metrics. Not all goals are equally important and the act of prioritization will help avoid double counting and other similar pitfalls. DoD should consider using a consistent framework to prioritize and refine metrics within a program. There are many decision and prioritization frameworks. In this report, the committee highlights the decision analysis technique called swing-weight analysis,8 and the framework used widely in DoD called capabilities based planning. Using a consistent framework does not mean that the metrics for each CTR program or program element should be the same, but that a common framework should be used for defining program objectives, partner capacities, capacity objectives, and metrics.
Working with Other Agencies
4. DoD plans to leverage other U.S. Government agencies’ experience, capabilities, and assets as CTR expands to new countries and as it continues existing programs. DoD also needs to communicate, coordinate, and cooperate with relevant agencies.
DoD is not the only agency engaged in capacity-building programs (such as those for deterrence, detection, and defense against WMD). U.S. Customs and Border Protection operates a program with equipment, training, and services similar to the CTR Proliferation Prevention Program along many thousands of kilometers of border and has developed metrics for its mission and operations. The U.S. Department of Agriculture’s Animal and Plant Health Inspection Service is a leader in an international surveillance network that has many parallels to the global network DoD leadership envisions for reducing biothreats. The United States Agency for International Development operates capacity building programs with partners across the world to foster democratic institutions and has had to develop metrics for these hard-to-measure efforts. The DoD Defense Security Cooperation program also shares important similarities with the DoD CTR Program and may in some cases serve as a model.
DoD’s CTR metrics do not address the full scope of the threat in each of the WMD areas. Instead, the metrics address only the scope of the funded projects. While the project metrics are useful in assessing the annual project status, they do not help Congress and senior leaders in
8 A swing weight matrix defines the importance and range of variation for a set of metrics. The idea of the swing weight matrix is straightforward: a metric that is very important to the decision should be weighted higher than a metric that is less important. A metric that differentiates between alternatives is weighted more than a metric that does not differentiate between alternatives.
DoD and other parts of the government to understand the full scope of the potential for cooperative threat reduction which could help identify the need for and scope of future projects.
Not only can DoD CTR learn from these other agencies/programs, but DoD will be working with these other agencies in a “whole of government” effort. These and others might already have mechanisms in place for measuring impact and effectiveness that could be useful to DoD CTR.
Time and Change
5. DoD’s metrics and planning process should factor in more explicitly both planned and unplanned change over time. During the phases of active DoD involvement in a CTR project and afterward during sustainment, which is its own stage requiring resources (budgets, equipment, and trained people), clearer planning for how changes and metrics results will feed into decision making will make the metrics more credible and useful for both DoD and the partner country.
The purpose of metrics is to provide information to inform CTR Program decision makers. Effectiveness and impact metrics should be tracked over time, and impact and effectiveness will be compared to the time and resources expended. In the early stages, even when operating according to plans, projects are unlikely to have measurable impact, but that does not mean that they are not on track or will not have an impact. Different project stages require different metrics as measures of progress. CTR Program managers and other U.S. Government decision makers need this information to ensure that project resources are achieving the program objectives. DoD and the partner country can also use this information to signal the need to develop and implement corrective action plans so resources can be better utilized or reallocated, if progress is not satisfactory. Explicitly factoring in change will help DoD with clarity and completeness of the metrics. When objectives change due to changing circumstances, managers may need to change the metrics as the original metrics may no longer be relevant. A phased, adaptive approach will enable DoD to use appropriate metrics at each stage of a project so that the right kind of impact and effectiveness can be shown even at the beginning of a project. This is especially true for sustainment.
6. Capacity building programs need independent evaluation of how the capabilities being built perform in action. This can be accomplished by several means, ranging from expert observations of routine operations to comprehensive exercises that test the full scope of capabilities. The level of effort can be tailored to the scope of the program, its resources, and its relative importance. DoD and its partners should build such independent evaluation into each project. The Defense Security Cooperation Program might be a good model for how to proceed.
Independent evaluation establishes a degree of credibility that is hard to achieve by other means. Especially for capacity building programs, some kind of independent evaluation is essential. Exercises are a good way to measure effectiveness and sustainment. The kind of
evaluation employed needs to be tailored to the scope of the program, its resources, and its relative importance. The evaluation might take the form of periodic expert observations of the project operations or the partner country’s capacities. It might be an impromptu test of a randomly selected part of the system (e.g., a border protection system). Or it might be an exercise of the system, such as was performed for CBEP in Georgia as part of the Initial Operational Capability assessment. Ideally, the exercises would be designed by an entity independent from the groups being tested.
Measuring progress in building capacity and effectiveness of programs to prevent low-frequency, high-consequence events is difficult. Assessment against standards and guidelines (e.g., is the partner’s action plan for interdicted nuclear material consistent with the International Atomic Energy Agency model action plan?) is one important component, but DoD and Congress care more about likely performance when the event occurs (i.e., how effectively can the partner implement the action plan?). Exercises can help measure both capability and performance in such programs. While exercises might be structured differently in different countries and for different projects, they should be built into the implementation and evaluation components of capacity-building projects and programs from the beginning, starting with a baseline evaluation and proceeding with midcourse and final or sustainment evaluations.
OTHER MAJOR ISSUES FOR CTR IN THE FUTURE
Some Congressional authorizers and appropriators have questioned whether CBEP should in fact be a part of DoD’s CTR mission. It is beyond the scope of this study to comment on what is inside or outside the scope of the CTR mission. Fundamentally, whether DoD should be the agency to carry out the CBEP mission is only secondarily a metrics question. The primary question is whether the U.S. Government wishes to prioritize the work done under CBEP and what mix of government agencies is best equipped to carry it out. The increases in budget and scope to date, as well as the National Strategy for Countering Biological Threats, indicate that CBEP is a growing priority to the Administration and DoD’s involvement reflects a conscious choice to use DoD to carry out this mission because of its experience with biodefense research and its experience working on threat reduction programs. Critics may dispute these decisions, and can legitimately point out that difficulties in developing reliable direct metrics for the program’s impact and effectiveness increase the programmatic risks of the program, but mixing these issues with questions about the metrics themselves confuses matters and makes it more difficult to make progress in the program and in the debate about the program.
Finally, defining and measuring completion—how do we know when we are done?—and sustainability—will the improvements take hold and will the partner nation support and sustain the programs when U.S. funding stops?—are critically important for CTR programs, particularly capacity-building programs. What completion and sustainability mean and how they should be implemented and measured for a given program should be part of the formulation of objectives.
There is a mismatch in the vision of sustainability and measuring completion of the program among different CTR decision makers in the Administration and on Capitol Hill. One vision might be called a project view, in which DoD partners with a country, engages in a set of concrete activities with a well-defined beginning and end, and then DoD exits and monitors sustainment after project completion. The other main vision might be called a relationship view, in which DoD partners with a country, works with the partner to build a joint or multilateral
network that is exercised regularly to maintain an on-going relationship with no defined end date. These visions appear mutually exclusive, but there are different phases to capacity building programs: the initial phase may involve intensive efforts and capital expenditures. There should be schedules and milestones for completion of this phase. The long-term relationship that follows may be open ended, but it also should require far less funding, which should allay some concerns about programs with no exit strategy.