Executive Summary
The Climate Change Science Program (CCSP) and its predecessor U.S. Global Change Research Program (USGCRP) have sponsored climate research and observations for nearly 15 years. Although significant scientific discoveries and societally beneficial applications have resulted from these programs, the overall progress of the program has not been measured systematically. Metrics—a system of measurement that includes the item being measured, the unit of measurement, and the value of the unit—offer a tool for measuring such progress, improving program performance, and demonstrating program successes to Congress, the Office of Management and Budget, and the public.
Metrics have been applied successfully to research programs in industry, academia, and the government. The challenge is applying them to a complex program such as the CCSP, which involves 13 federal agencies and sponsors a wide range of activities—from basic research on the earth-ocean-atmosphere-human system, to assessment and risk analysis, to decision making. At the request of the James Mahoney, director of the Climate Change Science Program and chair of the Subcommittee on Global Change Research, the National Research Council’s Committee on Metrics for Global Change Research was convened to
-
provide a general assessment of how well CCSP objectives lend themselves to quantitative metrics;
-
identify three to five areas of climate change and global change research that can and should be evaluated through quantitative performance measures;
-
for these areas, recommend specific metrics for documenting progress, measuring future performance (such as skill scores, correspondence across models, correspondence with observations), and communicating levels of performance; and
-
discuss possible limitations of quantitative performance measures for other areas of climate change and global change research.
The committee approached its task first by examining the experience of industry, federal agencies, and academia with implementing metrics, and then by formulating possible metrics for a wide range of CCSP objectives. It began its deliberations with some skepticism as to whether metrics would apply to many of the elements of the program. However, analysis showed that it is possible to develop meaningful and useful measures for all parts of the CCSP. The difficulty arises in selecting a few areas of global change and climate change for which metrics should be developed (charge 2). The committee found that it was not possible to make this selection without a clearer sense of program priorities. The CCSP strategic plan does not contain measures of success, and program objectives are written too broadly for them to be inferred. However, even if such guidance were available, the committee found that a broader range of quantitative and qualitative metrics would be a more valuable tool for managing the program. The key to promoting successful outcomes is to consider the program from end to end, starting with program processes (e.g., planning and peer review) and inputs (e.g., resources) and extending to outputs (e.g., assessments, forecasts), outcomes (e.g., results for science and society), and long-term impacts. Principles and a framework for creating and implementing metrics for the entire CCSP are described below.
PRINCIPLES FOR DEVELOPING METRICS
Industry, federal agencies, and academia have different objectives in developing metrics. Industry has long used metrics to gauge progress in meeting business objectives and to identify where adjustments should be made to optimize performance and increase profits. Federal agencies are increasingly relying on metrics, either to manage programs or to increase their accountability to Congress and the public. The latter motivation was strengthened by the Government Performance and Results Act of 1993, which required federal agencies to set strategic goals and to measure program performance against those goals. Finally, academia uses metrics to supplement peer evaluation in decisions to hire or promote faculty members,
allocate resources among departments, or compare the performance of departments at different universities.
Based on the collective experience of these three sectors, the committee offers the following principles for developing useful metrics and avoiding unintended consequences:
-
Good leadership is required if programs are to evolve toward successful outcomes. The overall program will suffer if no one has the authority to direct resources and/or research effort and to develop and apply metrics. The leadership of a few individuals in supporting research and/or publicizing the implications of research results, for example, helped speed understanding of the causes of Antarctic ozone loss. These actions ultimately led to regulations on the reduction of chlorofluorocarbon emissions, which are expected to return effective chlorine amounts in the stratosphere to pre-ozone-hole conditions by mid century.
-
A good strategic plan must precede the development of metrics. Such a plan includes well-articulated goals against which to measure progress and a sense of priorities. Absent this context, it is difficult to select the most important measures for guiding the program.
-
Good metrics should promote strategic analysis. Demands for higher levels of accuracy and specificity, more frequent reporting, and larger numbers of measures than are needed to improve performance can result in diminishing returns and escalating costs. The nearly continuous assessments of the Intergovernmental Panel on Climate Change, for example, have the potential to provide only incremental improvements in policy guidance while imposing a heavy burden on the scientific community.
-
Metrics should serve to advance scientific progress or inquiry, not the reverse. Good measures will promote continuous improvements in the program, whereas poor measures could encourage actions to achieve high scores (e.g., “teaching to the test”) and thereby lead to unintended consequences. For example, a metric to measure the convergence of climate models succeeds if it leads to an improved understanding of the physical processes being modeled, but fails if it subtly encourages researchers to adjust their models solely to bring them into better agreement with one another.
-
Metrics should be easily understood and broadly accepted by stakeholders. In standard land classifications, for instance, areas covered by dense canopy are considered “forest,” even if they are severely logged and degraded. This land-cover metric would not be acceptable to paper companies, environmental groups, local governments, or other stakeholders without additional information on forest and environmental characteristics.
-
Promoting quality should be a key objective for any set of metrics. Quality is best assessed by independent, transparent peer review.
-
Metrics should assess process as well as progress. Metrics in a complex program such as the CCSP will be diverse, measuring factors that range from program planning, to resulting scientific knowledge and practical applications, to the ultimate impact of policy decisions on society.
-
A focus on a single measure of progress is often misguided. Relying solely on the metric of reducing uncertainty, for example, can create an erroneous sense of progress, since uncertainty can increase, decrease, or remain constant as the understanding of causal factors improves.
-
Considerable challenge should be expected in providing useful a priori outcome or impact metrics for discovery science. Care should be taken to avoid applying measures that stifle program elements for which the outcome is unknown. For example, metrics could have been devised to monitor the progress of C.D. Keeling’s measurements of CO2 in the atmosphere, two to four years after the program started. However, they would have missed the fundamental achievement enabled by this and subsequent measurements—the discovery of an annual cycle and decadal trend in atmospheric composition.
-
Metrics must evolve to keep pace with scientific progress and program objectives. Adjustments to the measures will be required as program managers gain experience and the program itself matures and evolves. For example, the CCSP strategic plan places greater emphasis on scientific assessments, decision support, and short-term outcomes than USGCRP plans and requires a greater breadth of metrics.
-
The development and application of meaningful metrics will require significant human, financial, and computational resources. It is possible to develop and apply thousands of metrics for the CCSP, but doing so would be costly and may not lead to improved program performance. A deliberative process of selecting the few most appropriate metrics, collecting the necessary information, and carrying out the evaluation will be required.
Although each of these principles is important, three merit especially careful attention: (1) leadership to guide the program and apply the metrics, (2) a plan of action against which to apply the measures, and (3) the potential to use metrics not just as simple measures of progress, but as tools to guide strategic planning and foster future progress. The first two are generally required if a program is to succeed. The last is a lesson learned from industry, and it informed the committee’s approach to developing metrics for the CCSP.
METRICS FOR THE CCSP
The first challenge in developing metrics is to choose goals against which progress should be measured. The CCSP strategic plan has hundreds
of goals and objectives, stated at different levels of specificity, from five overarching goals to 224 milestones, products, and payoffs. The committee found that the milestones, products, and payoffs could be grouped into eight themes, which cover the scope of the program and are amenable to the development of metrics:
-
improve data sets in space and time (e.g., create maps, databases, and data products; densify data networks);
-
improve estimates of physical quantities (e.g., through improvement of a measurement);
-
improve understanding of processes;
-
improve representation of processes (e.g., through modeling);
-
improve assessment of uncertainty, predictability, or predictive capabilities;
-
improve synthesis and assessment to inform;
-
improve assessment and management of risk; and
-
improve decision support for adaptive management and policy making.
One or two case studies were developed for each of these themes to explore how metrics could be developed and to determine how difficult it would be to generalize them to other elements of the program. The assumption was that a long list of unique metrics would emerge and that the challenge would be to choose and refine the few that seemed most important. A long list of metrics was in fact produced. However, comparison of the metrics revealed that many were similar (especially those that measured the research and development process or inputs to it) and that others could be rewritten more generically. This observation and subsequent tests led to a surprising conclusion: a general set of metrics can be developed and used to measure progress and guide strategic thinking across the entire CCSP. The general metrics recommended by the committee are given in Box ES.1.
Every metric will not be applicable to every CCSP program element. Moreover, it would be too expensive to measure and monitor all elements of the program, especially if the results are not going to be used. Consequently, efforts should be made to select the most appropriate measures. Metrics to guide strategic thinking will focus on identifying and monitoring program strengths and weaknesses with the object of enabling managers to make decisions that support successful outcomes. These measures will become apparent from even rough scores or answers to the metrics listed in Box ES.1. Metrics for demonstrating program progress will depend on how CCSP agencies define what constitutes success. As agencies gain experience, the initial metrics listed in Box ES.1 will be refined and simplified until only the most useful emerge.
Box ES.1 Process Metrics (measure a course of action taken to achieve a goal)
Input Metrics (measure tangible quantities put into a process to achieve a goal)
Output Metrics (measure the products and services delivered)
|
(c) new and applicable measurement techniques, (d) scenarios and decision support tools, and (e) well-described and demonstrated relationships aimed at improving understanding of processes or enabling forecasting and prediction.
Outcome Metrics (measure results that stem from use of the outputs and influence stakeholders outside the program)
Impact Metrics (measure the long-term societal, economic, or environmental consequences of an outcome)
|
The way in which general metrics are used depends on both the identity of the evaluators and the granularity of the program element being evaluated. Agency managers might give rough answers to all of the general metrics to assess strengths and weaknesses of the program and then determine an appropriate course of action. Indeed, the process of evaluating the program and selecting the measures should be as valuable to the agencies as the measures themselves. Expert panels might use the general metrics to develop a broader context for the project being reviewed. Finally, stakeholders might focus on outcome and impact metrics.
Highly focused programs may require highly specific metrics. The general metrics provide the categories to be evaluated, but they will have to be narrowed down and reworded in terms that are specific to the program goal. In refining the metrics, care must be taken to recognize and minimize biases, which are inevitable in subjective judgments. Attention must also be paid to developing an evaluation system to score each of the metrics and to aggregate different types of measures.
CONCLUSIONS
-
Meaningful metrics can be developed for most aspects of the CCSP, from enhancement of data networks to increases in public awareness of climate change issues. The general set of metrics developed by the committee provides a useful starting point for identifying a small set of important measures, and the principles provide guidance for refining the metrics and avoiding unintended consequences.
-
The metric used most commonly to gauge progress of the CCSP—reduction of uncertainty—has the potential to be misleading and should not be used in isolation. Uncertainty about future climate states may increase, decrease, or remain the same as more is understood about the elements that control the system.
-
A mixture of qualitative and quantitative metrics is required to assess the progress of the CCSP. Quantitative measures (e.g., numerical scores, yes or no answers) are most useful for evaluating management, assessing the research and development process, or measuring aspects of research output. Qualitative measures are most useful for assessing quality and program results. In general, peer review is required to assess quality or progress toward improved understanding, and stakeholder judgments are required to assess the usefulness or impact of many programs.
-
Discovery and innovation are difficult to measure with quantitative metrics. The best approach is to use process and input measures that ensure the promotion of discovery and innovation. As the science matures, more output, outcome, and impact measures become appropriate.
-
A number of candidate CCSP metrics, especially those that assess outcomes and impacts, will depend on a wide range of factors, including some outside of the program (e.g., politics, technological advance). To avoid misinterpreting these measures (e.g., one weak component dominating the evaluation of an otherwise strong program), the explanation should accompany the score or answer. The context or explanation is as important as the score.
-
Although some metrics can measure short-term impacts (e.g., CCSP payoffs scheduled to occur within two to fours years), it may take decades to fully assess the substantial contributions to the global debate on climate change being made by the CCSP and its predecessor USGCRP.
Although the maxim “what gets measured, gets managed” is not always true, the reverse generally is. A system of metrics, developed through an iterative process and evaluated in consultation with stakeholders, could be a valuable tool for managing the CCSP and for further increasing its usefulness to society.