Appendix C
Performance Measures Used by Other Agencies and Organizations

Many agencies and organizations involved in managing research programs are attempting to develop performance measures to evaluate the quality of their programs. In an effort to understand how metrics can be used to evaluate programs effectively, the committee reviewed evaluation tools used by the National Institute for Standards and Technology (NIST), the U.S. Air Force, Texas, Maine, Kansas, and two academic programs: The National Science Foundation (NSF) Experimental Program to Stimulate Competitive Research (EPSCOR) and the National Sea Grant Program.

Typically, two primary mechanisms drive the use of performance measures: specific legislative requirements and the desire to benchmark outcomes and impacts of a program (J. Melkers, University of Illinois, Chicago, personal commun., June 8, 2002).

PERFORMANCE MEASURES USED TO EVALUATE FEDERAL AGENCIES

The NIST Advanced Technology Program (ATP) and the U.S. Air Force Scientific Advisory Board (SAB) are two examples of federal-agency research programs that have developed specific metrics or evaluation criteria to gauge the performance of their research programs.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 165
Appendix C Performance Measures Used by Other Agencies and Organizations Many agencies and organizations involved in managing research programs are attempting to develop performance measures to evaluate the quality of their programs. In an effort to understand how metrics can be used to evaluate programs effectively, the committee reviewed evaluation tools used by the National Institute for Standards and Technology (NIST), the U.S. Air Force, Texas, Maine, Kansas, and two academic programs: The National Science Foundation (NSF) Experimental Program to Stimulate Competitive Research (EPSCOR) and the National Sea Grant Program. Typically, two primary mechanisms drive the use of performance measures: specific legislative requirements and the desire to benchmark outcomes and impacts of a program (J. Melkers, University of Illinois, Chicago, personal commun., June 8, 2002). PERFORMANCE MEASURES USED TO EVALUATE FEDERAL AGENCIES The NIST Advanced Technology Program (ATP) and the U.S. Air Force Scientific Advisory Board (SAB) are two examples of federal-agency research programs that have developed specific metrics or evaluation criteria to gauge the performance of their research programs.

OCR for page 165
National Institute for Standards and Technology Advanced Technology Program The NIST ATP has developed a complex program-evaluation tool, the business reporting system (BRS). The BRS, which was implemented in 1994, is used to track companies that have received funding through the ATP. It is an impressive evaluation tool that comprehensively evaluates the business and economic impacts of each research project from start to finish (see Box C-1). Companies are asked to respond to a number of detailed surveys before, during, and after their projects are completed. The surveys include questions regarding the commercial application of proposals, business goals, strategies for commercialization and for protecting intellectual property, dissemination efforts, publication in professional journals and presentations at conferences, participation in user associations, public-relations efforts, R&D status, collaborative efforts, impacts on employment, and attraction of new funding. U.S. Air Force Research Program Evaluations of the U.S. Air Force research program are conducted by the U.S. Air Force SAB. The first SAB evaluation was conducted in 1991 (R. Selden, U.S. Air Force SAB, personal commun., Jan. 9, 2003). Programs are evaluated for quality and relevance of research, and each directorate is evaluated every 2 years. Typical metrics used to evaluate research programs are university metrics (publications, patents, and peer review) and a grading system that is used to evaluate the various components of the research programs in each directorate on the basis of 10 criteria (see Box C-2). Scores are normalized across the different directorates (Selden 1998). PERFORMANCE MEASURES USED TO EVALUATE STATE-LEVEL PROGRAMS Many states have developed science and technology performance metrics to evaluate whether and how research programs are encouraging economic development. State governments tend to be more interested in whether research programs are encouraging economic development than in the quality or value of the research itself. Texas, Kansas, and Maine have developed a process of using performance measures to evaluate their re

OCR for page 165
BOX C-1 NIST ATP Performance Metrics Companies are asked to submit information regarding their progress and economic contributions in the form of four electronically submitted reports: Baseline report. Companies submit information regarding commercial application of proposals, business goals, strategies for commercialization and for protecting intellectual property, and dissemination efforts. Companies are asked to rank the importance of publishing in professional journals, presenting papers at conferences, participation in user associations, and public-relation efforts (NIST 2002a). Anniversary report. Companies list major commercial applications identified in the proposal, business goals, progress toward commercialization, R&D status, collaborative efforts, employment impacts, attraction of new funding, strategies for protecting intellectual property, and dissemination efforts. This section asks companies to evaluate the status of their R&D cycle as a result of ATP funding (NIST 2002b). These reports are completed annually. Close-out report. Companies discuss commercial applications, business goals, early impacts on revenue and cost, future projections on revenue and costs, R&D status, collaboration impacts, impacts on employment, attraction of new funding, strategies for protecting intellectual property, and dissemination plans (NIST 2002c). Reports are completed after the conclusion of projects. Post-project summary. Companies provide information on postproject affiliation, funding sources, the impact of ATP funding on product development and process capability, anticipated future market activity, and R&D partnering with other organizations (NIST 2002d). Reports are completed 2, 4, and 6 years after projects are completed. search and science and technology programs. The performance measures are representative of similar evaluations being conducted by other states. Texas The primary research effort in Texas is known as the Advanced Reevaluation efforts for ARP/ATP are coordinated by the Texas Higher Edu

OCR for page 165
BOX C-2 U.S. Air Force Scientific Advisory Board Science and Technology Review Evaluation Criteria Science Foundation Work described is based on sufficiently understood phenomena. Uses best and most recent available science applicable to the problem. All scientific issues, together with the work to address those issues, are identified. There is a rigorous approach to stated technical problem. Distinction is made between innovative design concepts and innovative science. Strategic Vision How does technology fit into evolving military capabilities? A clear identifiable path exists which connects technology to military capabilities. Leadership of scientific community into new research areas of high leverage for the Air Force? Commercial technology growth is forecast and planned for incorporation. Focus of Efforts Sufficient resources for a critical mass. Accountability exists for technical milestones. Scope is defined to maximize output. Maximum leverage with other programs exists where appropriate. Research Environment Quality and capabilities of facilities and equipment. Work atmosphere fosters productive interaction and allows for constructive criticism without fear of retribution. External experts are consulted (excluding lab contractors); lab is receptive to external ideas. Approach Addresses timely delivery of product. Leverages similar research from government/industry. Teams with the best from government and/or industry. Uses or modifies commercial technology. Current effort differs in approach from what was done 5 or 10 years ago. Maintains a balanced approach between cost and performance.

OCR for page 165
Innovation Applies novel techniques and cross-disciplinary science. New concepts/techniques/devices emerged as a result of the technology. Reflects “out-of-the-box” thinking. Output Results of technology have effective and timely transition. Milestones result in “interim” products (prototypes, increased knowledge). Technical quality evidenced by awards from technical societies. Customer satisfaction evidenced by customer feedback. Understand the metrics of success. People Qualifications, reputation, and technical productivity compared with other organizations in the same discipline. “Top guns” are involved in research. Solid mentoring system in place. Programs managers are recognized as experts in their fields. Context Understand military capability needs and priorities for technology development. Technology fits properly in context with similar research. Technology addresses the “tall poles.” There is an awareness of similar research inside/ outside of DOD. Long-term Relevance Technology will have a short-term and/or long-term impact on Air Force capabilities, weapon systems, personnel, and environment. Technology addresses unique long-term DOD/USAF weapon system or infrastructure needs (technology push). Technology provides meaningful improvement to weapon system sustainability (reliability, maintainability). Source: USAFSAB/USAFRL 1998. cation Coordinating Board. Progress reports and final reports are used to track the impact of research projects. Surveys are used to gather information about the progress of research. The survey metrics used to evaluate

OCR for page 165
research projects include number of publications and performance of graduate students (see Box C-3) (ARP/ATP 2002; J. Melkers, University of Illinois, Chicago, IL, personal commun., July 8, 2002). Kansas The Kansas Technology Enterprise Corporation is responsible for managing grant programs for applied research and for equipment used in science and technology skill training. Examples of metrics used to evaluate programs are ranking of importance of commonly accepted economic development goals (such as, job creation, encouraging technologic innovation and entrepreneurial spirit, and literature review), and literature reviews (Burress et al. 1992). Maine Maine’s Research and Development Evaluation Project (headed by the Maine Science and Technology Foundation) was asked by the state legislature to undertake a comprehensive 5-year evaluation that focuses on how the state R&D program has evolved and affected R&D industry and the level of innovation-based economic development in the state. Evaluation of Maine’s Public Investment in Research and Development, a report produced by the Maine Science and Technology Foundation (MSTF 2001), documents each research program that has been evaluated and the processes and methods used to evaluate each program. Performance measures are varied. For instance, the evaluation of the Maine Biomedical Research Program focused on output and outcome measures. Output measures include a plan showing how the funds would be used and the resulting research and economic benefits, peer-review journal articles demonstrating competitiveness of the institution’s research, and the amount of funding from outside sources and its use. Outcome measures include an evaluation of the direct and indirect economic impact of the funded research and an assessment of the contribution of the funded research to scientific advancement and the institution’s competitive position (MSTF 2001). The foundation has prepared a survey for research institutions to assist them in collecting data for program evaluation (see Box C-4).

OCR for page 165
BOX C-3 Texas Higher Education Coordinating Board Research-Project Performance Metrics The following are the questions being asked in 2002 of all persons who have received ARP/ATP funds. Responses are provided electronically. A. Provide a short (200 word) description of what you did during this pro-ject. This description should be written for a lay rather than a highly technical audience. B-1. Over the term of this project, how many different people (including the PI’s) have been supported by this project? Categories for response include men, women, black, hispanic, native american, foreign national. B-2. Over the term of the grant, how many additional people have worked on, but not been supported by the project? Same categories as for B-1. C. Identify those students who have worked on this project and graduated from your institution. Identify where they are now working. D-1. Over the term of the grant, what additional funding has your institution received, directly or indirectly, as a result of participation in this program? D-2. Over the term of the grant, what additional funding has your institution requested as a result of participation in this program? E. Over the term of this grant, how many different publications resulted from this project. Categories are refereed journals, conference proceedings, technical reports, book chapters, and other. For actual publications, full citations are requested. F. Brief “success” stories are requested. G. Briefly describe any “industrial” or commercial connections your project has. H. Briefly describe what you have done to effect technology transfer of the work done in this project. Subsequent questions are related to information about interaction with actual or potential collaborators, commercialization, knowledge utilization, and possible licensing opportunities. Source: ARP/ATP 2002. PERFORMANCE EVALUATION MEASURES IN ACADEME Program evaluation is also used widely in academe, particularly in programs that involve improving educational opportunities and academic

OCR for page 165
BOX C-4 Draft Survey for Maine Research Institutions (Revised February 1, 2002) Name of Research Institution: Name of Person Completing Survey: Position: Institutional Capacity: Increase in number of enrolled science and engineering graduate students attributable to state R&D funding Increase in number of science and engineering graduate degrees awarded attributable to state R&D funding Number of new degree programs established as a result of state R&D funding New and/or renovated R&D space available as a result of state R&D funding Value of new facilities and fixed equipment acquired as a result of state R&D funding Number of new FTEs hired as a result of state R&D funding Major (purchase price > $50,000) new research equipment acquired as a result of state R&D funding Outcomes of State R&D Investments: Number of publications (total) Number of publications in referred journals Number and value of research proposals submitted Dollar value of research proposals submitted Number and value of research proposals submitted jointly with other Maine institutions Number and value of research proposals submitted jointly with non-Maine institutions Number and value of new federal research grants/contracts/ subcontracts awarded (total) Number and value of new federal research grants/contracts/ subcontracts awarded (EPSCOR only) Number and value of new federal research grants/contracts/ subcontracts awarded (earmarked only) Number and value of new industrial research grants/contracts/ subcontracts awarded (total) Number and value of new industrial research grants/contracts/ subcontracts awarded (by Maine companies) Number and value of new foundation grants awarded Number of new companies formed on the basis of state supported R&D

OCR for page 165
Number of jobs in these companies at spin-off Number of disclosures made Number of patents applied for Number of patents awarded Number of copyrights obtained Number of plant breeder’s rights obtained Number of licensing agreements completed (total) Number of licensing agreements completed with Maine companies Source: MSTF 2001 competitiveness. Two such programs that are federally funded but administered by universities are NSF’s EPSCOR and the National Sea Grant Program. The Experimental Program to Stimulate Competitive Research EPSCOR is designed to improve the R&D competitiveness of states that have traditionally received smaller amounts of federal research and development funding—based on a per capita comparison. The program requires a commitment on the part of the states to improve the quality of science and engineering research and training at colleges and universities. Three key groups of metrics describe a state’s science and technology environment: NSF support, total federal academic R&D contribution, and high technology activity (NSF 2002). Each group contains a number of metrics that can be compared across states and over time. Most of the metrics involve assessments in terms of people, programs, and dollars. The following are examples: Total number of NSF research-support awards per year. Academic R&D obligations by all federal agencies per year. Total number of graduate students in science and engineering. Additional measures of the effectiveness of the programs include number of grant proposals submitted, number of grant proposals funded, quality of peer-reviewed research, professional contributions of students, publica

OCR for page 165
tion and patent productivity, return on investment, and contribution to the state (for example, an improved environmental program) (NSF 2002). National Sea Grant Program The National Sea Grant Program, created in 1966, established a partnership between the National Oceanic and Atmospheric Administration and universities to encourage the development of sea-grant institutions for the purpose of engaging in research, education, outreach, and technology transfer in an effort to encourage stewardship of the nation’s marine resources. Performance benchmarks for evaluation have been developed to determine whether the goals and strategic plans of each sea-grant institution are being met. Programs are evaluated according to the following weighted criteria (NOAA, unpublished material, 1998): Effective and aggressive long-range planning (relative weight, 10%). Organizing and managing for success (relative weight, 20%). Connecting sea grant with users (relative weight, 20%). Producing significant results (relative weight, 50%). Each sea-grant institution is given sets of recommended questions or established expected-performance benchmarks designed to gauge how well the program has met the goals established during strategic planning. Benchmarks typically include questions about the quality of the peer-review process, detailed information about the strategic planning process, measures to determine the quality of program management, ability of the program to develop private-sector matching funds, the number of published peer-reviewed papers in relation to the size of the research program, and questions to gauge the social, economic, and scientific contributions of program research (NOAA, unpublished material, 1998). CONCLUSIONS Federal research programs tend to focus more on the collection of product metrics than process metrics. Among federal research programs, there tends to be a presumption that peer review is the key process necessary to

OCR for page 165
ensure a successful program. However, there tends to be relatively little discussion of who is responsible for conducting the peer-review evaluations. The committee considers that peer review is a necessary but not sufficient condition to ensure a successful program. Evaluations at the state level are driven principally by economic considerations. There tends to be little targeting of specific research topics except in broad terms, such as nanotechnology. Many of the evaluations are based on surveys of participating institutions and data routinely collected at the state level, such as number of students enrolled in institutions of higher learning. NSF’s EPSCOR produces a level of standardization that allows comparison of R&D across states and across time. The standardization across time and place provides consistency, an important attribute of metrics. REFERENCES ARP/ATP (Advanced Research Program/Advanced Technology Program). 2002. Research Projects Performance Metrics. Texas Higher Education Coordinating Board, Austin, TX. [Online]. Available: http://www.arpatp.com/online/ [accessed June 12, 2002]. Burress, D., M. El-Hodiri, and V.K. Narayanan. 1992. An Evaluation Model to Determine the Return on Public Investment (ROPI) for the Kansas Technology Enterprise Corporation. Report No. 211. Institute for Public Policy and Busi-ness Research, The University of Kansas. November [Online]. Available: http://www.ukans.edu/cwis/units/ippbr/resrep/pdf/ M211.pdf [accessed Jan. 22, 2003]. MSTF (Maine Science and Technology Foundation). 2001. Evaluation of Maine's Public Investment in Research and Development. [Online]. Available: http: //www.mstf.org/evaluation/pdfjump.html. [accessed Jan. 29, 2003]. NIST (National Institute of Standards and Technology). 2002a. ATP Baseline Business Report. Optional Worksheet for Organizing Data. Advanced Tech-nology Program, National Institute of Standards and Technology, Technology Administration, U.S. Department of Commerce. NIST (National Institute of Standards and Technology). 2002b. ATP Anniversary Business Report. Optional Worksheet for Organizing Data. Advanced Technol-ogy Program, National Institute of Standards and Technology, Technology Administration, U.S. Department of Commerce. NIST (National Institute of Standards and Technology). 2002c. ATP Close-Out Business Report. Optional Worksheet for Organizing Data. Advanced Technol-ogy Program, National Institute of Standards and Technology, Technology Administration, U.S. Department of Commerce.

OCR for page 165
NIST (National Institute of Standards and Technology). 2002d. ATP Post-Project Summary Business Report . Optional Worksheet for Organizing Data. Ad-vanced Technology Program, National Institute of Standards and Technology, Technology Administration, U.S. Department of Commerce. NSF (National Science Foundation). 2002. Experimental Program to Stimulate Competitive Research (EPSCoR). National Science Foundation. [Online]. Available: http://www.ehr.nsf.gov/epscor/statistics/ start.cfm/ [accessed Dec. 19, 2002]. Selden, R.W. 1998. Air Force Science and Technology Quality Review. Review Overview Document. Scientific Advisory Board Science and Technology. June. USAFSAB/USAFRL (U.S. Air Force Scientific Advisory Board and U.S. Air Force Research Laboratory). 1998. Memorandum of Understanding for the Air Force Science and Technology Quality Review between U.S. Air Force Scien-tific Advisory Board, and U.S. Air Force Research Laboratory, Appendix III. Evaluation Criteria. August 1998.