tion—despite dissimilar panels and mandates produced similar results, lending credence to the technique.
During the experiments, committee members were able to identify several particular strengths of international benchmarking:
Panels were able to identify institutional and human-resource factors crucial to maintaining leadership status in a field that is unlikely to have been identified by other methods.
Benchmarking allows a panel to determine the best measures for a particular field while providing corroboration through the use of different methods, as opposed to the "one-size-fits-all" approach of some common evaluation methods.
Benchmarking can produce a timely but broadly accurate "snapshot" of a field.
The experiments were sufficiently thorough to provide guidelines for future experiments, including the following:
Because of the panels' use of expert judgment, the choice of panelists is a key to the credibility of the results. A tendency toward national biases can be mitigated by ensuring diverse geographic membership of panels; the same is true of the groups that select the panel members. In particular, it is critical to include non-US participants in the selection of panelists and as panel members because they provide perspective and objectivity.
Because major fields of research change slowly, benchmarking can probably detect important changes in quality, relevance, and leadership in fields when conducted at intervals of 3-5 years. It is unlikely that changes can be detected by annual benchmarking.
The choice of research fields to be evaluated is both challenging and critical. A "field" might best be considered the array of related domains between which investigators can move without leaving their primary area of expertise.
Benchmarking produces information that administrators, policy-makers, and funding agencies find useful as they make decisions as to what activities a federal research program should undertake and respond to demands for accountability, such as the Government Performance and Results Act.
If federal agencies use benchmarking, the wide variation in agency missions dictates that each agency tailor the technique to its own needs.
Use of indicators that provide information on degree of uncertainty and reliability of benchmarking results might enhance the presentation of panel assessments of leadership status.