Once COSEPUP decided on the use of expert panels for benchmarking, it found few models on which to base a method. Traditional peer-review panels provided a sound precedent, although most peer reviewers focus only on the quality of a program or project. They might or might not be asked to assess quality relative to world standards, to identify the key determinants of research performance, or to assess the future performance of research programs. The latter exercises, however, are integral to international benchmarking and require experts who have a broad understanding of a field as a whole and knowledge of the researchers who are most influential in that field.
2.1 General Features of Methodology
COSEPUP decided that the benchmarking of each field should be guided by an oversight group that included people with broad backgrounds. The oversight group, which included US members of relevant National Research Council commissions and boards, was asked to define a field's subfields (which could later be modified by the panel) and to select a benchmarking panel of the highest quality. In addition, because each panel was part of an overall study, it was important to strive for a consistent approach. COSEPUP offered the following guidelines to each panel before it began its work:
Panels should develop findings and conclusions regarding the US leadership status in a research field, but not recommendations. In
particular, the panel members should avoid statements that might be construed as recommendations to increase the funding for their field.
Each panel, in developing its report, should seek a consensus based mainly on the informed judgments of panel members.
Panel members should focus on the accomplishments of researchers in the field, not on funding levels, as indicators of leadership and they should consider the development of human resources as a component of leadership; for example, if all the interesting research in a country is being done by senior researchers, that country might lack sufficient young researchers to develop accomplishments in the future.
2.2 Specific Charge to the Panels
In particular, COSEPUP charged each panel to answer three questions:
What is the position of US research in the field relative to that in other regions or countries? Is it at the forefront? The leader? Behind the leaders? Where is the most exciting research occurring internationally? Given that the United States should be at the forefront of research, where does US research stand relative to the forefront?
On the basis of current trends in the United States and world-wide, what will be our relative position in the near and longer-term future ? Given current trends, will US research in the field remain at the forefront? Take the lead? Fall behind the leaders?
What are the key factors influencing relative US performance in the field? Why is the critical research occurring either in or outside the United States? Is the equipment, infrastructure, or supply of young people superior or inferior?
Each panel adapted this charge to the characteristics of its particular field. Complete descriptions of methodology can be found in the reports of the three panels, which are in the attachments.
2.3 Selection of Panel Members
To achieve balance and diversity, COSEPUP asked the oversight committees to include panel members from the following general groups: US experts in the research field being assessed, experts in related fields of research, non-US experts in the field and related fields, and "users" of research results.
The concept of "users" is meant to embrace those who can judge both the quality of research and its relevance to further research, to industrial applications, and to other societal objectives, including the advancement of knowledge. These users might be found in academe, government, industry, or other sectors. Users of mathematical research, for example, might include an academic chemist, an industrial engineer,
a foreign mathematician, an economist, and a representative of a professional society. COSEPUP also decided to seek an expert in policy analysis for each panel. With this diverse composition, each panel was equipped to judge the quality, relevance, and leadership status of the research field under study.
In addition, the oversight committees tried to create panels that represented geographic and professional diversity. They sought multinational viewpoints to enhance objectivity and reduce national bias. For example, the immunology panel selected ''up-and-coming leaders as well as established ones, investigators from all over the world, and leaders of all sub-subfields, both basic and clinical" to "incorporate the opinions of a variety of respected members of the immunology community." Membership from industrial firms was deemed essential to provide perspective on the work of leading industrial researchers and on the ultimate utility of research to developmental and manufacturing processes on which firms depend. Inclusion of younger and senior researchers were seen as essential to provide innovative thinking and fresh perspective.
2.4 Selection of Research Fields
For these experiments, the committee deliberately chose difficult subjects—fields that are diverse in scope and subject matter. Mathematics is the closest to being a traditional discipline, but even mathematics is broad in the sense that it is the language and tool of most of the sciences.
Immunology is not a disciplinary field in the traditional sense. Although immunologists work in virtually every department and division of the life sciences, few universities have departments of immunology. Immunology embraces many disciplines—including biochemistry, genetics, and microbiology—and its findings translate into diverse clinical subjects, such as rheumatology, surgery, endocrinology, neurology, and allergy.
Materials science and engineering spans multiple disciplines concerned with the structure, properties, processing, and performance of materials. Nearly all fields of science and engineering are involved in some way with materials, and many ideas in materials science and engineering emerge from disciplines as diverse as solid-state physics, chemistry, electronics, biology, and mechanics.
The process of selecting fields reminded the committee how difficult it is to divide research activities into discrete categories but this is a key to effective benchmarking. There is probably an optimal size and complexity of fields to be benchmarked, but these early experiments could give only rough indications of those measures. Two of the fields, immunology and materials, were initially judged by their panel chairs to be too large for truly rigorous treatment (they later changed their minds).
As scientific research becomes more interdisciplinary and complex, scientists and engineers are challenged to describe the limits of their own intellectual activity. It seems likely that the definitions required by benchmarking exercises can help to illuminate criteria for defining fields and subfields. In the case of superconductivity mentioned above, American scientists who were already working in adjacent subspecialties were able to move quickly into superconductivity research, and this could offer a definition of the subfields within a field along functional lines; that is, a field might be defined as the array of related domains among which investigators can move without leaving the realm of their expertise.
2.5 Evaluation of Panel Results
COSEPUP evaluated the quality of panel results independently and via comments from the oversight groups. In addition, the feasibility and utility of benchmarking were assessed during meetings with disciplinary societies and at a full-day workshop attended by representatives of federal agencies, universities, Congress, and the executive branch. The summary of that workshop in appendix C provided valuable information for COSEPUP members.
Of particular importance were the contributions of reviewers, whose array of expertise was comparable with that of the panel. The reviewers, chosen to represent diverse industrial and academic backgrounds, provided invaluable commentary and criticism for use by the panels and a means of validating the panels' findings.
Among the topics proposed for further discussion were the use of benchmarking during the budgetary process and whether benchmarking would be useful in helping to set national science policy. In addition, federal-agency representatives were asked about their own procedures for evaluating research and whether benchmarking might have a role to play in those procedures. Finally, the relevance of benchmarking to GPRA was discussed.