Review of Abt Report and Comments on the STC Evaluation Process
The preface discusses the panel's general belief's about the Abt report. Two major issues are addressed here:
Review and interpretation of the data gathered and reported on by Abt Associates.
Comments on the STC evaluation process.
To establish the context for our discussion of these issues, we first review the objectives of the Abt evaluation of the STCs, as stated in its report, and the relationship that NSF expected the Abt evaluation to have with the COSEPUP panel's objectives.
According to volume I of the evaluation of the STC program by Abt, the evaluation had three objectives: “1) to provide relevant and timely information to NSF decisionmakers considering whether or not to continue support of the STC program as presently constituted; 2) to document whether or not the STC program's research centers were, in the aggregate, accomplishing their research, education, and knowledge transfer objectives consistent with the original rationale for the STC program; and 3) to provide inputs to a pilot evaluation process under the Government Performance and Results Act (GPRA). . . . The study sought to identify aspects uniquely attributable to the center mechanism of operation.”
Abt envisioned the role of an independent expert panel as follows (volume I, page ix): “In preparing the study design for the present evaluation, we outlined a rationale for incorporating a series of qualitative dimensions of performance in the evaluation of a fundamental research program, and for the use of an expert panel to assess the quality of the program's research and other accomplishments on the basis of structured, qualitative data. However, such a panel was not included in the study.”
The COSEPUP panel was specifically asked by NSF to
(1) review and interpret the data gathered by an outside contractor (Abt Associates); (2) reach conclusions regarding progress the STC program has made toward its goals; and (3) make recommendations concerning NSF's future use of the STC mode of support. [memorandum from Larry McCray to COSEPUP Panel to Evaluate the NSF's Science and Technology Centers Program, September 22, 1995].
NSF viewed the STC evaluation process as a possible model for future evaluations of programs of this type:
The NSF program evaluation staff considers the STC Program to be an experiment, and is interested in the results of the COSEPUP and Abt Associates evaluations both for the resultant insights into the program and also for insights into the art of program evaluation. [COSEPUP proposal to NSF, September 12, 1995, p. 2].
REVIEW AND INTERPRETATION OF THE ABT REPORT
The Abt report is based on a historical review of the STC program, analysis of secondary data, bibliometric and patent analyses, and surveys of populations associated with the centers. The historical review developed information on changes in basic program structure, program goals and changes, eligibility, guidelines, criteria for review, review procedures, and management policies and practices. Secondary data included information from OSTI and NSF databases and OSTI copy files on the 25 STCs with respect to center funding, staff, and students; copies of the original grant-award jackets; files on the third-year reviews required for all centers and conducted via site visits by peer researchers. Bibliometric analyses were used to study the amount and quality of publications by STC participants, patterns of coauthorship of the publications across institutions, patterns of citations of the publications, and research foci of publications (applied versus basic). Patent analyses were used to measure the influence, science linkage, and technologic currency of STC-based patents.
Eight groups of persons associated with STCs were surveyed
Principal investigators (all).
STC advisory board chairs (all).
University deans or provosts (all).
Industry or federal laboratory representatives (three persons identified by each STC).
Educational-outreach collaborators (three persons identified by each STC).
STC administrators (all).
STC graduate-school alumni (all).
Job supervisors of graduate school (all).
The report offers a large amount of descriptive material about individual centers and the STC program itself. Particularly illuminating and rich are the numerous quotations from the groups surveyed. Collectively, the quotations document qualitatively the wide variations in research foci, educational outreach, and knowledge transfer among centers, and the strengths and weaknesses of the center concept and its administration by NSF. Also useful are the individual center profiles, which contain time-series data on sources of support, graduate and undergraduate educational activity, educational outreach-activity, and interaction with industry and federal agencies.
However, the report suffers from serious shortcomings that limit its usefulness for evaluating the STC program. First, the primary data on the achievements and impacts of each center consist of self-reports by persons who, because they are direct or indirect beneficiaries of the center, do not have an independent perspective. Thus, the report assumes the tone of an advocacy document much like a legal brief, rather than an objective program evaluation. Readers do not fully recognize until they are deep into the report that there are no data from groups that might have independent or even negative views about the centers.
Abt surveyed various participants in the STC program with the expectation that, “in most cases, individuals associated with the centers would take the opportunity to ‘put their best foot forward'” (Abt volume II, page 1-32). An expert panel, Abt assumed, would incorporate this positive bias in its deliberations and achieve a balanced set of conclusions.
In justifying that approach, Abt stated (page 1-32) that “there is no source impartial or ‘neutral' opinion about the centers; individuals who are knowledgeable in depth about a center are likely to be either professional collaborators or competitors of the center.” The COSEPUP panel rejects that position because it implies that the initial STC-proposal review process, the periodic site-visit process, and indeed the entire peer-review process are invalidated.
Second, and similarly, the study was not designed to elicit data from comparison groups with which the importance of the achievements and impacts identified in the surveys can be judged. The achievements and impacts described stand in isolation; there is no objective basis on which to assess the extent to which they can be attributed to the features characteristic of centers or whether they might have occurred in the absence of centers. Readers of the report must rely on the perceptions of principal investigators and others who have substantial stakes in the success of the centers with which they are affiliated.
Bibliometric analyses potentially offer a basis for judging the quality and quantity of publications generated by center participants and comparable groups, but the potential was not taken full advantage of. Rather than comparing the publications of center participants with their own prior records (one indicator of the influence of center participation on publication activity and patterns) or with the contemporaneous records of comparable researchers in non-STC environments (another appropriate indicator of the quality, quantity, and nature of center publi-
cations), Abt chose to compare center output with that of all researchers publishing in similar journals in the same fields. As a result, few clear conclusions can be drawn about the effect of center participation on researchers' publication records—and thus about the effect of centers themselves on the quantity and quality of research output in their fields. Abt also could have examined how center participation might have changed the foci of investigators' research.
Third, Abt did not carry out any analyses of the relationships among the several key sources of data or among categories of respondents that produced the same types of data. Doing so would have constituted an important check on the validity of the achievement and impact data. In the former case, the principal investigators' self-reports of the centers' most-important achievements and impacts could have been compared with the available NSF site-visit reports for each center and with the center-profile data on graduate and undergraduate education, educational outreach, and industry activity. In the latter case, the perceptions of principal investigators could have been compared with those of advisory board chairs. Consistency would have increased confidence in the results, and discrepancies could have been explored with data from other sources.
COMMENTS ON THE ABT-COSEPUP PROCESS AS A MODEL FOR FUTURE EVALUATIONS
If NSF wishes to have the benefits of different program-evaluation methods in deciding the fate of programs, such as the STC program, it should either (1) follow a carefully designed and reviewed agency-generated overall evaluation plan, contract for separate independent studies that use alternative methods or strategies (one of which might be a peer-review panel), and synthesize the resulting data through inhouse evaluation or (2) support an evaluation by a single professional evaluator who maintains full control over the evaluation design (which could include a peer-review component).
The panel strongly recommends against NSF's use of a process like the one used in the STC program evaluation as a model for future program evaluations. As the process has evolved in this case, the panel has been placed in the position of acting as a check against the positive biases of the Abt report. At the same time, the timing of the process was such that Abt was unable to incorporate comments from the panel about questions addressed in the evaluation or on the evaluation design itself. This failure is both untenable and unacceptable.