Click for next page ( 94


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 93
DATA NEEDS FOR PROGRAM EVALUATION To assess the numerical adequacy of the nation's biomedical and behavioral research personnel and to make judgments about the quality of their training, we need both quantitative and qualitative information. Timely, accurate, and relevant information is essential to the success of this effort. A number of data sets are maintained by NIH and other federal agencies that are directly relevant to the responsibilities of the Committee on Biomedical and Behavioral Research Personnel. Unfortunately, some of the sets are complex and difficult to manipulate. Even when one manages to retrieve information from them, the information's quality is sometimes questionable. We propose to extract from them and from other sources a data set that is tailored to the information needs of this committee and potentially to those of NIH. _ The proposed ''evaluation data matrix could form the core of a management information system for use in tracking and evaluating the National Research Service Award (NRSA) program. The difficulties encountered by this committee in simply attempting to correlate the historical levels of prior committee recommendations with actual NRSA awards by field suggest that this tracking has been difficult at best. What is needed is a coordinated systematic data set that provides descriptive and comparative . . . . statistics relevant to the committee. It should be coordinated and systematic in its use of common taxonomies and measures that satisfy the needs of the committee. It should be descriptive of programs at a level of detail that the committee deems appropriate' providing information about characteristics, both of the programs and of their participants, and it should be comparative across programs and through time. The needed data set can be conceptualized simply as a time series of matrices whose rows represent program categories (or "activity codes," as they are called by NIH) such as F31: predoctoral individual NRSA fellowship. The columns represent characteristics of the programs and their participants. For example, one of the nine program characteristics requested was median length of training time. Thus, a cell in this column would represent the median training time of the program type that the particular row represents. Primarily for the benefit of future versions of our committee, we have undertaken an initial design and pilot construction of such a data matrix. Our design of both rows and columns is shown below. In order to gain an idea of the magnitude and feasibility of the project, we contracted with a firm that is experienced in working with the relevant data bases to undertake a pilot construction. Results: The contracted firm concluded that, while some items of the data matrix could be constructed fairly readily, others would involve a greater level of effort to construct. In principle, at least, data exist with which to construct all cells of the 39 by 20 matrix. The exercise revealed several important problems, most of which the committee had been aware. The five most important problems with the existing data sets are described briefly in the following section. Problems in the Data Sets: The data sets are subject to a number of criticisms, five of which are most serious for our purposes. 1. Access: Since our study was located in NAS/NRC's Office of Scientific and Engineering Personnel (OSEP), which also houses the Survey of Earned Doctorates (SED) and SDR, there was no problem in accessing these data. However, there are some difficulties in accessing the major NIH data sets. Although our staff have direct accessing abilities, it appears preferable to work through the NIH staff. The Information for Management, Planning, Analysis and Coordination (IMPAC) file is the primary source of financial 93

OCR for page 93
and other data on Public Health Service (PHS) extramural programs; data are organized by fiscal year. The Information Systems Branch of the Division of Research Grants (DRG) responds to requests for data regarding these PHS activities. In addition, DRG provides annual data that are used by the National Research Council (NRC) to update the Trainee Fellow File (TFF) and Consolidate] Grant Applicant File (CGAF). These two files are organized by individual recipients of traineeships, fellowships, or grants. Data derived by DRG from the IMPAC file are considered to be "official" data, while data derived by others from TFF and CGAF may be considered for some purposes to be "unofficial." A written request for data to be extracted! from the IMPAC file was sent on behalf of the committee to DRG. Some materials were received from DRG, although most were not at the level of detail needed for the data matrix. Some unofficial data were extracted by the contracted firm from the TFF and CGAF files and provided to the committee, forming the basis of the statistical profile of training programs in Chapter 2. 3. 4 Quality of CGA F and TFF files: These two files rearrange the IMPAC information to identify all the training received by a single individual (TFF) and all the research grants given to an individual principal investigator (CGAF). This information is essential for the committee to determine the subsequent research participation of those who have received NRSA research training and to do longitudinal studies of participation in NRSA research. Some concerns have been expressed by DRG and NIH about the quality of these two data sets. Thus, we recommend that NIH evaluate the accuracy of a sample of the data sets. Classification of race/ethnicity and sex: Apparently there are problems of nonreporting and incorrect reporting of gender and race/ethnicity data on the IMPAC file. A representative of DRG discourages the use of these data. This is most discouraging in light of the clear need to monitor progress of women and minorities in science. We recommend further investigation of data quality, including the matching of individuals across sources of data and over time in an effort to resolve inconsistent reporting of data. Classification of training field: The definition of fields of science presents several problems. The Discipline/Specialty/Field (D/S/F) codes in the IMPAC file apparently have not been coded consistently across the various institutes of NIH and ADAMHA. The DRF, derived from the SED, provides data only for Ph.D.s and does not take into account persons with degrees in one field who are receiving additional training in another. The D/S/F codes need additional investigating; it may be the case that they provide sufficiently accurate data at the broad field levels of biomedical sciences and behavioral sciences, but we cannot be certain without further investigation. We strongly recommend that NIH investigate the accuracy of this classification anal, if necessary, design better centralized quality control methods and apply them to these classifications. 5. Response rates in the SDR: These rates have been extremely low for many years, varying across fields from the high 40s to the low 70s. We recognize the complexities and problems of attrition in a longitudinal data set such as this but recommend that ways be found to improve the rates. An NAS panel on the NSF data system, of which the SDR is a part, recently has made a 94

OCR for page 93
similar recommendation.8 We understand that a study of nonresponse bias in the SDR currently is under way within NAS/NRC and await its results with interest. 8C. F. Citro and G. Kolton teds.), Surveying the Nation's Scientists and Engineers: a Data System for the 1990s, Washington, D.C.: National Academy Press, 1989. 95

OCR for page 93
Preliminary List of Evaluation Data Matrix Rows: Program Categories 1. Basic Biomedical Science a. Predoctoral o Individual oo NRSA Fellowships oo Other o Institutional oo MARC Undergraduate oo NRSA Traineeships oo Other b. Postdoctoral o Individual oo NRSA Fellowships oo Career Development Awards (K07, K08) oo Other o Institutional oo NRSA Traineeships oo Other 2. Behavioral Sciences a. Predoctoral o Individual oo NRSA Fellowships oo Other o Institutional oo MARC Undergraduate oo NRSA Traineeships oo Other b. Postdoctoral o Individual oo NRSA Fellowships oo Career Development Awards (K07, K08) oo Other o Institutional oo NRSA Traineeships oo Other 3. Clinical Sciences a. Predoctoral o Individual oo NRSA Fellowships oo Other Institutional oo NRSA Traineeships oo Other b. Postdoctoral o Individual oo NRSA Fellowships oo Career Development Awards (KXX series, including Kll, K15) oo Other Institutional oo NRSA Traineeships oo Other 96

OCR for page 93
PRELIMINARY LIST OF MATRIX COLUMNS: CHARACTERISTICS OF PROGRAMS AND PARTICIPANTS 1. Program characteristics in given year a. b. c. d. e. f. g. h. i. Goals Number of institutions involved Number of recipients Median length of training Total enrollment Total cost Cost per recipient month Median number of trainees per institution Publication counts of faculty in primary departmentts) of program Participant characteristics in given year (median, except as noted) a. Baccalaureate selectivity scores (A. Astin) b. Quality rating (NRC, 1982) of doctoral department c. Quality rating of primary department in postdoctoral programs d. GRE scores e. Percent female f. Percent Asian/Pacific Islander g. Percent other minority h. Number of publications in first K postdoctoral years i. Number of citations in first K postdoctoral years i. Percent who apply for research grants in first K post-doctoral years k. Percent who receive research grants in first K post-doctoral years 1. Percent in academia K years after termination of training - 97

OCR for page 93