National Academies Press: OpenBook

Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers (1989)

Chapter: 1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future

« Previous: Contents
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 1
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 2
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 3
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 4
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 5
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 6
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 7
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 8
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 9
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 10
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 11
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 12
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 13
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 14
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 15
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 16
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 17
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 18
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 19
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 20
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 21
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 22
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 23
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 24
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 25
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 26
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 27
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 28
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 29
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 30
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 31
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 32
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 33
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 34
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 35
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 36
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 37
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 38
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 39
Suggested Citation:"1. Evaluating the National Research Service Award Program: A Review and Recommendation for the Future." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers. Washington, DC: The National Academies Press. doi: 10.17226/9915.
×
Page 40

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

EVALUATING THE NATIONAL RESEARCH SERVICE AWARD PROGRAM: A REVIEW AND RECOMMENDATIONS FOR THE FUTURE Georgine M. Pioni The focus of this paper is (1) to review previous evaluation activities of the National Research Service Award (NRSA) program and (2) to propose an agenda describing the types of evaluation activities that should be carried out over the next 5 years. In line with this emphasis, a description of the major evaluation questions of interest to key program constituencies will be presented. Then, previous evaluation efforts will be discussed in terms of whether they addressed these questions and provided answers that could be viewed with a reasonable degree of confidence. The ''match"' between the questions of interest and the avaiJability.of sound evidence for answering these questions as gleaned from previous evaluations to date will serve as the basis for recommending future evaluation priorities. Throughout this paper, two considerations should be kept in mind. One concerns the diversity of the NRSA program itself. At first glance the overarching goal of this program is relatively straightforward: to train individuals for health-related research and teaching careers. However, in achieving this mandate, several different components and activities are encompassed by the program. For example, NRSA awards support training in a heterogeneous group of disciplines, ranging from genetics to health services research, and activities are administered by a variety of federal agencies and institutes, each with varying levels of experience in supporting research training. The training sponsored by these agencies also differs in terms of academic level (undergraduate, predoctoral, or postdoctoral), target populations, e.g., M.D.s, Ph.D.s, or ethnic minorities), and strategies (e.g., short-term training versus for-mar degree programs or disciplinary versus multidisciplinary approaches). Further, distinct funding mechanisms (individual) fellowships versus institutional training grants) are used to PI would like to thank David Cordray, Peter Rossi, Robert McGinnis, Grace Carter, and Robert Boruch for their critical and insightful comments on previous versions of this paper. Also, all the individuals interviewed, particularly Charles Sherman and Walter Schaffer, deserve special thanks for their willingness to answer questions, identify and locate materials, and discuss issues. The opinions expressed in this paper are the author's and do not reflect those of either the author's affiliation-- Vanderbilt (University) Institute for Public Policy Studies, the Committee on Biomedical and Behavioral Research Personnel, or the National Research Council.

support activities, incorporating different selection procedures and educational strategies (e.g., ''one-to-one" individually negotiated, student/mentor apprenticeships versus more formally structured degree programs within an institution). Consequently, previous evaluations in the NRSA program differ in terms of the specific program of interest, the target populations examined, the training activities invest ved, and the outcomes studied. It also must be remembered that the NRSA program is but one benefactor of research training. Both the National Institutes or Health (NIH) and the Alcohol, Drug Abuse, and Mental Health Administration (ADAMHA) sponsor other programs with quite similar aims (i.e., increasing the supply of productive researchers in health-related areas). These programs either directly or indirectly sponsor research training and/or research career development (e.g., Research Career Development Awards, Clinical Investigator Awards, the Minority Biomedical Research Support program, and individual investigator RO-1 grants). In addition, other federal agencies and nonfederal organizations support research training in biomedical and behavioral research at some levee (e.g., the National Science Foundation's Graduate Fellowship Program and the Robert Wood Johnson Clinical Scholars Program). Thus, the goals, functioning, outcomes and effects of NRSA programs must be viewed within the larger context of research training occurring in university departments, medical schools, faculty laboratories, and independent research canters. WHAT TYPES OF EVALUATION QUESTIONS ARE OF CURRENT INTEREST? In reviewing the quality of NRSA evaluation efforts, a major issue concerns the extent to which previous evaluations have addressed questions posed by major constituencies. Given that evaluations are intended to provide useful results, studies should speak to the key concerns expressed by the various stakeholders involved with the program. Four major constituencies for the NRSA program can be identified. They include (1) Congress, which is responsible for authorizing the program and appropriating funds; (2) NIH and ADAMHA, along with their individual institutes, which are in charge of administering the programs; (3) the individual fellows, trainees, and faculty involved in NRSA-supported training activities; and (4) other audiences with vested interests in training researchers (e.g., professional associations, scientific societies, and national ''blue ribbon" committees concerned with research and science policy). In order to identify the issues of primary interest to these constituencies, relevant legislation and evaluation reports were reviewed. Individual interviews with congressional staff, federal agency personnel responsible for NRSA policies and evaluation activities, and individuals in 2

charge of specific NRSA programs (N = 16) also yielded insight into the questions for which evaluative data are sought. 2 It should be noted that neither is interest in these questions always generated independently by each constituency nor is each constituency equal in terms of the urgency with which its demands are accommodated. For example' questions dictated by reauthorizing legislation and formal requests from congressional oversight committees to federal agencies mandate a response; Congress indeed is the holder of the purse strings, and given limited time and resources, its requests often rank higher on the list of agency priorities for evaluation. In addition, the evaluation questions of most interest to a particular group often depend on the extent of its nature with the NRSA program. For example, agency staff whose major responsibility lies in administering institutional training grants may be most enthusiastic about collecting data that could improve their ability to monitor and guide programs; in contrast, scientific societies' demands may stem primarily from their desire to develop stronger arguments for increased NRSA funding in their respective discipliners). The major evaluation questions that have been and/or currently are of interest to key NRSA constituencies can be categorized into seven generic types. These include questions about3 o the demand for the NRSA program (e.g., the adequacy of the current supply of for biomedical researchers); o o o levels of program participation, including numbers and characteristics of awardees; characteristics of program operation and functioning, such as whether payback requirements affect the attractiveness of the NRSA program to qualified applicants; program outcomes (e.g., the research career accomplishments of awardees) and/or program effectiveness (e.g., whether the subsequent success of awardees in obtaining federal grants is directly attributable to the program); 2A list of individuals interviewed is available from the author upon request. 3Because of time constraints, attention was focused on those constituencies most involved in setting priorities and administering policies for NRSA programs (i.e., Congress and federal agencies). 3

o outcomes and/or effectiveness of individual NRSA components (e.g., whether the Medical Scientist Training Program is more successful in training physician/investigators than extramural postdoctoral traineeships, intramural fellowships, physician/scientist awards, and/or a combination of training support mechanisms); o cost-effectiveness of the NRSA program; and o the development and maturation of scientific careers in General and the role of research training in this process (e.g., the components and determinants of scientific productivity). Appendix A provides detailed examples of the questions that emerged during interviews with congressional staff, federal agency personnel, and others involved in research training activities and policy. DEMAND FOR THE PROGRAM AND LEVELS OF PARTICIPATION Having definitive data on the need for research training support and levels of participation in NRSA programs are "bottom line" demands of all major stakeholders. For example, both the authorizing and reauthorizing legislation for the NRSA (e.g., P.L. 93-348 and P.L. 100-607) specify that awards are to be made only in areas/fields that have demonstrated a need for researchers. As such, Congressional appropriation committees traditionally have sought to base their fiscal decisions on this information, and actual "numerical recommendations' that indicate the number of training slots necessary to address shortages of researchers in specific areas have been frequently requested. 4 Agency staff share this predilection for reasonably precise estimates of researchers needed in specific fields, disease categories, or problem areas. Other groups, including both those who lobby Congress regularly for NRSA funding in individual disciplines and those concerned with the overall health of the scientific enterprise, also clamor for better projections of supply and demand. Occasionally, these stakeholders have even launched their own data collection efforts in an attempt to obtain this information (e.g., Barries, 1986; Porter, 1979~. 4In addition to specifying the number of training slots needed in a field or research area, there was a consensus among congressional staff that better explanations about the ways in which recommendations were derived (e.g., the assumptions underlying supply and demand models) were needed. 4

Related to supply and demand issues are questions about the "niche" occupied by NRSA programs in the overall landscape of research training support. All constituencies interviewed want to know the types of sponsors, the levels of their investment, and major priority areas for funding. Congress, in particular, wants such descriptive information so as to ascertain what the appropriate role of the federal government should be in the research training enterprise. A third question of perennial interest, frequently arising at congressional hearings, centers around the distribution of NRSA programs and funds. All constituencies want an accounting of awarded fellowships and traineeships, the research fields supported (e.g., nursing or primary care research), and changes over time. Such data are perceived as crucial to determining whether NRSA expenditures are targeted at "shortage" areas, to ascertaining whether agencies have responded to specific congressional directives, and to identifying where changes in NRSA program priorities or policies may be warranted. Also viewed as important is information on the characteristics of awardees, typically in terms of their sex, race/ethnicity, and institutional affiliation. Many of these questions have been spurred by disappointment in the low rates of participation by women and ethnic minorities in science, coupled with a concern that the nation's pool of scientists and engineers may prove inadequate to meet future challenges (e.g., Office of Technology Assessment 1985; Vetter, 1989~. PROGRAM CHARACTERISTICS AND OPERATION Of primary interest to federal agency staff who administer NRSA programs and policies are questions related to program functioning. These questions are quite diverse in their scope and content. They include requests for information on how institutional review groups (IRGs) make decisions about training grant awards, the amount and types of research training received by predoctoral and postdoctoral trainees, and whether faculty mentors indeed have active research programs in areas most relevant to an institute's goals and objectives. Program officers, however, are not the only source of these questions. The legislation for the NRSA program itself (e.g., P.L. 100-607) speaks to the general need for program monitoring so as to "determine what modifications in the [NRSA] programs are required to meet the needs E for research personnel]." More explicitly framed "operational" questions also have been posed by Congress, including how the payback requirement and current stipends for NRSA awards affect participation in the program. 5

PROGRAM OUTCOMES AND PROGRAM EFFECTIVENESS Questions related to both absolute outcomes (the accomplishments of NRSA trainees and fellows) and comparative outcomes (e.g., the performance of NRSA-funded predoctoral students in the life sciences as contrasted with those supported by the National Science Foundation) are specified clearly in the legislation. All authorizing and reauthorizing language states that National Academy of Sciences shall "identify the kinds of research positions available to and held by individuals completing NRSA and other current training programs]" (e.g., P.L. 100-607, Part F. Section 489~. Another example appears in the Health Research Extension Act of 1985 (P.L. 99-1258), which requested data on the "number of persons who receive NRSA awards and who engage in health research or training as a career." On the other hand, questions related to program effectiveness (i.e., whether outcomes are directly attributable to NRSA-supported training) are much less frequent and clearly enunciated. A broad and relatively vague mandate for effectiveness data appears in the Jaw (P.L. 100-607~; NAS is directed to "assess Current NRSA programs] and other current training programs available for . . . such personnel." Aside from this fairly global injunction, however, being able to confidently link NRSA training with specific achievements ranks lower on Congress's list of evaluation priorities. Agency staff members also express less interest in effectiveness questions, particularly if the attention paid to them is at the expense of other data collection efforts. What does generate enthusiasm among this group, is obtaining better information on program outcomes--both in absolute and comparative terms. Training officers at the various institutes want to know what happens to their awardees--for example, whether they remain active in research, whether their research is in the area of their NRSA training, and whether they have been instrumental in training other researchers. If these outcomes ultimately can be contrasted with the performance of individuals who received other types of research training that is supported by either their own agencies or by other sponsors, this would be an additional asset. It is likely that the lower priority assigned to addressing effectiveness issues stems from an array of factors. For example, there is an awareness of the enormous difficulty and cost involved in obtaining unequivocal data on the effects of research training, given the current structure of these programs (e.g., the heterogeneity of training experiences and the lack of uniformly applied selection criteria). Related to this is the strong sentiment, based on the substantial erosion in NRSA training monies over the last decade, that the first priority for spending any additional funds appropriated by Congress must be to 6

increase the number of training slots rather than to initiate rigorous impact evaluations. Further, in many programs staff members maintain that the necessary data for answering more basic questions about program demand and operation are not available and that this situation must be corrected before such "second- order" questions as program effectiveness are considered. When questions as to the effectiveness of NRSA programs do surface, they typically center around issues of relative effectiveness. For example, data that can "tease out" the effects of NRSA programs in producing biomedical researchers relative to the performance of other research training programs with similar goals are deemed more salient than evaluations aimed at understanding whether NRSA training is more effective than no research training at all or research training that is entirely financed by the individual through loans or other personal sources. OUTCOMES AND/OR EFFECTIVENESS OF INDIVIDUAL PROGRAM COMPONENTS As previously mentioned, the NRSA comprises a heterogeneous group of programs, many of which also have distinct program components. These include different funding mechanisms, different target populations, and different training philosophies and strategies. Outcomes associated with these individual components and their relative effectiveness have comprised the focus of congressional and administrative inquiries. For example, the Health Research Extension Act of 1985 requested a study on "the effectiveness of Ache training grant] mechanism in encouraging individuals to engage in health research and training as a career." Of constant concern to agency staff is "what works best" among or within NRSA components. Illustrative of this interest are such questions as "Are M.D./Ph.D. programs or postdoctoraI traineeships more efficacious in producing physician/investigators?" and "Is predoctoral training that is grounded in a particular discipline more successful for increasing the number of researchers attacking alcohol-related health problems than predoctoral training that incorporates several disciplinary perspectives and methodologies?" Cost-Effectiveness For the most part, cost-effectiveness questions do not constitute a high priority among major constituencies. The few questions that emerged in the interviews pertained to identifying ways to "best use the training buck," particularly if research training funds continue to erode. Somewhat redated to this concern are more global questions associated with the personal, disciplinary, and social costs incurred from having an insufficient amount of research monies available to support the 7

number of high-quality applications for individual investigator awards from researchers who have been trained in NRSA programs. Development and Maturation of Scientific Careers More frequently, the questions of interest to key NRSA constituencies are those that address research training, scientific productivity, and scientific career development in general rather than with regard to NRSA programs in particular. These questions span a variety of topics, including the relationship between research training and the quality of research, the factors governing an individual's choice to pursue a scientific career, and the resources required to most successfully maintain a productive research career. Also included in this category are those questions posed by agency staff about how to best measure relevant outcomes of research training (e.g., "active involvement in research" and "quality of researched. Although such questions are important for guiding and improving future evaluations of the NRSA program and can indeed be addressed by wel1-designed studies, it must be kept in mind that providing answers is neither the sole responsibility of NRSA nor the evaluation efforts connected with this program. Where Improvements in Evaluation Activities Are Needed? In the previous section the major evaluation questions of current interest to key constituencies were identified. Although these questions covered all aspects of the program, the priority areas centered around those issues associated with demand for the program, levels of participation, characteristics of training and recipients, and program outcomes. To date, past evaluation efforts have provided a wealth of data about National Research Service Awards, but many questions basic to understanding how these programs operate and what happens to awardees remain. Most individuals in charge of NRSA programs often continue to find themselves operating in almost a vacuum with regard to having sound, empirical data about how awardees are selected, the characteristics of participants, and the training environments and activities supported. Further, knowledge about the subsequent performance of awardees currently is confined to a limited set of indicators that vary considerably across individual NRSA programs and that incompletely characterize the intended outcomes. Improving this situation (i.e., "filling the gaps") is what must drive the individual items included in any portfolio of future evaluation activities. Four major gaps exist in terms of having an adequate knowledge base about NRSA programs: 8

Basic questions about program participation and functioning, although of great interest to key constituencies, have remained inadequately addressed. 2. Our understanding of program outcomes, let alone program effects, is still limited. 3. Insufficient attention has been given to determining what works best across and within program components. Evaluation efforts have been sparse in many fields and research problem areas. The first three gaps focus on "points of slippage" between the types of evaluation questions currently of interest to major constituencies and those that have comprised the thrust of prior evaluation efforts. An examination of the generic questions addressed by the 16 evaluation studies/reports reviewed indicated that 56 percent (N = 9) addressed program participation issues, and 50 percent (N = 8) collected data on program characteristics and operation. The overwhelming majority (94 percent, N = 15) presented information on one or more outcomes for programs or program elements, and 38 percent (N = 6) attempted to address in some way the effectiveness of NRSA programs or distinct components. None of the studies reviewed dealt with issues of r cost-effectiveness. ~ At first glance these percentages suggest that many of the questions of interest to NRSA stakeholders (e.g.' program outcomes) indeed have been addressed. However, it must be remembered that within each of these generic evaluation issues lie a variety of subquestions. For example, questions about program operation encompass the nature of the trainee selection process, characteristics of training, and the relationship of the payback requirement to participation levels. As shall be seen, the match between constituency priorities and evaluation efforts 5Given that studies could focus on more than one type of evaluation question, these percentages do not sum to 100 percent. Information on how the evaluations were chosen for review and on the broad categories of questions addressed by each is presented in Appendixes B and C, respectively. In classifying these evaluations, distinguishing between "outcome" studies and "effectiveness" studies has been in many cases a matter of judgment. For the purposes of this paper, an effectiveness study is one that incorporated either methodological (e.g., matching) and/or statistical procedures so as to control for selectivity bias. Those studies float attempted to compare outcomes but did not include any real consideration of selectivity bias were designated, rightly or wrongly, outcome studies. 9

on these more specific questions is where the discrepancies surface. (See Appendix A for a detailed enumeration of the questions posed by constituencies and Appendix D for a listing of those addressed by evaluation activities.) GAPS IN UNDERSTANDING PROGRAM PARTICIPATION AND OPERATION Although past evaluation efforts have addressed aspects of program participation and operation, several issues have escaped careful examination, including some that are basic to understanding any discrete program or intervention (e.g., program implementation). This situation is partly an outgrowth of the limited amount of resources that have been allotted for evaluation activities. Consequently, some programs (e.g., those sponsored by ADAMHA) have received little scrutiny. Another problem has concerned the fact that when evaluations were initiated, the short timelines imposed often dictated that the focus be on collecting outcome data (no mean feat by itself), with only secondary attention given to examining participation levels or program characteristics. As a result, summary profiles describing NRSA applicants, awardees, and program activities are either nonexistent, sketchy, or idiosyncratic in terms of the populations covered, the variables of interest, and the time periods examined. Needed Information on the Demand for the Program Although development of better supply and demand indicators for biomedical and behavioral science research personnel is covered more thoroughly in the full committee report, one related component deserves special mention in this paper. This concerns the extent of our knowledge about the research training enterprise as a whole (e.g., the total amount of funds, training opportunities, and types of training provided by all sponsors). Congress' motivation for having such information stems from its desire to ascertain what its role should be in financing research training and then to apply this understanding when making decisions about the NRSA program. Similarly, the interest of NIH and ADAMHA staff arises from their wish to better understand their own agency's total involvement in research training, particularly by mechanisms other than those covered under the NRSA umbrella (e.g., research assistantships paid by grants to individual investigators). Answers to these questions also are requested in the charge for evaluation specified in the authorizing and reauthorizing legislation for the NRSA program: to "assess current MESA programs] and other current training programs available for the training of such personnel" (P.L. 100-607, Part F. Section 489~. To carry out this charge, a map of the geography and topography 10

of non-NRSA funding sources and mechanisms for research training must be constructed. Previous committee reports (e.g., National Research Council, 1977, 1981) have attempted to survey this terrain, but this is no easy task. Currently, the best sources of data are the annual Survey of Earned Doctorates (SED) conducted by the National Research Council and the National Science Foundation's (NSF) Survey of Graduate Science Students and Postdoctorals. However, each has certain limitations. For example, the NSF survey requires institutions to indicate only one source of support for a graduate student. Although respondents to the SED are instructed to identify all sources of support and estimate the percentage of support received from each source, their ability to reconstruct these data accurately is unclear. Understanding the variety of research training activities sponsored by NIH and ADAMHA via non-NRSA mechanisms represents one step toward mapping the terrain, however. Of particular interest here is predoctoral and postdoctoral research training paid by research grants to individual investigators. Available data suggest that the use of this mechanism in supporting research training is not infrequent; research assistantships paid by federal and other grants were a source of predoctoral support for 16 percent of the 1987 Ph.D. recipients in the life sciences- -an almost equal percentage to that reported for NIH traineeships (Coyle and Thurgood, 1989~. Developing this capacity to obtain detailed training support data for all Public Health Service agencies may be more feasible than one might expect, given earlier and more recent efforts by NIH and ADAMHA. Information on all paid and unpaid personnel working on research grants was collected on a sample basis for all PHS grants in 1963 and NIH grants in 1969; beginning in 1973, these data were again requested of all NIH grantees in the NIH Research Grants Manpower Survey. Unfortunately, this effort was abolished in 1980 despite a reasonably favorable evaluation (Williams, 1979). ADAMHA staff members currently are investigating the feasibility of implementing a similar system for their own research grants and have already developed the system specifications, along with conducting some preliminary pilot tests (see Tjioe, 1989, for a description of this system). Variables in this system include name, social security number, role on the grant, type of position, sex, highest degrees), year of degree, birth date, field (e.g., surgery), and research discipline (e.g., brain damage) for all personnel connected with awards made by ADAMHA institutes. If resources were available to establish and maintain this data base for all PHS awards, questions relating to the various ways in which research training 11

is supported by these agencies, along with differences across fields and levels of training, could be addressed. Moreover, the ability to address additional questions concerning research training and personnel, including detecting shifts over time, would be enhanced. For example, postdoctoral research associates supported on faculty research grants (RO-ls) could be identified by the data base at the start of their tenure; these individuals (or a sample) then could be surveyed about their training (e.g., the extent of their involvement in relevant activities and the nature of their relationships with advisors and other faculty) so as to determine whether and how these experiences may differ from those received by NRSA-funded individuals. If it is true that students are supported by various types of mechanisms, a more complete picture of each trainee's or fellow's total PHS-supported research training would be obtained. This would work towards acquiring a better sense of an individual's training support history and help improve efforts to elucidate the relationship of outcomes to types and length of research training. Needed Information on Participation Levels and Participant Characteristics Data relevant to questions asking for the numbers of awardees or positions are readily available. They are published annually in data books or other reports issued by NIH (1987, 1988) and ADAMHA (1989a), along with their individual institutes. They also appear in the majority of evaluation studies reviewed r along with each of the committee's previous reports (e.g., National Research Council, 1983, 1985~. Depending on the specific report, information on the number of awardees may be disaggregated by type of training mechanism (individual fellowship versus traineeship), level of training (e.g., predoctoral versus postdoctoral), type of training (e.g., M.D. or Ph.D.), major field grouping, sponsoring institute (e.g., National Cancer Institute), or some combination of these variables. Detailed profiles depicting even basic demographic characteristics of participants (e.g., sex, race/ethnicity, and educational background), however, are less frequently found in formal evaluation reports, although there are some exceptions. A few evaluation studies (i.e., Coggesha11 and Brown, 1984; Garrison and Brown, 1986; National Institute of Dental Research Ad Hoc Consultant Panel, 1988; Velletri et al., 1985) did report descriptive data on awardees' educational backgrounds. Information on sex of awardees was included in Garrison and Brown's (1986) study of NIH postdoctoral appointments. Data on race/ethnicity typically have appeared only in internal program reports or evaluations of the Minority Access to Research Careers 12

(MARC) programs Garrison and Brown 1985; Primos, 1989a, b; Sherman, 1983b; Task Force on Minority Research Training, 1986~. Developing detailed profiles or time series of participation levels by one or more of these major indices from these documents also is not easy. In part, this is because there exists no agreed-upon format for how to best report these data. It is not uncommon for individual evaluation studies and agency publications to use slightly different lexicons for classifying research fields and specialties, to apply different schemes for aggregating individual disciplines into major field clusters, or to employ different counting strategies (e.g., number of individuals versus fu11-time equivalents). For example, in some instances MARC awards have been included under predoctoral awards, and in other documents they have been demarcated; sometimes awards made by the Fogarty Center have been included in the total counts (National Research Council, 1979), and sometimes they have not (National Institutes of Health, 1987~. Similarly, in 1981 ADAMHA began using a different system for classifying the major fields of their trainees and fellows. Such discrepancies often may be difficult to detect, even for the most saw y user, because of the variability across studies in the use of detailed footnotes. A more central problem lies in the fact that the accuracy of the information for some demographic variables in the major research training data bases is questionable. Look (1989) identified problems in the IMPAC data file maintained by NIH (the master data base from which the Trainee-Fellow File [TFF] is constructed); these included nonreporting and incorrect reporting of gender and race/ethnicity, along with inconsistent reporting of data on discipline, field, and specialty. It is the case that demographic data are available from other sources, which might then be used to augment gaps in existing agency files. For example, for some programs, particularly small programs such as those administered by the Minority Resources Branch at the National Institute of Mental Health (NIMH), demographic data on trainees and fellows funded by the MARC program are collected and maintained by program personnel (e.g., Primos, 1989a). Information on all Ph.D.s awarded by U.S. universities, which includes those who received NRSA support, also is collected by the SED. However, gaining access to and merging these files with PHS data require resources, and monies typically become available only when large-scale evaluation studies, which focus on outcomes, are commissioned. In addition, information on non- Ph.D. populations (e.g., M.D.s) has proved more difficult to obtain (see Carter, Robyn, and Singer, 1983, and Martin, 1986, for examples of problems with specific data bases). In general' there remains a paucity of complete and accurate information available on the characteristics of NRSA participants. Although special analyses can be performed and 13

some descriptive data can be ''patched together" from previous evaluation studies, many audiences do not know answers to such simple questions as "What have been the trends in NRSA awards in the clinical sciences?" and "What percentages of women and ethnic minorities have received NRSA support?" This same situation is even more characteristic of applicants to NRSA programs, although the issue is somewhat more complicated. "Applicants" include individuals who apply for NRSA fellowships and institutions that apply for NRSA training grants. Some information is available on the numbers and characteristics of individual applicants for fellowships, and although fellowships do not constitute the bulk of training slots funded, their numbers are not insubstantial (147 fellowships for NIMH in 1987 and 1,664 for NIH in 1986~. However, individual applicants who might apply for training grant slots cannot be identified. These persons are selected by training grant directors/committees at individual institutions, and information is seldom reported on unsuccessful candidates, assuming that at J east in some cases individuals do apply and some type of selection process occurs (most likely for postdoctoral traineeships). The selection processes inherent in the training grant mechanism thus make a comprehensive profile of all applicants unworkable without primary data collection efforts such as those initiated by Bickel et al. (1981) for the MST program. At the same time, determining whether the award selection process is working as intended and ultimately the effects of the training grant mechanism require that attention be paid to gathering applicant data. In addition, the characteristics of the institutions that apply for training grants are worthy of examination in terms of understanding the geographical distribution of applicants and awards, the factors correlated with awarding funds to a grant application, and so forth. Increased attention to collecting data on participation issues would contribute to laying a firmer groundwork for understanding not only the demand for the program but also issues associated with how individual or program characteristics may be related to certain successful outcomes. For example, relationships have been found between sex and grant activity for postdoctoral fellows (e.g., Garrison and Brown, 1986) and NSF predoctoral fellows (Snyder, 1988) and between certain characteristics of training programs and subsequent grant application efforts of trainees (C. Roth, personal communication, June 1989~. Needed Information on Program Characteristics and Activities There exists a plethora of questions about the characteristics of NRSA programs and their functioning (see Appendix A). These questions not only include ones raised by Congress, which typically focus on program procedures and 14

regulations (e.g., the payback requirement), but also those generated by agency staff (e.g., the nature of the trainee selection process and the types of training activities carried out in funded programs). About half (56 percent) of the 16 evaluation studies reviewed devoted some attention to program "operational" issues (see Appendix C). For example, descriptive statistics on the duration of NRSA support were contained in evaluations of NIH predoctoral appointments (Coggeshall and Brown, 1984; National Research Council, 1976; Vel~etri et al., 1985), NIH postdoctoral appointments (Garrison and Brown, 1986), and the National Institute of Drug Abuse and NIDR trainees and fellows (Clouet, 1986; NIDR Ad Hoc Consultant Panel, 1988~. Related to this is the need for data describing the extent to which individual awardees receive multiple NRSA awards (e.g., predoctoral and postdoctoral). This aspect has been addressed in a few studies (e.g., ADAMHA, 1986, unpublished report; Coggeshall and Brown, 1984 ; Garrison and Brown, 1986~. Such data are important in order to obtain a good sense of the "dosage" of training and how it might subsequently red ate to measures of outcomes and effectiveness. Other aspects of program operation also have received some scrutiny. In response to a specific congressional request, the influence of the payback requirement on the number and quality of applicants and the number of awardees who pursued health research or training careers was examined by NIH ~ 198 6 ~ . An in-depth exploration of NIH program "processes" was undertaken by the National Research Council (1978), with site visits to institutions with training grants conducted to obtain a better sense of how training monies were used and to suggest ways in which training policies might be improved. Garrison and Brown (1985), in their evaluation of MARC undergraduate training grants, also obtained qualitative information gleaned from site visits on such operational issues as the activities on which MARC funds were spent, departmental composition, and recruitment practices. At the same time, a profile of how programs function in terms of recruitment, selection, and actual training activities- -issues currently of interest to major constituencies--is not available for the majority of NRSA programs. The lack of these data is disturbing, not only because program modifications may then fall prey to being guided more by personal judgment and experience than by empirical data but also because the success of other evaluation efforts (i.e., outcome and effectiveness studies) hinges on understanding how participants are selected and the distinct types of training (if any) they receive. Take, for example, the question about how trainees are assigned to training grants, particularly in terms of Ph.D. 15

predoctoral training and M.D. postdoctoral training. It has been speculated that NRSA support is simply viewed by departments as another "pot of money" that can be channeled to students who currently are not receiving other types of financial aid. If this practice is common, it may mean that traineeships, rather than being highly competitive, are reserved for those individuals judged as less qualified to compete for other sources of support (e.g., prestigious university fellowships or postdoctoral appointments). The situation is further compounded if NRSA stipends are lower than those offered by most other sponsors or if payback obligations are viewed as burdensome by the individuals who would be most qualified to receive NRSA support. GAPS IN UNDERSTATING PROD OUTCOMES AND EFFECTS Questions concerned with program outcomes clearly are specified in the legislative authority for the NRSA program, along with being of significant interest to agency staff in charge of these training programs (see Appendix A). Here the focus is on knowing what happens to awardees (e.g., "Are they engaged in health research careers?". As previously discussed, questions concerning program effectiveness also are implied in the legislative authority for the NRSA program and, although not the highest priority, do generate some enthusiasm among some agency personnel. The questions on effectiveness generating the most interest, however, are not those of the breed "Are NRSA-supported predoctoral fellows more likely to be successful health researchers than those who receive no predoctoral training and/or support?" Rather, they address the issue of relative effectiveness. For example, congressional policymakers want to know if NRSA programs are more effective than research training programs administered by NSF. Agency staff want answers to such questions as "Is predoctoral or postdoctoral training more effective in producing researchers in the clinical sciences?" Given the strong interest in outcomes, examining program achievements comprised a major emphasis in the overwhelming majority of evaluation studies reviewed (see Appendix C). For the most part, the unit of analysis for these studies was the individual awardee. The role of NRSA training support and the consequences of losing this support on departments were, however, explored in two of these efforts (National Research Council, 1978, 1981~. The bulk of studies focused on those outcomes that to varying degrees reflect involvement in research, given the legislative authority for the NRSA. There was some heterogeneity in terms of the number of outcomes examined, with a few studies 16

(ADAMHA, 1986, unpublished report; Schneider, 1980) providing data on essentially only one outcome (success in obtaining NTH/ADAMHA funding or type of employer). Across the various reports, there also was considerable variability in which outcomes were examined, depending on the training sponsor (e.g., NIH or ADAMHA), degree and field of training (e.g., Ph.D. or M.D. and biomedical or behavioral sciences), levels of training (e.g., predoctoral versus postdoctoral), and time periods examined. For any given outcome, however, there was considerably less variation in terms of how it was measured. Awardee outcomes typically were operationalized as attainment of doctorate and pursuit of postdoctoral research training (for predoctoral award recipients); type of employment, usually academic employment; time spent in research; pursuit of and success in obtaining external research grants, particularly grants awarded by NIH and ADAMHA; and publication performance (numbers of publications and citations). One primary reason for this is that only 4 of the 12 studies involved any primary data collection on awarders; Clouet (1986), the National Research Council (1977), and Sherman et al. (1981) all collected at least some data directly from awardees, and Bickel et al. (1981) surveyed medical school deans about students in their programs. Instead, the typical study has relied on archival data: data on demographic characteristics, educational history, and employment plans of individuals who have just earned their Ph.D. (the Doctorate Records File compiled from the SED); data containing information on all individuals who have applied and/or been awarded grants from PHS agencies (the Consolidated Grants Application File, along with similar data, where available, from such other funding sponsors as NSF); data from a biennial sample survey conducted by NSF on the employment activities of Ph.D.-holders in science and engineering fields (the Survey of Doctorate Recipients [SDR]~; and employment data from reports submitted by awardees after completion of their NRSA appointments in order to fulfill payback requirements. This reliance on archival data has at least partly resulted from the constraints imposed by limited funding for evaluation, short timelines for reporting, and OMB regulations for data collection efforts contracted by federal agencies. In addition to gathering information on outcomes for NRSA recipients, four studies (Coggeshall and Brown, 1984; Garrison and Brown, 1986; NIH, 1986; National Research Council, 1976) did address program effectiveness at some level. These studies were 17

all efforts sponsored by the committee, and for the most part, the major focus was on evaluating NIH programs. 6 Given that NRSA supports both predoctoral and postdoctoral training for M.D.s and Ph.D.s and that the training strategies for each of these groups are reasonably distinctive, evaluation activities will be discussed separately. Predoctoral Training for Ph. D. ~ . Three major studies examined outcomes associated with NRSA- sponsored predoctoral training (Coggeshall and Brown, 1984; National Research Council, 1976, 1977~. In general, the results indicated that NIH awardees distinctly outperformed their comparisons in terms of greater involvement in research (e.g., receipt of additional postdoctoral research training, time spent in research, and grant application/award activity). These individuals also had somewhat better track records in carrying out high quality research (as measured by citations). Similar to the results of previous studies on the determinants of academic careers (Long, et al., 1979; McGinnis and Long, 1988), awardees did not experience any greater success in locating academic employment, once prestige of doctoral institution had been control led. For each of these findings, however, the causal linkages between NRSA-funded training and these outcomes remain unclear. For example, in its survey of 1971-1975 Ph.D. recipients in the biomedical and behavioral sciences, along with nurses who had earned their doctorates during the same time period, the National Research Council (1977) found distinct differences between those individuals who had received NRSA predoctoral support from ADAMHA/NIH/HRA (Group 1) and those who did not receive this support (Group 2~. The size of these discrepancies, however, varied across the three broad fields. Looking at individuals' 6None of these studies had designs that could confidently support causal attributions, however. The majority did use multiple comparisons that embodied differing levels of selectivity, but in some cases, comparisons (e.g., predoctoral awardees versus those with no Ph.D. training) were so hopelessly confounded as to be meaningless. Those that paid some attention to issues of selectivity are the ones that are considered in this paper. Even with this more narrow focus, summarizing the results of this smaller set of studies is difficult, given the diversity of programs examined, the variability in career patterns and research activities in different fields and specialties, and the influence that differing time periods of training may have on outcomes (e.g., the effect of labor market expansion and contraction on the availability of academic positions). 18

reported success in obtaining PHS support for their current research, the proportions of NRSA awardees versus those who received no NRSA support were 58.5 percent versus 43.4 percent in the biomedical sciences, 29.5 percent versus 16.0 percent in the behavioral sciences, and 38.1 percent versus 18.4 percent for nurses. NRSA awardees in all three fields also were slightly more likely to report greater time spent in research, although the differences were quite small; the average time reported was 59.8 percent for Group 1 members versus 52.0 percent for Group 2 members in the biomedical sciences, with corresponding percentages of 28.4 percent versus 22.6 percent for behavioral scientists and 15.5 percent versus 11.6 percent for nurse- researchers. Awardees in the biomedical sciences also were much more likely to have spent the first year after their doctorate in postdoctoral study (65.2 percent versus 47.9 percent) in contrast to behavioral science Ph.D.s (15.6 percent versus 10.4 percent). Comparisons of the percentages employed in academic environments for each group yielded little or no differences. Better performance in research as a function of NRSA predoctoral support may, however, be simply a product of preexisting differences between the groups and differential training experiences. The groups examined in this study were those with NRSA support and those without NRSA support. This latter group is quite heterogeneous, comprised of individuals who received other types of federal support, university support, or no financial support for their graduate training. Further, the fact that a variety of predoctoral training experiences characterized this group, some that may or may not be similar to NRSA-supported training activities, makes interpretation of both the differences and the lack of differences between these two groups impossible. Significantly better information is available from two studies that used comparison groups designed to help control for differences associated with heterogeneity of training experiences and, to a lesser degree, selectivity. The National Research Council (1976) compared the types of employment (e.g., academic and business) and time spent in various activities from 1968 to 1970 for three groups of Ph.D.s: (1) awardees of NIH predoctoral support; (2) those who received other non-NIH predoctoral support that could be identified (e.g., awards from NSF); and (3) those who received neither predoctoral support from NIH nor the other agencies covered in the study. In addition, the attempt was made to include only those individuals who had not engaged in postdoctoral study so as to reduce the possible influence of additional formal research training. Similar results were found. Looking at Groups (1) and (2), although the percentage employed in academic environments was almost identical (71.4 percent versus 71.6 percent, respectively), NIH awardees spent more time in research than those Ph.D.s supported by other sources (an average of 53 percent versus 41 percent). 19

A more recent study (Coggeshall and Brown, 1984) of NIH predoctoral awards also attempted to at least partially control for the heterogeneity of training experiences and selectivity. Looking at those individuals who received their Ph.D.s in the biomedical sciences between 1967 and 1981, three study groups were compared: (1) those who received at least 9 months of NIH predoctoral support; (2) those who earned their degree from the same departments as the first group but who received 0-8 months of NIH support; and (3) those who graduated from departments that did not have NIH training funds. This strategy permitted two important considerations: (1) those departments receiving NIH funds, often the top-ranked departments in the biomedical sciences, apply the same criteria to accept students, and thus those students enrolled in the doctoral program, regardless of their source of predoctoral support, may be more similar in terms of individual differences (e.g., abilities); and (2) that students who are in departments with NIH funded programs but who are not supported by these funds for an extended length of time may benefit from certain resources accruing to NIH-supported departments. The most instructive comparison in this study for questions pertaining to specific outcomes associated with NRSA support is between those groups from the same set of departments. The findings of previous studies were confirmed. Although there emerged no differences in terms of subsequent academic employment between those with or without NIH support, distinctions did appear in variables related to research performance. For example, the percentage of NIH awardees who subsequently received NIH postdoctoral support was 34.4 percent as compared to only 20.7 percent of their departmental counterparts. Recipients of NIH training funds also were 32 percent more likely to apply for NIH research grants (30.5 percent versus 23.1 percent) and, if they applied, 13 percent as likely to be awarded them (62.3 percent versus 55.0 percent). In general, there was a stronger tendency for NTH awardees to have published at least one article (e.g., 86.3 percent of NIH-supported individuals versus 63.2 percent of those not receiving support for FY 1977 Ph.D.s) and to have more citations per article (e.g., 8.2 versus 6.0 for this same cohort). By examining data on length of NIH predoctoral support, Coggeshall and Brown provided additional insight into NRSA awards, indicating a clear relationship between length of NIH support and performance on several major outcomes. While such results do suggest that NIH predoctoral support, at least in the biomedical sciences, increases the probability that an individual will have a research career in health-related areas, it nonetheless remains a small role. Analyses regressing years since Ph.D., the quality of the predoctoral institution, and total months of NIH predoctoral support on number of NIH research grant applications, average priority score awarded to 20

NIH grant applications, total number of articles published, and average number of citations per article published yielded R2s of .08, .06, .06, and .07, respectively (Coggeshall and Brown, 1984~. Thus, it is apparent that other factors (e.g., individual abilities, the nature of research in certain fields, subsequent postdoctoral research training, and available resources for research at the awardee's employment setting) remain plausible and key contributors to producing active researchers in these areas. Postdoctoral Research Training for Ph.D.~. Three major studies have focused on identifying the outcomes of NRSA-supported postdoctoral training, primarily those of biomedical scientists. In general, those with postdoctoral training, regardless of the sponsor, outperformed on all measures as compared to those who were supported for their prectoctoral education but who did not choose to pursue additional postdoctoral study. More recent examinations of NIH postdoctoral training in the biomedical sciences have been carried out for 1964-1977 Ph.D. recipients (National Institutes of Health, 1986) and for 1961, 1966, 1971, and 1976 Ph.D. recipients in the biomedical sciences (Garrison and Brown , 198 6 ~ . Here the ma j or comparison groups were (1) NIH postdoctoral trainees and fellows, (2) Ph.D.s who had likely received postdoctoral training from other sponsors, and (3) those who reported no plans for postdoctoral study at the time they received their degree. Substantial differences emerged between NIH postdoctoral awardees and those who indicated no plans for postdoctoral study; for example, Garrison and Brown (1986) found that NIH awardees were three times as likely as the "no plans" group to have applied for NIH/ADAMHA research grants (56.9 percent versus 19.6 percent) and four times as likely to have been awarded a grant (40.0 percent versus 9.2 percent). This Batter difference was reduced somewhat when only those who applied for grants were considered (70.3 percent of NIH awardees versus 47.1 percent of "no plans" group). They also were more likely than those with no postdoctoral training to have obtained faculty appointments 8-9 years after their Ph.D. (66.7 percent versus 52.7 percent) and, depending on the specific cohort examined, to have published more articles and received more citations per article. A study by NIH (1986) revealed similar findings in terms of academic employment and research funding activity. Given that individuals who choose to undergo the additional years of training involved in postdoctoral appointments may share certain characteristics that are distinct from those of individuals who do not engage in postdoctoral study, comparing outcomes of NIH postdoctoral awardees with those who had their postdoctoral training sponsored by other agencies is more 21

appropriate for addressing questions related to the specific value of NRSA support. In the studies previously described, the advantages of NIH support remained, although the differences between the two groups were smaller. Comparing Ph.D.s with NIH postdoctoral appointments to those who had their postdoctoral study supported by another sponsor, the National Research Council (1976) found that the NIH study group spent substantially more time in research (an average of 61.2 percent) as compared to those with non-NIH appointments (40.0 percent) and published articles that were more frequently cited by their colleagues (e.g., 73.1 citations per person versus 62.0 for those investigators aged 31-40~. In contrast, although both groups had high rates of academic employment, the non-NIH supported postdoctorates were somewhat more likely to be in universities and medical schools (90.5 percent versus 82.2 percent). As Garrison and Brown (1986) found, NTH awardees continued to outperform in terms of grant application activity those individuals whose postdoctoral training was supported via another source (56.9 percent versus 34.5 percent); Also, they were more likely to have been awarded a grant (40.0 percent versus 22.3 percent); this disparity decreased substantially, however, when considering only those applying for such grants (70.3 percent versus 64.8 percent). There did appear to be some advantage in terms of academic employment; the percentage obtaining a faculty position was 66.7 percent for NIH awardees as compared to 56.7 percent for those with other types of postdoctoral training, but consonant with previous research (McGinnis, Allison, & Long, 1982), this relationship could be primarily accounted for by other factors (e.g., prestige of doctoral institution). Similar results were reported by NIH (1986~. Similar to the situation that exists for predoctoral training, however, multiple regressions on such outcome measures as grantsmanship that controlled for other factors (e.g., selectivity of baccalaureate institution and reputation of doctoral institution) yielded small multiple R2s, ranging for the most part from .06 to .14 (Garrison and Brown, 1986~. This reinforces the conclusion that several other factors contribute to fostering successful research career paths and achievements, although little is known about the exact nature and strength of the relationships. Postdoctoral Training for M.D.~. The role of postdoctoral training for M.D.s was examined by the three studies discussed in the preceding paragraphs. However, the difficulty in interpreting the results--resulting from problems encountered in drawing comparison groups resembling in both orientations and experiences M.D.s with NRSA-supported, postdoctoral research training--is exacerbated, given that the vast majority of physicians do not follow research careers. In 22

addition, identifying reasonable comparison groups in these retrospective studies is further complicated by the fact that existing data bases for physicians typically are less complete that those for Ph.D. recipients. Differences between M.D.s with postdoctoral appointments and those without postdoctoral training, some of which appear to be substantial, were found by the National Research Council (1976) for certain outcomes: employment in medical schools and universities (40.9 percent versus 7.4 percent, respectively); the average amount of time reported in conducting research (10.6 percent versus 2.6 percent); and numbers of publications and citations (e.g., 58.6 citations versus 10.3 citations per person for M.D.s aged 41-50~. By the use of additional comparison groups, a strong relationship between the existence and length of formal research training and outcomes also appeared--a relationship that has been supported by analyses of more recent trainees (Levey et al., 1988; Sherman, 1983a, 1983b, 1989~. In addition to the M.D. groups specified above, two other groups were identified: individuals who had earned both an M.D. and a Ph.D. and who had or had not received postdoctoral training. With the exception of average time spent in research, the results showed a ranking among these groups in line with the amount of research training received. For example, the proportions employed in academic settings were 67 . 5 percent for M.D./Ph.D.s with postdoctoral appointments, 60.4 percent for M.D./Ph.D.s who did not pursue postdoctoral study, 40.9 percent for M.D.s who had NIH-supported postdoctoral appointments, and 7.4 percent for M.D.s with neither a Ph.D. nor postdoctoral training. On each of the four measures used in the study, the performance of M.D./Ph.D.s, regardless of whether they had been engaged in postdoctoral study, was higher than for those M.D.s who did not possess a Ph.D. The two remaining studies tried to draw comparison groups that addressed in some way selectivity issues. Rather than looking only at all M.D.s without postdoctoral training, Garrison and Brown (1986) alto identified another group of M.D.s who received their degree in 1965 or 1974, who reported their primary activities to be "research" or "teaching," but who had not received postdoctoral research training. Looking at 1974 M.D.s only, there were differences between this group and NIH postdoctoral trainees and fellows. For example, those-M.D.s with NIH-supported postdoctoral training also were slightly more apt to have applied for NIH/AD~HA research grants (18.6 percent versus 12.0 percent) and subsequently been awarded funding (8.7 percent versus 5.5 percent). A comparison of these outcomes between M.D.s who had NIH postdoctoral fellowships and those who had unsuccessfully applied for these fellowship was performed by the NIH (1986~. Although both this study and the Garrison and Brown (1986) study 23

demonstrated that NIH fed lows comprise a small and select group of M.D.s with NTH postdoctoral awards, this comparison is instructive, although still equivocal, in that it attempts to address some issues of selectivity. Looking at 1968 and 1971 M.D. recipients, the National Institutes of Health found the NIH fellows consistently outperformed their unsuccessful applicant counterparts in terms of medical school faculty appointments (65.1 percent versus 43.5 percent) and NIH/ADAMHA application activity (27.4 percent versus 19.4 percent). Of those who applied for grants, 59.1 percent of the fellows versus 33.3 percent of the unsuccessful fellow applicants received an award. In general, all of the previously described studies on predoctoral and postdoctoral training have contributed to our knowledge about certain accomplishments of NRSA awardees. Because of unresolved problems with selectivity and heterogeneity of training experiences, however, they have yet to yield strong evidence concerning the effects of NRSA training. GAPS IN UNDERSTANDING OUTCOMES AND EFFECTS OF NRSA PROGRAM COMPONENTS The NRSA program is comprised of several different programs and/or components. For example, there are two basic award mechanisms--individual fellowships and departmental training grants. These mechanisms can be regarded as distinct components, given that they involve different selection procedures and possibly different training experiences. Evaluation studies have paid attention to identifying outcomes and, occasionally, effects associated with these two funding mechanisms, along with examining similar questions for other types of programs/components. Differences in the outcomes for trainees versus fellows have been investigated by several efforts (ADAMHA, 1986, unpublished report; Clouet, 1986; Coggeshall and Brown, 1984; Garrison and Brown, 1986; NIDR Ad Hoc Consultant Panel, 1988; NIH, 1986; Velletri et al., 1985~. For the most part, each has shown that fellows consistently outperformed their trainee counterparts on all measures of interest. Garrison and Brown (1986) looked at several distinct programmatic strategies aimed at providing M.D.s with postdoctoral research training. These included training appointments for study at NIH and extramural awards (individual fellowships and training grants). Five groups were examined: (1) M.D.s who had extramural fellowships and NIH intramural appointments; (2) M.D.s who had extramural traineeships and NIH intramural appointments; (3) those who only had received NIH intramural appointments; (4) M.D.s who only received extramural fellowships; and (5) M.D.s who only received extramural 24

traineeships. In general, the results indicated that individuals with appointments stemming from very competitive selection procedures (extramural fellowships and intramural appointees) were more likely to receive research support from NIH or other major biomedical research sponsors, to have academic appointments, and to exhibit better publication records. Those with both forms of training had the highest performance on every outcome measure. Other distinct NRSA program components also have been evaluated. One program involves the institutional training grants supported by MARC t Honors Undergraduate Research Training Grants). Garrison and Brown (1985) conducted a study of those programs that were sponsored by NIH, gathering data on how the program was functioning, the outcomes of program graduates, and the impact on the institution. For example, they examined the types of activities supported by these programs, student satisfaction with the program, educational and occupational status of former program-alumni, and institutional enrollments of ethnic minorities in the biomedical sciences. From these data, they found general satisfaction with the program, a substantial proportion of program alumni currently enrolled in another graduate or professional program, and a higher percentage of biology baccalaureates awarded by institutions that had MARC programs as compared to those with no MARC program. Limited outcome data on MARC awardees supported by ADAMHA also have been collected, with the most recent results indicating that 75 percent of students supported during the 1980-1986 period were enrolled in graduate school, along with another 4 percent in medical school (Primos, 1989a). Another component--short-term research training in health profession schools that received support from NIH--was examined by Sherman (1984~. This study involved two components: (1) an examination of whether short-term trainees who had been enrolled in the program during the 1961-1970 period had applied for and/or received NIH/ADAMHA research grants and (2) a comparison of career and research plans and preferences of 1982 and 1983 medical school graduates who had received short-term training and who had not. Although the data suggested a stronger research commitment among program participants as compared to those graduates who were not enrolled in this program, selectivity problems make interpretation of these results impossible. One distinct and highly visible component is the-Medical Scientist Training Program (MSTP). Only two small-scale studies of this program have been conducted aimed at training physician/scientists. Bicke] et al. (1981) surveyed medical school deans to examine the adequacy of the number or MSTP training slots, given the pool of qualified applicants, and to compare attrition rates for students in MSTP with those for other M.D./Ph.D. programs. A second study by Sherman et al. (1981) 25

attempted to assess the effectiveness of MSTP in producing physician/scientists, along with other M.D. research training programs (N'IH-supported postdoctoral fellowships and - traineeships, NIH research associate'ships, and NIH clinical associateships). Individuals graduating from MSTP' between 1968 and 3973 were matched with students who subsequently trained in one of the other three programs and who were similar in certain characteristics (e.g., type of medical school, age, and MCAT scores). The results suggested that MSTP graduates outperformed the other groups in terms of achieving faculty status and advancement, obtaining NIH research grants, and publication performance. ~ Since that time' evaluation activities of the MSTP have been minimal (i.e., a brief telephone survey of graduates by program staff and procuring additional analyses of the data collected in Coggeshall awns Brown's 1984 study of NIH predoctora~ trainees and fellows). To date, there are no firm plans for a more comprehensive evaluation, although the committee in 1983 ' recommended that a better "picture of costs, training completion rates, post-training employment histories, scientific accomplishments, etc." be developed (National Research Council, 1983). GAPS IN BASIC EVALUATIVE DATA FOR PROGRAMS IN SPECIFIC FIELDS/RESEARCH PROBLEMS: The large majority of evaluation studies have focused on NRSA programs administered by NIH. In contrast' evaluation of NRSA-supported training at ADAMHA are both few in number and restricted in scope and coverage. Of the three efforts reviewed, only one included all ADAMHA awardees, another focused on individuals supported by a single institute (NIDA), and the third examined one specific program within an institute (psychology research training sponsored by NIMH). To date, the primary source of evaluative information about programs in health services research (now residing in NCHSR) and those in nursing, administered by the Center for Nursing Research resides in the Committee's report issued in the late 1970s (National Research Council, 1977~. Even when evaluations of ADAMHA programs-were carried out, these efforts were circumscribed in nature. Activities typically focused on only a small set of outcomes (e.g., success at obtaining PHS research grants or initial employment of predoctoral trainees after receipt of the doctorate). No attempts were made at examining program effectiveness. This situation is in some ways not surprising. Since 1975, the bulk of training has been supported by NIH; about 9 of every 10 fellowships and training grants have been awarded by NIH 26

programs (Biomedical and Behavioral Research Personnel Committee, 1989~. At the same time, however, this skewed distribution of evaluation efforts has resulted in a lack of information about research training in many fields/research areas, given that ADAMHA, NCHSR, and the Center for Nursing Research provide much of the federal research training support in the behavioral sciences, health services, and nursing. RECOMMENDATTON8 FOR FUTURE EVALUATION ACTIVITIES The major "gaps" described above imply that any portfolio of future evaluation efforts for the NRSA program should consider the following issues: The quality of the major data bases on NRSA appointments should be assessed so as to ensure that information on program recipients covers the key characteristics of most interest, is accurate, and is collected uniformly on all NRSA components. O A core set of evaluative data to be collected for all research training programs funded by NRSA should be identified. o Future efforts should be targeted at gathering information on program characteristics and operation, including data on selection of fe1 lows and trainees, the types of training activities received by individuals in the program, and how these may differ across various program components, fields, and so on. O Increased attention should be paid to measuring the full range of program outcomes both in absolute and comparative terms. o In order to facilitate the development of outcome assessments, basic research on scientific career development and maturation, scientific productivity, and the dynamics of training should be supported. O Although interest in determining the effectiveness of NRSA programs is relatively circumscribed and many aspects of the program currently are not amenable to rigorous impact assessments, the feasibility of implementing effectiveness studies, at least for distinct components of the program, should be explored. 27

ENSURING THE QUALITY OF DATA BASES ON NRSA APPOINTMENTS The Trainee-Fellow File (TFF), derived from the IMPAC file on all extramural awards made by PHS for research, training, and other activities, is the primary source of information on all persons applying for and receiving fellowships and/or traineeships from NIH, ADAMHA, and other PHS agencies. Thus, having accurate and up-to-date information is extremely important for identifying the populations and subgroups eligible for inclusion in evaluations of the NRSA program. In addition, the ability to use archival data successfully from other major data bases (e.g., the SED and the Consolidated Grants Application File) to augment the TFF would be enhanced. Clouet's (1986) study of NRSA trainees and fellows supported by the National Institute of Drug Abuse identified misclassification probe ems in the TFF, at least for NIDA awardees. In the course of identifying NIDA grantees listed in this data base, she found that 67 percent of the student awards identified from agency files were not in the TFF and that over 1,000 people in NIDA nonacademic training programs were included. The extent to which these problems apply to data for other institutes administering NRSA programs is uncJ ear, although interviews with NIH officials did not suggest that it was symptomatic of NIH programs. At the same time, however, Look's 1989 examination of the information available on trainees and fellows suggests that other problems (e.g., inconsistent and even inaccurate reporting on basic demographic characteristics) may exist. Thus, attention should be paid to assessing the quality of data in the TFF and resolving any problems, particularly in light of the fold owing recommendations aimed at increasing evaluation efforts of NRSA programs. ENSURING A CORE SET OF EVALUATION ACTIVITIES FOR ALL NRSA PROGRAMS To date, evaluation activities for programs in certain fields or areas (e.g., the behavioral sciences and health services research) have been minimal and haphazard. For this reason there are no bodies of information for these fields comparable to that existing for the biomedical and clinical sciences. For example, the primary source of outcome data on health services research training support is the National Research Council's 1977 survey. In the case of the behavioral sciences, similar data were collected in the National Research Council's 1977 study, and although other evaluations did include the behavioral sciences in their populations, they were less than illuminating--either because-the studies did not analyze the results for the behavioral sciences separately (e.g., Coggeshal and Brown, 1984) or because the measurement of outcomes was 28

restricted to a single variable, with no incorporation of reasonable comparison groups (ADAMHA, 1986, unpublished report; Schneider, 1980~. This distinct gap in our understanding of the role played by NRSA-supported training for major broad fields of biomedical and behavioral research should be remedied. As found in previous studies of research training, distinct differences in patterns of study and accomplishments appear with respect to individual fields (National Research Council, 1981b; Snyder, 1988~. Further, those fields that are most lacking in evaluative data are those that also make substantial contributions to addressing key health and mental health problems (e.g., National Research Council, 1985), along with other social concerns (e.g., Gerstein et al., 1988~. Moreover, there is some suggestion that there may be future shortages of qualified researchers in these areas. A recent analysis by ADAMHA (1989b) indicated that the pool of researchers working in alcohol abuse, drug abuse, and mental health areas is aging rapidly; in 1979, 26 percent of ADAM A-funded investigators were 35 or younger, and this percentage has declined to 13 percent in 1987. The proportion of young applicants to ADAMHA has followed a parallel trend. In addition, the average age of Ph.D. principal investigators has increased 1.6 times faster than the average age of NIH-funded researchers. The key point is that there should be a core set of evaluative data collected on all NRSA programs and research areas. Only in this manner can the scope and breadth of NRSA activities be examined and strategies developed to improve programs aimed at all fields, levels, and types of research training supported by NRSA funds. INITIATING EVALUATION ACTIVITIES THAT PROVIDE BETTER INFORMATION ON PROGRAM PARTICIPATION AND OPERATION At present, it is difficult to characterize what is happening in NRSA programs, regardless of the type, level, or field of training. For example, we cannot readily provide profiles on the basic demographic characteristics of predoctoral and postdoctoral fellows and trainees, the characteristics of institutions receiving NRSA awards, the types of training models and activities being supported, the components of NRSA training that distinguish it from other training programs aimed at producing researchers in the same fields, and so forth. This dearth of knowledge not only handicaps our ability to understand the basic components of the NRSA program, but it also hampers designing studies that can better assess outcomes (and, ultimately, the effects) of NRSA training and identify with 29

greater confidence which types of training strategies may work better than others (e.g., multidisciplinary programs versus those that focus on one discipline or specialty). Efforts need to be initiated to rectify this situation. A1 though retrospective studies of former trainees and fellows have provided useful insight into training gaps and deficiencies (e.g., Gentile et al., 1987), they are limited. Surveys of recent Ph.D.s and M.D.s/Ph.D.s upon receipt of their doctorate could provide some preliminary insight into the types of variables that should be examined and how they can be measured best (e.g., the phrasing of survey questions, if appropriate). The same can be said for surveys of NRSA-supported postdoctoral trainees and fellows who have just completed their postdoctoral appointment. The most preferred strategy is to collect data from individuals at key intervals during the training process so as to avoid the problems of selective memory and to obtain a more complete picture of how and when training actually occurs. Such efforts also could provide data on many other issues. Program implementation could be examined (i.e., did individuals actually receive the training set forth in the funded applications. If information on trainee satisfaction were collected, the need for modifications in training policies or regulations also might be identified. Finally, if such efforts were extended to a sample of predoctoral students or postdoctoral associates without traineeships in the same academic program and possibly to students in similar programs in departments without training grants, we would begin to develop a better sense of the strength and integrity of various NRSA "treatments" (e.g., whether NRSA programs provide training experiences sufficiently different from those received by students in the same program or in other programs). The manner in which trainees are selected is another important area worthy of close scrutiny. We know, at least for the biomedical sciences, that predoctoral trainees tend to receive their training in top-ranked institutions/departments; presumably, the training grant application review process is selecting the best programs, and the departments in which these programs are based are highly selective in terms of graduate student admitting policies. However, we do not know whether 7These programs have been described as "contests" (Ross), personal communication, May, 1989) in that the most promising applicants are chosen to receive awards. It appears that the main contestants are institutions; the top-ranked research universities clearly are given preference. However, within the departments that house the NRSA programs, it is not clear whether the contest rules are the same, given the problems described in 30

those who actually receive traineeships are awarded them because they are the best "match" in terms of the requirements of the NRSA-supported program, because it is a convenient source of money at a particular period of time; or for other reasons. The same problem plagues applicants for postdoctoral appointments, particularly in medical schools. Needless to say, the ability to ultimately provide answers to questions seeking evidence on effectiveness is hampered by our lack of knowledge about how individuals are selected to receive NRSA support. INCREASED ATTENTION TO A8SE88ING PROGRAM OUTCOMES Those associated with NRSA programs, particularly at the program administrative level, want to know what works best in research training, either in terms of specific training models (e.g., broad versus specialized training) or in terms of specific populations (e.g., predoctoral support versus postdoctoral support versus both types of support in producing physician/ scientists). Previous large-scale evaluation efforts that have had to rely on retrospective assessment strategies have been unable to confidently provide unequivocal answers to these questions because of the probe ems associated with drawing solid comparison groups. Given this substantial interest in program outcomes, particularly comparative outcome data, concerted attention should be devoted to examining the full range of outcomes implied by the goals of the NRSA program (i.e., both research and teaching activities as specified in program announcements and payback requirements for NRSA programs) and to pilot testing new data collection strategies. The issues and variables warranting consideration are discussed by Fox (1983) and Gee (1989) and should be guided by previous research on scientific careers (e.g., Long et al., 1979; McGinnis et al., 1982; McGinnis and Long, 1988; Stephan, 1986~. Not only would efforts to accurately measure outcomes and test various measurement strategies help to better examine scientific productivity and career development, but they also would allow an understanding of the marginal gain associated with using these measures as contrasted to those previous approaches that have had to rely on existing archival data bases. Outcome assessments also could benefit from efforts geared at improving the use of existing archival data bases. For example, rather than using archival data bases to obtain a "snapshot" at one point in time of research grant success, the feasibility of using such data bases to track individuals over time, similar to that performed by Biddle et al. (1989) in this paper. This area requires further investigation, modeled on the work by Carter et al. (1987) for Research Career Development Awards. 31

examining the career accomplishments of Research Scientist Development awardees, could be explored. Another possibility would be to assess the feasibility of using relevant data from other archival sources that could augment the information on traditionally used measures (e.g., research grant success) or provide data on outcomes that have received only minimal attention (e.g., teaching others to be researchers). For example, Yasumura (1986) found that recipients of the KO-6 awards were more likely to receive PHS training grants--a relevant outcome, based on the request made by Congress for obtaining information on the number of awardees who engage in health research or teaching as a career (P.L. 99-1258~. In addition, exploring relationships between program characteristics and outcomes also is needed. For example, recent data on NHLBI awardees suggest that those M.D.s who were postdoctoral trainees in programs designed to train both M.D.s and Ph.D.s were more likely to be receive a NIH grant than those who were postdoctoral trainees in programs only aimed at M.D.s (C. Roth) personal communication, June, 1989~. More efforts of this kind can guide subsequent outcome studies in terms of identifying key programmatic variables that should be considered. Another focus should be on designing outcome studies where the entire "dosage" of training is measured. At the very least, for outcome studies of ADAMHA, NIH, or other PHS support, this information should be available from the TFF currently in existence. However, given that NRSA research training is just one part of funding for scientific personnel development, measuring the length of training must take into account all types of research training and development (e.g., short-term research training, predoctoral and postdoctoral traineeships and fellowships, and career development awards). If the previously described efforts for developing and/or reinstituting data bases on personnel working on PHS grants also are initiated, this would provide additional information on dosage by getting data on predoctoral or postdoctoral support financed by faculty research grants from these agencies. The completeness and accuracy of this information then could be checked by actually contacting the individuals chosen for examining outcomes to determine what types of training support they actually received--both from PHS and other sponsors. The final point is a call for evaluation efforts that incorporate a more longitudinal perspective. To date, the majority of the studies have been of the snapshot variety-- career accomplishments at one point in time. Future evaluation activities ideally must track individuals from the time they begin training (or apply for training opportunities) through the completion of their training and during their scientific careers. Even more limited efforts would benefit from following the Carter et al. (1987) and Biddle et al. (1988) approach to measuring 32

outcomes (e.g., PHS research grant activity) at several points after the individual has completed training. Only within a longitudinal perspective can a specific achievement or pattern of achievements be interpreted and yield information that can illuminate our understanding of these programs and research training in general. INCREASING THE AVAILABILITY OF RE80URCES FOR RESEARCH ON RESEARCH TRAINING Developing better outcome strategies is at least partially dependent on better understanding the nature of scientific careers, what influences productivity, and other similar relationships. Currently, funding for research on scientific careers, resources, and so forth is not abundant. The NSF's Division of Science Resources devotes some funds to supporting studies in these areas (e.g., National Science Foundation, 1986~. However, monies for this program are limited, and proposals often must include all scientific and engineering fields rather than those most relevant to NRSA programs or must focus on using preexisting survey data collected by NSF (data that typically includes, for example, only small numbers of NRSA-supported individuals in various categories). Research on scientific career development and enhancement can go a long way toward addressing basic questions about productivity, motivations for choosing a scientific career, the factors that facilitate or inhibit research productivity, and so on. In conjunction with these research programs, efforts need to be targeted at developing new measures of scientific productivity and quality of research and testing their feasibility. Work on assessing the quality of training programs, apart from the accomplishments of their graduates, is another component that requires attention, particularly if future evaluation efforts aimed at judging the quality of training grant recipients (i.e., programs/departments within institutions) are considered worthy of exploration. The list of research questions is endless, and reasonably strong arguments could be made for choosing any particular question or methodological strategy as the initial starting point. The major point is that resources for these efforts should be increased so that our ability to carry out evaluation activities that can both meet the needs of major NRSA constituencies and assist in the continual improvement of training activities in all fields of health-related research can be enhanced. 33

EXPLORING TRE FEA8IBILITY OF EVALUATTON DESIGN8 FOR ASSE8SING EFFECTIVENESS Given our lack of knowledge about the nature of the selection and training processes, improving the quality of future designs certainly represents a technical challenge. Perhaps the first step toward obtaining better estimates of the relative effectiveness of different types of research training programs and determining the training strategies or mechanisms that work best is to identify where the use of high quality designs can be implemented. One way to begin this process is to focus initially on exploring opportunities that may exist for assessing specific components. For example, the MSTP appears to be a promising candidate for consideration, given that training program directors often vigorously recruit students. Thus, a pool of applicants may be available to use as a comparison group. The feasibility of "beating the bushes" to augment the size of the applicant pool, of convincing directors to collect necessary data on applicant characteristics, and of persuading them to adhere to a standard selection process and criteria should be explored. Even if random assignment is not possible (a likely event), the use of such designs as regression-discontinuity is worth examining. This latter type of design was used successfully in Carter et al.'s (1987) evaluation of the Research Career Development Award. In addition, exploring the feasibility of implementing high quality evaluation designs for the MST program is attractive, given the level of interest in this component. For example, some concern has been expressed over the available supply of physician-researchers, and MD-Ph.D. training programs are being established by other agencies to resolve this problem. At the same time, however, the best way to produce physician-scientists is a point of controversy. Further, questions have been raised about the fact that the total expenditures per graduate for MST programs have been increasing over the last decade and are now significantly higher than for graduates supported by other predoctoral training grants (National Research Council, 1983~. Well-designed evaluations of the MST program, focusing on the outcomes of its graduates and its effectiveness on certain outcomes and as compared to other research training alternatives for physician-scientists, would do much towards addressing these issues. The lower level of interest in effectiveness questions expressed by constituencies, coupled with the difficulty in carrying out high quality studies to address these questions, does not, however, imply that exploring the feasibility of implementing rigorous designs for estimating program effects not important. The long-term objective for NRSA evaluation 34 IS

efforts should be to gain an understanding of what works in training, which programs and program elements work better, and how training should be assessed. Small-scale, pilot tests of more rigorous approaches can be quite instructive in terms of identifying where more better designs can be implemented and ultimately yield an understanding of research training itself. Many of the concerns being expressed today (e.g., future shortages of trained scientists, issues of scientific misconduct, and the lack of interdisciplinary research efforts in many major problem areas) have their core issues surrounding human resources and training. If we are successful in providing a better understanding of NRSA research training and how it operates and contributes to the development of outstanding researchers and research mentors, one step toward addressing these issues will have been taken. REFERENCES Alcohol, Drug Abuse, and Mental Health Administration (ADAMHA). 1989a. ADAMHA NRSA Research Training Tables FY 1988. Rockvil~e, MD. Alcohol, Drug Abuse, and Mental Health Administration. 1989b. Addressing the Shortage of Young ADAMHA Principal Investigators. (Available from Walter Schaffer, ADAMHA, Parklawn Building, 5600 Fishers Lane, Rockvil~e, MD 20857.) Barries, J. 1986. Projected supply and demand for Ph.D. Biologists. Paper presented at the meeting of the American Zoological Society, December 1986. Bicke1, J. W., C. R. Sherman, J. Ferguson, L. Baker, and T. E. Morgan. 1981. The Role of M.D.-Pl~.D. Training in Increasing the Supply of Physician-Scientists. New England Journal of Medicine 304:1265-1268. Biddle, A. R., G. M. Carter, J. S. Uebersax, and J. D. Winkler. 1988. Research Careers of Recipients of the Research Scientist and Research Scientist Development Award. Santa Monica, CA: The RAND Corporation. Carter, G. M., A. Robyn, and A. M. Singer. (1983~. The Supply of Physician Researchers and Support for Research Training: Part I of an Evaluation of the Hartford Fellowship Program (N-2003-HF). Santa Monica, CA: The RAND Corporation. Carter, G. M., J. D. Winkler, and A. K. Biddle. 1987. An Evaluation of the NIH Research Career Development Award. Santa Monica, CA: The RAND Corporation. 35

Clouet, D. H. 1986. The Career Achievements of Trainees and Fellows Supported by the National Institute of Drug Abuse. Rockville, MD: NIDA. Coggeshall, P. E., and P. W. Brown. 1984. The Career Achievements of NIH Predoctora~ Trainees and Fellows. Washington, D.C.: National Academy Press. Coyle, S. L., and Thurgood, D. H. 1989. Summary Report 1987: Doctorate Recipients from United States Universities. Washington, D.C.: National Academy Press. Fox, M. F. 1983. Publication Productivity Among Scientists: A Critical Review. Social Studies of Science 13:285-305. Garrison, H. H., and P. W. Brown, 1986. Minority Access to Research Careers: An Evaluation of the Honors Undergraduate Research Training Program. Washington, D.C.: National Academy Press. Gee, H. 1989. Productivity. Paper commissioned by the Biomedical and Behavioral Research Personnel Committee, National Academy of Sciences, Washington, D.C. Gentile, N. 0., G. S. Levey, P. Jolly, and T. H. Dial. 1987. Post-doctoral Research Training of Full-Time Faculty in Departments of Medicine. Washington, D.C.: Association of Professors of Medicine and Association of American Medical Colleges. Gerstein, D. R., Luce, R. D., N. J. Smelser, and S. Sperlich (eds.~. 1988. The Behavioral and Social Sciences: Achievements and Opportunities. Washington, D.C.: National Academy Press. Levey, G. S., C. R. Sherman, N. O. Gentile, L. J. Hough, T. H. Dial, and P. Jolly. 1988. Postdoctoral research training of full-time faculty in academic departments of medicine. Annals of Internal Medicine 109:414-418. Long, J. S., P. D. Allison, and R. McGinnis. 1979. Entrance into the academic career. American Sociological Review, 44:816-830. Look, M. 1989. Final Report on the Proposed Data Matrix for the Biomedical and Behavioral Research Personnel Committee. Bethesda, MD: Quantum Research Corporation. 36

Martin, M. R. 1986. The Results of a Manual Look-up of 300 References from Non-foreign Medical School Faculty Clinicians from the B/I/D Journal Set and an Analysis of Report by the AAMC on the Number and Character of Physical Researchers. (Available from Division of Program Analysis, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD 20092.) McGinnis, R., and J. S. Long. 1988. Entry into academia: Effects of stratification, geography, and ecology. In D. W. Brenneman and T. I. K. Youn teds.) Academic Labor Markets and Careers, New York: Falmer Press. McGinnis, R., P. D. Allison, and J. S. Long. 1982. Postdoctoral training in bioscience: Allocation and outcomes. Social Forces 60:701-723. National Institute of Dental Research (NIDR) Ad Hoc Consultant Panel. 1988. The National Research Service Award Program: Report of the NIDR Ad Hoc Consultant Panel. Bethesda, MD: NIDR. National Institutes of Health (NIH). 1986. Effects of the National Research Service Award Program on Biomedical Research and Teaching Careers. Washington, D.C.: NIH. National Institutes of Health. 1987. NIH Data Book 1987. Washington, D.C. National Institutes of Health. 1988. NHLBI Fact Book Fiscal Year 1988. Washington, D.C.: NTH. National Research Council (NRC). 1976. Research Training and Career Patterns of Bioscientists: The Training Programs of the National Institutes of Health. Washington, D.C.: National Academy of Sciences. National Research Counci1. 1977. Personnel Needs and Training for Biomedical and Behavioral Research. Washington, D.C.: National Academy of Sciences. National Research Council. 1978. for Biomedical and Behavioral National Academy of Sciences. National Research Council. 1979. for Biomedical and Behavioral National Academy of Sciences. National Research Council. 19Bla. for Biomedical and Behavioral National Academy of Sciences. 37 Personnel Needs and Training Research. Washington, D.C.: Personnel Needs and Training Research. Washington, D.C.: Personnel Needs and Training Research. Washington, D.C.:

National Research Council. 198Ib. Postdoctoral Appointments and Disappointments. Washington, D.C.: NRC. National Research Council. 1983. Personnel Needs and Training for Biomedical and Behavioral Research. Washington, D.C.: National Academy Press. National Research Council. 1985. Personnel Needs and Training for Biomedical and Behavioral Research. Washington, D.C.: National Academy Press. National Research Council. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply. Washington, D.C.: NRC. National Science Foundation. 1988. (NSF 88-336~. Washington, D.C. Office. Project Summaries: FY 1988 : U.S. Government Printing Office of Technology Assessment. 1985. Demographic Trends and the Scientific and Engineering Work Force (OTA-TM-SET-35~. Washington, D.C.: U.S. Government Printing Office. Porter, B. F. 1979. Transition: A follow-up study of 1973 postdoctorals. In the American Physical Society, The Transition in Physics Doctoral Employment 1960-1990. New York, NY: APS. Primos, M. E. 1989a. Minority Access to Research Careers Program: Briefing on Program Activities, FY 1980 to 1986. (Available from M. Primos, Minority Research Resources Branch, NIMH, Parklawn Building, 5600 Fishers Lane, Rockville, MD 20857.) Primos, M. E. 1989b. Minority Fellowship Program: Summary of Program Activities, FY 1973 to FY 1987. (Available from M. Primos, Minority Research Resources Branch, NIMH, Parklawn Building, 5600 Fishers Lane, Rockville, MD 20857.) Schneider, S. 1980. Positions of Psychologists Trained for Research. American Psychologist 35:861-866. Sherman, C. R. 1983a. Notes on the NIH Role in Support of Postdoctoral Research Training of Two Groups of Physicians. (Available from Charles Sherman, NIH, 9000 Rockville Pike, Bethesda, MD 20092.) - Sherman, C. R. 1983b. Training and Manpower Development. Presentation at the Meeting of the Advisory Committee to the Director, NIH, Bethesda, MD. 38

Sherman, C. R. 1984. Perspectives on Short-Term Training in Schools of the Health Professions. (Available from Charles Sherman, NTH, 9000 Rockville Pike, Bethesda, MD 20092.) Sherman, C. R. 1989. The NIH Role in the Training of Individual Physician Faculty: A Supplementary Analysis. (Available from Charles Sherman, Office of Science Policy and Legislation, NIH, 9000 Rockvi1le Pike, Bethesda, MD 20092.) Sherman, C. R., H. P. Jolly, T. E. Morgan, E. J. Higgins, D. Hollander, T. Bry1l, and E. R. Sevilla III. 1981. On the Status of Medical School Faculty and Clinical Research Manpower 1968-1990. Washington, D.C.: National Institutes of Health. Snyder, J. 1988. Early Career Achievements of National Science Foundation Graduate Fellows, 1967-1976. Washington, D.C.: National Research Counci1. Stephan, P. 1986. Age publishing patterns in science: Estimation and measurement issues. In Anthony F. J. van Raan fed.), A Handbook of Quantitative Studies of Science and Technology. Amsterdam, Holland: Elsevier Science Publishing Division of North Holland. Task Force on Minority Research Training. 1986. Minority Research Training Programs at the National Institute of Mental Health: Descriptions and recommendations. Rockville, MD: National Institute of Mental Health. Tjioe, P. 1989. Researchers' Tracking System (RTS) User's Guide. Laurel, MD: General Sciences Corporation. Velletri, P. A., C. R. Sherman, and G. Bowden. 1985. A Comparison of the Career Achievements of NIH Predoctoral Trainees and Fellows. Washington, D.C.: National Institutes of Health. Vetter, B. M. 1989. Recruiting Doctoral Scientists and Engineers Today and Tomorrow (Occasional Paper 89-2~. Washington, D.C.: Commission on Professionals in Science and Technology. Williams, P. C. 1979. Evaluation of the Annual Manpower Survey of NIH Research Grants. Sumner, MD: Cooper-Wi1liams Associates. Yasumura, S. 1984. The Research Career Award (R06~: A 20-Year Perspective and an Analysis of Research Productivity. Washington, D.C.: National Institutes of Health. 39

Zumeta, W. 1985. Extending the Educational Ladder: The Changing Quality and Value of Postdoctoral Study. Lexington, Mass.: D.C. Heath. 40

Next: 2. Productivity »
Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers Get This Book
×
 Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume III: Commissioned Papers
MyNAP members save 10% online.
Login or Register to save!

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!