Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 67
- - -
existing data sets. other ~nt~ormat~on could be gathered through surveys ot former
participants. More information is needed on the determinants of a research career and on
the process used to select trainees. The biggest methodological problem is the lack of
adequate control groups.
CHAPTER 4
TOWARD MEASURING THE EFFECTIVENESS
OF NRSA TRAINING PROGRAMS
OVERVIEW
NRSA personnel programs are designed to ensure the adequate supply and quality
of biomedical and behavioral researchers. The principal mechanisms are fellowships,
which influence individual career choices, and training grants, which also strengthen
institutional training capabilities. However, the complexity of these NRSA programs, as
well as methodological and data problems, makes it difficult to measure their effectiveness.
_ _
e . ,~ . ~ _ +
-
Recent studies of NRSA programs at NIH do suggest that participants outperform
non-participants in terms of subsequent involvement in research during their careers.
These differences held for a wide variety of performance measures, including grant
applications, grants received, publication counts, and citation counts. It cannot be
concluded from the studies, however, that the training programs were responsible for these
differences. There have been no such evaluations of NRSA programs at ADAMHA or the
HRSA.
Much of the information needed! for more rigorous evaluations is available in
In the case of physician/scientists, program evaluations are more complex. Few
M.D. trainees go on to careers of bench-level research, yet their clinical research is vital in
applying new knowledge of molecular biology to patient care. Several reports have
recommended changes in the program of study in training grant programs for
physician/scientists
Background
This chapter considers the two most advanced pools in the education pipeline:
predoctoral and postdoctoral educational programs in biomedical and behavioral science.)
The intent here is to examine how effective the National Research Service Award (NRSA)
programs are in training individuals who move into successful research careers that meet
national needs. A related concern is the question of how variations in effectiveness are
related to program substance. With credible information about program effectiveness, plus
more refined and thorough versions of the cost data presented in the Executive Summary,
it will become possible to determine whether these programs demonstrate acceptable
cost/effectiveness.
The committee was aware from the outset that it would be impossible to provide
definitive information about the effectiveness of these training programs because of
insufficient time and inadequate prior research and data bases. Our more realistic
iWe consider the words "training" and "education" to be synonymous, but the term
"training program" in the context of this report often refers to specific educational efforts
and participants supported in part by the National Institutes of Health (NIH) or Alcohol,
Drug Abuse and Health Administration (ADAMHA) funds. In this context, training
programs may be considered distinct from fellowship programs.
67
OCR for page 68
ambition was to derive a program of research and data improvement that, when
implemented, could provide important steps toward definitive evaluations.
The Education Pipeline
Vital national interests require adequate supplies of highly qualified health-related
scientists. As was stressed in the Executive Summary, our education pipeline--from
elementary schools to universities and professional schools--is the key mechanism for
assuring both the adequacy of the supply and its quality. Public policy must focus on
effective and efficient ways to ensure that the highest quality and appropriate numbers of
scientists are produced by the pipeline while at the same time containing costs.
However, the complexity of producing new entrants into science makes it difficult
to affect the course of the process. In addition to being leaky, the U.S. educational system
is decentralized, ill-coordinated, and only loosely coupled. In an ideal world the lower
levels of an educational system would guide properly prepared young people toward the
universities and colleges. These institutions would in turn provide basic scientific training
to especially talented young people and encourage them to pursue scientific and
professional training and apprenticeships in graduate and professional schools. In that
perfect educational system, every young person who showed promise would advance along
the pipeline promptly and in proportion to their promise.
In reality, however, local school boards run primary and secondary schools under
loose state coordination, universities operate under a variety of jurisdictions, and science
departments enjoy a remarkable degree of autonomy within universities. In this less-than-
ideal world, therefore, the choice points that move one toward scientific and prot-ess~onal
occupations are unclear, both to students and to educational institutions. This lack of
clarity, together with the cumulative nature of the pipeline, makes it easier to get out than
to stay in. A second result of this uncoordinated educational system is that policies
directed at just one of the critical points cannot produce maximum effects, because the
processes taking place at any one critical juncture only partially control the total
production scheme. The only exception may consist of policies directed at the last stage--
graduate and professional schools. Even though the flow is smallest at this point, the
degree of leakage appears to be among the largest. Policies that stem this leakage could
simultaneously affect improvements in the quality of training.
This discussion requires two caveats. First, even if new pipeline policies are
implemented, their effects may take years to become discernible because of the length of
the pipeline. Second, as in other areas of human behavior, public policy may be only a
minor factor in shaping the flow of personnel into the work force of science; enclogenous
processes dominate the shaping of that flow. As a result, the effects of policy innovations
may be slight and will be manifested only over long time periods. Their detection requires
very sensitive measurement, and their analysis needs sophisticated research.
THE AIMS AND EFFECTS OF TRAINING AND FELLOWSHIP PROGRAMS
The overall goals of NRSA personnel programs are easier to state than to achieve.
As a matter of policy, they are intended to ensure that the Supply of biomedical and
behavioral research personnel is sufficient to meet the demand, that their quality is high
enough to meet the needs of a constantly improving level of biomedical research, and that
the pool of skills is responsive to shifts in the demand for various kinds of specialized
personnel.
To reach these goals, the policy uses a number of devices whose adoption is based on
assumptions, explicit or implicit, concerning how occupational choices are made, how
biomedical research skills are acquired, and what the market for biomedical research
68
OCR for page 69
personnel will be. It is useful to examine the first two sets of assumptions to see how
closely they match the actual programs pursued and to consider the alternatives to those
assumptions that were not adopted. (Labor market issues were considered: earlier in this
report.)
Occupational Choice
The most direct goal of NRSA programs is to influence the occupational choices of
potential research personnel. The implicit model is that critical choices at each point in the
career path are influenced by the balance between anticipated benefits and costs of
alternative paths. NRSA programs are designed to influence the balances by lowering costs
of particular choices through stipends.
The effectiveness of this strategy depends not only on when the stipend is offered,
but also on the array of alternative choices offered by the environment. In many
engineering fields, for example, graduate stipends must compete against immediate
employment as a B.S. engineer. Similar competitive circumstances face fellowship programs
designed to recruit M.D.s into research.
1 ralulug
Another purpose of NRSA programs is to provide students with training
opportunities that may not otherwise be available to them. (In this context, training is
meant to cover both formal course work and apprentice-like participation In research.'
This purpose can be achieved in two ways: either by enhancing the ability of academic
departments to provide training, or by improving the range and quality of training choices
available to students.
Training grants may be best at enhancing institutional training capacities, for
instance by expanding the number of traineeships, providing a departmental focus, or
enhancing opportunities for cross-disciplinary training and research. However, it is an
open question whether trainees receive different educational experiences than other
graduate students in the same departments. Stipends may release students from the
necessity to support themselves by unrelated employment, but traineeships may also
compete with employment that is directly related to training, particularly research
assistantships in which graduate students directly participate in faculty research as
apprentices. It is unclear whether the quality of training is affected by being employed as
a trainee rather than as a research assistant.
Fellowships awarded to individuals provide fewer advantages to departments, but
they too enhance training opportunities for individuals and may also enhance the research
of the faculty sponsors.
Training Efficiency
All other things being equal, the shorter the training period, the more research
personnel can be produced. Traineeships and fellowships are believed to shorten the
training period by making it less necessary for their incumbents to engage in income-
generating activities that are not training opportunities: the more time devoted to training,
the quicker it is attained.
Several caveats must be taken into account in assessing this argument. First, the
greater efficiency of the entire biomedical and behavioral research personnel "industry"
can only be attained if there are qualified and promising candidates for training who
cannot be accommodated by the industry~s capacity. Second, stipends are fungible--they
can substitute for job earnings unrelated to training, but they can also increase the
69
OCR for page 70
consumption of goods and services or even prolong one's stay in a training position. The
fungibility of stipends also allows departments to use stipends to substitute for other funds,
thereby increasing the resources available for other purposes or, more likely, increasing the
number of graduate students. The connection between traineeship or fellowship strategies
and increased efficiency is not necessarily causal except where expansion is required as a
condition for awarding a grant.
Honor
Because fellowships and traineeships are awarded mainly in competition, they honor
those who win the awards and hold the resulting positions. Honor may affect subsequent
performance by increasing self-esteem, self-confidence, and the expectations of others.
This effect may be reduced if traineeships are doled out in the same manner as other
support, while fellowships awarded by national competitions may carry additional honor.
Merit-Tested Selection
Traineeships and fellowships presumably go to the most promising among the pool
of eligible candidates. Ironically, those most likely to be selected are also those most likely
to become biomedical researchers without the traineeship or fellowship in question. As a
result it is difficult to estimate the net effects of biomedical and behavioral research
personnel programs. If every promising candidate receives some support, there will be no
available controls--no persons of equal merit who were not chosen. (Statutorily ineligible
persons, such as foreign nationals, differ from those chosen in other important respects.)
Programs that use merit-tested selection can only be evaluated for their net effects by
drastically altering the selection process in what may be regarded as undesirable ways. It
might be tempting, for example, to randomly deny fellowships and traineeships to selected
persons in order to form controls, but such a strategy would surely produce both
unsatisfactory controls and strong opposition.
Marginal Effects
A final policy concern is the marginal effects of the programs: how much would be
gained from expanding the program? Marginal effects are especially of interest for
programs that are not likely to be terminated, have not reached saturation coverage, and in
which policy concerns center around whether the program should be expanded (or
contracted). Biomedical and behavioral research training programs are unlikely candidates
for termination, but their level of support does sometimes come under scrutiny. The issue
of coverage saturation is not settled: there may or may not be additional traineeship or
fellowship candidates who are qualified for support. The import of this discussion is that,
whatever estimates are made of the effects of the biomedical research personnel programs,
attention should be given in the first place to its marginal effects in preference to
estimates of main effects.
THE EVALUATION OF TRAINING AND FELLOWSHIP PROGRAMS
Recent evaluation activities related to the NRSA programs have addressed the above
questions and serve to identify areas of highest priority for future research. (A detailed
discussion of these activities is found in the commissioned paper by Georgine Pion, found
in Volume III of this report.) Such evaluations may serve two different purposes:
definitive or descriptive. That is, an evaluation may aim at a definitive statement
describing the effects of a program (i.e., a statement of how the world would be different
if the program did not exist and evidence of the truth of that statement that is sufficiently
rigorous to be acceptable to the scientific community). This requires well-ciesigned
research: the programs may have small effects, delayed effects, effects that differ for
different subpopulations and under different conditions, and control groups are difficult
70
OCR for page 71
to find. There has been no evaluation of an NRSA program to date that would meet
reasonable scientific standards for a demonstration of causality.
Recent evaluation activities have instead sought the much more modest goal of
providing some facts about certain aspects of the program. Most were outcome studies that
examined selected aspects of the subsequent careers of recipients and compared them to
persons who did not receive NRSA support. These outcome studies are reviewed in the
next section, as are the gaps in knowledge about NRSA that appear important to fill. These
evaluations do not include any causal inference, but they can still support judgments about
the attributes of appropriate policy. For example, if practically all graduates of a
particular training program have outstanding research careers, it may be judged good
policy to continue the program, even if the program had no causal effect on its recipients'
careers. If on the other hand very few graduates of a program ever enter research, it may
be judged that the program is not worthwhile, even if the program does have a causal
effect on those who do succeed. Most evaluations fall between these two extremes.
Outcomes
What happens to persons who receive research training support from the NRSA
program? Recent attempts to answer this question focused on indicators of whether the
graduates are engaged in health-related research and measures of scientific productivity
(e.g., grant application data, publications, citations).2 These studies typically construct a
"comparison group" of persons who did not undergo NRSA training, with which to compare
the performance of those who were in the NRSA program. In practice, however,
comparison groups have been poorly matched; perfect matching is likely to be impossible.
This methodological problem weakens the conclusions that might be drawn about the effect
of NRSA training programs on performance differences.
The consistent finding of almost all of the evaluation studies is that NRSA
awardees outperform comparison group members in terms of research involvement during
their careers. The magnitude of the difference between participants and the comparison
group depends in part on the composition of the comparison group. For examples
.
Coggeshall and Brown used two groups tar comparison with those who received pre-Ph.D.
support under NRSA.3 The first group consisted of age-matched Ph.D.s who received their
degrees from departments that had received an NIH training grant but who had not
received an NIH stipend themselves. The second comparison group consisted of Ph.D.s who
had received their degrees from other departments and who had not received NIH support
themselves. The study found that the performance of participants in NIH-sponsored
predoctoral training modestly exceeds the performance of nonparticipants from the same
departments and greatly exceeds the performance of the second comparison group. This
was true for a wide variety of performance measures, including postdoctoral research
support, subsequent involvement in NIH research, publication counts, and citation counts.
Given that the first comparison group should have received exactly the same graduate
education as the NIH trainees, the differences suggest that at least some training grant
directors are effectively selecting their better Ph.D. students for the NIH award.
A second study found that NIH post-Ph.D. awarders also go on to have more
research-intensive careers than do members of two comparison groups: (1) biomedical
science Ph.D.s who indicated on a survey that they planned to take a postdoctoral
2For an extensive discussion of the concept of productivity, see the paper by Helen H.
Gee in Volume [~l of this report.
3Porter Coggeshall and Prudence Brown, The Career Achievements of NIH Predoctoral
Trainees and Fellows, Washington, D.C.: National Academy Press, 1984.
71
OCR for page 72
appointment but did not receive NIH support and (2) biomedical science Ph.D.s without
postdoctoral plans.4 Again, the differences appear on many performance measures (grant
applications, publication counts, citation rates) and were much greater for the second
comparison group than for the first.
Garrison and Brown also studied the post-training performance of NIH M.D.
postdoctoral awarders. Here, because research is unlikely to be the career goal of a
physician, there are substantial problems in developing an informative comparison group.
Comparison data were collected for (1) all ~ D.s and (2) a subset of M.D.s who said, a few
years after their degree, that their primary activity was either research or teaching. Not
surprisingly, the proportion of NIH ~ D. postdoctoral awardees who were engaged in
research or teaching exceeded that of the typical physician. The subsequent research
involvement of M.D.s who had received an NIH fellowship also greatly exceeded the
comparison group of self-identified researchers and teachers. However, M.D.s who had
received NIH postdoctoral support under a training grant were less likely to be involved in
research than the comparison group of self-identified researchers and teachers. This
unexpected result merits replication; if the finding is repeated, it merits further
investigation as part of a program of research into outcomes of NRSA training programs.
A more interesting question is how these outcomes are related to program
characteristics. The best information would let one estimate how outcomes for NRSA
trainees would change if small amounts of funds were shifted among programs (e.g., from
institutional training grants to fellowships awarded to individuals). Training grant
programs and individual fellowship applications are assigned priority scores to describe the
scientific merit of each application; and (in most NIH programs) funding decisions for
each award are made in priority score order. The "payline" is the point at which funds run
out. If the outcome for persons who receive training under an application that is close to
the payline for each type of grant were known, then one could estimate how the outcomes
for NRSA trainees would change if small amounts of funds were shifted among programs.
However, none of the studies addressed the relationship between outcome and the priority
score given to the fellowship or training grant application.
There is some information about the average outcome for recipients of various
components of the NRSA awards. It is important to compare only programs with a
reasonable chance of having comparable results. For example, one must expect that post-
Ph.D. programs will produce a higher return in researchers per trainee than predoctoral
programs because of the greater commitment to research demonstrated by those persons
who have successfully completed the Ph.D. and applied for a postdoctoral appointment.
Also, because the current M.D. curriculum provides little research training, one must expect
a greater return in researchers per trainee from post-Ph.D. programs than from post-M.D.
programs. To do better it would be necessary for selected M.D.s to have had enough
research experience to be similarly committed to a research career. (See page 58, "A Special
Note on the Training of Physician/Scientists.")
The most comparable programs are training grant and fellowship programs aimed at
persons with the same previous research training. The few evaluation studies that
addressed this issue found that fellows outperformed trainees on most measures of
subsequent research involvement. The differences were less pronounced for Ph.D.s than
for M.D.s, however: 62 percent of post-Ph.D. fellows applied for an NIH or ADAMHA
4Howard Garrison and Prudence Brown, The Career Achievements of NIH Postdoctoral
Trainees and Fellows, Washington, D.C.: National Academy Press, 1986.
72
OCR for page 73
research grant, compared to 52 percent of post-Ph.D. trainees; for M.D.s, the corresponding
figures are 43 percent for fellows and 17 percent for trainees.5
It would be desirable to know how outcomes are related to other aspects of the
NRSA program. The section on the training requirements for physician/sc~entists (below)
notes the empirical evidence supporting the hypothesis that the length of time spent in
postdoctoral research training is a strong predictor of the subsequent research involvement
of M:D.s. What is not known, however, is the amount of time ~D. recipients of NRSA
awards spend in research training supported by non-NRSA mechanisms, such as privately
supported fellowships.
There also have been no adequate studies of the outcomes of two of the most
promising (and expensive) ways of training physician/researchers: the Medical Scientist
Training Program and the Physician/Scientist Award program. Because they both provide
longer periods of NIH-supported research training, they may also yield substantially higher
returns to research than the more traditional fellowship and training programs, but the
facts currently are unknown.
Most evaluation activities have focused on NRSA programs administered by NIH.
The few studies that included ADAMHA awardees tended to use fewer outcome measures.
There have been no evaluations of the NRSA programs sponsored by the Health Research
Service Administration. Similarly, there have been no evaluations of the effect of training
grants on the training capacity or training efficiency of recipient institutions.
Data Needs for Program Evaluation
Program statistics on the number and characteristics of persons receiving each type
of award are the most basic information about the training received by NRSA recipients.6
To provide this information, NIH sponsored the creation of the Trainee Fellow File, which
provides information on all NRSA students, and the Consolidated Grant Application File,
which contains information on programs for advanced research training and on
institutional awards. One deficiency in these data bases is the difficulty involved in
constructing definitions of attributes, such as field of study, that will provide consistent
time series. A second problem is the lack of information about program outcomes. A third
deficiency is the lack of a set of adequate measures for career outcomes, including
scientific productivity. The proposals for a framework for evaluating program
effectiveness and for an evaluation data matrix (discussed below) would remove many of
the difficulties involved in the use of these data.
-
The evaluation matrix proposed in the appendix would deal with the ease of use of
currently available statistics. However, there are three areas where all currently available
statistics are inadequate: research participation by physicians'7 non-NRSA sources of
support for research training, and program evaluations by former trainees. In many
medical schools the faculty roster conducted by the Association of American Medical
Colleges (AAMC) is not answered by the individual faculty member, as a result, the
information in the survey is frequently out of date or otherwise inaccurate. The only firm
information available on the amount of time that physicians spend on research comes-from
5Garrison and Brown, op. Cit., Tables 4.2 and S.2A.
6See the appendix for a further description of existing and proposed data sets discussed
in this chapter.
7Research participation by Ph.D.s is covered in the SDR.
73
OCR for page 74
a one-time survey of the faculty of departments of internal medicine.8 Information is
lacking on other specialties.
Information about sources of support would be obtained most accurately from a
survey of the training sponsors, although it could also be collected on an individual basis.
This information would give a more accurate picture of the total training received by
NRSA recipients and would greatly facilitate the design of more effective evaluation
studies.
Some outcome measures are available from data sets such as the SDR and the
Institute of Scientific Information's Science Citation Index. Other basic measures are
available only from former trainees themselves (and, where appropriate, from credible
comparison groups). Former trainees' assessments of the impact of NRSA training on their
subsequent careers is just one basic set of information that could be of substantial value in
future versions of this report. Other valuable items would include sense of satisfaction
with one's career and sense of contribution to the field. Because the SDR is based on a
small sample, it is usually inappropriate as the source of inferences about small populations
such as NRSA trainees in a given field of science. In this case, occasional surveys of
former trainees and appropriate control groups are altogether warranted.
Very little is known about the process used to select trainees for institutional grants.
No records are kept of unfunded applicants or of persons who are offered a traineeship but
turn it down. The lack of such basic information about the demand for training makes it
difficult to assess important parameters of the program, such as the level of stipends and
the effects of the payback provision.
Finally, and perhaps most importantly, there is a great need for basic research on
the determinants of a research career. NRSA programs attempt to intervene in a complex
decision process that is poorly understood. Little is known about how the characteristics of
a training" program affect the research abilities of persons who participate in that
training. Although the recent evaluation studies suggest that NRSA training is correlated
with success as a researcher, the correlations are very small: the total effect of NRSA
training and other indicators of preexisting quality explained only 6-14 percent of the
variance in outcome measures. Better understanding of the factors that influence career
decisions and research ability is the key to designing more effective and efficient training
programs.
A FRAMEWORK FOR PROGRAM EVALUATION
There are two major limitations in conducting an adequate evaluation of NRSA
programs: (1) inadequate control groups with which to compare the awardees and (2)
inadequate measures of many of the outcomes that need to be assessed. As discussed above,
the problem of control groups is related to the process by which trainees are selected: those
selected might have more successful careers (by whatever measure) than those not selected,
independent of the advantages provided by the training program. An ideal experimental
design would consist of choosing trainees randomly, independent of their characteristics, so
that differences in career outcomes could be attributed to the effect of the training
program. This ideal methodological approach is unreasonable in practice. A reasonable
and practical alternative to random selection would be careful study of the process that
FIG. S. Levey, et al., "Postdoctoral Research Training of Full-time Faculty in Academic
Departments of Medicine," Annals of Internal Medicine, vol. 109, no. 5 (September 1988), pp.
414-418; their findings are discussed in Chapter 5 of this report.
9Coggeshall and Brown, op. cit.; Garrison and Brown, op. cit.
74
OCR for page 75
determines selection as an NRSA trainee. This approach would also provide insights that
can be used to improve the selection process and, to the extent that the process is modeled
adequately, it would be possible to introduce statistical controls into the analysis of the
effects of being a trainee on career outcomes. Consequently, the committee's first
recommendation in designing future evaluation studies is to include detailed information
on the process by which trainees are selected from all applicants. The next step is to model
the effects of the training program on career outcomes, including productivity measures.
The commissioned paper by Helen H. Gee (see Volume III of this report) establishes
guidelines that should be used in planning productivity assessments, including a number of
general points about the use of productivity measures for NRSA programs:
o
o
o
o
Define program goals specifically enough to provide guidance in
constructing measures of their success. For example, "contribute to the
research enterprise" does not narrow down the many ways this can be
accomplished--through publications, patents, administration, and teaching.
Recognize multiple pathways (activities and career paths) that can lead to
those goals by designing evaluation studies that assess the variety of
potential outcomes.
Exclude those scientists whose career paths and research productivity cannot
be assessed adequately with available methods and data. For example, if
methods for assessing the productivity of nonacademic scientists are not
practical, those scientists should be excluded from comparisons of other
groups.
Identify the uses to which the results of the assessment are to be put and let
them guide the design of evaluation studies. Evaluations designed to assist
program managers, for example, will not necessarily provide the information
required by those making policy decisions.
Recent evaluation studies have tended to focus exclusively on the single measure of
publications and the single characteristic of whether or not the trainee sought funding
from NIH. The committee recommends that future studies consider a broader spectrum of
outcome measures, including the following:
o
o
o
o
o
o
o
o
receipt of a Ph.D. (for predoctoral trainees);
time required to complete the Ph.D. (for predoctoral trainees);
years of postdoctoral training;
type of employer;
type of work activity;
pursuit and receipt of NIH and ADAMHA funding;
publications and citations; and
area of research.
For the evaluations to be most effective, these measures (many of which have been used in
other studies) should be followed over an extended period of the career rather than be
measured only at a single point in time. Longitudinal studies should be used that track
changes in employers, work activity, grant activity, publications, citations, and area of
research over at least the first decade of the career. Statistical comparisons of the career
activities of trainees and the control group will provide a much better insight into the
effectiveness of NRSA training programs.
In summary, it is possible to design and carry out research that will produce
unbiased estimates of marginal program effects by carefully expanding the program to
75
OCR for page 76
include additional trainees and fellows. Persons selected under this controlled expansion
need to be followed over a period of time. Furthermore, the heterogeneity of the programs'
aims and mechanisms also create difficulties because there are many kinds of intended
effects and additional side effects--some desirable, some simply benign, and others possibly
subversive of the main aims of the programs. Thus, the committee recommends that two
evaluations of program effects be undertaken:
1.
A comprehensive assessment of the effects on institutions, departments, and
individual trainees and fellows and
2. A less comprehensive evaluation of the effects of program participation on
individual awarders.
A SPECIAL NOTE ON THE TRAINING OF PHYSICIAN/SCIENTISTS
Program evaluation for clinical investigators is further complicated by the
complexities of training and tracking the academic physician/scientist. ~D. faculty are
supported in their research training not only through NRSA fellowships and institutional
training grants, but also by a variety of foundations and volunteer health agencies. Thus,
receipt of NIH support for post-M.D. training and receipt of post-M.D. training are not
synonymous.
Evaluation is further complicated when application for and receipt of NIH research
grants (generically called R01) by former trainees are used as program outcome variables.
The subsequent careers of M.D. awardees may involve (1) no research, (2) bench-type
research, or (3) academic "hands-on" patient research. It is predominantly those in the
second category--a comparatively low number--who are likely to apply for and obtain NIH
R01 research support. Yet there is evidence that far more M.D. faculty in the third
category--perhaps as many as 50-60 percent of the total NRSA M.D. trainees--are doing
productive clinical investigation but not at the bench level that is generally required for
NIH R01 funding.
There is a vital need for well trained clinical investigators who can take the
enormous explosion of knowledge in molecular biology and apply it to the care of patients.
Over the last decade, however, it has been increasingly difficult for clinical investigators
doing hands-on patient research to obtain funding through NIH. These individuals may
account for a very large proportion of the "unsuccessful" trainees from the NRSA
institutional grants. If so, they must be identified and quantified for adequate program
evaluation. The same holds true for the cadre of clinical investigators who will have to be
trained in the methodologies of epidemiology, biostatistics, health services research,
economics, and outcome assessment in the near future.
James Wyngaarden, a former director of NIH, has emphasized that research training
and career development programs have a priority for NIH "virtually equal to the support
of research project grants.") He also acknowledges, however, that NIH-sponsored training
programs have variable success rates, with the least certain being the traditional training
programs for physician/scientists. Far too few of these M.D. trainees apply for and receive
NIH research grants, according to Wyngaarden, and some training programs merely serve as
support vehicles for subspecialty clinical training. He calls for a comprehensive, critical
review of NIH research training programs, specifically whether examining current training
programs for physician/scientists should be modified.
A. B. Wyngaarden (memorandum to BID Directors and OD Staff), "Review of NIH's
Biomedical Research Training Program," April 19, 1989.
76
OCR for page 77
Wyngaarden's position is echoed by Lloyd.H. Smith (see Volume III of this report).
Smith's position is that the serious physician/scientist must receive in-depth training in- a
scientific discipline relevant to medicine and that rigorous scientific training can rarely be
achieved in a specialty division of a clinical department. Smith argues that the training of
physician/scientists should be comparable to Ph.D. programs in rigor and scope and that the
physician should not be burdened with clinical responsibilities during the research training
period. Smith believes that at least three years of rigorous training in modern biological
science is usually necessary for most individuals to achieve independence as an
investigator
Smith's paper buttresses remarks made by Joseph Goldstein in his 1986 address to
the American Society for Clinical Investigation. Paraphrasing from that address,
intelligence, curiosity,.and drive are necessary but not sufficient for the productive
physician/scientist; there must also be technical skill and the ability to reduce a
complicated clinical phenomenon to a manageable biochemical problems Given the
complexities of modern biomedical research, a clinical investigator must have a
sophisticated understanding of the fundamental sciences, a mentor in the sciences to direct
development, the opportunity to learn techniques, and uninterrupted time in the laboratory
to conduct the research.
Those committee members who have experience in the training of
physician/scientists endorse the suggestions made by Smith and Goldstein and suggest that
the following changes be made in the postdoctoral institutional training programs for
physician/sc~entists:
a true consortium between the clinical and preclinical departments of the
institution, with shared responsibility for the design and administration of
the program,
selection of trainees based on evidence of some previous experience in
research and overall promise; ~
o formal course. work in the physical and biochemical sciences sufficient to
give graduates a theoretical background comparable to those with graduate
degrees in the biological sciences;
not less than three years of research training, primarily in direct research
experience under the supervision of a mentor; and
modules of instruction, specifically tailored to the needs of the physician
trainee, in such areas as basic laboratory techniques, chromatography,
radioimmunoassay, protein purification, advanced instrumental techniques,
fundamental principles of enzymology and molecular biology, subcellular i.
fractionation techniques, computer technology, evaluation of experimental
data, epidemiology, and statistics and data base management, as well as grant
and manuscript writing.
A 1986 survey of full-time faculty in departments of medicine made similar
recommendations regarding postdoctoral research trainings The survey identified several
its. L. Goldstein, "On the Origin and Prevention of PAIDS (Paralyzer! Academic
Investigator's Disease Syndrome)," Journal of Clinical Investigation, vol. 7S, 1986, pp. 848-
854.
i2G. S. Levey, et al., op. cit.
77
OCR for page 78
features of training experiences that were associated with the faculty member currently
being an active researcher, including the following:
1. Most postdoctoral training occurred in medical schools and the primary
source of funding was NIH.
2. For faculty members with an M:D. degree, the length of training was a
significant predictor for subsequently being an active researcher and
principal investigator for a peer-reviewed research grant.
The average length of time between the end of postdoctoral research training
and obtaining the first peer-reviewed research grant was 24 months,
regardless of length of training, source of training support, training site, or
type of academic degree (M.D., M.D./Ph.D., or Ph.D.~.
4.
Respondents advocated incorporating formal course work, particularly in the
basic sciences and statistics, within the structure of the postgraduate training
programs, with less time allocated to patient care.
5. Contributing factors to being a successful researcher in academic medicine
include the following: two or more years of postdoctoral research training,
including formal course work in the fundamental sciences pertinent to
biomedical research; two to three years of full research funding from an
academic institution until the first extramural grant is obtained; and the
investigator's commitment of at least 33 percent of time to research
activities.
The former trainees, upon reflection, favored changing the curriculum to include more
formal course work and training in fundamentals, particularly mathematics/computer
science, statistics, research techniques, grant administration, and medical writing. Of equal
interest, the vast majority (65 percent) wanted less time devoted to clinical medicine during
the training program. The committee as a whole finds merit in these suggestions and
recommends that NIH establish a committee, conference, or study to consider whether
changes should be made in the program of study in postgraduate institutional training
grants for physician/scientists.
Deficiencies in the evaluation of these training programs are described in detail in
the commissioned paper by Georgine Pion (see Volume III of this report). There are
inherent difficulties in retrospective survey designs, and evaluation must focus on the
career development of those who trained as many as 10-15 years ago to determine long-
range effects of these training programs. The evaluation must also define what constitutes
"success." For example, although this committee may conclude that institutional training
grants need to be revised, it also recognizes that even in their current form, these programs
have made a positive contribution. Graduates of the institutional training grants have
populated all the clinical departments-in our medical schools and, even during their short
training periods, they have done valuable research work in the laboratories of established
investigators.
78
Representative terms from entire chapter:
institutional training