Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 13
Methodology
Quality . . . you know what it is, yet you don't know
what it is. But that's self-contradictory. But some
things are better than others, that is, they have more
quality. But when you try to say what the quality is,
apart from the things that have it, it all goes poof'
There's nothing to talk about. But if you can't say
what Quality is, how do you know what it is, or how do
you know that it even exists? If no one knows what it
is, then for all practical purposes it doesn't exist
at all. But for all practical purposes it really does
exist. What else are the grades based on? Why else
would people pay fortunes for some things and throw
others in the trash pile? Obviously some things are
better than others . . . but what's the "betterness~?
. So round and round you go, spinning mental wheels
and nowhere finding anyplace to get traction. What the
hell is Quality? What is it?
· ~
Robert M. Pirsig
Zen and the Art of
Motorcycle Maintenance
Both the planning committee and our own study committee have given
careful consideration to the types of measures to be employed in the
assessment of research-doctorate programs. The committees recognized
that any of the measures that might be used is open to criticism and
that no single measure could be expected to provide an entirely
satisfactory index of the quality of graduate education. With respect
PA description of the measures considered may be found in the third
chapter of the planning committee's report, along with a discussion of
the relative merits of each measure.
13
OCR for page 14
14
to the use of multiple criteria in educational assessment, one critic
has commented:
At best each is a partial measure encompassing a frac-
tion of the large concept. On occasion its link to the
real [world] is problematic and tenuous. Moreover,
each measure [may contain] a load of irrelevant super-
fluities, "extra baggage" unrelated to the outcomes
under study. By the use of a number of such measures,
each contributing a different facet of information, we
can limit the effect of irrelevancies and develop a
more rounded and truer picture of program outcomes.2
Although the use of multiple measures alleviates the criticisms
directed at a single dimension or measure, it certainly will not sat-
isfy those who believe that the quality of graduate programs cannot be
represented by quantitative estimates no matter how many dimensions
they may be intended to represent. Furthermore, the usefulness of the
assessment is dependent on the validity and reliability of the criteria
on which programs are evaluated. The decision concerning which mea-
sures to adopt in the study was made primarily on the basis of two fac-
tors:
(1) the extent to which a measure was judged to be
related to the quality of research-doctorate pro-
grams and
(2) the feasibility of compiling reliable data for
making national comparisons of programs in par-
ticular disciplines.
Only measures that were applicable to a majority of the disciplines to
be covered were considered. In reaching a final decision the study
committee found the ETS study,3 in which 27 separate variables were
examined, especially helpful, even though it was recognized that many
of the measures feasible in institutional self-studies would not be
available in a national study. The committee was aided by the many
suggestions received from university administrators and others within
the academic community.
Although the initial design called for an assessment based on ap-
proximately six measures, the committee concluded that it would be
highly desirable to expand this effort. A total of 16 measures (listed
in Table 2.1) have been utilized in the assessment of research-doctor-
ate programs in economics, political science, psychology, and sociol-
2C. H. Weiss, Evaluation Research: Methods of Assessing Program Ef-
fectiveness, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1972,
p. 56.
3See M. J. Clark et al. (1976) for a description of these variables.
OCR for page 15
15
TABLE 2.1 Measures Compiled on Individual Research-Doctorate Programs
in the Social and Behavioral Sciences
Program Size1
01 Reported number of faculty members in the program, December 1980.
02 Reported number of program graduates in last five years (July 1975
through June 1980).
03 Reported total number of full-time and part-time graduate students
enrolled in the program who intend to earn doctorates, December 1980.
Characteristics of Graduates2
04 Fraction of FY1975-79 program graduates who had received some national
fellowship or training grant support during their graduate education.
05 Median number of years from first enrollment in graduate school to
receipt of the doctorate--FY1975-79 program graduates.3
06 Fraction of FY1975-79 program graduates who at the time they completed
requirements for the doctorate reported that they had made definite
commitments for postgraduation employment.
07 Fraction of FY1975-79 program graduates who at the time they completed
requirements for the doctorate reported that they had made definite com-
mitments for postgraduation employment in Ph.D.-granting universities.
Renutational Survey Results4
08 Mean rating of the scholarly quality of program faculty.
09 Mean rating of the effectiveness of the program in educating research
scholars/scientists.
10 Mean rating of the improvement in program quality in the last five years.
11 Mean rating of the evaluators' familiarity with the work of the program's
faculty.
University Library Sizes
12 Composite index describing the library size in the university in
which the program is located, 1979-80.
Research Support
13 Fraction of program faculty members holding research grants from the
Alcohol, Drug Abuse, and Mental Health Administration, the National
Institutes of Health, or the National Science Foundation at any time
during the FY1978-80 period. 6
Total expenditures (in thousands of dollars) reported by the university
for research and development activities in a specified field, FY1979.7
Publication Records
17 Number of published articles attributed to the program faculty members,
1978-80.
18 Fraction of program faculty members with one or more published articles,
1978-80.
iBased on information provided to the committee by the participating universi-
ties.
2Based on data compiled in the NRC's Survey of Earned Doctorates.
3 In reporting standardized scores and correlations with other variables, a
shorter time-to-Ph.D. is assigned a higher score.
4Based on responses to the committee's survey conducted in April 1981.
s Based on data compiled by the Association of Research Libraries.
6 Based on matching faculty names provided by institutional coordinators with
the names of research grant awardees from the three federal agencies.
7 Based on data provided to the National Science Foundation by universities.
abased on data compiled by the Institute for Scientific Information.
1
OCR for page 16
16
ogy. Fifteen of these were used in evaluating programs in anthropology
and geography, and 14 were used in history. (Data on research expendi-
tures are unavailable in each of the latter three disciplines, and data
on ADAMHA/NIH/NSF research support of faculty investigators are not ap-
plicable to most programs in history.) For nine of the measures data
are available describing most, if not all, of the social and behavioral
science programs included in the assessment. For five measures the
coverage is less complete but encompasses at least a majority of the
programs in every discipline. The actual number of programs evaluated
on every measure is reported in the second table in each of the next
seven chapters.
The 16 measures describe a variety of aspects important to the op-
eration and function of research-doctorate programs--and thus are rele-
vant to the quality and effectiveness of programs in educating scien-
tists for careers in research. However, not al1 of the measures may be
viewed as "global indices of quality." Some, such as those relating to
program size, are best characterized as "program descriptors" that, al-
~ __ ~~___˘ =~ are thought to have a signifi-
cant influence on the effectiveness of programs. Other measures, such
as those relating to university library size and support for research
and training, describe some of the resources generally recognized as
being important in maintaining a vibrant program in graduate education.
Measures derived from surveys of faculty peers or from the publication
records of faculty members, on the other hand, have traditionally been
regarded as indices of the overall quality of graduate programs. Yet
these too are not true measures of quality.
thouah not dimensions of analiEv ner I.
We often settle for an easy-to-gather statistic, per-
fectly legitimate for its own limited purposes, and
then forget that we haven't measured what we want to
talk about. Consider, for instance, the reputation ap-
proach of ranking graduate departments: We ask a same
pie of physics professors (say) which the best physics
departments are and then tabulate and report the re-
sults. The "best" departments are those that our re-
spondents say are the best. Clearly it is useful to
know which are the highly regarded departments in a
given field, but prestige (which is what we are measur-
ing here) isn't exactly the same as quality.4
To be sure, each of the 16 measures reported in this assessment has its
own set of limitations. In the sections that follow an explanation is
provided of how each measure has been derived and its particular limi-
tations as a descriptor of research-doctorate programs.
4 John Shelton Reed, "How Not to Measure What a University Does," The
Chronicle of Higher Education, Vol. 22, No. 12, May 11, 1981, p. 56.
OCR for page 17
17
PROGRAM SIZE
Information was collected from the study coordinators at each uni-
versity on the names and ranks of program faculty, doctoral student
enrollment, and number of Ph.D. graduates in each of the past five
years (FY1976-80~. Each coordinator was instructed to include on the
faculty list those individuals who, as of December 1, 1980, held aca-
demic appointments (typically at the rank of assistant, associate, and
full professor) and who participated significantly in doctoral educa-
tion. Emeritus and adjunct members generally were not to be included.
Measure 01 represents the number of faculty identified in a program.
Measure 02 is the reported number of graduates who earned Ph.D. or
equivalent research doctorates in a program during the period from
July 1, 1975, through June 30, 1980. Measure 03 represents the total
number of full-time and part-time students reported to be enrolled in
a program in the fall of 1980, who intended to earn research doctor-
ates. All three of these measures describe different aspects of pro-
gram size. In previous studies program size has been shown to be
highly correlated with the reputational ratings of a program, and this
relationship is examined in detail in this report. It should be noted
that since the information was provided by the institutions partici-
pating in the study, the data may be influenced by the subjective deci-
sions made by the individuals completing the forms. For example, some
institutional coordinators may be far less restrictive than others in
deciding who should be included on the list of program faculty. To
minimize variation in interpretation, detailed instructions were pro-
vided to those filling out the forms.5 Measure 03 is of particular
concern in this regard since the coordinators at some institutions may
not have known how many of the students currently enrolled in graduate
study intended to earn doctoral degrees.
CEIARACTERISTICS OF GRADUATES
One of the most meaningful measures of the success of a research-
doctorate program is the performance of its graduates. How many go on
to lead productive careers in research and/or teaching? Unfortunately,
reliable information on the subsequent employment and career achieve-
ments of the graduates of individual programs is not available. In the
absence of this directly relevant information, the committee has relied
on four indirect measures derived from data compiled in the NRC's Sur-
vey of Earned Doctorates.6 Although each measure has serious limita-
tions {described below), the committee believes it more desirable to
sA copy of the survey form with the instructions sent to study coor-
dinators is included in Appendix A.
6A copy of the questionnaire used in this survey is found in Appen-
dix B.
OCR for page 18
18
include this information than not to include data about program gradu-
ates.
In identifying program graduates who had received their doctorates
in the previous five years (FY1975-79) ,7 the faculty lists furnished
by the study coordinators at universities were compared with the names
of dissertation advisers (available from the NRC survey). The latter
source contains records for virtually all individuals who have earned
research doctorates from U.S. universities since 1920. The institu-
tion, year, and specialty field of Ph.D. recipients were also used in
determining the identity of program graduates. It is estimated that
this matching process provided information on the graduate training and
employment plans of more than 90 percent of the FY1975-79 graduates
from the social and behavioral science programs. In the calculation
of each of the four measures derived from the NRC survey, program data
are reported only if the survey information is available on at least
10 graduates. Consequently, in a discipline with smaller programs--
e.g., geography--slightly less than three-fourths of the programs are
included in these measures, whereas nearly all of the economics, his-
tory, and psychology programs are included. -
Measure 04 constitutes the fraction of FY1975-79 graduates of a
program who had received at least some national fellowship support,
including National Institutes of Health fellowships or traineeships,
National Science Foundation fellowships, other federal fellowships,
Wood row Wilson fellowships, or fellowships/traineeships from other U.S.
national organizations. One might expect the more selective programs
to have a greater proportion of students with national fellowship sup-
port--especially "portable fellowships." Although the committee con-
sidered alternative measures of student ability (e.g., Graduate Record
Examination scores, undergraduate grade point averages), reliable in-
formation of this sort was unavailable for a national assessment. It
should be noted that the relevance of the fellowship measure varies
considerably among disciplines. In the biomedical sciences a substan-
tial fraction of the graduate students are supported by training grants
and fellowships; in the social and behavioral sciences the majority are
supported by teaching assistantships and research assistantships.
Measure 05 is the median number of years elapsed from the time pro-
gram graduates first enrolled in graduate school to the time they
received their doctoral degrees. For purposes of analysis the commit-
tee has adopted the conventional wisdom that the most talented students
are likely to earn their doctoral degrees in the shortest periods of
time--hence, the shorter the median time-to-Ph.D., the higher the stan-
dardized score that is assigned.
been employed in social science research as a proxy for student abil-
ity, one must regard its use here with some skepticism. It is quite
possible that the length of time it takes a student to complete re-
quirements for a doctorate may be significantly affected by the ex-
Althounh this measure has frequently
7 Survey data for the FY1980 Ph.D. recipients had not yet been compiled
at the time this assessment was undertaken.
OCR for page 19
19
plicit or implicit policies of a university or department. For ex-
ample, in certain cases a short time-to-Ph.D. may be indicative of less
stringent requirements for the degree. Furthermore, previous studies
have demonstrated that women and members of minority groups, for rea-
sons having nothing to do with their abilities, are more likely than
male Caucasians to interrupt their graduate education or to be enrolled
on a part-time basis.8 ^~ ~^ and ~^A:_~ ~ :~_~nh n ~~,
^~ ~ L;~ll~U=ll~= ~ ~~= 111~1~ ~ ~ ,~=—I- ~ - Icy
ne longer tor programs with larger fractions of women and minority
students.
Measure 06 represents the fraction of FY1975-79 program graduates
who reported at the time they had completed requirements for the doc-
torate that they had signed contracts or made firm commitments for
postgraduation employment (including postdoctoral appointments as well
as other positions in the academic or nonacademic sectors) and who pro-
vided the names of their prospective employers. Although this measure
is likely to vary discipline by discipline according to the availabil-
ity of employment opportunities, a program's standing relative to other
programs in the same discipline should not be affected by this varia-
tion. In theory, the graduates with the greatest promise should have
the easiest time finding jobs. ~
However, the measure is also influenced
by a variety of other factors, such as personal job preferences and re-
strictions in geographic mobility, that are unrelated to the ability
of the individual. It also should be noted parenthetically that unem-
ployment rates for doctoral recipients are quite low and that nearly
all of the graduates seeking jobs find positions soon after completing
their doctoral programse9 ~~-~^----^ ~ ~ _1~ ~ Ace_ ~_~1~_
~UL~=Llll~L=, LION hill "~=L Yost—
ation is by no means a measure of career achievement, which is what one
would like to have if reliable data were available.
Measure 07, a variant of measure 06, constitutes the fraction of
FY1975-79 program graduates who indicated that they had made firm com-
mitments for employment in Ph.D.-granting universities and who provided
the names of their prospective employers. This measure may be presumed
to be an indication of the fraction of graduates likely to pursue ca-
reers in academic research, although there is no evidence concerning
how many of them remain in academic research in the long term. In many
social and behavioral science disciplines the path from Ph.D. to junior
faculty has traditionally been regarded as the road of success for the
growth and development of research talent. The committee is well
aware, of course, that other paths, such as employment in the major
laboratories of industry and government, provide equally attractive
opportunities for growth. Indeed, in recent years increasing numbers
BFor a detailed analysis of this subject, see Dorothy M. Gilford and
Joan Snyder, Women and Minority Ph.D.'s in the 1970's: A Data Book,
National Academy of Sciences, Washington, D.C., 1977.
9For new Ph.D. recipients in science and engineering the unemployment
rate has been less than 2 percent (see National Research Council, Post-
doctoral Appointments and Disappointments, National Academy Press,
Washington, D.C., 1981, p. 313~.
OCR for page 20
20
TABLE 2.2 Percentage of FY1975-79 Doctoral Recipients with Definite
Commitments for Employment Outside the Academic Sector*
Anthropology
Economics
Geography
History
Political Science
Psychology
Sociology
16
32
18
22
19
47
13
*Percentages are based on responses to the NRC's Survey of Earned Doc-
torates by those who indicated that they had made firm commitments for
postgraduation employment and who provided the names of their prospec-
tive employers. These percentages may be considered to be lower-bound
estimates of the actual percentages of doctoral recipients employed
outside the academic sector.
of graduates are entering the nonacademic sectors. Unfortunately, the
data compiled from the NRC's Survey of Earned Doctorates do not enable
one to distinguish between employment in the top-flight laboratories
of industry and government and employment in other areas of the non-
academic sectors. Accordingly, the committee has relied on a measure
that reflects only the academic side and views this measure as a useful
and interesting program characteristic rather than a dimension of qual-
ity. In the social and behavioral science disciplines, in which less
than one-third of the graduates with definite employment plans intend
to take jobs outside the academic environs (see Table 2.2), this limi-
tation is of lesser concern than it is in the engineering and physical
science disciplines.
The inclusion of measures 06 and 07 in this assessment has been an
issue much debated by members of the committee; the strenuous objec-
tions by three committee members regarding the use of these measures
are expressed in the Minority Statement, which follows Chapter X.
REPUTATIONAL SURVEY RESULTS
In April 1981 survey forms were mailed to a total of 1,770 faculty
members in anthropology, economics, history, political science, psy-
chology, and sociology. Survey forms were mailed to 150 geography fac-
ulty in September 1981. The evaluators were selected from the faculty
lists furnished by the study coordinators at the 228 universities cov-
ered in the assessment. These evaluators constituted approximately 13
percent of the total faculty population--14,898 faculty members--in the
social and behavioral science programs being evaluated (see Table 2.3~.
The survey sample was chosen on the basis of the number of faculty in a
particular program and the number of doctorates awarded in the previous
OCR for page 21
21
five years (FY1976-801--with the stipulation that at least one evalu-
ator was selected from every program covered in the assessment. In se-
lecting the sample each faculty rank was represented in proportion to
the total number of individuals holding that rank, and preference was
given to those faculty members whom the study coordinators had nomi-
nated to serve as evaluators. As shown in Table 2.3, 1,686 individuals
--88 percent of the survey sample in the social and behavioral sciences
--had been recommended by study coordinators.~°
Each evaluator was asked to consider a stratified random sample of
no more than 50 research-doctorate programs in his or her discipline--
with programs stratified by the number of faculty members associated
with each program. Every program was included on 150 survey forms.
The set of programs to be evaluated appeared on each survey form in
random sequence, preceded by an alphabetized list of all programs in
that discipline that were being included in the study. No evaluator
was asked to consider a program at his or her own institution. Ninety
percent of the survey sample group were provided the names of faculty
members in each of the programs to be evaluated, along with data on the
total number of doctorates awarded in the last five years. The
inclusion of this information represents a significant departure from
the procedures used in earlier reputational assessments. For purposes
of comparison with previous studies, 10 percent (randomly selected in
each discipline) were not furnished any information other than the
names of the programs.
The survey items were adapted from the form used in the Roose-
Andersen study. Prior to mailing, the instrument was pretested using
a small sample of faculty members in chemistry and psychology. As a
result, two significant improvements were made in the original survey
design. A question was added on the extent to which the evaluator was
familiar with the work of the faculty in each program. Responses to
this question, reported as measure 11, provide some insight into the
relationship between faculty recognition and the reputational standing
of a program. 2 Also added was a question on the evaluator's field of
specialization--thereby making it possible to compare program evalu-
ations in different specialty areas within a particular discipline.
A total of 1,195 faculty members in the social and behavioral sci-
ences--62 percent of those asked to participate--completed and returned
survey forms {see Table 2.3~. Two factors probably have contributed to
this response rate being approximately 17 percentage points below the
rates reported in the Cartter and Roose-Andersen studies. First, be-
cause of the considerable expense of printing individualized survey
forms (each 25-30 pages), second copies were not sent to sample members
MA detailed analysis of the survey participants in each discipline is
given in subsequent chapters.
This information was furnished to the committee by the study coor-
dinators at the universities participating in the study.
Evidence of the strength of the relationship is provided by corre-
lations presented in Chapters III-IX, and an analysis of the relation-
ship is provided in Chapter X.
OCR for page 22
22
TABLE 2.3 Survey Response by Discipline and Characteristics of
Evaluator
Total
Program
Faculty
N
Discipline of Evaluator
. _
Survey Sample
Total Respondents
N N %
.
Anthropology 1,181 210 125 60
Economics 2,163 279 185 66
Geography 640 150 106 71
History 2,820 306 166 54
Political Science 1,880 249 152 61
Psychology 4,299 450 280 62
Sociology 1,915 276 181 66
Faculty Rank
Professor 7,629 1,000 628 63
Associate Professor 4,014 611 383 63
Assistant Professor 2,984 299 179 60
Other 271 10 5 50
Evaluator Selection
-
Nominated by Institution 4,543 1,686 1,082 64
Other 10,355 234 113 48
Survey Form
With Faculty Names N/A* 1,728 1,072 62
Without Names N/A* 192 123 64
Total All Fields 14,898 1,920 1,195 62
*Not applicable.
OCR for page 23
23
not responding to the first mailing~3--as was done in the Cartter and
Roose-Andersen efforts. Second, it is quite apparent that within the
academic community there has been a growing dissatisfaction in recent
years with educational assessments based on reputational measures. In-
deed, this dissatisfaction was an important factor in the Conference
Board's decision to undertake a multidimensional assessment, and some
faculty members included in the sample made known to the committee
their strong objections to the reputational survey.
As can be seen in Table 2.3, there is some variation in the re-
sponse rates in the seven social and behavioral science disciplines.
Of particular interest is the relatively high rate of response from
geographers and the low rate from historians.~4 The high response
rate in geography may be attributable, in part, to the fact that geog-
raphy was added later to the original list of disciplines to be in-
cluded in the assessment and consequently the timing and circumstances
relating to the survey activity in this discipline were somewhat dif-
ferent. It is not surprising to find that the evaluators nominated by
study coordinators responded more often than did those who had been
selected at random. No appreciable differences were found among the
response rates of assistant, associate, and full professors, nor be-
tween the rates of those evaluators who were furnished the abbreviated
survey form (without lists of program faculty) and those who were given
the longer version.
Each program was considered by an average of approximately 90 sur-
vey respondents from other programs in the same discipline. The eval-
uators were asked to judge programs in terms of scholarly quality of
program faculty, effectiveness of the program in educating research
scholars/scientists, and change in program quality in the last five
years. The mean ratings of a program on these three survey items
constitute measures 08, 09, and 10. Evaluators were also asked to in-
dicate the extent to which they were familiar with the work of the pro-
gram faculty. The average of responses to this item constitutes mea-
sure 11.
In making judgments about the quality of faculty, evaluators were
instructed to consider the scholarly competence and achievements of the
individuals. The ratings were furnished on the following scale:
5 Distinguished
4 Strong
3 Good
2 Adequate
1 Marginal
O Not sufficient for doctoral education
X Don't know well enough to evaluate
MA follow-up letter was sent to those not responding to the first
mailing, and a second copy was distributed to those few evaluators who
specifically requested another form.
Vito compare the response rates obtained in the earlier surveys, see
Roose and Andersen, Table 28, p. 29.
USA copy of the survey instrument with its instructions is included
in Appendix C.
OCR for page 24
24
In assessing the effectiveness of a program, evaluators were asked to
consider the accessibility of faculty, the curricula, the instructional
and research facilities, the quality of the graduate students, the
performance of graduates, and other factors that contribute to a pro-
gram's effectiveness. This measure was rated accordingly:
3
Extremely effective
2 Reasonably effective
1 Minimally effective
O Not effective
X Don't know well enough to evaluate
Evaluators were instructed to assess change in program quality on the
basis of whether there has been improvement in the last five years in
both the scholarly quality of faculty and the effectiveness in educat-
ing research scholars/scientists. The following alternatives were
provided:
2 Better than five years ago
1 Little or no change in last five years
O Poorer than five years ago
X Don't know well enough to evaluate
Evaluators were asked to indicate their familiarity with the work of
the program faculty according to the following scale:
2 Considerable familiarity
1 Some familiarity
O Little or no familiarity
In the computation of mean ratings on measures 08, 09, and 10, the
"don't know" responses were ignored. An average program rating based
on fewer than 15 responses (excluding the "don't know" responses) is
not reported.
Measures 08, 09, and 10 are subject to many of the same criticisms
that have been directed at previous reputational surveys. Although
care has been taken to improve the sampling design and to provide eval-
uators with some essential information about each program, the survey
results merely reflect a consensus of faculty opinions. As discussed
in Chapter I, these opinions may well be based on out-of-date informa-
tion or be influenced by a variety of factors unrelated to the quality
of the program. In Chapter X a number of factors that may possibly
affect the survey results are examined. In addition to these limita-
tions, it should be pointed out that evaluators, on the average, were
unfamiliar with almost one-third of the programs they were asked to
consider. 6 As might be expected, the smaller and less prestigious
programs were not as well known, and for this reason one might have
less confidence in the average ratings of these programs. For all four
Resee Table 10.6 in Chapter X.
OCR for page 25
25
survey measures, standard errors of the mean ratings are reported; they
tend to be larger for the lesser known programs. The frequency of re-
sponse to each of the survey items is discussed in Chapter X.
Two additional comments should be made regarding the survey activ-
ity. First, it should be emphasized that the ratings derived from the
survey reflect a program's standing relative to other programs in the
same discipline and provide no basis for making cross-disciplinary
comparisons. For example, the fact that a much larger number of psy-
chology programs received "distinguished" ratings on measure 08 than
did anthropology programs indicates nothing about the relative quality
of the faculty in these two disciplines. It may depend, in part, on
the total numbers of programs evaluated in these disciplines; in the
survey instructions it was suggested to evaluators that no more than
10 percent of the programs listed be designated as "distinguished."
Nor is it advisable to compare the rating of a program in one disci-
pline with that of a program in another discipline because the ratings
are based on the opinions of different groups of evaluators who were
asked to judge entirely different sets of programs. Second, early in
the committee's deliberations a decision was made to supplement the
ratings obtained from faculty members with ratings from evaluators who
hold research-oriented positions in institutions outside the academic
sector. These institutions include industrial research laboratories,
government research laboratories, and a variety of other research es-
tablishments. Over the past 10 years increasing numbers of doctorate
recipients have taken positions outside the academic setting. The ex-
tensive involvement of these graduates in nonacademic employment is re-
flected in the percentages reported in Table 2.2: An average of 31
percent of the recent graduates in the social and behavioral science
disciplines who had definite employment plans indicated that they
planned to take positions in nonacademic settings. Data from another
NRC survey suggest that the actual fraction employed outside academia
may be significantly higher. The committee recognized that the in-
clusion of nonacademic evaluators would furnish information valuable
for assessing nontraditional dimensions of doctoral education and
would provide an important new measure not assessed in earlier studies.
Results from a survey of this group would provide an interesting com-
parison with the results obtained from the survey of faculty members.
A concentrated effort was made to obtain supplemental funding for ad-
ding nonacademic evaluators in selected disciplines to the survey sam-
ple, but this effort was unsuccessful. The committee nevertheless
remains convinced of the importance of including evaluators from non-
academic research institutions. These institutions are likely to em-
ploy increasing fractions of graduates in many disciplines, and it is
urged that this group not be overlooked in future assessments of grad-
uate programs.
UNIVERSITY LIBRARY SI ZE
The university library holdings are generally regarded as an im-
portant resource for students in graduate (and undergraduate) educa-
OCR for page 26
26
The Association of Research Libraries {ARL) has compiled data
its academic member institutions and developed a composite
tion.
from
of a university library's size relative to those of other ARL members.
The ARL Library Index, as it is called, is based on 10 characteristics:
volumes held, volumes added (gross), microform units held, current se-
rials received, expenditures for library materials, expenditures for
binding, total salary and wage expenditures, other operating expendi-
tures, number of professional staff, and number of nonprofessional
staff.~7 The 1979-80 index, which constitutes measure 12, is avail-
able for 89 of the 228 universities included in the assessment. (These
89 tend to be among the largest institutions.) The limited coverage of
this measure is a major shortcoming. It should be noted that the ARL
i nylon is: ~ nomnn~:it~ clan r;ntic~n cuff libr;3rv Dine and not a Qualitative
measure
evaluation of the collections, services, or operations of the library.
l
Also, it is a measure of aggregate size and does not take into account
the library holdings in a particular department or discipline. Fi-
nally, although universities with more than one campus were instructed
to include figures for the main campus only, some in fact may have re-
ported library size for the entire university system. Whether this
misreporting occurred is not known.
RESEARCH SUPPORT
Using computerized data filed provided by the National Science
Foundation {NSF) and the National Institutes of Health (NIH), it was
possible to identify which faculty members in each program had been
awarded research grants during the FY1978-80 period by either of these
agencies or by the Alcohol, Drug Abuse, and Mental Health Administra-
tion (ADAMHA). The fraction of faculty members in a program who had
received any research grants from these agencies during this three year
period constitutes measure 13. Since these awards have been made on
the basis of peer judgment, this measure is considered to reflect the
perceived research competence of program faculty. However, it should
be noted that significant amounts of support for research in the social
and behavioral sciences come from other federal agencies and from pri-
vate foundations and other nonfederal sources as well, though it was
not feasible to compile data from these other sources. Perhaps as
many as half of the faculty investigators in the social and behavioral
sciences derive their support from nonfederal sponsors. It is esti-
matedi9 that 22 percent of the university faculty members in these
disciplines who received federal R&D funding obtained their support
from NIH, another 20 percent from NSF, and approximately 13 percent
7See Appendix D for a description of the calculation of this index.
IDA description of these files is provided in Appendix E.
Abased on special tabulations of data from the NRC's Survey of Doc-
torate Recipients, 1979.
OCR for page 27
27
from ADAMHA. The remaining 45 percent received support from a variety
of other federal agencies. It also should be pointed out that only
those faculty members who served as principal investigators or coinves-
tigators are counted in the computation of this measure. As mentioned
earlier, since very few faculty members in history programs receive re-
search support from NIH, NSF, or ADAMHA, measure 13 was not included in
the assessment of programs in this discipline.
Measure 14 describes the total FY1979 expenditures by a university
for R&D in a particular discipline. These data have been furnished to
the NSF20 by universities and include expenditures of funds from both
federal and nonfederal sources. If an institution has more than one
program being evaluated in the same discipline, the aggregate univer-
sity expenditures for research in that discipline are reported for each
of the programs. In each discipline data are recorded for the 100 uni-
versities with the largest R&D expenditures. As already mentioned,
such data are not available for programs in anthropology, geography,
and history.
This measure has several limitations related to the procedures by
which the data have been collected. The committee notes that there is
evidence within the source documental that universities use different
practices for categorizing and reporting expenditures. Apparently, in-
stitutional support of research, industrial support of research, and
expenditure of indirect costs are reported by different institutions in
different categories (or not reported at all). Since measure 14 is
based on total expenditures from all sources, the data used here are
perturbed only when these types of expenditures are not subsumed under
any reporting category. In contrast with measure 13, measure 14 is not
reported on a scale relative to the number of faculty members and thus
reflects the overall level of research activity at an institution in a
particular discipline. Although research grants in the sciences and
engineering provide some support for graduate students as well, these
measures should not be confused with measure 04, which pertains to fel-
lowships and training grants.
PUBLICATION RECORD S
Data from the 1978, 1979, and 1980 Social Science Citation Index22
have been compiled on published articles by faculty members in anthro-
pology, economics, geography, history, political science, psychology,
20A copy of the survey instrument used to collect these data appears
in Appendix E.
2~National Science Foundation, Academic Science: R and D Funds,
Fiscal Year 1979, U.S. Government Printing Office, Washington, D.C.,
NSF 81-301, 1981.
22The publication data have been compiled and provided for the come
mittee's use by the Institute for Scientific Information.
OCR for page 28
28
and sociology. Publication counts were associated with research-doc-
torate programs by matching authors' names with the names of individual
program faculty members (provided by the universities). To differenti-
ate authors who have the same last name and first initial,23 the
matching process took into account the institutional affiliation of the
author and the discipline of the journal in which the article appeared.
In the case of a coauthored article, each author is fully credited with
that article.
Two measures have been derived from publication records: measure
17--the total number of articles published in the 1978-80 period that
have been identified with individual faculty members in a research-doc-
torate program; and measure 18--the fraction of program faculty members
with one or more published articles during this three-year period.
Since both of these publication measures are based on individual name
matches with program faculty, they are quite different from measures
15 and 16, which are presented in the committee's reports on research-
doctorate programs in the mathematical and physical sciences, engineer-
ing, and biological sciences.24 The latter two measures are associ-
ated with programs on the basis of the discipline of the journal in
which an article appeared and the institution with which the author was
affiliated. Therefore, articles by program faculty members, students,
research personnel, and even members of other programs/departments in
the university who publish in those journals are included. Measures
17 and 18 reported here reflect only articles authored by program fac-
ulty members.
Although physical and biological scientists publish a large share
of their research in the form of articles, this is not as often the
case for social or behavioral scientists. Thus, measure 17--confined
to articles published in the 1978-80 period in journals covered by the
Social Science Citation Index--tends to overestimate the contributions
of faculty members who publish articles and to underestimate the con-
tributions of those who publish in books. To the extent that the
former more often do experimental and quantitative research and that
the latter do qualitative, theoretical, and historical research, pro-
grams emphasizing experimental and quantitative orientations are likely
to receive higher counts on this measure. The significance of book
publication in the social and behavioral sciences should not be over-
looked. A recently published list25 of the most frequently cited
2 3 The full names of individual authors are not available from the
Social Science Citation Index.
lo.
Okidata on measures 15 and 16 are also available for researc~doctor-
ate programs in psychology and are presented in Appendix J.
2 sE. Garfield, "The 100 Articles Most Cited by Social Scientists,
1969-77," Current Contents: Social and Behavioral Sciences, #32, Au-
gust 7, 1978; and E. Garfield, "The 100 Books Most Cited by Social
Scientists, 1969-77, n Current Contents: Social and Behavioral Sci-
ences, #37, September 11, 1978.
OCR for page 29
29
authors in the social and behavioral sciences and their most frequently
cited works suggests that books are of major significance in these dis-
· . .
cop Ones.
Readers should also be aware that measure 17 does not take into ac-
count the different sizes of programs. Thus, programs with larger fac-
ulties may appear to be more productive than those with smaller facul-
ties. Since measure 17 reflects the total number of published articles
by individual faculty members in a program, the average number of arti-
cles per faculty member may be derived by dividing measure 17 by the
number of faculty members {measure 01~. Measure 18--the fraction of
faculty members with at least one published article during this three-
year period--has been corrected for the program faculty size but does
not reflect the rate at which individual members have written articles.
Since the data are confined to the years between 1978 and 1980 and
were compiled by author's name and institutional affiliation, they do
not take into account institutional mobility of authors during this
three-year span. Thus, faculty members who have moved from one insti-
tution to another during this span are not credited with those articles
for which the author's affiliation is his or her former institution.
Procedures for allocating credit for multiauthored papers must also
be considered in assessing measures 17 and 18. Equal weight has been
assigned to the program affiliations of all authors of such papers.
Thus, these measures tend to overestimate the contributions of faculty
members given to collaborative publications. It should also be noted
that the Social Science Citation Index does not completely cover all
journals in which social and behavioral scientists publish. For exams
pie, papers by psychologists working in the neurosciences may not be
counted since many neuroscience journals do not appear in the Social
Science Citation Index. 2 6 The same may be true for anthropologists,
historians, and sociologists who tend to publish results of their re-
search in traditionally humanistic journals not covered by the Social
Science Citation Index. Finally, neither measure 17 or measure 18 pro-
vides any indication of the impact or influence of the articles written
by program faculty members. Although publication productivity and the
impact of published articles tend to be correlated, previous investiga-
tion~ 7 indicates that they are quite different variables. Citation
counts, had it been feasible to compile them, would have complemented
measures 17 and 18 and been a highly desirable measure in assessing the
publication records of program faculty measures.
26For a list of journals covered by the Social Science Citation In-
dex, see Appendix F.
. R. Cole and S. Cole, "Measuring the Quality of Sociological Re-
seach: Problems In the Use of the Science Citation Index, n American
Sociologist, Vol. 6, No. 1, February 1971, pp. 23-28.
OCR for page 30
30
ANALYST S AND PRESENTATION OF THE DATA
The next seven chapters present all of the information that has
been compiled on individual research-doctorate programs in anthropol-
ogy, economics, geography, history, political science, psychology, and
sociology. Each chapter follows a similar format, designed to assist
the reader in the interpretation of program data. The first table in
each chapter provides a list of the programs evaluated in a discipline
--including the names of the universities and departments or academic
units in which programs reside--along with the full set of data com-
piled for individual programs. Programs are listed alphabetically ac-
cording to name of institution, and both raw and standardized values
are given for all measures. For the reader's convenience an insert of
information from Table 2.1 is provided that identifies each of the 16
measures reported in the table and indicates the raw scale used in re-
porting values for a particular measure. Standardized values, con-
verted from raw values to have a mean of 50 and a standard deviation
of 10,28 are computed for every measure so that comparisons can easily
be made of a program's relative standing on different measures. Thus,
a standardized value of 30 corresponds with a raw value that is two
standard deviations below the mean for that measure, and a standardized
value of 70 represents a raw value two standard deviations above the
mean. While the reporting of values in standardized form is convenient
for comparing a particular program's standing on different measures,
it may be misleading in interpreting actual differences in the values
reported for two or more programs--especially when the distribution of
the measure being examined is highly skewed. For example, the numbers
of published articles (measure 17) associated with four psychology pro-
grams are reported in Table 3.1 as follows:
Program Raw Value
A
B
C
D
2
32
Standardized Value
37
38
41
42
Although programs C and D have many times the number of articles as
have programs A and B. the differences reported on a standardized scale
appear to be small. Thus, the reader is urged to take note of the raw
values before attempting to interpret differences in the standardized
values given for two or more programs.
The initial table in each chapter also presents estimated standard
errors of mean ratings derived from the four survey items (measures
08-11~. A standard error is an estimated standard deviation of the
28The conversion was made from the precise raw value rather than from
the rounded value reported for each programe Thus' two programs may
have the same reported raw value for a particular measure but different
standardized values.
OCR for page 31
31
sample mean rating and may be used to assess the stability of a mean
rating reported for a particular program.29 For example, one may
assert (with .95 confidence) that the population mean rating would lie
within two standard errors of the sample mean rating reported in this
assessment.
No attempt has been made to establish a composite ranking of pro-
grams in a discipline. Indeed, the committee is convinced that no
single measure adequately reflects the quality of a research-doctorate
program and wishes to emphasize the importance of viewing individual
programs from the perspective of multiple indices or dimensions.
The second table in each chapter presents summary statistics (i.e.,
number of programs evaluated, mean, standard deviation, and decile val-
ues) for each of the program measures.30 The reader should find
these statistics helpful in interpreting the data reported on individ-
ual programs. Next is a table of the intercorrelations among the vari-
ous measures for that discipline. This table should be of particular
interest to those desiring information about the interrelations of the
various measures.
The remainder of each chapter is devoted to an examination of re-
sults from the reputational survey. Included are an analysis of the
characteristics of survey participants and graphical portrayals of the
relationship of the mean rating of scholarly quality of faculty {mea-
sure 08) with the number of faculty (measure 01) and the relationship
of the mean rating of program effectiveness (measure 09) with the numb
ber of graduates (measure 02~. A frequently mentioned criticism of the
Roose-Andersen and Cartter studies is that small but distinguished pro-
grams have been penalized in the reputational ratings because they are
not as highly visible as larger programs of comparable quality. The
comparisons of survey ratings with measures of program size are pre-
sented as the first two figures in each chapter and provide evidence
about the number of small programs in each discipline that have re-
ceived high reputational ratings. Since in each case the reputational
rating is more highly correlated with the square root of program size
than with the size measure itself, measures 01 and 02 are plotted on a
square root scale.3~ To assist the reader in interpreting results
of the survey evaluations, each chapter concludes with a graphical
2 9The standard error estimate has been computed by dividing the stan-
dard deviation of a program's ratings by the square root of the number
of ratings. For a more extensive discussion of this topic, see Fred
N. Kerlinger, Foundations of Behavioral Researc_, Halt, Reinhart, and
Winston, Inc., New York, 1973, Chapter 12. Readers should note that
the estimate is a measure of the variation in response and by no means
includes all possible sources of error.
30Standardized scores have been computed from precise values of the
mean and standard deviations of each measure and not the rounded
values reported in the second table in each of the following chapters.
for a general discussion of transforming variables to achieve lin-
ear fits, see John W. Tukey, Exploring Data Analysis, Addison-Wesley,
Reading, Massachusetts, 1977.
OCR for page 32
32
presentation of the mean rating for every program of the scholarly
quality of faculty (measure 08) and an associated "confidence interval"
- - ~ ~ In comparing the ~
or 1.o standard errors. ~~ -viny one mean ratings or two programs,
if their reported confidence intervals of 1.5 standard errors do not
overlap, one may safely conclude that the program ratings are signifi-
cantly different (at the .05 level of significance)--i.e., the observed
difference in mean ratings is too large to be plausibly attributable
to sampling error.32
The final chapter of this report gives an overview of the evalua-
tion process in the seven social and behavioral science disciplines and
includes a summary of general findings. Particular attention is given
to some of the extraneous factors that may influence program ratings
individual evaluators and thereby distort the survey results. The
chapter concludes with a number of specific suggestions for improving
future assessments of research-doctorate programs.
32This rule for comparing nonoveriapping intervals is valid as long as
the ratio of the two estimated standard errors does not exceed 2.41.
(The exact statistical significance of this criterion then lies between
.050 and .034.) Inspection of the standard errors reported in each
discipline shows that for programs with mean ratings differing by less
than 1.0 (on measure 08), the standard error of one mean very rarely
exceeds twice the standard error of another.
Representative terms from entire chapter:
program graduates