Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 61
7
General Conclusions and Recommendations
The Committee to Examine the Methodology to Assess
Research-Doctorate Programs was presented with the task
of examining the methodologies used in the 1995 National
Research Council Study, Research-Doctorate Programs in
the United States: Continuity and Change (referred to
throughout this report as the "1995 Study") to determine the
feasibility of significant improvements. The previous
chapters have made specific recommendations on how to
conduct an assessment of research-doctorate programs under
the assumption that it will be done. The more fundamental
question remains to be addressed: Should another study be
carried out at all? This chapter presents the Committee's
conclusions on this and other general issues along with the
reasons for supporting them.
SHOULD ANOTHER ASSESSMENT OF RESEARCH-
DOCTORATE PROGRAMS BE UNDERTAKEN BY THE
NATIONAL RESEARCH COUNCIL?
The Committee was asked to examine the methodology
of the 1995 Study and to identify both its strengths and its
weaknesses. Where weaknesses were found, it was asked to
suggest methods to remedy them.
The strengths of the 1995 Study identified by the Com-
mittee were:
· Wide acceptance. It was widely accepted, quoted, and
utilized as an authoritative source of information on the
quality of doctoral programs.
· Comprehensiveness. It covered 41 of the largest fields
of doctoral study.
· Transparency. Its methodology was clearly stated.
· Temporal continuity. For most programs, it maintained
continuity with the NRC study carried out 10 years earlier.
Finally, it should be noted that the study was a useful tool
for doctoral programs to improve themselves and hence to
61
improve doctoral education. A frequent use of the study by
administrators is to examine the characteristics of programs
that are rated more highly than their own. If the study is
carried out again, it would provide the quantitative basis for
such analyses.
The weaknesses were:
· Data presentation. The emphasis on exact numerical
rankings encouraged users of the study to draw spurious
inferences of precision.
· Flawed measurement of educational quality. The
reputational measure of program effectiveness in graduate
education, derived from a question asked of faculty raters,
confounded research reputation and educational quality.
· Emphasis on the reputational measure of scholarly
quality. This emphasis gave users the impression that a
"soft" criterion, subject to "halo" and "size effects," was
being relied on for the assessment of programs.
· Obsolescence of data. The period of 10 years between
studies was viewed as too long.
· Poor dissemination of results. The presentation of the
study data was in a form that was difficult for potential
students to use since it was inaccessible and difficult to
interpret.
· Use of an outdated or inappropriate taxonomy offields.
Particularly for the biological sciences, the taxonomy did
not reflect the current organization of graduate programs in
many institutions.
· Inadequate validation of data. Data were not sent back
to providers for a check on accuracy, and some unnecessary
errors were propagated.
The weaknesses listed above were addressed in earlier
chapters, but in addition to these difficulties, it must be noted
that assessments of research-doctorate programs are costly
in the direct costs of staff and committee time, but far greater
and invisible costs are incurred by university faculty and
OCR for page 62
62
administrative personnel in amassing data for inclusion in
the study. The benefits of the NRC study must outweigh
these costs if it is to be undertaken.
One other issue to be addressed is the possibility of dupli-
cative studies. Broad rankings of doctoral programs in some
fields are conducted periodically by US News and World
Report (USN&WR). Unless the NRC study differs in
important respects from that study, there seems little reason
to incur the known costs.
Both the USN&WR and the NRC reports publish reputa-
tional rankings, but the resemblance ends there. USN&WR
rankings appear with somewhat greater frequency, and they
cover a more limited set of fields outside of professional
schools. With the exception of engineering, USN&WR
publishes only reputational rankings (as of their 2004 edi-
tion). Quantitative data are collected for engineering, but
USN&WR employs a weighted average of quantitative data
and reputational ratings to arrive at a composite ranking. The
problem with this approach is that any ranking based on
weighted averages of quantitative indicators is necessarily
subjective. They represent an implementation of someone's
prejudices regarding the relative importance of the various
indicators.
There are additional technical objections to the
USN&WR rankings. For those fields for which there is over-
lap with the NRC fields, the response rates for USN&WR
were 10-20 percentage points below those obtained for the
1995 NRC report. Even more importantly, USN&WR
targets administrators as respondents and asks their views of
programs in fields outside their area of expertise. The NRC
makes every effort to obtain ratings from within-field peers
who are primarily faculty.
The differences between the two studies also reflect a
difference in audience. USN&WR is aimed directly at the
potential student and purports to contain material that would
be helpful to students applying to graduate school. The 1995
NRC Study was primarily directed to faculty, administrators,
and scholars of higher education. It was not especially user-
friendly. In fact, Brendan Maher, a co-author of the 1995
Study, subsequently wrote a guide for students and others.)
Because of the transparent way in which NRC studies
present their data, the more extensive coverage of fields out-
side of professional schools, their focus on peer ratings, and
the relatively high response rates they obtain, there is clearly
value added in having the NRC conduct the assessment once
again. However,twoquestionsstillremain: Doreputational
ratings do more harm than good to the enterprise that they
seek to assess? And, does the fact that ratings are published
by a prestigious organization, such as the NRC, lend more
credence to rankings than should be due?
i"How to Read the 1995 National Research Council Report Research-
Doctorate Programs in the United States." 1996.
ASSESSING RESEARCH-DOCTORATE PROGRAMS
Ratings would be harmful if they gave a distorted view of
the graduate enterprise or if they encouraged behavior
inimical to improving its quality. The Committee believes
that a number of steps recommended in previous chapters
would minimize these risks. Presenting ratings as ranges
would diminish the focus of some administrators on hiring
decisions designed purely to "move up in the rankings."
Ascertaining whether programs track student outcomes
would encourage programs to pay more attention to improv-
ing student outcomes. Asking students about the education
they have received would encourage programs to focus on
graduate education as well as on research. Expanding the set
of quantitative measures would permit deeper investigations
into components of a program that contribute to a reputation
for quality. More frequent updating of these data would
provide more timely and objective assessments. A careful
analysis of the correlates of reputation would improve public
understanding of the factors that contribute to a highly
reputed graduate program.
Recommendation 1: The assessment of both the schol-
arly quality of doctoral programs and the educational
practices of these programs is important to higher
education, its funders, its students, and to society. The
National Research Council should continue to conduct
such assessments on a regular basis.
One of the major objections to previous NRC studies is
that they are performed only every 10 years. The reason for
this is a practical one. A national survey of graduate faculty
is an enormous undertaking and changes in scholarly quality
occur slowly. Little new information would be gained at a
high cost if faculty were to be questioned frequently about a
slowly changing phenomenon. The ability to gather quanti-
tative data electronically at little cost, however, makes
possible more frequent reporting of quantitative data. We
will attempt to produce periodically and, ideally, annually
updatable proxy assessments based on quantitative informa-
tion. The Committee believes that Web-based data gathering
should be a part of the next study and suggests the establish-
ment of an updateable database on graduate programs.
Further, once a statistical analysis of the relationship between
quantitative measures and the reputational measure has been
conducted for each field, it will be possible to construct a
"synthetic reputational measure," constructed under the
assumption that the parameters that relate the quantitative
measures to reputation have held steady over time, but that
the values themselves have changed. Although the measure
is weighted, the weights are not subjective except in the sense
they will be statistically determined and the combination of
measures that provide the best fit will be used to construct
the indicator for subsequent years. The measures and their
parameters are then frozen in time although the values of
the measures may change.
OCR for page 63
GENERAL CONCLUSIONS AND RECOMMENDATIONS
Recommendation 2: Although scholarly reputation and
the composition of program faculty change slowly and
can be assessed over a decade, quantitative indicators
that are related to quality may change more rapidly and
should be updated on a regular and more frequent basis
than scholarly reputation. The Committee recommends
investigation of the construction of a synthetic measure
of reputation for each field, based on statistically derived
combinations of quantitative measures. This synthetic
measure could be recalculated periodically and, if
possible, annually.
As described in Chapter 6, reputational rankings depend
on the dispersion of the aggregated ratings of many raters.
This dispersion is relatively narrow for the very best pro-
grams but increases for other programs simply because
information about such programs is not as widely known. A
number of factors may contribute to this phenomenon lack
of rater knowledge about the program, the likelihood that
smaller programs may specialize in some subfields but not
others, and the fact that different raters value different
dimensions of program quality when they assign ratings.
Although it may greatly disappoint those programs which
would like to boast about their place in the ratings, the Com-
mittee believes that presenting ratings in a way that portrays
dispersion (or lack of rater agreement about the exact
ranking) would improve the usefulness of the ratings.
Recommendation 3: The presentation of reputational
ratings should be modified so as to minimize the drawing
al a spurious mlerence al precision In program ran Bug.
In addition to the quantitative measures collected in the
1995 Study, additional measures would add to the ability of
study users to analyze the correlates of reputation. These are
discussed in detail in Chapter 4, but include data on elec-
tronic acquisitions by libraries and field-specific measures,
such as laboratory space in the sciences, and number of
books in the humanities.
Recommendation 4: Data for quantitative measures
should be collected regularly and made accessible in a
Web-readable format. These measures should be reported
whenever significantly updated data are available. (See
Recommendation 4.1 for details.)
The education of doctoral students for a wide range of
employment beyond that in academia has become an object
of growing attention in the educational policy community
and among the students themselves. In addition to collect-
ing data on educational practices and resources, the Com-
mittee proposes that the next NRC study collect data from
advanced-to-candidacy students in a small number of fields
in order to assesses their educational experiences, their
research productivity, program practices, and institutional
63
and program environments. Further, although the Commit-
tee realizes that it would not be feasible to conduct a large
study of outcomes, it believes that information on whether
programs collect and publish such information would be
valuable to potential students.
Recommendation 5: Comparable information on
educational processes should be collected directly from
advanced-to-candidacy students in selected programs
and reported. Whether or not individual programs
monitor outcomes for their graduates should be reported.
The Committee constructed a taxonomy of fields for the
proposed study that reflected changes that have taken place
in the past decade, especially in the biological sciences.
Although it was not able to identify many interdisciplinary
fields that offered doctoral programs, it did recommend a
new category that would present data on such fields as they
emerged. Many such fields may still be included in more
traditional programs. The committee appointed to conduct
the proposed study should consider the exact details of the
taxonomy. This is an open question, still subject to review.
Recommendation 6: The taxonomy of fields should be
changed from that used in the 1995 Study to incorporate
additional fields with large Ph.D. production. The agri-
cultural sciences should be added to the taxonomy and
efforts should be made to include basic biomedical fields
in medical schools. A new category, "emerging fields,"
should be included.
In the 1995 Study, data were not send back to the pro-
viders for validation. This introduced a number of errors.
For example, for multicampus institution whole programs
were omitted and a number of faculty lists were inaccurate.
The next study should make sure this does not happen. This
is made much more feasible by the availability of informa-
tion technology.
Recommendation 7: All data that are collected should be
validated by the providers.
There is an increasing trans-border flow of doctoral
students between Canadian and U.S. doctoral programs.
Although there are differences between the national systems,
there are many similarities as well. The Committee believes
that the inclusion of Canadian research-doctorate programs
would be useful to programs in both countries.
Recommendation 8: If the recommendation of the
Canadian Research-Doctorate Quality Assessment Study,
which is currently underway, is to participate in the pro-
posed NRC study, Canadian doctoral programs should
be included in the next NRC assessment.
OCR for page 64
64
The past decade has seen enormous strides in information
technology. It is now feasible, as demonstrated by the pilot
trials, to collect data using Web questionnaires. This is a
cost-effective technology, saving not only postage but also
the time of coders and permitting rapid validation of data.
Electronic technology can and should also play an important
role in the dissemination of the report. Databases can be
made available on-line, as can simple analytic software that
would enable users to select peer institutions as well as
conduct comparative analyses, while maintaining rater con-
fidentiality. The database for the proposed study should be
designed with this sort of dissemination in mind.
Recommendation 9: Extensive use of electronic Web-
based means of dissemination should be utilized for both
the initial report and periodic updates (cf. Recommenda-
tions 2 and 4~.
THE FORM OF THE PROPOSED STUDY
The 1995 Study was disseminated as a book of 740 pages,
64 pages of which comprised the text. The remaining pages
ASSESSING RESEARCH-DOCTORATE PROGRAMS
contained tables of data and rankings. The bulky study was
also made available on the Web. Two years later, a CD was
published with these data and supplemental data on the
ratings of raters. Electronic technology now makes it
possible to immediately publish all the data, aggregated to
preserve rater confidentiality, on the Web. The same tech-
nology makes it possible for data from the next study to be
pre-released to designated researchers for analytic studies
and for those studies to be published as the print "report" of
the study. Furthermore, a Web-based release makes it
possible to provide analytical tools to users so that they can
compose and rate programs using a la carte quantitative
weights of their own choosing. The Committee believes
strongly that publication of the data alone, without an explo-
ration of its strengths and limitations, should not happen
again. The funding of analytic work should be built into the
study and appear as a prominent part of the report.
Finally, since the report will have considerably more
information of interest to students, it would be very helpful
to include as an integral part of the report a section, entitled
How to Read This Report, similar to the guide written by
Brendan Maher in 1996.
Representative terms from entire chapter:
quantitative measures