Read "An Assessment of Research-Doctorate Programs in the United States: Mathematical and Physical Sciences" at NAP.edu

« Previous: Front Matter

Page 1 Cite

Suggested Citation:"I. Origins of Study and Selection of Programs." National Research Council. 1982. An Assessment of Research-Doctorate Programs in the United States: Mathematical and Physical Sciences. Washington, DC: The National Academies Press. doi: 10.17226/9730.

Page 2 Cite

Page 3 Cite

Page 4 Cite

Page 5 Cite

Page 6 Cite

Page 7 Cite

Page 8 Cite

Page 9 Cite

Page 10 Cite

Page 11 Cite

Page 12 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Origins of Study and Selection of Programs Each year more than 22,000 candidates are awarded doctorates in engineering, the humanities, and the sciences from approximately 250 U.S. universities. They have spent, on the average, five-and-a-half years in intensive education in preparation for research careers either in universities or in settings outside the academic sector, and many will make significant contributions to research. Yet we are poorly informed concerning the quality of the programs producing these graduates. This study is intended to provide information pertinent to this complex and controversial subject. The charge to the study committee directed it to build upon the planning that preceded it. The planning stages included a detailed review of the methodologies and the results of past studies that had focused on the assessment of doctoral-level programs. The committee has taken into consideration the reactions of various groups and individuals to those studies. The present assessment draws upon previous experience with program evaluation, with the aim of improving what was useful and avoiding some of the difficulties encountered in past studies. The present study, nevertheless, is not purely reactive: it has its own distinctive features. First, it focuses only on programs awarding research doctorates and their effectiveness in preparing students for careers in research. Although other purposes of graduate education are acknowledged to be important, they are outside the scope of this assessment. Second, the study examines a variety of different indices that may be relevant to the program quality. This multidimensional approach represents an explicit recognition of the limitations of studies that rely entirely on peer ratings of perceived quality--the so-called reputational ratings. Finally, in the compilation of reputational ratings in this study, evaluators were provided the names of faculty members involved with each program to be rated and the number of research doctorates awarded in the last five years. In previous reputational studies evaluators were not supplied such information. During the past two decades increasing attention has been given to describing and measuring the quality of programs in graduate education. It is evident that the assessment of graduate programs is highly important for university administrators and faculty, for employers in industrial and government laboratories, for graduate 1

2 students and prospective graduate students, for policymakers in state and national organizations, and for private and public funding agencies. Past experience, however, has demonstrated the difficulties with such assessments and their potentially controversial nature. As one critic has asserted: . . . the overall effect of these reports seems quite clear. They tend, first, to make the rich richer and the poor poorer; second, the example of the highly ranked clearly imposes constraints on those institu- tions lower down the scale (the "Hertz-Avis" effect). And the effect of such constraints is to reduce diversity, to reward conformity or respectability, to penalize genuine experiment or risk. There is, also, I believe, an obvious tendency to promote the prevalence of disciplinary dogma and orthodoxy. All of this might be tolerable if the reports were tolerably accurate and judicious, if they were less prescriptive and more descriptive; if they did not pretend to "objectivity" and if the very fact of ranking were not pernicious and invidious; if they genuinely promoted a meaningful "meritocracy~(instead of simply perpetuating the status quo ante and an establishment mentality). But this is precisely what they cannot claim to be or dog The widespread criticisms of ratings in graduate education were carefully considered in the planning of this study. At the outset consideration was given to whether a national assessment of graduate programs should be undertaken at this time and, if so, what methods should be employed. The next two sections in this chapter examine the background and rationale for the decision by the Conference Board of Associated Research Councils2 to embark on such a study. The remainder of the chapter describes the selection of disciplines and programs to be covered in the assessment. The overall study encompasses a total of 2,699 graduate programs in 32 disciplines. In this report--the first of five reports issuing from the study--we examine 596 programs in six disciplines in the mathematical and physical sciences: chemistry, computer sciences, geosciences, mathematics, physics, and statistics/biostatistics. These programs account for more than 90 percent of the research iWilliam A. Arrowsmith, "Preface" in The Ranking Game: The Power of the Academic Elite, by W. Patrick Dolan, University of Nebraska Printing and Duplicating Service, Lincoln, Nebraska, 1976, p. ix. 2 The Conference Board includes representatives of the American Council of Learned Societies, American Council on Education, National Research Council, and Social Science Research Council.

3 doctorates awarded in these six disciplines. It should be emphasized that the selection of disciplines to be covered was determined on the basis of total doctoral awards during the FY1976-78 period (as described later in this chapter), and the exclusion of a particular discipline was in no way based on a judgment of the importance of graduate education or research in that discipline. Also, although the assessment is limited to programs leading to the research-doctorate (Ph.D. or equivalent) degree, the Conference Board and study committee recognize that graduate schools provide many other forms of valuable and needed education. PRIOR ATTEMPTS TO ASSESS QUALITY IN GRADUATE EDUCATION Universities and affiliated organizations have taken the lead in the review of programs in graduate education. At most institutions program reviews are carried out on a regular basis and include a comprehensive examination of the curriculum and educational resources as well as the qualifications of faculty and students. One special form of evaluation is that associated with institutional accreditation: The process begins with the institutional or programmatic self-study, a comprehensive effort to measure progress according to previously accepted objectives. The self-study considers the interest of a broad cross-section of constituencies--students, faculty, administrators, alumni, trustees, and in some circumstances the local community. The resulting report is reviewed by the appropriate accrediting commission and serves as the basis for evaluation by a site-visit team from the accrediting group. . . . Public as well as educational needs must be served simultaneously in determining and fostering standards of quality and integrity in the institutions and such specialized programs as they offer. Accreditation, conducted through non-governmental institutional and specialized agencies, provides a major means for meeting those needs.3 Although formal accreditation procedures play an important role in higher education, many university administrators do not view such procedures as an adequate means of assessing program quality. Other efforts are being made by universities to evaluate their programs in graduate education. The Educational Testing Service, with the sponsorship of the Council of Graduate Schools in the United States and the Graduate Record Examinations Board, has recently developed a 3Council on Postsecondary Accreditation, The Balance Wheel for Accreditation, Washington, D.C., July 1981, pp. 2-3.

4 set of procedures to assist institutions in evaluating their own graduate programs.4 While reviews at the institutional (or state) level have proven useful in assessing the relative strengths and weaknesses of individual programs, they have not provided the information required for making national comparisons of graduate programs. Several attempts have been made at such comparisons. The most widely used of these have been the studies by Keniston (1959), Cartter (1966), and Roose and Andersen (1970~. All three studies covered a broad range of disciplines in engineering, the humanities, and the sciences and were based on the opinions of knowledgeable individuals in the program areas covered. Kenistons surveyed the department chairmen at 25 leading institutions. The Cartter6 and Roose-Andersen7 studies compiled ratings from much larger groups of faculty peers. The stated motivation for these studies was to increase knowledge concerning the quality of graduate education: A number of reasons can be advanced for undertaking such a study. The diversity of the American system of higher education has properly been regarded by both the professional educator and the layman as a great source of strength, since it permits flexibility and adaptability and encourages experimentation and competing solutions to common problems. Yet diversity also poses problems. . . . Diversity can be a costly luxury if it is accompanied by ignorance. . . . Just as consumer knowledge and honest advertising are requisite if a competitive economy is to work satis- factorily, so an improved knowledge of opportunities and of quality is desirable if a diverse educational system is to work effectively.8 Although the program ratings from the Cartter and Roose-Andersen studies are highly correlated, some substantial differences in successive ratings can be detected for a small number of programs-- suggesting changes in the programs or in the perception of the programs. For the past decade the Roose-Andersen ratings have 4 For a description of these procedures see M. J. Clark, Graduate Program Self-Assessment Service: Handbook for Users, Educational Testing Service, Princeton, New Jersey, 1980. sH. Keniston, Graduate Study in Research in the Arts and Sciences at . the University of Pennsylvania, University of Pennsylvania Press, Phildelphia, 1959. 6 A. M. Cartter, An Assessment of Quality in Graduate Education, American Council on Education, Washington, D.C., 1966. 7 K. D. Roose and C. J. Andersen, A Rating of Graduate Programs, American Council on Education, Washington, D.C., 1970. ~Cartter, p. 3.

5 generally been regarded as the best available source of information on the quality of doctoral programs. Although the ratings are now more than 10 years out of date and have been criticized on a variety of grounds, they are still used extensively by individuals within the academic community and by those in federal and state agencies. A frequently cited criticism of the Cartter and Roose-Andersen studies is their exclusive reliance upon reputational measurement. The ACE rankings are but a small part of all the evaluative processes, but they are also the most public, and they are clearly based on the narrow assumptions and elitist structures that so dominate the present direction of higher education in the United States. As long as our most prestigious source of information about post-secondary education is a vague popularity contest, the resultant ignorance will continue to provide a cover for the repetitious aping of a single model. . . . All the attempts to change higher education will ultimately be strangled by the "legitimate" evaluative processes that have already programmed a single set of responses from the start.9 A number of other criticisms have been leveled at reputational rankings of graduate programs.~° First, such studies inherently reflect perceptions that may be several years out of date and do not take into account recent changes in a program. Second, the ratings of individual programs are likely to be influenced by the overall reputation of the university--i.e., an institutional "halo effect." Also, a disproportionately large fraction of the evaluators are graduates of and/or faculty members in the largest programs, which may bias the survey results. Finally, on the basis of such studies it may not be possible to differentiate among many of the lesser known programs in which relatively few faculty members have established national reputations in research. Despite such criticisms several studies based on methodologies similar to that employed by Cartter and Roose and Andersen have been carried out during the past 10 years. Some of these studies evaluated post-baccalaureate programs in areas not covered in the two earlier reports--including business, religion, educational administration, and medicine. Others have focused exclusively on programs in particular disciplines within the sciences and humanities. A few attempts have been made to assess graduate programs in a broad range of disciplines, many of which were covered in the Roose-Andersen and Cartter ratings, but in the opinion of many each has serious deficiencies in the methods and procedures 9Dolan, p. 8L. i°For a discussion of these criticisms, see David S. Webster, "Methods of Assessing Quality," Change, October 1981, pp. 20-24.

6 employed. In addition to such studies, a myriad of articles have been written on the assessment of graduate programs since the release of the Roose-Andersen report. With the heightening interest in these evaluations, many in the academic community have recognized the need to assess graduate programs, using other criteria in addition to peer judgment. Though carefully done and useful in a number of ways, these ratings {Cartter and Roose-Andersen) have been criticized for their failure to reflect the complexity of graduate programs, their tendency to emphasize the traditional values that are highly related to program size and wealth, and their lack of timeliness or currency. Rather than repeat such ratings, many members of the graduate community have voiced a preference for developing ways to assess the quality of graduate programs that would be more comprehensive, sensitive to the different program purposes, and appropriate for use at any time by individual departments or universities. Several attempts have been made to go beyond the reputational assess- ment. Clark, Harnett, and Baird, in a pilot study 2 of graduate programs in chemistry, history, and psychology, identified as many as 30 possible measures significant for assessing the quality of graduate education. Glowers 3 has ranked engineering schools according to the total amount of research spending and the number of graduates listed in Who's Who in Ennineerinq. House and Meager rated economics departments on the basis of the total number of pages published by full professors in 45 leading journals in this discipline. Other ratings based on faculty publication records have been compiled for graduate programs in a variety of disciplines, including political science, psychology, and sociology. These and other studies demonstrate the feasibility of a national assessment of graduate programs that is founded on more than reputational standing among faculty peers. ~Clark, p. 1. I'M. J. Clark, R. T. Harnett, and L. L. Baird, Assessing Dimensions of Quality in Doctoral Education: A Technical Report of a National Study in Three Fields, Educational Testing Service, Princeton, New Jersey, 1976. ~ 3 Donald D. Glower, "A Rational Method for Ranking Engineering Programs," Engineering Education, May 1980. McDonald R. House and James H. Yeager, Jr., "The Distribution of Publication Success Within and Among Top Economics Departments: A Disaggregate View of Recent Evidence," Economic Inquiry, Vol. 16, No. 4, October 1978, pp. 593-598.

7 DEVELOPMENT OF STUDY PLANS In September 1976 the Conference Board, with support from the Carnegie Corporation of New York and the Andrew W. Mellon Foundation, convened a three-day meeting to consider whether a study of programs in graduate education should be undertaken. The 40 invited participants at this meeting included academic administrators, faculty members, and agency and foundation officials,~5 who represented a variety of institutions, disciplines, and convictions. In these discussions there was considerable debate concerning whether the potential benefits of such a study outweighed the possible mis- representations of the results. On the one hand, "a substantial majority of the Conference [participants believed] that the earlier assessments of graduate education have received wide and important use: by students and their advisors, by the institutions of higher education as aids to planning and the allocation of educational functions, as a check on unwarranted claims of excellence, and in social science research." 6 On the other hand, the conference participants recognized that a new study assessing the quality of graduate education "would be conducted and received in a very different atmosphere than were the earlier Cartter and Roose-Andersen reports. . . . Where ratings were previously used in deciding where to increase funds and how to balance expanding programs, they might now be used in deciding where to cut off funds and programs." After an extended debate of these issues, it was the recommenda- tion of this conference that a study with particular emphasis on the effectiveness of doctoral programs in educating research personnel be undertaken. The recommendation was based principally on four considerations: (1) the importance of the study results to national and state bodies, (2) the desire to stimulate continuing emphasis on quality in graduate education, (3) the need for current evaluations that take into account the many changes that have occurred in programs since the Roose-Andersen study, and (4) the value of extending the range of measures used in evaluative studies of graduate programs. Although many participants expressed interest in an assessment of master's degree and professional degree programs, insurmountable problems prohibited the inclusion of these types of programs in this study. Following this meeting a 13-member committee, 7 co-chaired by tssee Appendix G for a list of the participants in this conference. t6From a summary of the Woods Hole Conference (see Appendix G). 7 See Appendix H for a list of members of the planning committee.

8 Gardner Lindzey and Harriet A. Zuckerman, was formed to develop a detailed plan for a study limited to research-doctorate programs and designed to improve upon the methodologies utilized in earlier studies. In its deliberations the planning committee carefully considered the criticisms of the Roose-Andersen study and other national assessments. Particular attention was paid to the feasibility of compiling a variety of specific measures (e.g., faculty publication records, quality of students, program resources) that were judged to be related to the quality of research-doctorate programs. Attention was also given to making improvements in the survey instrument and procedures used in the Cartter and Roose-Andersen studies. In September 1978 the planning group submitted a comprehensive report describing alternative strategies for an evaluation of the quality and effectiveness of research-doctorate programs. The proposed study has its own distinctive features. It is characterized by a sharp focus and a multi- dimensional approach. (1) It will focus only on programs awarding research doctorates; other purposes of doctoral training are acknowledged to be important, but they are outside the scope of the work contem- plated. (2) The multidimensional approach represents an explicit recognition of the limitations of studies that make assessments solely in terms of ratings of perceived quality provided by peers--the so-called reputational ratings. Consequently, a variety of quality-related measures will be employed in the proposed study and will be incorporated in the presentation of the results of the study. This report formed the basis for the decision by the Conference Board to embark on a national assessment of doctorate-level programs in the sciences, engineering, and the humanities. In June 1980 an 18-member committee was appointed to oversee the study. The committee,~9 made up of individuals from a diverse set of disciplines within the sciences, engineering, and the humanities, includes seven members who had been involved in the planning phase and several members who presently serve or have served as graduate deans at either public or private universities. During the first eight months the committee met three times to review plans for the study activities, make decisions on the selection of disciplines and programs to be covered, and design the survey instruments to be used. Early in the study an effort was made to solicit the views of presidents and graduate deans at more than 250 universities. Their suggestions were most helpful to the committee in drawing up final t8 National Research Council, A Plan to Study the Quality and Effec- tiveness of Research-Doctorate Programs, 1978 (unpublished report). ~9See p. iii of this volume for a list of members of the study committee.

9 plans for the assessment. With the assistance of the Council of Graduate Schools in the United States, the committee and its staff have tried to keep the graduate deans informed about the progress being made in this study. The final section of this chapter describes the procedures followed in determining which research-doctorate programs were to be included in the assessment. SELECTION OF DISCIPLINES AND PROGRAMS TO BE EVALUATED One of the most difficult decisions made by the study committee was the selection of disciplines to be covered in the assessment. Early in the planning stage it was recognized that some important areas of graduate education would have to be left out of the study. Limited financial resources required that efforts be concentrated on a total of no more than about 30 disciplines in the biological sciences, engineering, humanities, mathematical and physical sciences, and social sciences. At its initial meeting the committee decided that the selection of disciplines within each of these five areas should be made primarily on the basis of the total number of doctorates awarded nationally in recent years. At the time the study was undertaken, aggregate counts of doctoral degrees earned during the FY1976-78 period were available from two independent sources--the Educational Testing Service (ETS) and the National Research Council (NRC). Table 1.1 presents doctoral awards data for 10 disciplines within the mathematical and physical sciences. As alluded to in footnote 1 of the table, discrepancies between the ETS and NRC counts may be explained, in part, by differences in the data collection procedures. The ETS counts, derived from information provided by universities, have been categorized according to the discipline of the department/academic unit in which the degree was earned. The NRC counts were tabulated from the survey responses of FY1976-78 Ph.D. recipients, who had been asked to identify their fields of specialty. Since separate totals for research doctorates in astronomy, atmospheric sciences, environmental sciences, and marine sciences were not available from the ETS manual, the committee made its selection of six disciplines primarily on the basis of the NRC data. In the case of computer sciences, some consideration was given to the fact that the ETS estimate was significantly greater than the NRC estimate ·2 0 The selection of the research-doctorate programs to be evaluated in each discipline was made in two stages. Programs meeting any of the following three criteria were initially nominated for inclusion in the study: (1) more than a specified number (see below) of research doctorates awarded during the FY1976-78 period, 2 ° See footnote 4 in Table 1.1.

10 (2) more than one-third of that specified number of doctorates awarded in FY1979, or (3) an average rating of 2.0 or higher in the Roose-Andersen rating of the scholarly quality of departmental faculty. In each discipline the specified number of doctorates required for inclusion in the study was determined in such a way that the programs meeting this criterion accounted for at least 90 percent of the TABLE 1.1 Number of Research Doctorates Awarded in the Mathematical and Physical Science Disciplines, FY1976-78 Source of Date ETS Disciplines Included in the Assessment Chemistry Physics2 Mathematics Geosciences3 Computer Sciences4 Statistics/Biostatisticss Total Disciplines Not Included in the Assessment Astronomy Marine Sciences Atmospheric Sciences Environmental Sciences Other Physical Sciences Total NRC 4,624 3,139 1,985 1,395 728 457 12,328 N/A6 N/A N/A N/A N/A 4,739 3,033 1,848 1,139 456 634 11,849 408 406 246 160 132 1,352 data on FY1976-78 doctoral awards were derived from two independent sources: Educational Testing Service (ETS), Graduate Programs and Admissions Manual, 1979-81, and NRC's Survey of Earned Doctorates, - 1976-78. Differences in field definitions account for discrepancies between the ETS and NRC data. 2 Data from ETS include doctorates in astronomy and astrophysics. 3 Data from ETS include doctorates in atmospheric sciences and oceanography. 4 The ETS data may include some individuals from computer science departments who earned doctorates in the field of electrical engineering and consequently are not included in the NRC data. s Date from ETS exclude doctorates in biostatistics. 6 Not available.

11 doctorates awarded in that discipline during the FY1976-78 period. In the mathematical and physical science disciplines, the following numbers of FY1976-78 doctoral awards were required to satisfy the first criterion (above): Chemistry--13 or more doctorates Computer Sciences--5 or more doctorates Geosciences--7 or more doctorates Mathematics--7 or more doctorates Physics--10 or more doctorates Statistics/Biostatistics--5 or more doctorates A list of the nominated programs at each institution was then sent to a designated individual (usually the graduate dean) who had been appointed by the university president to serve as study coordinator for the institution. The coordinator was asked to review the list and eliminate any programs no longer offering research doctorates or not belonging in the designated discipline. The coordinator also was given an opportunity to nominate additional programs that he or she believed should be included in the study. Coordinators were asked to restrict their nominations to programs that they considered to be Of uncommon distinction" and that had awarded no fewer than two research doctorates during the past two years. In order to be eligible for inclusion, of course, programs had to belong in one of the disciplines covered in the study. If the university offered more than one research-doctorate program in a discipline, the coordinator was instructed to provide information on each of them so that these programs could be evaluated separately. The committee received excellent cooperation from the study co- ordinators at the universities. Of the 243 institutions that were identified as having one or more research-doctorate programs satisfying the criteria (listed earlier) for inclusion in the study, only 7 declined to participate in the study and another 8 failed to provide the program information requested within the three-month period allotted (despite several reminders). None of these 15 institutions had doctoral programs that had received strong or distinguished reputational ratings in prior national studies. Since the information requested had not been provided, the committee decided not to include programs from these institutions in any aspect of the assessment. In each of the six chapters that follows, a list is given of the universities that met the criteria for inclusion in a particular discipline but that are not represented in the study. As a result of nominations by institutional coordinators, some programs were added to the original list and others dropped. Table 1.2 reports the final coverage in each of the six mathematical and physical science disciplines. The number of programs evaluated varies 2 ~ See Appendix A for the specific instructions given to the coordinators.

12 TABLE 1.2 Number of Programs Evaluated in Each Discipline and the Total FY1976-80 Doctoral Awards from These Programs Discipline ProgramsFY1976-80 Doctorates* Chemistry 1457,304 Computer Sciences 581,154 Geosciences 911,747 Mathematics 1152,698 Physics 1234,271 Statistics/Biostatistics 64906 TOTAL 59618,080 *The data on doctoral awards were provided by the study coordinator at each of the universities covered in the assessment. considerably by discipline. A total of 145 chemistry programs have been included in the study; in computer sciences and statistics/ biostatistics fewer than half this number have been included. Although the final determination of whether a program should be included in the assessment was left in the hands of the institutional coordinator, it is entirely possible that a few programs meeting the criteria for inclusion in the assessment were overlooked by the coordinators. During the course of the study only two such programs in the mathematical and physical sciences--one in mathematics and one in biostatistics--have been called to the attention of the committee. In the chapter that follows, a detailed description is given of each of the measures used in the evaluation of research-doctorate programs in the mathematical and physical sciences. The description includes a discussion of the rationale for using the measure, the source from which data for that measure were derived, and any known limitations that would affect the interpretation of the data reported. The committee wishes to emphasize that there are limitations associated with each of the measures and that none of the measures should be regarded as a precise indicator of the quality of a program in educating scientists for careers in research. The reader is strongly urged to consider the descriptive material presented in Chapter II before attempting to interpret the program evaluations reported in subsequent chapters. In presenting a frank discussion of any shortcomings of each measure, the committee's intent is to reduce the possibility of misuse of the results from this assessment of research-doctorate programs.

Next: II. Methodology »

An Assessment of Research-Doctorate Programs in the United States: Mathematical and Physical Sciences (1982)

Chapter: I. Origins of Study and Selection of Programs

Welcome to OpenBook!

Get Email Updates