Click for next page ( 182


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 181
IT Sundry and Discussion In the nine preceding chapters results are presented of the assess- ment of 522 research-doctorate programs in art history. classics, English language and literature, French language language and literature, ~ language and literature. describing the means and ~ a particular discipline. In this chapter a comparison is made of the summary data reported in the nine disciplines. Also presented here are an analysis of the reliability (consistency) of the reputational survey ratings and an examination of some factors that might possibly have influenced the survey results. This chapter concludes with suggestions for improving studies of this kind--with particular attention given to the types of measures one would like to have available for an assess- ment of research-doctorate programs. This chanter necessar ilv involves a detailed discussion of various and literature, German linguistics, music, philosophy, and Spanish Included in each chapter are summary data intercorrelations of the program measures in ~ ~ ~ ~ ~ statistics (means, standard Deviations, correlation coerr~c~en~s' describing the measures. Throughout, the reader should bear in mind that all these statistics and measures are necessarily imperfect attempts to describe the real quality of research-doctorate programs. Quality and some differences in quality are real, but these differences cannot be subsumed completely under any one quantitative measure. For example, no single numerical ranking--by measure 08 or by any weighted average of measures--can rank the quality of different programs with precision. However, the evidence for reliability indicates considerable stability in the assessment of quality. For instance, a program that comes out in the first decile of a ranking is quite unlikely to "really" belong in the third defile, or vice versa. If numerical ranks of programs were replaced by groupings (distinguished, strong, etc.), these groupings again would not fully capture actual differences in quality since there would likely be substantial ambiguity about the borderline between adjacent groups. Furthermore any attempt at linear ~ Programs of ordering (best, next best, . . .l 1~ "~=v -= ~ in.. ~ . =~ ^^ ~ inaccurate. roughly comparable quality may be better in different ways, so that there simply is no one best--as will also be indicated in some of the numerical analyses. However, these difficulties of formulating ranks 181

OCR for page 181
182 should not hide the underlying reality of differences in quality or the importance of high quality for effective doctoral education. SUMMARY OF THE RESULTS Displayed in Table 12.1 are the numbers of programs evaluated (bottom line) and the mean values for each measure in the nine humanities disciplines.) As can be seen, the mean values reported for individual measures vary considerably among disciplines. The pattern of means on each measure is summarized below, but the reader interested in a detailed comparison of the distribution of a measure may wish to refer to tables presented in the preceding chapters.2 Program Size (Measures 01-031. Based on the information provided to the committee by the study coordinator at each university, English programs had, on the average, the largest number of faculty members (31 in December 1980}, followed by music (20~. English programs graduated the most students (44 Ph.D. recipients in the FY1975-79 period) and had the largest enrollment (62 doctoral students in December 1980~. In contrast, classics programs were reported to have an average of only 11 faculty members, 10 graduates, and 17 doctoral students. Program Graduates (Measures 04-071. The mean fraction of FY1975-79 doctoral recipients who as graduate students had received some national fellowship or training grant support (measure 04) ranges from .12 for graduates of music programs to .36 for graduates in linguistics. With respect to the median number of years from first enrollment in a grad- uate program to receipt of the doctorate (measure 05), graduates in classics, linguistics, and philosophy typically earned their degrees more than a full year sooner than graduates in any other humanities discipline. In terms of employment status at graduation (measure 06), an average of 67 percent of the Ph.D. recipients from art history pro- grams reported that they had made firm job commitments by the time they had completed requirements for their degree, contrasted with 48 percent of the program graduates in French. A mean of 35 percent of the art history graduates reported that they had made firm commitments to take positions in Ph.D.-granting institutions (measure 07), while only 19 percent of those in French had made such plans. Survey Results (Measures 08-111. Differences in the mean ratings aer~vea Prom one repucac~ona' survey are small. In all nine disci- plines the mean rating of scholarly quality of program faculty See Table 2.1 for a description of each of the measures and the units in which values of a measure are reported. 2 The second table in each of the nine earlier chapters presents the standard deviation and decile values for each measure.

OCR for page 181
183 TABLE 12.1 Mean Values for Each Program Measure, by Discipline Art Linguis- Philos- History Classics English French German tics Music ophy Spanish Program Size 01 13 11 31 11 9 14 20 14 10 02 18 10 44 15 13 19 26 18 13 03 33 17 62 20 15 34 42 29 24 Program Graduates 04 .32 .28 .20 .26 .28 .36 .12 .27 .24 05 9.3 7.7 9.1 9.2 8.9 7.9 10.0 7.9 9.0 06 .67 .58 .57 .48 .51 .62 .64 .57 .60 07 .35 .32 .20 .19 .25 .28 .24 .25 .27 S urvey Results 08 2.7 2.9 2.5 2.6 2.9 2.8 2.8 2.6 2.7 09 1.5 1.6 1.5 1.6 1.7 1.6 1.6 1.5 1.6 10 1.1 .9 1.0 1.0 .9 1.1 1.1 1.1 1.0 11 1.1 1.2 .9 1.0 1.2 1.2 1.0 1.1 1.1 University Librar, 12 .7 1.0 .2 .5 .5 .8 .6 .3 .3 Total Programs 41 35 106 58 48 35 53 77 69

OCR for page 181
184 (measure 08) is slightly below 3.0 ("good"), and programs were judged to be, on the average, a bit below "moderately" effective (2.0) in educating research scholars/scientists (measure 091. In the opinions of the survey respondents, there has been "little or no change" (approximately 1.0 on measure 10) in the last five years in the overall average quality of programs. The mean rating of an evalua- tor's familiarity with the work of program faculty (measure 11) is close to 1.0 ("some familiarity") in every discipline--about which more will be said later in this chapter. University Library (Measure 12~. Measure 12, based on a composite index of the sizes of the library in the university in which a pro- gram resides, is calculated on a scale from -2.0 to 3.0, with means ranging from .2 in English to .8 in linguistics, and 1.0 in classics. These differences may be explained, in large part, by the number of programs evaluated in each discipline. In the disciplines with fewest doctoral programs (classics and linguistics), the programs included are typically found in the larger institutions, which are likely to have high scores on the library size index. Ph.D. programs in English are found in a much broader spectrum of universities that includes the smaller institutions as well as the larger ones. CORRELATIONS AMONG MEASURES Relations among the program measures are of intrinsic interest and are relevant to the issue of validity of the measures as indices of the quality of a research-doctorate program. Measures that are logi- cally related to program quality are expected to be related to each other. To the extent that they are, a stronger case might be made for the validity of each as a quality measure. A reasonable index of the relationship between any two measures is the Pearson product-moment correlation coefficient. A table of corre- lation coefficients of all possible pairs of measures is presented in each of the nine preceding chapters. This chapter presents selected correlations to determine the extent to which coefficients are compa- rable in the nine disciplines. Special attention is given to the cor- relations involving the number of FY197S-79 program graduates (measure 02) and the survey rating of the scholarly quality of program faculty (measure 08~. These two measures have been selected because of their relatively high correlations with several other measures. Readers interested in correlations other than those presented in Tables 12.2 and 12.3 may refer to the third table in each of the preceding nine chapters. 3 The index, derived by the Association of Research Libraries, reflects a number of different measures, including number of volumes, fiscal expenditures, and other factors relevant to the size of a university library. See the description of this measure presented in Appendix D.

OCR for page 181
185 Correlations with Measure 02. Table 12.2 presents the correlations of measure 02 with each of the other measures used in the assessment. As might be expected, correlations of this measure with the other two measures of program size--number of faculty and doctoral student enrollment--are reasonably high in all nine disciplines. Of greater interest are the strong positive correlations in many disciplines between measure 02 and measures derived from either reputational sur- vey ratings or university library size. The coefficients describing the relationship of measure 02 with measure 12 are greater than .40 in all disciplines except linguistics and music. This result is not surprising, of course, since one might expect the larger programs to be located in the larger universities, which are likely to have libraries of considerable size. The correlations of measure 02 with measures 08, 09, and 11 are even stronger in most disciplines. It is quite apparent that the programs that received high survey ratings and with which evaluators were more likely to be familiar were also ones that had larger numbers of graduates. Although the committee gave serious consideration to presenting an alternative set of survey mea- sures that were adjusted for program size, a satisfactory algorithm for making such an adjustment was not found. In attempting such an adjustment on the basis of the regression of survey ratings on mea- sures of program size, it was found that some exceptionally large programs appeared to be unfairly penalized and that some very small programs received unjustifiably high adjusted scores. Correlations with Measure 08. Table 12.3 shows the correlation coef- ficients for measure 08, mean rating of the scholarly quality of pro- gram faculty, with each of the other variables. The correlations of measure 08 with measures of program size (01, 02, and 03) are signifi- cantly positive for all of the humanities disciplines except music. Not surprisingly, the larger the program, the more likely its faculty is to be rated high in quality. Correlations of measure 08 with measure 04, fraction of students with national fellowship awards, are .30 or higher in only four disci- plines: English, linguistics, music, and Spanish. For programs in the biological and social sciences, the corresponding coefficients (reported in a subsequent volume) are found to be greater, typically in the range .40 to .70. The lower correlations in the humanities may be primarily explained by the smaller number of national fellowships available in these disciplines. Correlations of rated faculty quality with measure 05, shortness of time from matriculation in graduate school to award of the doctorate, are positive in all nine humanities disciplines. Although the coef- ficents are not as high as those pertaining to program size (discussed above), they suggest that those programs producing graduates in shorter periods of time tended to receive higher survey ratings. This finding is surprising in view of the smaller correlations in these disciplines between measures of program size and shortness of time-to-Ph.D. It seems there is a tendency for programs that produce doctoral graduates in a shorter time span to have more highly rated faculty, and this tendency is relatively independent of the number of faculty.

OCR for page 181
186 TABLE 12.2 Correlations of the Number of Program Graduates (Measure 02) with Other Measures, by Discipline Art Lingu is- Phi los- History Classics English French German tics Music ophy Spanish Program Size 01 .72 .63 .65 .40 .44 .57 .54 .36 .46 03 .68 .58 .70 .67 .32 .74 .61 .50 .53 Program Graduates 04 -.14 -.05 .01 -.14 -.24 -.10 -.02 -.02 .30 05 -.22 -. 03 .21 .06 .03 -.34 -. 03 .13 -. 27 06 .33 .21 .02 -.11 .08 .03 .12 .19 .08 07 .13 .25 .21 .07 -.05 .05 -.17 .28 .10 Survey Results 08 .76 .66 .68 .64 .58 .50 .12 .42 .42 09 .74 .72 .66 .67 .66 .53 .13 .45 .48 10 - .06 .07 . 19 .02 . 12 -.30 . 08 -.23 -.13 11 .7S .61 .69 .63 .52 .49 .17 .43 .37 University Library 12 .49 .44 .59 .51 .51 .12 .12 .49 .42

OCR for page 181
187 TABLE 12.3 Correlations of the Survey Ratings of Scholarly Quality of Program Faculty (Measure 08) with Other Measures, by Discipline Art Linguis- Philos- History Classics English French German tics Music ophy Spanish Program Size .69 81 50 62 51 - 02 38 .42 02 76 66 68 60 41 36 17 .36 .22 Program Graduates .12 .10 .38 -.06 -.14 .33 .44 .24 . 38 Oo56 360 34 04 0220 2244 49 - 26 34 - 3178 07 .08 .64 . 54 .38 .50 . 57 . 28 .61 .15 Survey Results 96 .98 97 98 .98 .99 .97 97 10 , 31 . 28 . 4 5 60 . 29 16 . 20 , 28 University Library .71 .62 .65 .23 .73 .57 . 70

OCR for page 181
188 Correlations of ratings of faculty quality with measure 06, the fraction of program graduates with definite employment plans, are moderately high in linguistics, classics, and philosophy. In every discipline except art history, the correlation of measure 08 is higher with measure 07, the fraction of graduates having agreed to employment at a Ph.D.-granting greater in classics, The correlations . institution. These coefficients are .50 or philosophy, linguistics, English, and German. Of measure 08 with measure 09, rated effectiveness ot doctoral education, are uniformly very high, at or above .96 in every discipline. This finding is consistent with results from the Cartter and Roose-Andersen studies.4 The coefficients describing the relationship between measure 08 and measure 11, familiarity with the work of program faculty, are also very high, ranging from .93 to .98. In general, evaluators were more likely to have high regard for the quality of faculty in those programs with which they were most familiar. That the correlation coefficients are as large as observed may simply reflect the fact that "known" programs tend to be those that have earned strong reputations. Correlations of ratings of faculty quality with measure 10, ratings of perceived improvement in program quality, are much smaller but still positive in all nine disciplines. The highest coefficients are found for programs in German (.60) and French (.45~. One might have expected that a program judged to have improved in quality would have been somewhat more likely to receive high ratings on measure 08 than would a program judged to have declined--thereby imposing a small positive correlation between these two variables. High correlations are also observed in most disciplines between measure 08 and measure 12 (university library size). With the excep- tion of linguistics these coefficients are .50 or greater in all disciplines. It should be noted that the correlations between measure 08 and measure 12 are generally noticeably higher in the humanities ~ . . ~ . . . . . . . . . . . . alsclpllnes tnan tnev are In science and enalneerlna dlsclullnes. Despite the appreciable correlations between reputational ratings of quality and program size measures, the functional relations between the two probably are complex. If there is a minimum size for a high- quality program, this size is likely to vary from discipline to discipline. Increases in size beyond the minimum may represent more high-quality faculty, or a greater proportion of inactive faculty, or a faculty with heavy teaching responsibilities. In attempting to select among these alternative interpretations, a single correlation coefficient provides insufficient guidance. Nonetheless, certain similarities across disciplines may be seen in the correlations among the measures. High correlations consistently appear among measures 08, 09, and 11 from the reputational survey, and these measures also are prominently related to program size (measures 01, 02, and 03~-- except in music--and to library size (measure 12--except in linguistics. These results show that for most disciplines the 4 Roose and Andersen, p. 19.

OCR for page 181
189 reputational rating measures (08, 09, and 11) tend to be associated with program size and with another correlate of size: university library holdings. Also, for most disciplines the reputational measures 08, 09, and 11 tend to be positively related to shortness of time-to-Ph.D. (measure 05) and to employment prospects of program graduates (especially measure 07~. ANALYS I S OF THE SURVEY RESPONSE Measures 08-11, derived from the reputational survey, may be of particular interest to many readers since measures of this type have been the most widely used (and frequently criticized) indices of quality of graduate education. In designing the survey instrument for this assessment the committee made several changes in the form that had been used in the Roose-Andersen study. The modifications served two purposes: to provide the evaluators a clearer understanding of the programs that they were asked to judge and to provide the com- mittee with supplemental information for the analysis of the survey response. One change was to restrict to 50 the number of programs that any individual was asked to evaluate--in art history, classics, German, and linguistics, evaluators were asked to consider all programs (except their own) since there were fewer than 50 in the total set being evaluated. Probably the most important change was the inclusion of lists of names and ranks of individual faculty members involved in the research-doctorate programs to be evaluated on the survey form, together with the number of doctoral degrees awarded in the previous five years. Ninety percent of the evaluators were sent forms with faculty names and numbers of degrees awarded; the remaining 10 percent were given forms without this information, so that an analysis could be made of the effect of this modification on survey results. Another change was the addition of a question concerning an evaluator's familiarity with each of the programs. In addition to providing an index of program recognition (measure 11), the inclusion of this question permits a comparison between the ratings furnished by individuals who had considerable familiarity with a particular program and the ratings by those not as familiar with the program. Each evaluator was also asked to identify his or her own institution of highest degree and current field of specialization. This information enables us to compare, for each program, the ratings furnished by alumni of that institution with the ratings by other evaluators, as well as to examine differences in the ratings supplied by evaluators in certain specialty fields. Before examining factors that may have influenced the survey results, some mention should be made of the distributions of responses to the four survey items and the reliability (consistency) of the ratings. As can be seen from Table 12.4, the response distribution for each survey item does not vary greatly from discipline to disci- pline. For example, in judging the scholarly quality of faculty (measure 08), survey respondents in each discipline rated between 6 and 11 percent of the programs as being "distinguished" and between 3 and 5 percent as "not sufficient for doctoral education." In evaluat-

OCR for page 181
190 Survey Measure TABLE 12.4 Distribution of Responses to Each Survey Item, by Discipline Art Linguis- Philos- Total History Classics English French German tics Music ophy Spanish 08 SCHOLARLY QUALITY OF PROGRAM FACULTY Distinguished 8.1 11.0 11.0 6.9 6.9 8.5 9.6 9.3 7.6 6.5 Strong 17.8 1S.5 24.1 15.1 16.6 23.9 20.6 19.7 15.5 17.4 Good 23.3 23.3 23.7 20.6 23.5 26.8 25.1 19.7 22.5 26.1 Adaequsae1 18 6 17 4 17 1 17 5 20 3 17 8 18 8 172 9 10 8 1202 6 Not Sufficient for Doctoral Education 4.3 5~4 3.2 4.9 3.7 3.3 3.5 4.5 5.0 3.7 Don ' t Know Wel 1 Enough to Evaluate 18.3 18.4 11.6 25. 3 19. 3 10 . 8 13. 0 26. 2 19.1 13. 2 TOTAL 100.0 100.0 100.0 100.0 100. 0 100.0 100.0 100.0 100.0 100.0 0 9 EFFECTIVENESS OF PROGRAM IN EDUCATING SCIENTISTS Extremely Effeetive 8.1 10.6 9.6 6.7 7.1 11.1 10.0 9.3 6.7 6.9 Reasonably Effeetive 31.3 31.2 36.6 26.4 32.9 40.3 32.9 29.1 . 24.8 36.3 Minimally Effeetive 17.1 19.5 19.9 13.9 16.7 18.9 16.3 18.0 14.6 20.8 Not Effeetive 4.8 6.5 5.2 4.2 4.0 4.6 4.5 5.9 5.9 3.6 Don't Know Well Enough to Evaluate 38.8 32.2 28.7 48.8 39.4 25.1 36.4 37.7 48.0 32.4 TOTAL 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 10 CHANGE IN PROGRAM QUALITY IN LAST FIVE YEARS Better 11.2 14.9 10.9 8.3 11.3 9.5 16.6 9.0 11.8 11.9 Little or No Change 31.0 31.2 34.0 21.4 31.5 41.9 31.7 28.9 32.7 34.5 Poorer 9.9 8.9 15.2 7.3 10.0 14.6 10.4 6.5 8.0 12.0 Don't Know Well Enough to Evaluate 47.9 45.0 39.9 63.0 47.2 33.9 41.3 55.6 47.6 41.5 TOTAL 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 11 FAMILIARITY WITH WORK OF PROGRAM FACULTY Considerable 30.2 33.2 39.7 20.4 28.1 36.8 37.9 26.5 30.7 33.0 Some 43.8 40.4 43.7 44.2 44.6 45.8 41.3 41.6 42.6 46.9 Little or None 25.1 24.5 16.3 34.5 26.9 16.7 20.4 29.7 26.2 18.6 No Response 9 1 9 4 9 3 7 4 2 3 5 1 4 TOTAL 100 0 100 0 100 0 100 0 100 0 100 0 100 0 100 0 100 0 100 0 NOTE: For survey measures 08, 09, 10 the Don't know" category includes a small number of eases for which the respondents provided no response to the survey item.

OCR for page 181
191 ing the effectiveness in educating research scholars, they rated 7 to 11 percent of the programs as being "extremely effective" and approximated 4 to 7 percent as "not effective." Of particular interest in this table are the frequencies with which evaluators failed to provide responses to measures 08, 09, and 10. Approximately 18 percent of the total number of evaluations requested for measure 08 were not furnished because survey respondents in the humanities felt that they were not familiar enough with a particular program to eval- uate it. The corresponding percentages of "don't know" responses for measures 09 and 10 are considerably larger--39 and 48 percent, respec- tively--suggesting that survey respondents found it more difficult (or were less willing) to judge program effectiveness and change than to judge the scholarly quality of program faculty. The large fractions of "don't know" responses are a matter of some concern. However, given the broad coverage of research-doctorate pro- grams, it is not surprising that faculty members would be unfamiliar with many of the less distinguished programs. As shown in Table 12.5, survey respondents in each discipline were much more likely to furnish evaluations for programs with high reputational standing than they were for programs of lesser distinction. For example, for humanities pro- grams that received mean ratings of 4.0 or higher on measure 08, as many as 97 percent of the evaluations requested on measure 08 were provided; 89 and 79 percent, respectively, were provided on measures 09 and 10. In contrast, the corresponding response rates for programs with mean ratings below 2.0 are much lower--66, 43, and 32 percent response on measures 08, 09, and 10, respectively. Of great importance to the interpretation of the survey results is the reliability of the response. How much confidence can one have in the reliability of a mean rating reported for a particular program? In the second table in each of the preceding nine chapters, estimated standard errors associated with the mean ratings of every program are presented for all four survey items (measures 08-11~. While there is some variation in the magnitude of the standard errors reported in every discipline, they rarely exceed .15 for any of the four measures and typically range from .05 to .10. For programs with higher mean ratings the estimated errors associated with these means are generally smaller--a finding consistent with the fact that survey respondents were more likely to furnish evaluations for programs with high repu- tational standing. The "split-half" correlationsS presented in Table 12.6 give an indication of the overall reliability of the survey results in each discipline and for each measure. In the derivation of these correlations individual ratings of each program were randomly divided into two groups (A and B), and a separate mean rating was computed for each group. The last column in Table 12.6 reports the 5 For a discussion of the interpretation of "split-half" coefficients, see Robert L. Thorndike and Elizabeth Hagan, Measurement and Evaluation in Psychology and Education, John Wiley & Sons, New York, 1969, pp. 182-185.

OCR for page 181
s . o 4 e O + Measure + 3.0++ 08 + 2. 0+ 1 . 0+ 206 * * * * * * * * * * * * * * * * * * * * * * * * * * * * r = .89 C.O + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1.0 2.0 3.0 4.0 5.0 Roose-Ander sen Rat ing ( 197 0 ) FIGURE 12.2 Mean rating of scholarly quality of faculty (measure 08) versus mean rating of faculty in the Roose-Andersen study--30 programs in classics.

OCR for page 181
207 5. 0++ + + + + 4. 0++ + + + Measure + 3.0++ 08 + 2. 0++ + + + + 1 . 0++ + + + + 0.0 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1.0 2.0 3.0 4.0 5.0 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Roose-Andersen Rating ( 1970) r = .91 FIGURE 12.3 Mean rating of scholarly quality of faculty (measure 08) versus mean rating of faculty in the Roose-Andersen study--82 programs in English language & literature.

OCR for page 181
208 5 . 0 ++ + + 4 . 0++ + Measure + 3.0++ 08 + + + 2 . 0 1 . 0 * * * * * * * * * * * * * * . * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ^.0 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1.0 2.0 3.0 4.0 5.0 Roose-Ander sen Rat i ng ( 19 7 0 ) FIGURE 12.4 Mean rating of scholarly quality of faculty (measure 08) versus mean rating of faculty in the Roose-Andersen study--49 programs in French language & 1 iterature.

OCR for page 181
5 . o++ + 4. 0++ + 209 * * * * * * * + + Measure + 3.0++ 0 8 + + 2 . 0++ + 1.0++ . * * * * * * * * * * * * * * * * * * * * * r = .91 C.O + + + + + + + + + + + + + + + + + + + + + + + + + + + + + '.0 2.0 3.0 4.0 5.0 Roose-Andersen Rating (1970) FIGURE 12.5 Mean rating of scholarly quality of faculty (measure 08) versus mean rating of faculty in the Roose-Andersen study--36 programs in German language ~ literature.

OCR for page 181
210 s . o++ + + + + 4 . 0++ + + + + Measure + .0++ 3 08 + 2.0+ 1 . O v . * * * * * * * * * * * * * * * * * * * * * ,0 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1.0 2.0 3.0 4.0 5.0 Roose-Andersen Rating ( 1970) FIGURE 12.6 Mean rating of scholarly quality of faculty (measure 08) versus mean rating of faculty in the Roose-Andersen study--26 programs in linguistics.

OCR for page 181
211 s o++ 4.0++ Measure + 3.0++ 0 8 + + + + + 2 . 0 ++ + + + + ; . O++ + C.O + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1.0 2.0 3.0 4.0 5.0 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Roose-Ander sen Rat ing ( 197 0 ) r = .94 FIGURE 12.7 Mean rating of scholarly quality of faculty (measure 08) versus mean rating of faculty in the Roose-Andersen study--34 programs in music.

OCR for page 181
212 s. o++ + * * + * + * 4. 0++ * * + + * + * * + * * * 08 Measure + 3 . 0++ 2 . 0++ + 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * + + + * * * * + * + + * . O++ * r = . 86 * * 0.0 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1.0 2.0 3.0 4.0 Roose-Andersen Rating ( 1970) FIGURE 12.8 Mean rating of scholarly quality of faculty (measure 08) versus mean rating of faculty in the Roose-Andersen study--58 programs in philosophy. 5.0

OCR for page 181
213 5.0++ + + 4. 0++ + + + + + Measure + 3.0++ 08 + + + + 2.0++ . 1. 0++ + + * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * r = .86 w.0 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1.0 2.0 3.0 4.0 5~0 Roose-Andersen Rating ( 1970 ) FIGURE 12.9 Mean rating of scholarly quality of faculty (measure 08) versus mean rating of faculty in the Roose-Andersen study--52 programs in Spanish language & literature.

OCR for page 181
214 English, German, and music the correlation coefficients are greater than .90. The lowest coefficient found is for programs in linguistics (.78~. The extraordinarily high correlations found in most of these disciplines may suggest to some readers that reputational standings of programs have changed very little in the last decade. However, dif- ferences are apparent for some institutions. Also, one must keep in mind that the correlations are based on the reputational ratings of only three-fourths of the programs evaluated in this assessment in these disciplines and do not take into account the emergence of many new programs that did not exist or were too small to be rated in the Roose-Andersen study. FUTURE STUDIES One of the most important objectives in undertaking this assess- ment was to test new measures not used extensively in past evaluations of graduate programs. Although the committee believes that it has been successful in this effort, much more needs to be done. First and foremost, studies of this kind should be extended to cover other types of programs and other disciplines not included in this effort. As a consequence of budgeting limitations, the committee had to restrict its study to 32 disciplines, selected on the basis of the number of doctorates awarded in each. Among those omitted were programs in Russian, which was included in the Roose-Andersen study; a multi- dimensional assessment of research-doctorate programs in this and many other important disciplines would be of value. Consideration should also be given to embarking on evaluations of programs offering other types of graduate and professional degrees. As a matter of fact, plans for including masters-degree programs in this assessment were originally contemplated, but because of a lack of available information about the resources and graduates of programs at the master's level, it was decided to focus on programs leading to the research doctorate. Perhaps the most debated issue the committee has had to address concerned which measures should be reported in this assessment. In fact, there is still disagreement among some of its members about the relative merits of certain measures, and the committee fully recognizes a need for more reliable and valid indices of the quality of graduate programs. First on a list of needs is more precise and meaningful information about the product of research-doctorate programs--the graduates. For example, what fraction of the program graduates have gone on to be productive scholars--either in the academic setting or outside the university environs? What fraction have gone on to become outstanding scholars--as measured by receipt of major prizes, member- ship in academies, and other such distinctions? How do program grad- uates compare with regard to their publication records? Also desired might be measures of the quality of the students applying for admittance to a graduate program {e.g., Graduate Record Examination scores, undergraduate grade point averages). If reliable data of this sort were made available, they might provide a useful index of the

OCR for page 181
215 reputational standings of programs, from the perspective of graduate students. A number of alternative measures relevant to the quality of pro- gram faculty were considered by the committee but not included in the assessment because of the associated difficulties and costs of compil- ing the necessary data. For example, what fraction of the program faculty were invited to present papers at national meetings? What fraction had been elected to prestigious organizations/groups in their field? What fraction had received senior fellowships and other awards of distinction? In addition, it would be highly desirable to compile information about research awards received by faculty members in humanities programs.

OCR for page 181