Read "Evaluating and Improving Undergraduate Teaching in Science, Technology, Engineering, and Mathematics" at NAP.edu

Page 139 Cite

Suggested Citation:"Appendix A: Selected Student Evaluation Instruments." National Research Council. 2003. Evaluating and Improving Undergraduate Teaching in Science, Technology, Engineering, and Mathematics. Washington, DC: The National Academies Press. doi: 10.17226/10024.

×

Appendix A
Selected Student Evaluation Instruments

TYPES OF STUDENT EVALUATION INSTRUMENTS

Current Students: End-of-Course Questionnaires

Questionnaires administered at the end of the term have long been widely used to elicit students’ opinions about individual courses or instructors (Seldin, 1998). Studies on the reliability and validity of these types of student ratings have been undertaken for more than 70 years (Centra, 1993). Students are in a unique position to comment on their satisfaction with a course and the impact of the instruction on their own learning. However, they are not subject matter experts, and therefore are not in a position to make judgments about the currency or accuracy of course content. In addition, research has shown that ratings by students are sometimes influenced by their level of motivation for taking the course, attitude toward the course or the instructor, and needs or contextual variables (e.g., whether the course is required). Findings from research on the use of student questionnaires suggest that when these instruments are used, the results should be compared with data from student questionnaires in similar courses.

Those who design or use data from student questionnaires must be careful to distinguish instruments that ask students to evaluate courses from those that ask them to evaluate the instruction or the instructor. Forms are often constructed to ask students to rate various aspects of a course and then to provide a rating for the professor’s performance. Use of such data for evaluating teaching effectiveness becomes problematic if most of the questions asked of students focus on

Page 140 Cite

Suggested Citation:"Appendix A: Selected Student Evaluation Instruments." National Research Council. 2003. Evaluating and Improving Undergraduate Teaching in Science, Technology, Engineering, and Mathematics. Washington, DC: The National Academies Press. doi: 10.17226/10024.

×

components of the course itself, such as the usefulness of the textbook or amount of material covered.

Current Students: Interviews

Interviewing students can provide rich, in-depth information about their responses to courses and instructors. When used appropriately, such interviews are usually either highly structured (following a specific set of questions and protocol), semistructured (with a few general items), or unstructured (e.g., “Tell me about this class”). Interviews can probe details and explore aspects of a course and the instructor’s role in it in ways that written questionnaires cannot. However, interviewing sufficient numbers of students to obtain an accurate picture of the instructor’s teaching and interpreting the results can require a great deal of time, rendering this approach somewhat impractical. Research also indicates that interviews are most helpful when they are used to provide feedback for improving teaching rather than for summative evaluation. Information garnered from interviews also can be more helpful to the instructor when the interviewing is done by an instructional improvement specialist, if available, or a trusted colleague (Centra, 1993).

Current Students: Measures of Learning

An extremely useful and increasingly common approach to evaluating teaching effectiveness is to measure students’ knowledge or skills at the beginning of a course or unit of the course and again after some body of material has been covered in class. Instructors can then observe and quantify the amount of improvement and draw inferences about the instructor’s effectiveness in helping students learn the subject matter. For measures of student learning to be considered valid and reliable, however, considerable effort is required to develop pre- and post-learning tests that actually measure the kind of learning desired. In addition, changes observed in students’ learning and performance cannot be attributed solely to the effectiveness of an individual instructor. Many factors, including students’ ability and motivation to learn and even their health status when taking either examination, can also influence the outcomes.

Indirect measures of student learning can be obtained through questionnaires that ask students to assess their own achievement (e.g., “How much have you learned from this course?”). Some research (e.g., Pike, 1995) has shown that students’ answers to such questions are correlated with their performance on end-of-course tests.

Page 141 Cite

Suggested Citation:"Appendix A: Selected Student Evaluation Instruments." National Research Council. 2003. Evaluating and Improving Undergraduate Teaching in Science, Technology, Engineering, and Mathematics. Washington, DC: The National Academies Press. doi: 10.17226/10024.

×

Another useful approach is for the instructor to evaluate student learning throughout the term. Instructors can use the information obtained from these regular assessments of student learning to improve their teaching and make midcourse corrections in the approaches they are using. Faculty members can thus conduct their own classroom research, gathering measures of student learning to improve their teaching (Brookfield, 1995; National Institute for Science Education, 2001b). An instructor’s use of such approaches, the range of test instruments employed (e.g., short-answer and essay questions, computer simulations, and laboratory-based problems, in addition to multiple-choice and similar kinds of questions) and the ways in which the instructor responds to indicators of student learning can be useful measures of teaching effectiveness.

Instructors also can benefit from knowing whether students who have taken their courses have mastered concepts and skills that will be needed for subsequent, higher level courses. Thus, questions about specific concepts the students will have been expected to learn can be included in pre/post-testing. Alternatively, as part of their evaluation of program effectiveness, academic departments can develop assessment instruments that can be used to examine whether students have learned well the knowledge and skills they need to move through a vertically structured departmental curriculum.

GUIDELINES FOR THE USE OF STUDENT EVALUATIONS

Having examined the research literature and practices in several different types of institutions of higher education, the committee offers here guidelines for the use of student evaluations, particularly in making decisions about a faculty member’s professional life. Centra (1993: especially 89–93) offers a detailed discussion of the issues involved; the suggestions offered below are based in part on that analysis.

Make clear to faculty and students how results of student evaluations will be used. Faculty members, administrators, and students need to understand both how the results will be used and who will have access to them.

Use student evaluation as only one piece of relevant information from several sources. Because student evaluations represent student views only, other sources of information (colleagues, self-reports, evidence of student learning) must be considered. Student evaluations are relatively easy to obtain, but that should not result in giving them undue weight. Note that when multiple sources of evaluation data are used,

Page 142 Cite

Suggested Citation:"Appendix A: Selected Student Evaluation Instruments." National Research Council. 2003. Evaluating and Improving Undergraduate Teaching in Science, Technology, Engineering, and Mathematics. Washington, DC: The National Academies Press. doi: 10.17226/10024.

×

consensus must be reached on how each source will be weighted when making decisions about teaching effectiveness.

Use several sets of evaluation results. For personnel decisions, a pattern of evaluation results derived from different courses taught over more than one semester should be used. Using results from five or more classes is generally best. Also, the results of student evaluations should be compared with a historical record for that class or type of class, if such data are available.

Have a sufficient number of students evaluate each course. Averaging responses from a sufficient number of students minimizes the effects of a few divergent opinions. Reliability estimates (see Chapter 4 for a definition of reliability as used in psychometrics) are excellent for classes of 25 students or more. In classes with fewer students, it is critical to examine patterns of student responses across a number of classes. Reliability estimates for classes of 15 or more are at an acceptable level. For very large classes, a representative or random sample of students totaling 25 or more can be selected to complete the form. An effort should be made to encourage at least 60 percent of enrolled students to participate in the evaluation, and at least 15–25 questionnaires are needed for results to be considered reliable. If the class has fewer than 10 students, it is best not to summarize the data. For sufficiently large sample sizes, means and standard deviations are used most frequently to summarize data.

Consider some course characteristics in interpretations. While any single course variable may not have a great effect, a combination (e.g., small classes, course subject area) could affect a teacher’s mean rating.

Use comparative data. Comparisons among instructors within an institution or, better yet, across a large number of similar institutions can help in interpreting results by minimizing the effects of any skewed distributions.

Do not overestimate small differences. Because student evaluations typically are quantified, there may be a tendency to assign them a precision they do not possess or warrant. A 10-percentile difference between instructors generally does not represent a practical distinction.

For personnel decisions, emphasize global evaluations and estimates of learning. Overall ratings of instruction or of a course tend to correlate highly with measured student achievement— more highly than ratings dealing with different teaching styles and presentation methods. Students’ estimates of their own learning also can be useful

Page 143 Cite

Suggested Citation:"Appendix A: Selected Student Evaluation Instruments." National Research Council. 2003. Evaluating and Improving Undergraduate Teaching in Science, Technology, Engineering, and Mathematics. Washington, DC: The National Academies Press. doi: 10.17226/10024.

×

and reasonably accurate means of assessing this aspect of teaching effectiveness.

Use standardized procedures for administering forms in class. When results may be used in personnel decisions, standardized procedures are necessary to minimize possible biasing effects. These procedures include having the instructor leave the room and providing consistent information to students about how the data will be used. Departments and institutions should also develop policies to ensure uniform procedures for distributing, collecting, and analyzing standardized forms. Normally, forms are completed anonymously in class. Some schools also require that students either return their evaluation forms to an administrative office individually or give them to a student in the class who is assigned to deliver them. An ideal approach is to use special staff, such as those from the teaching and learning center, to administer and collect rating forms. Another possibility is to use department secretarial staff. Use of student volunteers is least desirable.

Student evaluations are most commonly completed at the end of the course and prior to final exams or grades. They can also be distributed at midsemester to assist in instructional improvement. Another approach is to administer the final examination early and then require students to attend a session where they receive their graded examination and are asked to complete the evaluation form. This approach allows students to review the instructor’s comments on their final examination, making the examination a more important component of the overall learning experience in the course. Having this information and perspective allows students to offer a more complete evaluation of the course. It is important to note, however, that employing this technique may well result in an instructor’s receiving lower evaluations than instructors who distribute the evaluations before administering the final examination. This difference in approaches should be considered in any summative evaluation of a faculty member’s teaching.

Expect those being evaluated to respond to evaluation results. Faculty should have the opportunity to discuss with their department chair or others involved in personnel decisions any circumstances they believe may have affected student evaluations of their teaching. They also should be asked to describe in writing what they were trying to accomplish in the course and how their teaching methods suited those objectives (e.g., Hutchings, 1998). Their written comments should be placed in their official dossier or wherever the student ratings are kept. It also

Page 144 Cite

Suggested Citation:"Appendix A: Selected Student Evaluation Instruments." National Research Council. 2003. Evaluating and Improving Undergraduate Teaching in Science, Technology, Engineering, and Mathematics. Washington, DC: The National Academies Press. doi: 10.17226/10024.

×

is important to keep in mind that traditional student rating forms often do not reflect an instructor’s effectiveness in less traditional teaching or testing environments.

Limit the use of rating forms. The use of student rating forms may reach a point of diminishing returns. If they are overused, neither students nor instructors will give them the level of attention required for fair evaluation of teaching or continued professional development by the faculty member in question.