This study focuses on tests with high stakes for individual students. The committee recognizes that accountability for students is related in important ways to accountability for educators, schools, and school districts. Indeed, the use of tests for accountability of educators, schools, and school districts has significant consequences for individual students, for example, by changing the quality of instruction or affecting school management and budgets. Such indirect effects of large-scale assessment are worth studying in their own right. By focusing on the congressional interest in high-stakes decisions about individual students, this report does not address accountability at those other levels, apart from the issue of participation of all students in large-scale assessments.
The use of tests in decisions about student tracking, promotion, and graduation is intended to serve educational policy goals, such as setting high standards for student learning, raising student achievement-levels, ensuring equal educational opportunity, fostering parental involvement in student learning, and increasing public support for the schools. The committee recognizes that test use may have negative consequences for individual students even while serving important social or educational policy purposes. The development of a comprehensive testing policy should therefore be sensitive to the balance among the individual and collective benefits and costs of various uses of tests.
Determining whether high-stakes testing of students produces better overall educational outcomes requires that its potential benefits be weighed against its potential unintended negative consequences. Thus, the value of tests should also be weighed against the use of other information in making high-stakes decisions about students. Tracking, promotion, and graduation decisions will be made with or without tests.
The committee adopted three principal criteria, developed from earlier work by the National Research Council, for determining whether a test use is appropriate:
measurement validity—whether a test is valid for a particular purpose, and whether it accurately measures the test taker's knowledge in the content area being tested;