exploring across programs, such as the challenges related to technical quality (e.g., reliability, fairness, and validity), as discussed in Chapter 2. Tests with innovative characteristics (like any tests) send signals to educators, students, and parents, about the learning that is most valued in the system—and in many cases innovative testing has lead to changes in practice. Testing also has costs, including a burden in both time and resources, which are likely to be different for different innovative assessments. Testing also provokes reactions from stakeholders, particularly politicians.
In 1990 performance and other kinds of alternative assessments were popular in the states, with 24 of them using, developing, or exploring possibilities for applying one of these approaches (Stecher and Hamilton, 2009). Today they are much less prevalent. States have moved away from these approaches, primarily for political and budget reasons, but a look at several of the most prominent examples highlights some lessons, as Brain Stecher explained in a synopsis. Individuals who had experience with several of the programs added their perspectives.
Vermont was a pioneer in innovative assessment, having implemented a portfolio-based program in writing and mathematics in 1991 (Stecher and Hamilton, 2009). The program was designed both to provide achievement data that would permit comparison of schools and districts and to encourage instructional improvements. Teachers and students in grades 4 and 8 collected work to represent specific accomplishments, and these portfolios were complemented by a paper-and-pencil test.
Early evaluations raised concerns about scoring reliability and the validity of the portfolio as an indicator of school quality (Koretz et al., 1996). After efforts to standardize scoring rubrics and criteria for selecting student work, reliability improved, but evaluators concluded that the scores were not accurate enough to support judgments about school quality.
The researchers confirmed that teachers altered their practice: for example, they focused more on problem solving in mathematics. Many schools began using portfolios in several subjects because they found them useful. However, some critics observed that teachers did not uniformly demonstrate a clear understanding of the intended criteria for selecting student work, and others commented that teachers began overemphasizing the specific strategies that were included on the standardized rubrics. Costs were high—$13 per student just for scoring. The program was discontinued in the late 1990s, primarily because of concerns about the technical quality of the scores.