rely only on multiple-choice items, and could provide annual report cards for schools.
The governor’s leadership was critical in marshalling the support of business and political leaders in the state, in Ferrara’s view. Because the Maryland governor appoints members of the state board of education, who, in turn, appoint the superintendent of public instruction, the result was “a team working together on education reform.” This team was responding to shifting expectations for education nationally, as well as a sense that state and district policy makers and the public were demanding assurances of the benefits of their investment in education. Ferrara recalled that the initial implementation went fairly smoothly, owing in part to concerted efforts by the state superintendent and others to communicate clearly with the districts about the goals for the program and how it would work and to solicit their input.
Most school districts were enthusiastic supporters, but several challenges complicated the implementation. The standards focused on broad themes and conceptual understanding, and it was not easy for test developers to design tasks that would target those domains in a way that was feasible and reliable. The way the domains were described in the standards led to complaints that the assessment did not do enough to assess students’ factual knowledge. The schedule was also exceedingly rapid, with just 11 months between initial planning and the first administration. The assessment burden was great—9 hours over 5 days for 3rd graders, for example. There were also major logistical challenges posed by the manipulatives needed for the large number of handson items.
The manipulatives not only presented logistical challenges, they also revealed a bigger challenge. For example, teachers who had not even been teaching science were asked to lead students through an assessment that included experiments. Teachers were also being asked more generally to change their instruction. The program involved teachers in every phase—task development, scoring, etc.—and Ferrara believes that this was one of the most important ingredients in its early success. As discussed above, there is evidence that teachers changed their practice in response to MSPAP (Koretz et al., 1996; Lane, 1999). Nevertheless, many people in the state began to oppose the program. Criticisms of the content and concern about the lack of individual student scores were the most prominent complaints, and the passage of the No Child Left Behind (NCLB) Act in 2002 made the latter concern urgent. The test was discontinued in that year.
For Ferrara, several key lessons can be learned from the history of MSPAP:
Involving stakeholders in every phase of the process was very valuable, both improving the quality of the program and building political acceptance.