interest for this paper. The next two sections discuss conducting research in this area. One is on the type of research that has been done and the other is on the complexity of conducting research on accountability systems. The accountability part concludes with a section discussing issues and concerns related to researching accountability systems. The third part is on assessment and begins by defining assessment in general as applied in science. This is followed by a section that outlines recent changes in what people think about assessment including the vision for assessment in the NSES and AAAS Benchmarks. The third and fourth sections present research on the relationship between standards, including the NSES and AAAS Benchmarks (but not limited to these), and assessments. The third section discusses the alignment between standards and assessment, an important procedure for judging the relationship between standards and assessments. This is followed by a section of research on the influence of assessment on teachers’ practices and student learning. The fourth part of the paper is our conclusions and needed research.


A number of initiatives have shaped education over the last decade—before the NSES and AAAS Benchmarks were written and after they were published. Over this time, accountability emerged as a dominant strategy employed by states and districts to improve education. Since the early 1990s, all 50 states have been engaged in developing education initiatives related to high standards and measurement of student performance that focus accountability on student outcomes. These efforts were spurred early in the decade by concerns about increasingly low student performance, the failure of Title I to close the achievement gap for educationally disadvantaged students, and an emphasis on basic skills and low expectations, as well as a focus on inputs and compliance rather than on academic outcomes. The Improving America’s Schools Act of 1994 (IASA) galvanized state efforts to develop new accountability systems that were meant to address these problems (Goertz, Duffy, and LeFloch, 2001). Over the rest of the decade, states took the lead in fashioning accountability and assessment systems that were based on standards and designed to provide information on student performance outcomes and school progress in addressing learning for all students.

Over the 1990s, all but one state adopted state curriculum standards in an effort to increase educational quality. If states had knowledge of the national standards, it is likely that these documents would be important factors in outlining what students should know and be able to do to be competent in science and other content areas in a world undergoing significant social, economic, and technological changes. But most of the states were engaged in developing standards prior to the release of the NSES or the publication of the AAAS Benchmarks (Blank and Pechman, 1995). As a consequence, some states left out or put less emphasis on prominent topics included in these policy documents, including the nature of science, history of science, science as inquiry, science and society, and science applications.

Prior to publishing the NSES and the AAAS Benchmarks, a number of people were emphasizing the need for alternative forms of assessment and higher expectations for student learning in science (Resnick, 1993; Wiggins, 1989; Forseth, 1992; Baron, 1991; Doran, Reynolds, Camplin, and Hejaily, 1992; Hoffman and Stage, 1993; Hein, 1991). Counter to these recommendations, the use of standardized, norm-referenced, fill-in-the-blank assessments has increased over the last decade, while the number of large-scale assessments incorporating open-ended activities that would reveal more of students’ underlying thinking has remained the same. Much of this has occurred since the publication of the NSES.

Very little research has been done that specifically looks at the influence of the NSES or the AAAS Benchmarks on assessment and accountability, or, in turn, on the relation of science assessments or accountability to teachers’ classroom practices. An increasing amount of research is being conducted on large-scale reform in education that frequently incorporates data or information on assessments and accountability. However, much of this research focuses on mathematics and language arts rather than on science. The research that does exist is not very extensive. This makes it impossible to establish a causal link between the NSES and the AAAS Benchmarks on the one hand and assessment and accountability practices on the other. At best, research provides a description of practices that are compatible with the view of science education advanced in these standards.

Much of the existing literature addressing assessment and accountability consists of historical analyses,

