There was a clear trend toward an increase in accountability and the use of assessment over the 1990s. Interestingly, the increase in assessment came early in the decade and before the crescendo in the state and district accountability systems. By the end of the decade, 46 states had content standards in science, but less than half had them for all three grade ranges. Two-thirds of the states had state assessments in science, but there was some evidence that states using alternative forms of assessment more aligned with the national standards, such as performance assessment, actually declined about the time the NSES document was released. What importance states gave to student performance on science assessments in accountability systems was unavailable in the literature reviewed. Most accountability systems held schools accountable for student performance and directed consequences to low-performing schools. Of the one-half of the states that had moderate to high level of stakes attached to student performance on assessments, almost all also distributed the consequences among students, teachers, schools, and districts—a desirable trait in an accountability system. It is likely that assessment and accountability in science will continue to be given less emphasis with the new federal legislation “No Child Left Behind,” which does not require states to assess in science until the 2007-2008 school year.

Determining the influence of the NSES and AAAS Benchmarks on assessments and accountability systems is confounded by a number of other initiatives and developments that coincided with the publication of these documents. The assessment practices and targets for assessments portrayed in the NSES and Benchmarks are compatible with current understandings about how students learn and how this learning can be measured. Assessment practices, such as using multiple measures or having students write about their understandings, are both consistent with teaching for understanding and teaching for inquiry as described in the NSES. Even though a clear link could not be made between assessment practices used by states and districts and the NSES and the Benchmarks, the research does provide convincing evidence that assessment practices do influence both teachers’ practices and subsequent student learning. An increase in formative assessment produces learning gains. This is significant because the emphasis in the NSES and the Benchmarks on teaching for understanding requires assessments that are integral to instruction and continuous as implied by formative assessment. In states that have given high importance to assessment scores, teachers do change their practices some, but not completely, to include more test-like activities in their teaching. However, not all state assessments are fully aligned with state standards indicating that those teachers who just “teach for the test” will likely fall short in students achieving the full expectations as expressed in the standards.

The research review did not directly establish that the NSES and AAAS Benchmarks have influenced accountability and assessment systems. If this link could be established, then there is evidence that assessment and accountability systems do influence teachers’ classroom practices and student learning. Our review of the literature and the type of research used in this area did reveal some inadequacies in the available research. What is missing and is needed is a comprehensive study of policies of all 50 states that would reveal the linkages between science standards, science assessment, and science accountability. This comprehensive study should include systematic analyses of the alignment between state standards and the NSES and Benchmarks. Such a comprehensive study would provide the missing link by establishing what has been the influence of the national science standards documents with the state standards. Research also is needed to describe and analyze the full science assessment system being used in states, districts, schools, and classrooms. Such an analysis would describe the full range of content being assessed; to what depth the content is assessed; at what level within the system the content is assessed; and how the information is applied to further learning. Such a detailed analysis would attend to the different attributes of assessments including what questions are asked, what responses are elicited, how student responses are scored, how the scores are interpreted, and what is reported. We also did not find any studies related to college placement examinations, another area for other research.

Accountability systems have not stabilized and are still undergoing significant change. These systems also are extremely complex. It is not surprising that definitive research has not been done on how accountability and assessment systems fully work and how these systems are influenced by documents such as the NSES and AAAS Benchmarks. What is clear is the increasing importance these policy components have in education. It is no longer sufficient for science educators who are most interested in the curriculum and the content to ignore the policy arena. Research that bridges and enlightens the relationship between content standards and policy is essential.

