The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
Systems for State Science Assessment
cannot serve all purposes equally well, there need not necessarily be a one-to-one correspondence between the number of tests and the number of purposes, provided the state is cognizant of the trade-offs inherent in using an assessment to serve multiple purposes. Evidence that an assessment is valid for one purpose is insufficient to establish the validity of its use for another purpose (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999, p. 17). Some evidence exists (Niemi, 1996; Baker, 1997; Baker, Abedi, Linn, and Niemi, 1996) that tests can be designed to yield useful information for various purposes at different levels of the system when the results are reported in different ways. Baker (2003) suggests that system-oriented measures can be turned to instructional improvement purposes in this way. This would be possible if evidence is collected to support the validity of each purpose, or if the different purposes are addressed by aggregating and reporting results in different ways.
The Nebraska STARS (Buchendahl, Impara, and Plake, 2002) and the Maine MeCAS2 programs have used this approach. In these programs, results of local assessments whose primary purpose is to support teaching and learning in the classroom are being combined with each other and with state-level assessments to support judgments about achievement of the state standards. These judgments are useful to both teachers and policy makers. The programs are built around a strong foundation of professional development that supports teachers in developing technically sound assessments. In each of these states, considerable attention has been paid to establishing the validity of both classroom and district portions of the assessment system for each intended purpose. However, concerns about the comparability of information across districts remain, and further research and experience will be needed to determine how well such strategies will work for different purposes.
At What Level the Assessment Is Administered
Another aspect of the selection of suitable assessment approaches is where the assessment should be administered to maximize its usefulness and provide results that support desired inferences. In a system of assessments, with many ways to implement an assessment strategy, decisions should be based both on the construct to be measured and where the most accurate picture of student learning can be obtained. For example, as discussed in Chapter 3, if a detailed picture of students’ abilities to conduct a scientific investigation is needed, this information may be captured best by the teacher while students are actively engaged in inquiry