Supporting the Design, Implementation, and Evaluation of State Science Assessment Systems
In this report the committee recommends that a coherent assessment system comprised of multiple measures of student achievement is necessary for meeting No Child Left Behind (NCLB) requirements. Moreover, we conclude that such a system is the most effective means for providing decision makers at all levels of the education system with the information they need to support high-quality science education. Throughout this volume we have laid out our reasoning in reaching these conclusions and have provided ideas, not for the creation of an ideal system (which does not exist), but for the creation of systems that change and adapt over time in response to state priorities and circumstances.
As an aid to states in developing, implementing, and supporting assessment systems of high quality as well as for monitoring existing ones, the committee included throughout this report a series of questions to states. These questions represent the committee’s advice on the issues that should be attended to as state science assessment systems are developed and put into practice. The questions, which are recapitulated in Box 9-1, are not intended to be answered with a simple yes or no, but rather to serve as yardsticks against which states can measure their efforts. They also serve as a reminder of the importance of thinking systemically about the design of assessment systems. The committee’s overarching recommendation to states is that they think carefully about the issues raised by these questions and consider how their systems address the issues that are raised by each.
In asking states to think about the issues raised by these questions, we recognize that we are asking them to rethink long-held assumptions about science
Question 2-1: Does the state take a systems approach to assessment? That is, are assessments at various levels of the system (classroom, school district, state) coherent with each other and built around shared goals for science education and the student learning outcomes described in the state standards?
Question 2-2: Does the state have in place mechanisms for maintaining coherence among its standards, assessments, curricula, and instructional practices? For example, does the state have in place a regular cycle for reviewing and revising curriculum materials, instructional practices, and assessments to ensure that they are coherent with each other and with the state science standards, and that they adhere to the principles of learning and teaching outlined in this report? Does the state conduct studies to formally monitor and evaluate the alignment between its standards and assessments?
Question 3-1: Does the state’s science assessment system target the knowledge, skills, and habits of mind that are necessary for science literacy? For example, does it include items, tasks, or tests that require students to describe, explain, and predict natural phenomena based on scientific principles, laws, and theories; understand articles about science; distinguish questions that can be answered scientifically from those that cannot; evaluate the quality of information on the basis of its source; pose and evaluate arguments based on evidence; and apply conclusions appropriately?
Question 3-2: Does the state’s science assessment system reflect current scientific knowledge and understanding? For example, does the state have in place mechanisms to ensure that all of the measures that comprise the assessment system are scientifically accurate?
Question 3-3: Does the state’s science assessment system measure students’ understanding and ability to apply important scientific content knowledge and scientific practices and processes? For example, does it include a focus on assessing students’ understanding of the big ideas of science as opposed to recall of isolated facts, formulas, and procedures?
Question 3-4: Has the state conducted an independent review of its content standards to ensure that they articulate both the skills and the content knowledge students need to achieve science literacy?
Question 3-5: Does the state’s science assessment system reflect contemporary understandings of how people learn science?
Question 3-6: Is the state’s science assessment system consistent with the nature of scientific inquiry and practice as it is outlined in the state standards? For example, are opportunities built into the assessment system to assess students’ abilities to conduct extended scientific investigations, if such abilities are included in the state’s science standards?
Question 4-1: Have the state’s science standards been elaborated to provide explicit guidance to teachers, curriculum developers, and the state testing contractors about the skills and knowledge that are required by the state standards?
Question 4-2: Have the state’s science standards been reviewed by an independent body to ensure that they are reasonable in scope, accurate, clear, and attainable; reflect the current state of scientific knowledge; focus on ideas of significance; and reflect current understanding of the ways students learn science?
Question 4-3: Does the state have in place a regular cycle (preferably no longer than 8 to 10 years) for reviewing and revising its standards, during which time is allowed for development of new standards as needed; implementation of those standards; and then evaluation by a panel of experts to inform the next iteration of review and revision? Has the state set aside resources and developed both long-and short-term strategies for this to occur?
Question 5-1: Have research and expert professional judgment about the ways in which children learn science been considered in the design of the state’s science assessments?
Question 5-2: Have the science assessments and tasks been created to shed light on how well and to what degree students are progressing over time toward more expert understanding?
Question 6-1: Has the state brought together important stakeholders and required experts to develop and/or revise its science assessment system so that it reflects a shared vision of science education?
Question 6-2: Does the state have a written master plan for its science assessment system that specifies which types of assessments are to be used for which purposes, how frequently the different assessments will be administered, who will develop them, who will administer them, at what level of the education system they will be administered, and how the results will be scored, reconciled, and reported?
Question 6-3: Has the state developed both long- and short-term strategies for ensuring that resources are available for assessment development and revision? As part of this process, has consideration been given to strategies such as doing a little bit each year, purchasing curriculum materials that include quality assessments, collaborating with other states that have similar standards to develop assessments or item banks, or developing an assessment system that uses existing personnel and assessment opportunities to assess aspects of science learning that might otherwise be too expensive to assess?
Question 6-4: Is the state’s assessment system plan closely aligned with the complete array of its science standards, reflecting the breadth and depth of the science content knowledge, scientific skills and understandings, and cognitive demands that are articulated in the standards?
Question 6-5: Does the state have, and use the support of, both technical and content specific advisory committees to provide advice and guidance on the design, implementation, and ongoing monitoring and evaluation of the assessment system? Do these advisory committees make recommendations to improve particular aspects of the assessment system, and does the state have in place a plan for considering and responding to their suggestions?
Question 6-6: Has consideration been given in designing the assessment system to the nature of the score reports and to the intended inferences that the assessment information will be used to support?
Question 6-7: Have the state and its contractors developed strategies to ensure that reports of assessment results are accessible, relevant, and meaningful to the targeted audiences and that they are provided in a timely manner?
Question 6-8: Do assessment reports include information on the precision of scores and on the accuracy with which the scores can be used to classify students by performance levels? Do they include information about and examples of the appropriate and inappropriate use of the scores and about the kinds of inferences that can and cannot be supported by the results?
Question 6-9: Do the state’s teachers, school administrators, and policy makers have ongoing opportunities to build their understanding of current assessment practices and expand their skills in using and interpreting assessment results?
Question 6-10: Do school, district, and state education administrative personnel possess sufficient assessment competence to use assessment information accurately and to communicate it effectively to interested stakeholders?
Question 6-11: Do school, district, and state educational administrative personnel have sufficient resources to collect, store, manage, and analyze the data collected through the assessment system?
Question 6-12: Do the state, school districts, and schools include science educators in every step of the assessment process (from the design of the assessments to data collection to the use and interpretation of the results), thereby providing ongoing opportunities for individuals at each of these levels to build their understanding of current assessment practices and expand their skills in using and interpreting assessment results?
Question 6-13: Do the state’s teacher licensing regulations for certification and recertification require that all candidates demonstrate assessment competence at a level commensurate with their area of certification?
Question 6-14: Does the state require as part of its certification and recertification
standards that all teachers of science possess knowledge of the subjects they teach as well as the knowledge necessary to teach science well?
Question 7-1: Is the state’s science assessment system constructed to provide information on students’ opportunity to learn what is needed to meet the state’s goals for science learning? Does the state continually monitor and periodically evaluate its education system to ensure that sufficient opportunity to learn is being maintained for all students?
Question 7-2: Are all components of the states’ assessment system designed to make them accessible to the widest range of students, and to support valid interpretations about their performance? Does the development process for each component include consideration of ways to minimize challenges unrelated to the construct being measured?
Question 7-3: Does the state’s science assessment system include alternative assessments that can be used to assess the science achievement of students with significant cognitive disabilities?
Question 7-4: Has the state set aside resources for making improvements in its science education system to remedy the inequities or inadequacies that may be revealed by assessment and evaluation data? Has it also set aside resources to promulgate exemplary practices that may be revealed by assessment results?
Question 7-5: Does the state monitor the assessment system’s effect on the recruitment and retention of high-quality teachers?
Question 8-1: Does the state make use of multiple sources of information to continually monitor the effects of the science assessment system on science learning and teaching in the state?
Question 8-2: Does the state formally evaluate all aspects of its science assessment system, including development, administration, implementation, reporting, use, and both short- and long-term intended and unintended effects? Do the evaluations address the integration of the components of the system and address the major purposes the assessments are intended to serve? Do they include appropriate procedures and incentives. Do they include multiple indicators, such as technical quality, utility, and impact?
Question 8-3: Does the state monitor and evaluate the interactions between its science assessment system and the assessment systems for other disciplines? Does the evaluation address both the intended and unintended effects of the science assessment system on the state’s overall goals for K–12 education? Are the content standards, achievement standards, and assessments evaluated together to assure they work together as a coherent system?
assessment and that we are doing so in the face of a research base that we found to have significant limitations. While current understanding can serve as the foundation for the initial design of assessment systems, more knowledge is needed about the design of these systems as well as the underlying fundamental properties of learning and measurement on which they should be founded.
States working alone cannot accomplish all that is needed to make the design and implementation of effective science assessment systems a reality. Therefore, in this chapter we outline ways in which others can help states in their efforts to create coherent assessment systems. We urge scientists, science educators, cognitive scientists, and educational measurement experts to propose and conduct research on ways in which assessment systems can be designed, implemented, and monitored effectively. We also call on federal funding agencies and others, including professional disciplinary societies, to support this research with funding and expertise and to contribute to the dissemination of findings that can lead to improvements in state science assessment systems. Further, we ask states that have had experience in designing and implementing assessment systems to contribute to these efforts by sharing their experiences with other states. Education policy organizations can assist with these efforts by providing structured opportunities for this sharing to occur.
The committee also calls on institutions of higher education to do their part in supporting high-quality science education and assessment systems. Teachers and others need to understand how children learn science and how assessment can be used to obtain useful information about student competence. Both the initial preparation of teachers and their ongoing professional development should include opportunities to develop a deep understanding of how students learn as well as how to guide students at different levels of understanding. No assessment system, no matter how thoughtfully designed, can function as intended unless all who are responsible for developing assessments and interpreting the results are well prepared.
IMPROVING THE KNOWLEDGE BASE
The committee concluded that the measurement of science achievement would be improved if state assessment systems are founded on research regarding the developmental nature of science learning. However, as has been noted throughout this report, research addressing the nature of student learning in individual science domains is far from complete. While it is possible for states to begin by using well-reasoned conceptions of how students’ understanding of science develops over time, research that can confirm or enhance these conceptions is critical if the system is to function as intended. Thus, our first recommendations to those other than states call for such research to be conducted and for funding agencies to support researchers and states in this endeavor.
Recommendation 1: Funding agencies should support research on both: (1) the ways in which students’ understanding of the fundamental concepts of science develop over time with instruction, and (2) the ways in which students represent their understanding of these ideas as they develop greater expertise.
Recommendation 2: To assist states in their efforts to make more effective use of assessment results for improving curriculum and instruction and for diagnosing student needs relative to reaching the standards, funding agencies should support research on the ways in which tests could be designed to produce more useful subscores and the ways in which those subscores could be used effectively by teachers and others.
MULTIPLE APPROACHES AND UP-TO-DATE MEASURES
NCLB requires that state science assessments be aligned with state content and achievement standards and that they include multiple up-to-date measures of student achievement that are valid and reliable for the purposes for which they will be used. To meet these requirements, states will need both assistance with developing and validating new forms of assessment and with better incorporating and aligning all aspects of their assessment and education systems while meeting standards for technical quality when systems of assessment are involved.
For example, in an assessment system, assessments and combinations of them would need to be reliable and valid for every level and every purpose for which they are used. Current strategies for thinking about technical quality are not focused on thinking about systems of assessment. New methodologies for judging these concepts across different tests and across different levels (e.g., classroom, school, school district, and state) are needed. Similarly, strategies for conducting alignment studies among multiple components of an assessment system and the state standards are needed. Such strategies need to focus on the collective alignment of all the tests and tasks that constitute the assessment system, yet researchers are still struggling with ways to conduct such studies for a single assessment.
Recommendation 3: Research on the design and validation of science assessment systems should be conducted. Among the subjects investigated should be strategies for using classroom assessments for accountability purposes and instruction and procedures for determining the alignment, reliability, accuracy, and validity of assessment systems composed of multiple measures. Federal funding agencies and others should support these research efforts.
THE ASSESSMENT OF SCIENTIFIC INQUIRY
Most state science standards recommend that students understand and develop appropriate skills related to scientific inquiry, yet, as we have discussed,
many state science assessment systems do not adequately target these skills. Under these circumstances the requirements for alignment between standards and assessments cannot be met. States need assistance with developing valid, reliable, and cost-effective ways to include the assessment of inquiry in their science assessment systems.
Recommendation 4: To support the inclusion of assessment tasks focused on scientific inquiry and investigations in state assessment systems, the U.S. Department of Education, science educators, scientists, and educational measurement experts should help states address issues related to the development, validation, and implementation of such tasks.
NCLB requires that states include all students in their assessment systems and hold all students accountable for attaining challenging standards. Meeting this requirement has increased the importance of the accommodations that are provided to students with disabilities and those with limited English language proficiency. A previous National Research Council committee found that means for determining which accommodations are suitable under particular circumstances as well as determining that scores obtained under accommodated conditions are comparable to those obtained without accommodations are not well documented. States need help in developing policies and practices related to including these students in their assessments.
Recommendation 5: As an aid to the states in developing science assessments to meet NCLB requirements, federal funding agencies and others should sponsor research on the implications of including students with disabilities and those with limited English proficiency in the science assessment system. Research is needed both on identifying appropriate accommodations and on the validity of inferences that can be drawn from test results obtained under accommodated and nonaccommodated conditions. Research is also needed to support the development of instructional and assessment models, based on learning progressions for students with severe disabilities.
BUILDING PROFESSIONAL CAPACITY
NCLB requirements place a premium on high-quality science teaching, and the committee agrees that this as an essential element in improving science achievement. There is strong evidence that good assessment practices can support student success, but teachers need at least a minimum level of assessment literacy to make effective use of assessments and assessment results. We have already suggested that states provide ongoing professional development opportunities for
educators, including participation in all aspects of assessment development and implementation, as a way to build their assessment competence. However, such opportunities are not enough.
Post-secondary institutions that prepare science teachers and state licensing agencies must play a role by assuring that teachers and school administrators enter education with a firm foundation in assessment competence. The committee concluded that if states require new teachers to demonstrate assessment competence as a condition for their certification to teach, then teacher preparation programs will include it. In Chapter 6 of this report we conclude that states should include assessment competence as a requirement for state teacher certification, and here we recommend that institutions of higher education and do their part to support states in their efforts.
Recommendation 6: Post-secondary institutions that prepare science teachers should require that preservice science teachers have appropriate knowledge and skills regarding effective science assessment practices. Such knowledge includes the use of assessment results in promoting student learning and making decisions about instruction, developing and using sound assessments, and understanding the limitations of various types of assessment practices and results. Accomplishing this requires that preservice teachers have a deep understanding of the science they teach.
The linchpin of NCLB is the development and implementation of assessments to measure student attainment of high-quality standards, but, as we have discussed, the U.S. Department of Education has not provided guidance on what such standards should look like or how they should be organized. Since standards drive the entire system on which NCLB is built, standards of poor quality will affect every aspect of science education. The committee concludes that the U.S. Department of Education should take a more active role in monitoring the quality of state science standards.
Recommendation 7: The U.S. Department of Education should require that states have an independent body evaluate their academic science standards and submit evidence of their quality as part of the required peer review process. The evaluation should not focus on the specific content that states choose to include, but rather on the degree to which the standards are clear, concrete, and complete; are rigorous and scientifically correct; embody a clear conceptual framework; reflect sound models of the way students learn science; are reasonable in scope; and describe performance expectations for students in clear and specific terms. The results of the evaluation should be made public.
SETTING ACHIEVEMENT STANDARDS
In a standards-based system, both content and achievement standards play a critical part in the educational system. Achievement standards provide targets for instruction and assessment and help students to know what is expected of them so that they can adjust their learning strategies to meet expectations. However, the most consistent finding from the research literature on standard setting is that different methods lead to different results. It is nonetheless incumbent on states to ensure that the methods they use to set standards are defensible. Setting achievement standards for a system of assessment is even more challenging than for a single assessment, and states will need help in this regard. The committee calls on the educational measurement community to conduct research on standard-setting methods that can be used in conjunction with assessment systems. We also urge the U.S. Department of Education to require that states evaluate the methods they currently use.
Recommendation 8: Research on the development of standard-setting strategies that could be used to establish achievement levels when results from multiple assessments are involved should be conducted. Federal funding agencies and others should support this important research.
Recommendation 9: The U.S. Department of Education should require that states have an independent external body evaluate the process they use to develop and set achievement levels. This evaluation should be conducted as early in the development process as possible.
ASSESSMENT SYSTEMS—A PRIORITY
The committee urges states to use NCLB as an opportunity to make science both an educational priority and a responsibility shared by all. At the same time, we urge the federal government and the other bodies mentioned above to take their responsibilities seriously and to join states in considering this an opportunity to bring about substantial improvements in science assessment and student learning.
Recommendation 10: Federal agencies and others should support, with funding and expertise, the development and pilot testing of model assessment systems in order to assist states in their efforts to create such systems.