The charge to this committee was to develop a plan for assessment that will reinforce and complement the dramatic changes to science education proposed in A Framework for K-12 Science Education: Practices, Crosscutting Concepts, and Core Ideas (National Research Council, 2012a, hereafter referred to as “the framework”) and the Next Generation Science Standards: For States, By States (NGSS Lead States, 2013). We have emphasized throughout this report that both of these documents provide an opportunity to rethink the possibilities for using assessment to support learning. We recognize that changes of this order are extremely challenging, and our charge directed us specifically to discuss the feasibility and costs of our recommendations.
The guidance for developing a science assessment system discussed in Chapter 6 is based on the premise that states will need to tailor their plans to their own circumstances and needs. However, there are four major issues that will be important to implementation in any context. This chapter discusses these issues:
- The development of a new assessment system will need to be undertaken gradually and phased in over time.
- To be successful, a science assessment system will have to thoughtfully and consistently reflect the challenge of ensuring equity in the opportunity that students from diverse backgrounds have to demonstrate their knowledge and abilities. Meeting this challenge will require clear understanding of the opportunities all students have had to learn science and to be fairly assessed, in the new ways called for by the framework.
- Technology will play a critical role in the implementation of any assessment system that is aligned with the framework and the Next Generation Science Standards (NGSS).
- Every choice made in implementing a system will entail both costs and benefits and their tradeoffs, which will require careful analysis.
In this report, we have presented examples of tasks that assess the three-dimensional science learning represented by the NGSS performance expectations, and examples of assessment strategies that can incorporate these tasks. We believe these examples will prove valuable to those who have the responsibility to plan and design new state science assessment systems, but they are only examples. Implementing new assessment systems will require substantial changes to current systems. Thus, state leaders and educators will need to be both patient and creative as they implement changes over time. They need to understand and plan for the development and implementation of new systems in stages, over a span of years.
A number of innovative assessment programs floundered in the 1990s in part because they were implemented far too rapidly (perhaps to meet political exigencies). In many cases, their developers were not given sufficient time to implement what were major changes or to make modifications as they learned from experience (McDonnell, 2004). Some veterans of these experiences have cited this as a key factor in the lack of sustainability of many such efforts (see National Research Council, 2010).
A new assessment system has to evolve alongside other elements that are changing. It will take time for the changes to curriculum, instruction, professional development, and the other components of science education envisioned in the framework and the NGSS to be developed and implemented. New modes of assessment will need to be coordinated with those other changes, both because what is needed has to be embedded in some way in curriculum and instruction and because there is little value in assessing students on material and kinds of learning that they have not had the opportunity to learn. Moreover, assessing knowledge through the application of practices is relatively new, particularly in the context of externally mandated assessments. States that adopt new science assessment systems will need time to further develop and test new types of tasks and technology and gather evidence of their efficacy and validity in measuring three-dimensional learning. These changes will also need to be accompanied by
extensive changes in teacher professional development, at both the entry and continuing levels. Although these are all major changes, we note that many of them mirror those being proposed for assessment of English language arts and mathematics through the Race to the Top Assessment Program consortia.
As we emphasized in the discussion of our charge, striking the right balance with new assessments designed to measure rapidly changing curricula and instructional practices while also meeting a range of competing priorities will be challenging, and will require consideration of tradeoffs. Changes in curriculum, instruction, student performance expectations, and professional development will need to be carefully coordinated and then introduced and implemented in stages across grade levels. States will need to carefully plan and develop their own models for implementation. For example, some may want to begin at the kindergarten level and move upward by grade levels; others may choose another starting level, such as the beginning of middle school and move upwards (or downward) by grade levels. It is important to recognize that, in order to meet the performance expectations in the NGSS, students in higher grades will need to have had the necessary foundation in their earlier grades. States will need to expect and address these sorts of gaps, as they are currently doing with the Common Core State Standards in English language arts and mathematics.
It will be up to each state to determine the best way to gradually adapt their curricula. In many places, schools or districts have reduced the amount of science instruction offered in recent years, particularly in the early grades, in response to the accountability demands of the No Child Left Behind Act (NCLB) (see Center on Education Policy, 2007; Dorph et al., 2011; Griffith and Scharmann, 2008). Those jurisdictions will need to reintroduce science in the early grades—and review and revise the policies that have limited the time available for science—if they are to effectively implement the new standards. Frequently, schools that serve the most disadvantaged student populations are those in which the opportunity to learn science has been most reduced (Center on Education Policy, 2007; Dorph et al., 2011; Rennie Center for Education Research and Policy, 2008). Even in schools and districts that have maintained strong science programs at all grade levels, neither students nor teachers may have had experience with instruction that involves applying the practices as envisioned in the new framework and NGSS.
The cost of materials will also be a factor in the implementation of new approaches to science education, particularly at the elementary level. Many school districts in the United States use kit-based curriculum materials at the elementary levels, such as Full Option Science Systems (FOSS) and Science and
Technology for Children, which were developed in the early 1990s and aligned to AAAS benchmarks of the American Association for the Advancement of Science (1993, 2009) or to the National Science Education Standards (National Research Council, 1996). When combined with teacher training, these science kits have been valuable in the delivery of guided-inquiry instruction, but the materials will have to be revised and resequenced to align with the NGSS (Young and Lee, 2005). Developing the needed materials represents a significant investment for school districts.
Many states are already implementing the Common Core State Standards for English language arts and mathematics, which emphasize engaging students in classroom discourse across the disciplines. The new framework and the NGSS reflect the intention to integrate that approach with science learning: the integration will also take time and patience, especially in the many schools and districts in which there is little precedent on which to build.
Thus, states will need to both make some immediate changes and initiate a longer-term evolution of assessment strategies. Policy makers and educators will need to balance shorter- and longer-term assessment goals and to consider the effects of their goals and plans on each of the critical actors in teaching and assessment (e.g., the federal government, states, districts, schools, principals, teachers, parents, and students). Each component of the science education system—including instruction, curriculum and instructional materials, teacher education and professional development programs, assessment development, research, and education policy—will need to be adapted to an overall plan in a coordinated fashion. In terms of policy orientation, we emphasize again that a developmental path that is “bottom up” (i.e., grounded in the classroom), rather than “top down” (i.e., grounded in such external needs as accountability or teacher evaluation), is most likely to yield the evidence of student learning needed to support learning that is aligned with the framework’s goals.
Although accountability is an important function of an assessment system, we believe that placing the initial focus on assessments that are as close as possible to the point of instruction will be the best way to identify successful strategies for teaching and assessing three-dimensional science learning. These strategies can then be the basis for the work of developing assessments at other levels, including external assessments that will be useful for purposes beyond the classroom. We recognize that we are calling on state and federal policy makers to change their thinking about accountability—to rethink questions about who should be held accountable for what and what kinds of evidence are most valuable for that
task. States may have to temporarily forgo some accountability information if the new system is to have a chance to evolve as it needs to. Because this is a marked change, states that begin this approach will be breaking significant new ground, and there will be much to be learned from their experiences.
Continuing to use existing assessments will not support the changes desired in instruction, and thus interim solutions will be needed that can, simultaneously, satisfy federally mandated testing requirements and allow the space for change in classroom practice. Adapting new state assessment systems will require a lengthy transition period, just as the implementation of the NGSS in curriculum and instruction will require a gradual and strategic approach. A gradual approach will ease the transition process and strengthen the resulting system, both by allowing time for development and phasing in of curriculum materials aligned to the framework and by allowing all participants to gain familiarity and experience with new curricula and new kinds of instruction that address the three dimensions of the NGSS. Ideally, the transition period would be 5 years or more. We realize, however, that many states will face political pressures for much shorter timelines for implementation.
A fundamental component of the framework’s vision for science education is that all students can attain its learning goals. The framework and the NGSS both stress that this goal can only be reached if all students have the opportunity to learn in the new ways recommended in those documents. Achieving equity in the opportunity to learn science will be the responsibility of the entire system, but the assessment system can play a critical role by providing fair and accurate measures of the learning of all students. As we have noted, however, it will be challenging to strike the optimal balance in assessing students who are disadvantaged and students whose cultural and linguistic backgrounds may significantly influence their learning experiences in schools.
The K-12 student population in the United States is rapidly growing more diverse—culturally, linguistically, and in other ways (Frey, 2011). The 2010 U.S. census showed that while 36 percent of the total population are minorities, 45 percent of those who are younger than 19 are minorities (U.S. Census Bureau, 2012), and non-Asian minority students are significantly more likely to live in poverty than white or Asian students (Lee et al., 2013). The number of students who are considered limited English proficient doubled between 1993 and 2007, to 11 percent (Lee et al., 2013). Under any circumstances, assessing the learning of a
very diverse student population requires attention to what those students have had the opportunity to learn and to the needs, perspectives, and modes of communication they bring to the classroom and to any assessment experience.
In the context of the recasting of science education called for by the framework and the NGSS, these issues of equity and fairness are particularly pressing. We argue in this report for a significantly broadened understanding of what assessment is and how it can be used to match an expanded conception of science learning. The framework and the NGSS stress the importance of such practices as analyzing and interpreting data, constructing explanations, and using evidence to defend an argument. Thus, the assessments we recommend present opportunities for students to engage in these practices. The implications for the equity of an assessment are complex, especially since there is still work to be done in devising the means of providing equitable opportunity to learn by participating in scientific practices that require significant discourse and writing.
Fairness is not a new concern in assessment. It can be described in terms of lack of bias in the assessment instrument, equitable treatment of test takers, and opportunity to learn tested material (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999). It is important to note, however, that the presence of performance gaps among population groups does not necessarily signal that assessments are biased, unfair, or inequitable. Performance gaps on assessments may also signal important differences in achievement and learning among population groups, differences that will need to be addressed through improved teaching, instruction, and access to appropriate and adequate resources. A test that makes use of performance-based tasks may indeed reveal differences among groups that did not show up in tests that use other types of formats. NGSS-aligned assessments could be valuable tools for identifying those students who are not receiving NGSS-aligned instruction.
The changes to science education called for in the framework and the NGSS highlight the ways in which equity is integral to the definition of excellence. The framework stresses the importance of inclusive instructional strategies designed to engage students with diverse interests and backgrounds and points out that these principles should carry over into assessment design as well. It also notes that effective assessment must allow for the diverse ways in which students may express their developing understanding (National Research Council, 2012a, pp. 283, 290). The NGSS devotes an appendix to the discussion of “All Standards, All Students.” It notes the importance of non-Western contributions to science and engineer-
ing and articulates three strategies for reaching diverse students in the classroom, which also apply to assessment (NGSS Lead States, 2013, Appendix D, p. 30):
- Value and respect the experiences that all students bring from their backgrounds (e.g., homes and communities).
- Articulate students’ background knowledge (e.g., cultural or linguistic knowledge) with disciplinary knowledge.
- Offer sufficient school resources to support student learning.
These principles offer a valuable addition to the well-established psychometric approaches to fairness in testing, such as statistical procedures to flag test questions that perform differently with different groups of students and may thus not measure all students’ capability accurately (see e.g., American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999; Educational Testing Service, 2002; Joint Committee on Testing Practices, 2004). The principles are grounded in recent research that uses sociocultural perspectives to explore the relationships between individual learners and the environments in which they learn to identify some subtle but pervasive fairness issues (Moss et al., 2008). Although that research was primarily focused on different aspects of instruction and assessment, the authors have expanded the concept of opportunity to learn. In this view, opportunity to learn is a matter not only of what content has been taught and what resources were available, but also of (1) whether students’ educational environments are sufficiently accessible and engaging that they can take advantage of the opportunities they have, (2) how they are taught, and (3) the degree to which the teacher was prepared to work with diverse student populations.
This research highlights the importance of respect for and responsiveness to diverse students’ needs and perspectives. All students bring their own ways of thinking about the world when they come to school, based on their experiences, culture, and language (National Research Council, 2007). Their science learning will be most successful if curriculum, instruction, and assessments draw on and connect with these experiences and are accessible to students linguistically and culturally (Rosebery et al., 2010; Rosebery and Warren, 2008; Warren et al., 2001, 2005). It will not be easy for educators to keep this critical perspective in view while they are adapting to the significant changes called for by the framework and the NGSS. Moreover, given the current patterns of teacher experience and qualifications, it is likely that students in the most advantaged circumstances
will be the first to experience science instruction that is guided by the framework and thus be prepared to succeed on new assessments. As states and districts begin to change their curricula and instruction and to adopt new assessments, they will need to pay careful attention to the ways in which students’ experiences may vary by school and for different cultural groups. The information provided by new generations of assessments will only be meaningful to the extent that it reflects understanding of students’ opportunities to learn in the new ways called for by the framework and educators find ways to elicit and make use of the diversity of students’ interests and experiences. Monitoring of opportunity to learn, as we recommend (see Chapter 6), will thus be a critical aspect of any assessment system.
Because the language of science is specialized, language is a particular issue for the design of science assessments. To some extent, any content assessment will also be an assessment of the test takers’ proficiency in the language used for testing (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999). Both native English speakers and English-language learners who are unfamiliar with scientific terminology and various aspects of academic language may have difficulty demonstrating their knowledge of the material being tested if they have not also been taught to use these scientific modes of expression. Some researchers have suggested that performance tasks that involve hands-on activities are more accessible to students who are not proficient in English, but such tasks may still present complex linguistic challenges, and this issue should be considered in test design (Shaw et al., 2010).
We note that strategic use of technology may help to diminish these challenges. For example, technology can be used to provide flexible accommodations—such as translating, defining, or reading aloud words or phrases used in the assessment prompt or offering variable print size that allow students to more readily demonstrate their knowledge of the science being tested. One model for this approach is ONPAR (Obtaining Necessary Parity through Academic Rigor), a web resource for mathematics and science assessments that uses technology to minimize language and reading requirements and provide other modifications that make them accessible to all students.1 However, more such examples are needed if the inclusive and comprehensive vision of the framework and the NGSS is to be realized.
Researchers who study English-language learners also stress the importance of a number of strategies for engaging those students, and they note that these strategies can be beneficial for all students. For example, techniques used in literacy instruction can be used in the context of science learning. These strategies promote comprehension and help students build vocabulary so they can learn content at high levels while their language skills are developing (Lee, 2012; Lee et al., 2013).
Research illustrates ways in which attention to equity has been put into practice in developing assessments. One approach is known as universal test design, in which consideration of possible ways assessment format or structure might limit the performance of students is incorporated into every stage of assessment design and development (Thompson et al., 2002).2 The concept of cultural validity has also been important. This idea takes the finding that “culture influences the ways in which people construct knowledge and create meaning from experience” (Solano-Flores and Nelson-Barber, 2001, p. 1) and applies it to both assessment design and development and to interpretation of assessment results (see also Basterra et al., 2011). Another approach is to provide specialized training for the people who will score the responses of culturally and linguistically diverse students to open-ended items (see Kopriva, 2008; Kopriva and Sexton, 1999).
Although building equity into assessment systems aligned with the framework and the NGSS poses challenges, it also presents opportunities. Equity in opportunity to learn is integral to the definition of excellence in those documents. Since significant research and development will be needed to support the implementation of the science assessment systems that are aligned with the framework and the NGSS, there is a significant opportunity for research and development on innovative assessment approaches and tasks that exemplify a view of excellence that is blended with the goals of equity. Much remains to be done: the new approaches called for in science education and in assessment should reflect the needs of an increasingly diverse student population. It will be important for those responsible for the design and development of science assessments to take appropriate steps to ensure that tasks are as accessible and fair to diverse student populations as possible. Individuals with expertise in the cultures, languages, and
2For more information, see Universally Designed Assessments from the National Center on Educational Outcomes, available at http://www.cehd.umn.edu/NCEO/TopicAreas/UnivDesign/UnivDesignTopic.htm [June 2013].
ethnicities of the student populations should be participants in assessment development and the interpretation and reporting of results.
We do not expect that any new approaches could, by themselves, eliminate inequity in science education. As we note earlier in this chapter, new assessments may very well reveal significant differences among groups of students, particularly because more advantaged schools and districts may implement the NGSS earlier and more effectively than less advantaged ones, at least in the early years. It will be important for test developers and researchers to fully explore any performance differences that become evident and to examine the factors that might contribute to them. For this type of research the appropriate types of data will have to be collected. This should include the material, human, and social resources available to support student learning, such as the indicators of opportunities to learn that we discuss in Chapter 6. Such studies might entail multivariate and hierarchical analyses of the assessment results so that factors influencing test scores can be better interpreted.3
Information and communications technology will be an essential component of a system for science assessment, as noted in the examples discussed throughout this report. Established and emerging technologies that facilitate the storage and sharing of information, audio and visual representation, and many other functions that are integral to the practice of science are already widely used in science instruction. As we have discussed, computer-based simulations allow students to engage in investigations that would otherwise be too costly, unsafe, or impractical. Simulations can also shorten the time needed to gather and display data (e.g., using computer-linked probes, removing repetitive steps through data spreadsheets and the application of algorithms) and give students access to externally generated datasets they can analyze and use as evidence in making arguments.
As we discuss in Chapter 5, technology enhances the options for designing assessment tasks that embody three-dimensional science learning. Technology can also support flexible accommodations that may allow English-language learners or students with disabilities to demonstrate their knowledge and skills. Students’
3These types of studies would not be attempts to do causal modeling, but a serious examination of sources of variance that might influences science scores especially when the scores are being used to make judgments about students and/or their teachers.
use of these options can be included as part of the data that are recorded and analyzed and used for future design purposes.4
Technology-based assessment in science is a fast-evolving area in which both the kinds of tasks that can be presented to students and the interface through which students interact with these tasks are changing. There are many interesting examples, but they do not yet comprise a fully evaluated set of strategies, so there are still questions to be answered about how technology-based tasks function. For example, tasks may ask students to manipulate variables in a simulation and interpret their observations or present data and data analysis tools for students to use in performing the given task. Students’ familiarity and comfort with such simulations or tools will likely influence their ability to respond in the time allowed, regardless of their knowledge and skills. Therefore, it will be essential to ensure that students have experience with technology in the course of instruction, not just in the context of assessments. They need to gain familiarity with the interfaces and the requisite tools as part of their regular instruction before they are assessed using those tools, particularly when high stakes are attached to the assessment results. Moreover, the development of technology-based assessments needs to include extensive pilot testing so that students’ reactions to the technology can be fully explored.5
The charge to the committee included a discussion of the costs associated with our recommendations. Cost will clearly be an important constraint on implementing our recommendations and will influence the designs that states adopt. We strongly recommend that states adopt their new systems gradually and strategically, in phases, and doing so will be a key to managing costs. And as we discuss throughout the report, new and existing technologies offer possibilities for achieving assessment goals at costs lower than for other assessments, including performance
4We do not advocate that these data be used for the purpose of scaling the scores of students who make use of accommodations.
5One option for such pilot testing would be to develop an open-source database of simulations with a common interface style that can be used in both instruction and assessment, though this option would require a significant research and development effort. Another option would be to develop such resources as part of curriculum materials and give students the option of choosing assessment items that use the interface and simulation tools that match the curriculum that was used in their classrooms.
tasks. At the same time, much of what we recommend involves significant change and innovation, which will require substantial time, planning, and investment.
There is no simple way to generate estimates of what it might cost a state to transform its science assessment systems because each state will have a different starting point, a different combination of objectives and resources, and a different pace of change. The approach we recommend also means that assessments will be organically embedded in the science education system in a way that is fundamentally different from how assessments are currently understood and developed. An important advantage of the approach we recommend is that many assessment-related activities—such as task development and scoring moderation sessions in which teachers collaborate—will have benefits beyond their assessment function. Determining what portion of such an activity should be viewed as a new assessment cost, what portion replaces an older function, and what portion could fairly be treated as part of some other set of costs (e.g., professional development) may not be straightforward. It is possible to make some guesses, however, about ways in which the costs may be affected, and we see both significant potential savings and areas for which significant resources will be needed, particularly in the initial development phases.
Developing the design and implementation plan for the evolution to new assessment systems will require significant resources. The design and development of tasks of the kind we have described may be significantly more resource intensive than the design and development of traditional assessment tasks (such as tests composed of multiple-choice items), particularly in the early phases. And as we note above, research and experimentation will be needed over a period of years to complete the work of elaborating on the ideas reflected in the framework and the NGSS. There will also be ongoing costs associated with the administration and scoring of performance-based tasks.
A number of steps can be taken to help defray these costs. State collaboratives, such as the Race to the Top Assessment Program consortia for developing English language arts and mathematics assessments or the New England Common Assessment Program consortium for developing science assessments, can help to reduce development costs. Scoring costs may be reduced by using teachers as scorers (which also benefits their professional development) and by making use of automated scoring to the extent possible.6 Integrating classroom-embedded assess-
6For a detailed analysis of costs associated with constructed-response and performance-based tasks, see Topol et al. (2010, 2013). Available: https://edpolicy.stanford.edu/sites/default/files/
ment into the system provides teacher-scored input, but the associated monitoring and moderating systems do have direct costs.
We expect that costs will be most intense at the beginning of the process: as research and practice support increasing experience with the development of new kinds of tasks, the process will become easier and less costly. Each state, either on its own or in collaboration with other states, will have to build banks of tasks as well as institutional capacity and expertise.
Implementation of the NGSS will also bring states a number of advantages that have cost-saving implications. Because the NGSS will be implemented nationwide, states will be able to collaborate and to share resources, successful strategies, and professional development opportunities. This multistate approach is in stark contrast to the current approach, in which states have had distinct and separate science standards and have had to develop programs and systems to support science education in their states in relative isolation, often at significant cost and without the benefit of being able to build on successful models from other states.
The NGSS will also allow states to pilot professional development models in diverse and culturally varied environments, which could then be useful in other states or regions that have similar demographic characteristics.8 The ways
publications/getting-higher-quality-assessments-evaluating-costs-benefits-and-investmentstrategies.pdf [August 2013].
7It is a common mistake to see assessment as separate from the process of instruction rather than as an integral component of good instructional practice. Well-designed tasks and situations that probe students’ three-dimensional science knowledge are opportunities for both student learning and student assessment. A substantial body of evidence shows that providing assessment opportunities in which students can reveal what they have learned and understood—to themselves, their peers, and their teachers—is far more beneficial to achievement than simply repeating the same content (Pashler et al., 2007 and Hinze et al., 2013).
8At least one such network to facilitate such interstate collaboration and mutual support is already operating. The Council of State Science Supervisors has organized meetings of BCSSE (Building Capacity for State Science Education) that included teams from more than 40 states in
in which states and school districts will be able to learn from one another and share successful models to support the systems of science education offer not only potentially substantial economies, but also an unparalleled opportunity to advance teaching and learning for all children.
Throughout the report we discuss and offer examples of practical ways to assess the deep and broad performance expectations outlined in the framework and the NGSS. However, we acknowledge the challenge of this new approach to assessment and building assessment systems. Implementing the recommended new approaches will require substantial changes, and it will take time. For the changes to be fully realized, all parts of the education system—including curriculum, instruction, assessment, and professional development—will need time to evolve. Thus, a key message is that each step needs to be taken with deliberation.
RECOMMENDATION 7-1 States should develop and implement new assessment systems gradually over time, beginning with what is both necessary and possible in the short term for instructional support and system monitoring while also establishing long-term goals to implement a fully integrated, technologically enhanced, coherent system of assessments.
RECOMMENDATION 7-2 Because externally developed assessments cannot, by design, assess the full range and breadth of the performance expectations in the Next Generation Science Standards (NGSS), they will have to focus on selected aspects of the NGSS (reflected as particular performance expectations or some other logical grouping structure). States should publicly reveal these assessment targets at least 1 year or more in advance of the assessment to allow teachers and students adequate opportunity to prepare.
As we discuss in Chapter 4, effective implementation of a new assessment system will require resources for professional development. Science instruction and
an ongoing collaboration about implementation issues for the NGSS and other new state standards for science, including but not limited to issues of assessment. Funding and resources to continue this networking will be an important investment to foster efficient learning from others in this multistate effort.
assessment cannot be successfully adapted to the new vision of science education without this element.
RECOMMENDATION 7-3 It is critically important that states include adequate time and material resources in their plans for professional development to properly prepare and guide teachers, curriculum and assessment developers, and others in adapting their work to the vision of the framework and the Next Generation Science Standards.
RECOMMENDATION 7-4 State and district leaders who commission assessment development should ensure that the plans address the changes called for by the framework and the Next Generation Science Standards. They should build into their commissions adequate provision for the substantial amounts of time, effort, and refinement that are needed to develop and implement such assessments, thus reflecting awareness that multiple cycles of design-based research will be necessary.
A fundamental component of the framework’s vision for science education is that all students can attain its learning goals. The framework and the NGSS both stress that this goal can be reached only if all students have the opportunity to learn in the new ways recommended by those documents. Assessments will play a critical role in achieving this goal if they are designed to yield fair and accurate measures of the learning of all students. Careful attention to the diversity of the nation’s student population will be essential in designing new science assessments.
RECOMMENDATION 7-5 Policy makers and other officials who are responsible for the design and development of science assessments should consider the multiple dimensions of diversity—including, but not limited to, culture, language, ethnicity, gender, and disability—so that the formats and presentation of tasks are as accessible and fair to diverse student populations as possible. Individuals with expertise in these areas should be integral participants in assessment development and in the interpretation and reporting of results.
As we discuss above, new assessments may reveal performance differences among groups for students, in part because more advantaged schools and districts might implement the NGSS earlier and more effectively than less advantaged
ones. Data will need to be collected to support studies of any such performance differences.
RECOMMENDATION 7-6 Because assessment results cannot be fully understood in the absence of information about opportunities to learn what is tested, states should collect relevant indicators about opportunity to learn—including material, human, and social resources available to support student learning—to contextualize and validate the inferences drawn from the assessment results.
Information and communications technology will be an essential component of assessment systems designed to measure science learning as envisioned in the framework and the NGSS. Technology enhances options for designing assessment tasks that embody three-dimensional science learning, as well as strategies for making them more accessible to students with disabilities and English-language learners.
RECOMMENDATION 7-7 States should support the use of existing and emerging technologies in designing and implementing a science assessment system that meets the goals of the framework and the Next Generation Science Standards. New technologies hold particular promise for supporting the assessment of three-dimensional science learning, and for streamlining the processes of assessment administration, scoring, and reporting.
As the framework makes clear, assessment is a key element in the process of educational change and improvement. Done well, it can reliably measure what scientists, educators, and parents want students to know and be able to do, and it can help educators create the learning environments that support the attainment of those objectives. Done poorly, it will send the wrong message about what students know and can do, and it will skew the teaching and learning processes.
For K-12 science assessment, the framework and the NGSS provide an opportunity to rethink and redesign assessments so that they more closely align with a vision of science proficiency in which the practices of scientific reasoning are deeply connected with the understanding and application of disciplinary core ideas and crosscutting concepts. Defining in detail the nature of that understanding and developing valid ways to assess it present a substantial challenge
for designing assessments. That challenge has begun to be met, as shown in the examples of such assessments, and there are tools, methods, and technologies now available to build on the work that has been done. If states, districts, researchers, and parents invest time and other resources in the effort, new science assessments that are well integrated with curriculum and instruction can be developed.