5
Designing Science Assessments

In this report the committee has stressed the importance of considering the assessment system as a whole. However, as was discussed in Chapter 2, the success of a system depends heavily on the nature and quality of the elements that comprise it, in this case, the items, strategies, tasks, situations, or observations that are used to gather evidence of student learning and the methods used to interpret the meaning of students’ performance on those measures.

In keeping with the committee’s conclusion that science education and assessment should be based on a foundation of how students’ understanding of science develops over time with competent instruction, we have taken a developmental approach to science assessment. This approach considers that science learning is not simply a process of acquiring more knowledge and skills, but rather a process of progressing toward greater levels of competence as new knowledge is linked to existing knowledge, and as new understandings build on and replace earlier, naïve conceptions.

This chapter begins with a brief overview of the principal influences on the committee’s thinking about assessment. It concludes with a summary of the work of two design teams that used the strategies and tools outlined in this report to develop assessment frameworks around two scientific ideas: atomic-molecular theory and the concepts underlying evolutionary biology and natural selection.

The chapter does not offer a comprehensive examination of test design, nor a how-to manual for building a test; a number of excellent books provide that kind of information (see, for example, Downing and Haladyna, in press; Irvine and Kyllonen, 2002). Rather, the purpose of this chapter is to help those concerned with the design of science assessments to conceptualize the process in ways that



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 77
Systems for State Science Assessment 5 Designing Science Assessments In this report the committee has stressed the importance of considering the assessment system as a whole. However, as was discussed in Chapter 2, the success of a system depends heavily on the nature and quality of the elements that comprise it, in this case, the items, strategies, tasks, situations, or observations that are used to gather evidence of student learning and the methods used to interpret the meaning of students’ performance on those measures. In keeping with the committee’s conclusion that science education and assessment should be based on a foundation of how students’ understanding of science develops over time with competent instruction, we have taken a developmental approach to science assessment. This approach considers that science learning is not simply a process of acquiring more knowledge and skills, but rather a process of progressing toward greater levels of competence as new knowledge is linked to existing knowledge, and as new understandings build on and replace earlier, naïve conceptions. This chapter begins with a brief overview of the principal influences on the committee’s thinking about assessment. It concludes with a summary of the work of two design teams that used the strategies and tools outlined in this report to develop assessment frameworks around two scientific ideas: atomic-molecular theory and the concepts underlying evolutionary biology and natural selection. The chapter does not offer a comprehensive examination of test design, nor a how-to manual for building a test; a number of excellent books provide that kind of information (see, for example, Downing and Haladyna, in press; Irvine and Kyllonen, 2002). Rather, the purpose of this chapter is to help those concerned with the design of science assessments to conceptualize the process in ways that

OCR for page 77
Systems for State Science Assessment may be somewhat different from their current thinking. The committee emphasizes that in reshaping their approaches to assessment design states should, at all times, adhere to the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, and the National Council on Measurement in Education, 1999). DEVELOPMENTAL APPROACH TO ASSESSMENT A developmental approach to assessment is the process of monitoring students’ progress through an area of learning over time so that decisions can be made about the best ways to facilitate their further learning. It involves knowing what students know now, and what they need to know in order to progress. This approach to assessment uses a learning progression (see Chapter 3), or some other continuum to provide a frame of reference for monitoring students’ progress over time.1 Box 5-1 is an example of a science progress map, a continuum that describes in broad strokes a possible path for the development of science understanding over the course of 13 years of education. It can also be used for tracking and reporting students’ progress in ways that are similar to those used by physicians or parents for tracking changes in height and weight over time (see Box 5-2). Box 5-3 illustrates another conception of a progress map for science learning. The chart that accompanies it describes expectations for student attainment at each level along the continuum in four domains of science subject matter: Earth and Beyond (EB); Energy and Change (EC); Life and Living (LL); and Natural and Processed Materials (NPM). The creators of this learning progression (and the committee) emphasize that any conception of a learning continuum is always hypothetical and should be continuously verified and refined by empirical research and the experiences of master teachers who observe the progress of actual students. A developmental approach implies the use of multiple sources of information, gathered in a variety of contexts, that can help shed light on student progress over time. These approaches can take a variety of forms ranging from large-scale externally developed and administered tests to informal classroom observations and conversations, or any of the many strategies described throughout this report. Some of the measures could be standardized and thus provide comparable information about student achievement that could be used for accountability purposes; others might only be useful to a student and his or her classroom teacher. A developmental approach provides a framework for thinking about what to assess and when particular constructs might be assessed, and how evi- 1   These may also be referred to as progress variables, progress maps, developmental progress maps, or strands.

OCR for page 77
Systems for State Science Assessment BOX 5-1 Science Progress Map   Interprets experimental data involving several variables. Interrelates information represented in text, graphs, figures, diagrams. Makes predictions based on data and observations. Demonstrates a growing understanding of more advanced scientific knowledge and concepts (e.g., calorie chemical change). Level 5 Demonstrates an understanding of intermediate scientific facts and principles and applies this in designing experiments and interpreting data. Interprets figures and diagrams used to convey scientific information. Infers relationships and draws conclusions by applying facts and principles, especially from the physical sciences. Level 4 Has a grasp of experimental procedures used in science, such as designing experiments, controlling variables, and using equipment. Identifies the best conclusions drawn from data on a graph and the best explanation for observed phenomena. Understands some concepts in a variety of science content areas, including the Life, Physical, Earth, and Space Sciences. Level 3 Exhibits a growing knowledge in the Life Sciences, particularly human biological systems, and applies some basic principles from the Physical Sciences, including force. Also displays a beginning understanding of some of the basic methods of reasoning used in science, including classification, and interpretation of statements. Level 2 Knows some general scientific facts of the type that can be learned from everyday experiences. For example, exhibits some rudimentary knowledge concerning the environment and animals. Level 1 SOURCE: LaPointe, Mead, and Phillips (1989). Reprinted by permission of the Educational Testing Service. dence of understanding would differ as students gain more content knowledge, higher-order and more complex thinking skills, and greater depth of understanding about the concepts and how they can be applied in a variety of contexts. For example, kinetic molecular theory is a big idea that does not usually appear in state standards or assessments until high school. However, important concepts that are essential to understanding this theory should develop earlier. Champagne et al. (National Assessment Governing Board, 2004)2 provide the 2   Available at http://www.nagb.org/release/iss_paper11_22_04.doc.

OCR for page 77
Systems for State Science Assessment BOX 5-2 Details for a Progress Map   Interprets experimental data involving several variables. Interrelates information represented in text, graphs, figures, diagrams. Makes predictions based on data and observations. Demonstrates a growing understanding of more advanced scientific knowledge and concepts (e.g., calorie chemical change). Level 5   Demonstrates an understanding of intermediate scientific facts and principles and applies this in designing experiments and interpreting data. Interprets figures and diagrams used to convey scientific information. Infers relationships and draws conclusions by applying facts and principles, especially from the physical sciences. Level 4   Has a grasp of experimental procedures used in science, such as designing experiments, controlling variables, and using equipment. Identifies the best conclusions drawn from data on a graph and the best explanation for observed phenomena. Understands some concepts in a variety of science content areas, including the Life, Physical, Earth, and Space Sciences. Level 3   Exhibits a growing knowledge in the Life Sciences, particularly human biological systems, and applies some basic principles from the Physical Sciences, including force. Also displays a beginning understanding of some of the basic methods of reasoning used in science, including classification, and interpretation of statements. Level 2   Knows some general scientific facts of the type that can be learned from everyday experiences. For example, exhibits some rudimentary knowledge concerning the environment and animals. Level 1 NOTE: This represents a learning progression for science literacy over 13 years of instruction. The arrow on the left indicates increasing expertise. The center of the progression provides a general description of the kinds of understandings and practices that students at each level would demonstrate. To be of use for assessment development, these descriptions must be broken down more specifically. SOURCE: LaPointe et al. (1989). Reprinted by permission of the Educational Testing Service.

OCR for page 77
Systems for State Science Assessment following illustration of how early understandings underpin more sophisticated ways of understanding big ideas. Children observe water “disappearing” from a pan being heated on the stove and water droplets “appearing” on the outside of glasses of ice water. They notice the relationships between warm and cold and the behavior of water. They develop models of water, warmth, and cold that they use to make sense of their observations. They reason that the water on the outside of the glass came from inside the glass. But their reasoning is challenged by the observation that droplets don’t form on a glass of water that is room temperature. Does the water really disappear? If so, where did the water droplets come from when a cover is put on the pot, and why doesn’t the water continue disappearing when the cover is on? These observations, models of matter, warmth and cold, are foundations of the sophisticated understandings of kinetic-molecular theory. Water is composed of molecules, they are in motion, and some have sufficient energy to escape from the surface of the water. This model of matter allows us to explain the observation that water evaporates from open containers. Understanding temperature as a measure of the average kinetic energy of the molecules provides a model for explaining why the rate at which water evaporates is temperature dependent. The higher the temperature of water the greater is the rate of evaporation. This simple description illustrates that at different points along the learning continuum the understandings and skills that need to be addressed through instruction and assessed are fundamentally different. INFLUENCES ON THE COMMITTEE’S THINKING The committee drew on a variety of sources in thinking about the design of developmental science assessments, including the work of the design teams described in Chapter 2 and those described below. We also reviewed work conducted by a variety of others interested in this type of assessment (Wiggins and McTighe, 1998; CASEL, 2005; Wilson 2005; Wilson and Sloane 2000; Wilson and Draney 2004), the work of the Australian Council for Educational Research (Masters and Forster, 1996), and the work that guided the creation of the strand maps included in the Atlas of Science Literacy (AAAS, 2001).3 The Assessment Triangle Measurement specialists describe assessment as a process of reasoning from evidence—of using a representative performance to infer a wider set of skills or 3   See Figure 4-1.

OCR for page 77
Systems for State Science Assessment BOX 5-3 Elaborated Progress Map for Energy and Change Progress Map for Science Learning

OCR for page 77
Systems for State Science Assessment Elaborated Framework Science > Earth and Beyond, Energy and Change, Life and Living, Natural and Processed Materials Earth and Beyond FOUNDATION LEVEL 1 LEVEL 2 LEVEL 3 Students understand how the physical environment on Earth and its position in the universe impact on the way we live. EB F The student: Attends and responds to local environmental features. EB 1 The student: Understands that easily observable environmental features, including the sun and moon, may influence life. EB 2 The student: Understands how some changes in the observable environment, including the sky, influence life. EB 3 The student: Understands changes and patterns in different environments and space, and relates them to resource use. Energy and Change Students understand the scientific concept of energy and explain that energy is vital to our existence and to our quality of life. EC F The student: Demonstrates an awareness that energy is present in daily life. EC 1 The student: Understands that energy is required for different purposes in life. EC 2 The student: Understands ways that energy is transferred and that people use different types of energy transdifferent purposes. EC 3 The student: Understands patterns of energy use and some types of energy for fer. Life and Living Students understand their own biology and that of other living things, and recognize the interdependence of life. LL F The student: Recognizes their personal features and communicates basic needs. LL 1 The student: Understands that people are living things, have features, and functions over time. LL 2 The student: Understands that needs, features, and change of living things are related and change over time. LL 3 The student: Understands that living things have features that form systems which determine their interaction with the environment.

OCR for page 77
Systems for State Science Assessment Natural and Processed Materials Students understand that the structure of materials determines their properties and that the processing of raw materials results in new materials with different properties and uses. NPM F The student: Explores and responds to materials and their properties. NPM 1 The student: Understands that different materials are used in life and that materials can change. NPM 2 The student: Understands that materials have different uses and different properties and undergo different changes. NPM 3 The student: Understands that properties, changes, and uses of materials are related. LEVEL 4 LEVEL 5 LEVEL 6 LEVEL 7 LEVEL 8 EB 4 The student: Understands processes that can help explain and predict interactions and changes in physical systems and environments. EB 5 The student: Understands models and concepts that explain Earth and space systems and that resource use is related to the geological and environmental history of the Earth and universe. EB 6 The student: Understands how concepts and principles are used to explain geological and environmental change in the Earth and large-scale systems in the universe. EB 7 The student: Uses concepts and theories in relating molecular and microscopic processes and structures to macro-scopic effects within and between Earth and space systems and understands that these systems are dynamic. EB 8 The student: Uses concepts, models, and theories to understand holistic effects and implications involving cycles of change or equilibrium within Earth and space systems. EC 4 The student: Understands that energy interacts differently with different substances and that this can affect the EC 5 The student: Understands models and concepts that are used to explain the transfer and transfor EC 6 The student: Understands the principles and concepts used to explain the transfer and transfor- EC 7 The student: Understands the relationships between components of an energy transfer and EC 8 The student: Applies conceptual and theoretical frameworks to evaluate relationships between components of

OCR for page 77
Systems for State Science Assessment use and transfer of energy. mation of energy in an energy interaction. mation of energy that occurs in energy systems. transformation system and predicts the effects of change. an energy system and to systems as a whole. LL 4 The student: Understands that systems can interact and that such interactions can lead to change. LL 5 The student: Understands the models and concepts that are used to explain the processes that connect systems and lead to change. LL 6 The student: Understands the concepts and principles used to explain the effects of change on systems of living things. LL 7 The student: Uses concepts and ideas and understands theories in relating structures and life functions to survival within and between systems. LL 8 The student: Applies their understanding of concepts, models, and theories to interpret holistic systems and the processes involved in the equilibrium and survival of these systems. NPM 4 The student: Understands that properties, changes, and uses of materials are related to their particulate structure. NPM 5 The student: Understands the models and concepts that are used to explain properties from their microscopic structure. NPM 6 The student: Understands the concepts and principles used to explain physical and chemical change in systems and families of chemical reactions. NPM 7 The student: Uses interrelated concepts to explain and predict chemical processes and relationships between materials and families of materials. They use atomic and symbolic concepts in their explanations of macroscopic evidence. NPM 8 The student: Chooses appropriate theoretical concepts and principles and uses them to conceptualize a framework or holistic understanding in order to explain properties, relationships, and changes to materials. SOURCE: Western Australia Curriculum Council. Reprinted by permission.

OCR for page 77
Systems for State Science Assessment knowledge. The process of collecting evidence to support inferences about what students know is fundamental to all assessments—from classroom quizzes, standardized achievement tests, or computerized tutoring programs, to the conversations students have with their teachers as they work through an experiment (Mislevy, 1996). The NRC’s Committee on the Cognitive Foundations of Assessment portrayed this process of reasoning from evidence in the form of what it called the assessment triangle (NRC 2001b, pp. 44–51) (see Figure 5-1). The triangle rests on cognition, a “theory or set of beliefs about how students represent knowledge and develop competence in a subject domain” (NRC, 2001b, p. 44). In other words, the design of the assessment begins with specific understanding not only of which knowledge and skills are to be assessed but also of how understanding develops in the domain of interest. This element of the triangle links assessment to the findings about learning discussed in Chapter 2. In measurement terminology, the aspects of cognition and learning that are the targets for the assessment are referred to as the construct. A second corner of the triangle is observation, the kinds of tasks that students would be asked to perform that could yield evidence about what they know and can do. The design and selection of the tasks need to be tightly linked to the specific inferences about student learning that the assessment is meant to support. It is important to note here that although there are a variety of questions that some kinds of assessment could answer, an explicit definition of the questions about which information is needed must play a part in the design of the tasks. The third corner of the triangle is interpretation, the methods and tools used to reason from the observations that have been collected. The method used for a large-scale standardized test might be a statistical model, while for a classroom FIGURE 5-1 The assessment triangle.

OCR for page 77
Systems for State Science Assessment assessment it could be a less formal, more practical method of drawing conclusions about student understanding based on the teacher’s experience. This vertex of the triangle also may be referred to as the measurement model. The purpose of presenting these three elements in the form of a triangle is to emphasize that they are interrelated. In the context of any assessment, each must make sense in terms of the other two for the assessment to produce sound and meaningful results. For example, the questions that dictate the nature of the tasks students are asked to perform should grow logically from an understanding of the ways learning and understanding develop in the domain being assessed. Interpretation of the evidence produced should, in turn, supply insights into students’ progress that match up with those same understandings. Thus, the process of designing an assessment is one in which specific decisions should be considered in light of each of these three elements. From Concept to Implementation The assessment triangle is a concept that describes the nature of assessment, but it needs elaboration to be useful for constructing measures. Using the triangle as a foundation, several different researchers have developed processes for assessment development that take into account the logic that underlies the assessment triangle. These approaches can be used to create any type of assessment, from a classroom assessment to a large-scale state testing program. They are included here to illustrate the importance of using a systematic approach to assessment design in which consideration is given from the outset to what is to be measured, what would constitute evidence of student competencies, and how to make sense of the results. A systematic process stands in contrast to what the committee found as a typical strategy for assessment design. These more common approaches tend to focus on the creation of “good items” in isolation from all other important facets of design. Evidence-Centered Assessment Design Mislevy and colleagues (see for example, Almond, Steinberg, and Mislevy, 2002; Mislevy, Steinberg, and Almond, 2002; and Steinberg et al., 2003) have developed and used an approach—evidence-centered assessment design (ECD)—for the construction of educational assessment that is based on evidentiary argument. The general form of the argument that underlies ECD (and the assessment triangle discussed above as well as the Wilson construct mapping process discussed below) was outlined by Messick (1994, p. 17): A construct-centered approach would begin by asking what complex of knowledge, skills, or other attributes should be assessed, presumably because they are tied to explicit or implicit objectives of instruction or are otherwise valued by

OCR for page 77
Systems for State Science Assessment M V Mass Student knows that floating depends on having less mass. Volume Student knows that floating depends on having more volume. To progress to the next level, student needs to recognize that changing EITHER mass OR volume will affect whether an object sinks or floats. UF Unconventional Feature Student thinks that floating depends on an unconventional feature, such as shape, surface area, or hollowness. To progress to the next level, student needs to rethink their ideas in terms of mass and/or volume. For example, hollow objects have a lot of volume but not a lot of mass. OT Off Target Student does not attend to any property or feature to explain floating. To progress to the next level, student needs to focus on some property or feature of the object in order to explain why it sinks or floats. NR No Response Student did not attempt to answer. To progress to the next level, student needs to respond to the question. X Unscorable Student gave a response, but it cannot be interpreted for scoring.   SOURCE: http://www.caesl.org/conference/Progress_Guides.pdf. Reprinted by permission of the Center for Assessment and Evaluation of Student Learning. In large-scale assessment programs it is typical for state personnel to decide on the measurement model that will be used, in consultation with the test development contractor. Most often it will be either classical test theory or one of the IRT models. Other models are available (see for example, Chapter 4 of NRC [2001b] for a recent survey), although these have mainly been confined to research studies rather than large-scale applications. The decision about which measurement model to use is generally based on information provided by the state about the inferences it wants to support with test results, and on the model the contractor typically uses for accomplishing similar goals.

OCR for page 77
Systems for State Science Assessment BOX 5-9 Delaware Scoring Rubric for Chemical Tests Question I: Your mixture is made with three chemicals that you have worked with in this unit. You may not have the same mixture as your neighbor. Using two or more senses, observe your unknown mixture. List at least three physical properties you observed. Do not taste the mixture. This question measures students’ ability to observe and record the physical properties of a mixture. Criterion for a complete response: Identifies and records three different physical properties using two or more senses, e.g., feels soft, like a powder, bumpy, white, has crystals, etc. Code Response   Complete response 20 Response meets criterion above. 21 Lists three properties and includes one speciic substance, e.g., sugar.   Partially correct response 10 Records two different physical properties using one or more senses. 11 Records two different physical properties using one or more senses, plus adds the name of a chemical (substance).   Incorrect response 70 Records one physical property. 71 Identifies a substance (sugar) rather than any properties. 79 Any other incorrect response.   Non response 90 Crossed out, erased, illegible, incomplete, or impossible to interpret. 99 BLANK SOURCE: http://www.scienceassessment.org/pdfxls/chemicaltest/oldpdfs/A6.18.pdf. EVALUATING THE COGNITIVE VALIDITY OF ASSESSMENT Educators, policy makers, students, and the public want to know that the inferences that are drawn from the results of science tests are justified. To address the cognitive validity of science achievement tests, Shavelson and colleagues (Ayala, Yin, Shavelson, and Vanides 2002; Ruiz-Primo, Shavelson, Li, and Schultz 2001) have developed a strategy for analyzing science tests to ascertain what they are measuring. The same process can be used to analyze state standards and to compare what an assessment is measuring with a state’s goals for student learning. At the heart of the process is a heuristic framework for conceptualizing the

OCR for page 77
Systems for State Science Assessment construct of science achievement as comprised of four different but overlapping types of knowledge. The knowledge types are: Declarative knowledge is knowing what—for example, knowledge of facts, definitions, or rules. Procedural knowledge is knowing how—for example knowing how to solve an equation, perform a test to identify an acid or base, design a study, identify the steps involved in other kinds of tasks. Schematic knowledge is knowing why—for example, why objects sink or float, or why the seasons change—and includes principles or other mental models that can be used to analyze or explain a set of findings. Strategic knowledge is knowing how and when to apply one’s knowledge in a new situation or when assimilating new information—for example, developing problem-solving strategies, setting goals, and monitoring one’s own thinking in approaching a new task or situation. Using an examinee–test interaction perspective to explain how students bring and apply their knowledge to answer test questions, the researchers developed a method for logically analyzing test items and linking them to the achievement framework of knowledge types (Li, 2001). Each item on a test goes through a series of analyses that are designed to ascertain whether the item will elicit responses that are consistent with what the assessment is intending to measure and if the responses that they elicit can be interpreted to support any intended inferences that the assessor hopes to draw from the results. Li, Shavelson, and colleagues (Li, 2001; Shavelson and Li, 2001; Shavelson et al., 2004) applied this framework in analyzing the science portions of the Third International Mathematics and Science Study—Repeat (TIMMS-R) (Population 2) and the Delaware Student Testing Program. They found that both tests were heavily weighted on declarative knowledge—almost 60 percent. The remaining items were split between procedural and schematic knowledge. The researchers also analyzed the Delaware science content standards using the achievement framework and found that the state standards were more heavily weighted toward schematic knowledge than was the assessment—indicating that the assessment did not adequately represent the cognitive priorities contained in the state standards. These findings led to changes in the state testing program and the development of a strong curriculum-connected assessment system for improvement of student learning to supplement the state test and provide additional information on students’ science achievement (personal communication, Rachel Wood). BUILDING DEVELOPMENTAL ASSESSMENT AROUND LEARNING The committee commissioned two design teams that included scientists, science educators, and experts with knowledge of how children learn science to

OCR for page 77
Systems for State Science Assessment suggest ways of using research on children’s learning to develop large-scale assessments at the national and state levels, and classroom assessments that were coherent with them. The teams were asked to consider the ways in which tools and strategies drawn from research on children’s learning could be used to develop new approaches to elaborating standards and to designing and interpreting assessments. Each team was asked to lay out a learning progression for an important theory or big idea in the natural sciences. The learning progression was to be based on experimental studies, cognitive theory, and logical analysis of the concepts, principles, and theory. The teams were asked to consider ways in which the learning progression could be used to construct strategies for assessing students’ understanding of the foundations for the theory, as well as their understanding of the theory itself. The assessment strategies (if they developed them) were to be developmental, that is, to test students’ progressively more complex understanding of the various layers of the theory’s foundation in a sequence in which cognitive science suggests it reasonably can be expected to develop. The work of these two groups is summarized below. Copies of their papers can be obtained at http://www7.nationalacademies.org/bota/Test_Design_K-12_Science.html. Implications of Research on Children’s Learning for Assessment: Matter and Atomic-Molecular Theory9 This team used research on children’s learning about the nature of matter and materials, how matter and materials change, and the atomic structure of matter10 to illustrate a process for developing assessments that reflect research on how students learn and develop understanding of these scientific concepts. Their first step was to organize the key concepts of atomic molecular theory around six big ideas that form two major clusters: the first two form a macroscopic level cluster and the last four form an atomic-molecular level cluster (Box 5-10 provides further detail on these concepts). The atomic-molecular theory elaborates on the macroscopic big ideas studied earlier and provides deeper explanatory accounts of macroscopic properties and phenomena. Using research on children’s learning, the team identified pathways—learning progressions—that would trace the path that children might follow as instruction helps them move from naïve ideas to more sophisticated understanding of atomic molecular theory. The group noted that research points to the challenges 9   Paper prepared for the committee by Carol Smith, Marianne Wiser, Andy Anderson, Joe Krajcik, and Brian Coppola (2004). 10   These ideas are represented in both the Benchmarks for Science Literacy (AAAS, 1993) and National Science Education Standards (NRC, 1996).

OCR for page 77
Systems for State Science Assessment BOX 5-10 Atomic-Molecular Theory Children’s ability to appreciate the power of the atomic theory requires a number of related understandings about the nature of matter and material kinds, how matter and materials change, and the atomic structure of matter. These understandings are detailed in the standards documents. Smith et al. (2004) organize them around six big ideas that form two major clusters: the first two form a macroscopic level cluster and the last four form an atomic-molecular level cluster. The first cluster is introduced in the earliest grades and elaborated throughout schooling. The second is introduced in middle school and elaborated throughout middle school and high school. The atomic-molecular theory elaborates on the macroscopic big ideas studied earlier and provides deeper explanatory accounts of macroscopic properties and phenomena. Six Big Ideas of Atomic Molecular Theory That Form Two Major Clusters M1. Macroscopic properties: We can learn about the objects and materials that constitute the world through measurement, classification, and description according to their properties. M2. Macroscopic conservation: Matter can be transformed, but not created or destroyed, through physical and chemical processes. AM1. Atomic-molecular theory: All matter that we encounter on Earth is made of less than 100 kinds of atoms, which are commonly bonded together in molecules and networks. AM2. Atomic-molecular explanation of materials: The properties of materials are determined by the nature, arrangement, and motion of the atoms and molecules of which they are made. AM3. Atomic-molecular explanation of transformations: Changes in matter involve both changes and underlying continuities in atoms and molecules. AM4. Distinguishing data from atomic-molecular explanations: The properties of and changes in atoms and molecules have to be distinguished from the macroscopic properties and phenomena they account for. SOURCE: Smith et al. (2004). inherent in moving through the progressions, as they involve macroscopic understandings of materials and substances as well as nanoscopic understandings of atoms and molecules. Box 5-11 contains these progressions as they were conceived by this design team. The team offers the following caveats about this progression. First, learning progressions are not inevitable and there is no one correct order—as children learn, many changes are taking place simultaneously in multiple, interconnected ways, not necessarily in the constrained and ordered

OCR for page 77
Systems for State Science Assessment BOX 5-11 The Concepts and Foundational Ideas Associated with Atomic-Molecular Theory Illustrate a Possible Learning Progression Experiences with a wider range of materials and phenomena. Children extend the range of their experiences with materials, properties of materials, and changes in materials. New experiences often help them to see the limitations of their earlier ideas and to accept new ideas that account for a wider range of phenomena. Increasing sophistication in describing, measuring, and classifying materials. Children learn about the limits of their sense impressions and master the use of a wider range of instruments to measure and classify properties of materials and changes in materials. They become aware of properties of materials that are not revealed by casual observation and learn to measure them. They also become aware of the composition of many materials, understanding that even homogeneous materials are mixtures of substances, including different elements and compounds. Development of causal accounts focusing on matter and mass. Children move from explanations of changes as events caused by conditions or circumstances to explanations that focus on mechanisms of change and on tracing substances through changes. They come to appreciate that mass is a fundamental measure of the amount of matter, so that changes in mass must be accounted for in terms of matter entering or leaving a system. They learn that gases are forms of matter like solids and liquids; thus gases have mass and can be used to account for otherwise unexplainable mass changes. way that it appears in a learning progression. Second, any learning progression is inferential or hypothetical as there are no long-term studies of actual children learning a particular concept, and describing students’ reasoning is difficult because different researchers have used different methods and conceptual frameworks. For designing assessments to tap into students’ progress along this learning progression, the team suggested a three-stage process: Codify the big ideas into learning performances: types of tasks or activities suitable for classroom settings through which students can demonstrate their understanding of big ideas and scientific practices. Use the learning performances to develop clusters of assessment tasks or items, including both traditional and nontraditional items that are (a) connected to principles in the standards and (b) analyzable with psychometric tools. Use research on children’s learning as a basis for interpretation of student

OCR for page 77
Systems for State Science Assessment Increasing theoretical depth. Children develop accounts of properties of matter and changes in matter that make increased use of hidden mechanisms and atomic-molecular theory. They are increasingly able to make use of all six big ideas (listed above) and to develop accounts that coordinate four different levels of description: Impressions or perceptual appearances—what we see and feel—are related to Measurable properties or variables—mass, volume, density, temperature, pressure, etc.—which are related to Constituent materials and chemical substances, and finally to The atoms and molecules of which those substances are composed. Throughout elementary school, students are working to coordinate the first two levels as they develop a sound macroscopic understanding of matter and materials based on careful measurement. From middle school onward they are coordinating all four levels as they develop an understanding of the atomic-molecular theory and its broad explanatory power. Understanding the nature and uses of scientific evidence and theories. Children learn to distinguish between data and models or theories, which can be used to account for many different observations and experiences. They become increasingly able to develop and criticize arguments that involve coordinated use of data and theories. They also become increasingly sophisticated in their understanding of sources of uncertainty and their ability to use conditional and hypothetical reasoning. SOURCE: Smith et al. (2004). responses, explaining how responses reveal students’ thinking with respect to big ideas and learning progressions. In creating examples to illustrate their process, the team laid out its reasoning at each step in the development process—from national standards to elaborated standards to learning performances to assessment items and interpretations—and about the contributions that research on children’s learning can make at each step. In doing so they illustrate why they believe that classroom and large-scale assessments developed using these methods will have three important qualities that are missing from most current assessments: Clear principles for content coverage. Because the assessments are organized around big ideas embodied in key scientific practices and content, their organization and relationship to themes in the curriculum will be clear. Rather than sampling randomly or arbitrarily from a large number of individual standards, assess-

OCR for page 77
Systems for State Science Assessment ments developed using these methods can predictably include items that assess students’ understanding of the big ideas and scientific practices. Clear relationships between standards and assessment items. Because the reasoning and methods used at each stage of the development process is explicit, the interpretation of standards and the relationships between standards and assessment items is clear. The relationship between standards and assessment items is made explicit and is thus easy to examine. Providing insights into students’ thinking. The assessments and their results will help teachers to understand and respond to their students’ thinking. For this purpose, the interpretation of student responses is critically important, and reliable interpretations require a research base. Thus, developing items that reveal students’ thinking is far easier for matter and atomic molecular theory than it is for other topics with less extensive research bases. While this group demonstrates the key role that research on learning can play in the design of high-quality science assessments, they note that for assessors whose primary concern is evaluation and accountability, these qualities may not seem as important as some others qualities, such as efficiency and reliability. They conclude, however, that assessments with these qualities are essential for the long-term improvement of science assessment. Evolutionary Biology11 While the importance of incorporating research findings about student learning into assessment development is widely recognized, research in many areas of science learning is incomplete. The design team that addressed evolutionary biology argued, however, that waiting for research to close all of the gaps would be unwarranted. To illustrate why waiting may not be necessary, the team developed an approach for producing inferences about student learning that apply a contemporary view of assessment and exploit learning theory. Their approach is to use learning theory to more clearly identify what should be assessed and what tasks or conditions could provide evidence about students’ understanding, so that inferences about students’ knowledge are well founded. The approach has three components. First, in a standards-based education system, assessment developers rely on standards to define what students should know (the constructs), yet standards often obscure the important disciplinary concepts and practices that are inherent in them. To remedy this, the team suggests that a central conceptual structure be developed around the big ideas contained in the standards as a means to clarify what it is important to assess. Many individual standards may relate to the same 11   Paper prepared for the committee by Kefyn Catley, Brian Reiser, and Rich Lehrer (2005).

OCR for page 77
Systems for State Science Assessment big idea, so that focusing on them is a means of condensing standards. Ideally, a big idea is revisited throughout schooling, so that a student’s knowledge is progressively refined and elaborated. This practice potentially simplifies the alignment between curriculum and assessment because both are tied to the same set of constructs. The team also advocates that big ideas be chosen with prospective pathways of development firmly in mind. They note that these are sometimes available from research in learning, but typically also draw on the opinions of master teachers as well as some inspired guesswork to bridge gaps in the research base. Second, standards are aligned with the big ideas, so that they can be considered in the context of more central ideas. This practice is another means of pruning standards, and it is a way to develop coherence among individual standards. Third, standards are elaborated as learning performances. As described earlier, learning performances describe specific cognitive processes and associated practices that are linked to achieving particular standards, and thus help to guide the selection of situations for gathering evidence of understanding as well as clues as to what the evidence means. The team illustrates its approach by developing a cartography of big ideas and associated learning performances for evolutionary biology for the first eight years of schooling. The cartography traces the development of six related big ideas that support students’ understanding of evolution. The first and most important is diversity: Why is life so diverse? The other core concepts play a supporting role: (a) ecology, (b) structure-function, (c) variation, (d) change, and (e) geologic processes. In addition to these disciplinary constructs, two essential habits of mind are included: mathematical tools that support reasoning about these big ideas, and forms of reasoning that are often employed in studies of evolution, especially model-based reasoning and comparative analysis. At each of three grade bands (K–2; 3–5, 6–8), standards developed by the National Research Council (1996) and American Association for the Advancement of Science (1993) are elaborated to encompass learning performances. As schooling progresses, these learning performances reflect increasing coordination and connectivity among the big ideas. For example, diversity is at first simply treated as an extant quality of the living world but, over years of schooling, is explained by recourse to concepts developing as students learn about structure-function, variation, change, ecology, and geology. The team chose this topic because of its critical and unifying role in the biological sciences and because learning about evolution requires synthesis and coordination among a network of related concepts and practices, ranging from genetics and ecology to geology, so that understanding evolution is likely to emerge across years of schooling. Thus, learning about evolution will be progressive and involve coordination among otherwise discrete disciplines (by contrast, one could learn about ecology or geology without considering their roles in evolution). Unlike other areas in science education, evolution has not been thor-

OCR for page 77
Systems for State Science Assessment oughly researched. The domain presents significant challenges for those who wish to describe the pathways through which learning in this area might develop that could guide assessment. Thus, evolution served as a test-bed for the approach. CONCLUSIONS Designing high-quality science assessments is an important goal, but a difficult one to achieve. As discussed in Chapter 3, science assessments must target the knowledge, skills, and habits of mind that are necessary for science literacy, and must reflect current scientific knowledge and understanding in ways that are accurate and consistent with the ways in which scientists understand the world. It must assess students’ understanding of science as a content domain and their understanding of science as an approach. It must also provide evidence that students can apply their knowledge appropriately and that they are building on their existing knowledge and skills in ways that will lead to more complete understanding of the key principles and big ideas of science. Adding to the challenge, competence in science is multifaceted and does not follow a singular path. Competency in science develops more like an ecological succession, with changes taking place simultaneously in multiple interconnected ways. Science assessment must address these complexities while also meeting professional technical standards for reliability, validity, and fairness for the purposes for which the results will be used. The committee therefore concludes that the goal for developing high-quality science assessments will only be achieved though the combined efforts of scientists, science educators, developmental and cognitive psychologists, experts on learning, and educational measurement specialists working collaboratively rather than separately. The experience of the design teams described in this chapter and multiple findings of other NRC committees (NRC, 1996, 2001b, 2002) support this conclusion. Commercial test contractors do not generally have the advantage of these diverse perspectives as they create assessment tools for states. It is for this reason that we suggest in the next chapter that states create their own content-specific advisory boards to assist state personnel that are assigned to work with the contractors. These bodies can advise states on the appropriateness of assessment strategies and the quality and accuracy of the items and tasks included on any externally developed tests. QUESTIONS FOR STATES This chapter has described ways of thinking about the design of science assessments that can be applied to assessments at all levels of the system. We offer the following questions to guide states in evaluating their approaches to the development of science assessments:

OCR for page 77
Systems for State Science Assessment Question 5-1: Have research and expert professional judgment about the ways in which students learn science been considered in the design of the state’s science assessments? Question 5-2: Have the science assessments and tasks been created to shed light on how well and to what degree students are progressing over time toward more expert understanding?