6
Research Needs

THEORY AND GOALS

There is wide acceptance of the value of a system in which assessments measure student progress in meeting education standards and the test results are used to hold students, schools, educators, and jurisdictions to account for their performance. But, Lorrie Shepard described in her closing remarks two very different theories of action regarding the way such a system will actually bring about improvements have been put forward. Neither the differences between them nor the implications of adopting one or the other have been widely recognized, she added. One, the incentives theory, is that given sufficient motivation, teachers and other school personnel will develop ways to improve instruction. This perspective was the basis for the Elementary and Secondary Education Act of 1994, which required states to establish standards and assessments.

The other approach, which Shepard called the coherent capacity building theory, posited that an additional step, beyond establishing clear expectations and the motivation to meet them, was needed. Educators would also need the capacity, in the form of professional development and other supports, to improve their teaching in order for the accountability measures to have the desired effect (see, e.g., National Research Council, 1995). Shepard believes that the incentives theory is dominant, and that capacity building has consequently been neglected.

Similar imprecision is evident in the possible interpretations of some of the top reform goals of the present moment, Shepard suggested, including



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 59
6 Research Needs THEORy AND gOALS There is wide acceptance of the value of a system in which assessments measure student progress in meeting education standards and the test results are used to hold students, schools, educators, and jurisdictions to account for their performance. But, Lorrie Shepard described in her closing remarks two very different theories of action regarding the way such a system will actually bring about improvements have been put forward. Neither the differences between them nor the implications of adopting one or the other have been widely recognized, she added. One, the incentives theory, is that given suf - ficient motivation, teachers and other school personnel will develop ways to improve instruction. This perspective was the basis for the Elementary and Secondary Education Act of 1994, which required states to establish standards and assessments. The other approach, which Shepard called the coherent capacity building theory, posited that an additional step, beyond establishing clear expectations and the motivation to meet them, was needed. Educators would also need the capacity, in the form of professional development and other supports, to improve their teaching in order for the accountability measures to have the desired effect (see, e.g., National Research Council, 1995). Shepard believes that the incentives theory is dominant, and that capacity building has conse - quently been neglected. Similar imprecision is evident in the possible interpretations of some of the top reform goals of the present moment, Shepard suggested, including 

OCR for page 59
0 BEST PRACTICES FOR STATE ASSESSMENT SYSTEMS, PART I • reforming assessments using conceptually rich tasks, • integrating 21st-century skills and academic content, • creating coherence between large-scale and classroom assessments, and • using data to improve classroom instruction. For example, treating the first two bullets as distinct enterprises makes little sense, given that a large body of research indicates the importance of teaching content and higher-order thinking skills together. Shepard believes that policy makers do not completely understand that effective teaching relies on a model for how learning proceeds, in which cognitive skills and the knowledge of when and how to use them develop together with content knowledge and under- standing of how to generalize from it. She cautioned that, without this theory of learning, policy makers are likely to accept current modes of assessment. They may believe, for example, that narrowing the curriculum is necessary because basic reading and mathematics skills are so important. They may not be aware that excessive drill on worksheets that resemble summative tests does not give students the opportunity to understand the context and purpose for what they are learning—which would enhance their skill development (see Elmore, 2003; Blanc et al., 2010; Bulkley et al., 2010; Olah, Lawrence, and Riggan, 2010). Similarly, although policy makers are in favor of data-driven decision making, Shepard believes, many educators lack the substantive expertise to interpret the available data and use it to make meaningful changes in their practice. During the workshop discussions, many presenters drew attention to the churning that affects education policy because of shifts in political goals and personnel at the state level. Given that reality, coherence will have to come at a lower level, Shepard argued. The United States does not have a common curriculum, she suggested, because it has no tradition of relying on subject- matter experts in many decisions about education. Psychometricians and policy makers have typically taken the lead in the development of assessments, for example: subject-matter experts have generally been involved in some way, but are not usually asked to oversee the development of frameworks, item development, and the interpretation of results. Now, however, the interests of subject-matter experts and cognitive researchers who have been developing models of student learning within particular disciplines have converged, and this convergence offers the possibility of a coherence that could withstand the inevitable fluctuations in political interests. However, the practical application of this way of thinking about learning is not yet widely understood, Shepard observed. Thus, for Shepard, the opportunity of the present moment is to take the first steps in inventing and implementing the necessary innovations. It is not practical to expect that any one state or consortium could develop an ideal system for all grades and subject areas on the first try, so the focus should be on incremental improvements. She thinks that each consortium grant award should be focused on the development of a system of “next-generation, high-

OCR for page 59
1 RESEARCH NEEDS quality” classroom and summative assessments for one manageable area—say, just for mathematics, grades 4 through 8. She noted that Lauren Resnick had proposed a way of implementing inno - vative approaches incrementally. Resnick suggested that content-based “mod - ules” that incorporate both a rich curriculum and associated assessments could be adopted one by one and incrementally incorporated into an existing full curriculum. In the near term, this would leave existing assessments unchanged, but, over time, the accumulating body of new modules would eventually lead to a completely transformed system, in which accountability information could be drawn from the assessment components of the innovative curriculum modules. This approach would allow educators to proceed gradually, as the research to support the development of such modules grows, and also to sidestep many of the political and practical challenges that have hampered past programs. Shepard also emphasized the importance of considering curriculum along with new and improved assessment models. She cautioned that establishing higher standards does not mean only setting cut-points at a higher level, but also incorporating material of a substantively different character into assess - ments. If this is done without corresponding changes to curriculum and instruc- tion, the result will be predictable—students will not be likely to succeed on the new assessment. In the end, after all, the purpose of the improvements is to “change the character of what we teach and then make those opportuni - ties available to all students and make sure that the assessment can track any changes over time.” Shepard’s closed by reminding everyone that “to truly transform learning opportunities in classrooms in ways that research indicates are possible, it will be necessary to remove policy structures—especially low-level tests that misdi - rect effort; provide coherent curricula consistent with ambitious reforms; and take seriously the need for capacity-building at every level of the education system.” RESEARCH PRIORITIES Shepard and other discussants were asked to reflect on their highest pri - orities for research that would support progress in developing and imple- menting innovative assessments. Many of the ideas overlapped, and they fell into a few categories: measurement, content, teaching and learning, and experimentation. Measurement Many participants emphasized the need for psychometric models that were developed generations ago to be updated in light of recent research on learning and cognition. New ways of thinking about what should be measured and what sort of information would be useful to educators have been put forward, as described in Chapter 2, and it is clear that current psychometric

OCR for page 59
2 BEST PRACTICES FOR STATE ASSESSMENT SYSTEMS, PART I models do not fit them well. These cognitive models illustrate, for example, the importance of each of the stages that students go through in learning complex material. This idea implies that teachers (and students) need information about students’ developing understanding of concepts and facts and how they fit into a larger intellectual structure. Yet educational measurement has tended to focus on one-dimensional rankings according to students’ mastery of specific knowl - edge and skills at a given time. The goals of traditional psychometrics remain important, but perhaps need to be expanded. Means of establishing the validity of new kinds of assessments for new kinds of uses are needed. Discussants pointed to the need for a strategy for making sure that the information an assessment provides is being used to good effect and a strategy for checking the links in a proposed learning trajectory, to be sure each stage in the progression is reasonable and well supported. The capacity to compare results across assessments is already being stretched, and the introduction of more innovative modes of assessment may present challenges that cannot be solved with current procedures. But the policy demand for comparative infor- mation suggests a need for new thinking about the precise questions that are important and the kinds of information that can provide satisfactory answers. Other fields, one participant noted, have grappled with similar issues. In medicine, for example, simulations are used in credentialing assessments, despite the lack of procedures for equating precisely across assessments. It would be worthwhile to explore the decisions that the medical profession made and their outcomes. It may be, for example, that the technical standards for modes of assessment could vary somewhat, according to the intended purpose to which the results will be put. A final thought offered on measurement was that the measurement com - munity should be conducting the sort of basic research that addresses not only immediate problems, but also the challenges and technological changes that are likely to emerge a decade from now. Some participants responded that the capacity of the testing profession is already being stretched and that there is little time for this kind of thinking—while others stressed the importance of addressing that problem. Content The measurement community may need to catch up with advances in cognitive research, but the overall picture of what students should learn is hardly complete. Deeper cognitive analysis of the content to be taught and assessed is still needed. Detailed learning trajectories have been put forward in a few areas of science and mathematics, but much remains to be done. Under- standing of the barriers to advancing along a trajectory, and of the efficacy of different approaches to teaching students to overcome those barriers, are in the beginning stages, and other subjects remain far behind science. Without a much broader base of research on these questions, the progress in developing innovative assessments will be hampered. Policy makers are currently working

OCR for page 59
 RESEARCH NEEDS from hypothesized trajectories of how learning in reading, English, language arts, and mathematics progresses from kindergarten through grade 12. These need to be elaborated, and the field needs a plan for gathering data about the validity of the common core standards that are based on them and for improv- ing the descriptions of the trajectories. Teaching and Learning An important theme of the workshop was the intimate relationship between models of measurement and models of teaching and learning. If assessments are to play the valuable role in education that many envision, they must not only align with what is known about how students think and learn, but also provide meaningful information that educators can use. And if educators are to play their part, their preparation and professional develop - ment must encompass this new thinking about assessment and the means to use it. Research is needed to support these changes. Teachers also have much to contribute to evolving thinking about teaching and assessment. Involving them in the research will be critical to ensuring that new kinds of assessment data can really improve instruction. It is not only data that teachers need, though, some participants pointed out. Their capacity to reflect on and evaluate not only their own practice and capacity to adapt, but also the value of innovations they are asked to try is also important. Working individually, in small groups, as whole departments, or even as schools, teachers can provide a check on such questions as the practical application of theoretical learning trajectories. Experimentation Several speakers noted that there is no one optimal assess- ment system waiting to be discovered. A range of international models offer promising possibilities and should be explored in greater detail. The develop - ment of state consortia offers the opportunity for the education community to articulate a variety of different models and the theories that underlie them and to work out a variety of ways of addressing key system goals. The idea that educators and policy makers should experiment on students may have negative connotations, but many participants also spoke about the critical importance of taking innovation step by step and learning from each step. In no other field, one participant pointed out, would policy makers overlook the importance of research and development to something as important as redesigning the assessment system. Ideally, the process would begin with a clear picture of the questions that need answers and the development of a strategy for researching those questions and testing hypotheses. State consortia, individual states, districts, schools, teachers, and students can all contribute to the design of new aspects of assessment systems and the important work of trying them out and collecting information about what works well and what does not. More typical, however, has been a model in which a whole new assessment system is created and presented to the public

OCR for page 59
 BEST PRACTICES FOR STATE ASSESSMENT SYSTEMS, PART I as ready to be implemented statewide. The big risk in such an approach is that implementation problems could doom an idea with valuable potential before it had a chance to be fully implemented or that individual valuable features of the approach would be thrown out along with features that did not work. Several participants suggested that retaining some or all of the elements of existing assessment systems, while gradually incorporating new elements, would allow for both the development of political and public acceptance and the flexibility to benefit from experience. An incremental approach may also make it possible to address different aspects of a system in a way that would be too radical to attempt for the whole. Whether the innovations are new instructional units based on core standards, in which assessment is embedded; revised cur- ricula that better map the learning trajectories in new standards; new formats and designs for summative assessments; or some other innovation, it should be possible to gradually construct a coherent system that meets the needs for both accountability and instructional guidance.