Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 5
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series 2 Considering the Status Quo Before considering options for integrating the most successful elements of standards-based systems into a common system, the committee wished to develop an accurate picture of the way the standards already in place are operating. A look at the views of policy makers in several states provided the basis for an overview of the evolution of the role of standards in the states and highlighted some of the key challenges that remain. Analysis of the content covered in state standards and of the performance expectations that states define provides a detailed picture of the extent of variation across the states. VIEWS FROM THE STATES The views of state-level policy makers were the first focus. Diane Massell (2008) analyzed for the committee a series of interviews with education policy makers in five states: California, Florida, Massachusetts, North Dakota, and Texas. The purpose of Massell’s interviews was to solicit opinions from a range of experienced policy makers who have been engaged in standards-based education reform, the catchall term for measures that states have taken to improve instruction and learning by organizing both policy and practice around clear, measurable standards. Massell and her colleagues hoped to trace both common themes and insights and possible differences, as well as to flag views that may be developing in response to current events. The five states were chosen to reflect both geographical diversity and diversity of experience with
OCR for page 6
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series standards. California, which initiated its standards approach during the 1980s, was an early adopter, for example, whereas North Dakota adopted standards in response to federal requirements in 1994. The 21 interview subjects included officials or education aides from governors’ offices, members of state boards of education, state legislators, and state education agency officials. Massell began with a brief overview of the evolution of standards-based reforms in the states. She highlighted the current, unprecedented degree of public engagement in the specifics of implementing standards-based systems, particularly attention on the curriculum and instruction that make them concrete. She described standards-based reform as having had the effect of “opening Pandora’s box,” because it resulted in a new transparency with regard to curriculum and instruction. Massell was borrowing a phrase from a 1950s report that described districts as reluctant to allow the public to involve itself with potentially divisive questions about what and how children should be taught. Although the minimum competency movement of the 1970s—as well as lawsuits in a number of states intended to force states to equalize school funding—increased focus on schools’ accountability to states with regard to what students actually learn, the achievement bar was set relatively low, Massell explained. The standards-based reform movement that developed in response to A Nation at Risk (National Commission on Excellence in Education, 1983) expanded the role of standards, emphasizing rigorous requirements for high school graduation. As national organizations, such as the National Council of Teachers of Mathematics (NCTM), as well as individual states began to put forward more detailed statements of what students should be expected to know and be able to do, the concept of systemic reform, suggested by Smith and O’Day (1991), sharpened the focus on how standards might lead to the desired learning. The logic of systemic reform was that the primary elements of an educational system—such as curriculum, instruction, teacher preparation, professional development, and assessment—must all be aligned to carefully developed content and performance standards in order for those standards to affect teaching and learning. In this view, educators would still retain significant flexibility in meeting expectations but be held accountable for the results. In 1994 the reauthorization of the Elementary and Secondary Education Act made standards-based reform the official national approach to public schooling by requiring states to set challenging standards aligned to assessments and accountability measures (Massell, 2008). The testing requirements imposed by the No Child Left Behind Act (NCLB) of 2001 built on that commitment, requiring states to (1) publish challenging academic content standards in English/language arts and mathematics for
OCR for page 7
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series each of grades 3 through 8 and one secondary grade, as well as standards for science in three grades, and to (2) assess students in these grades and subjects annually and to hold schools accountable for the results (http://www.ed.gov/policy/elsec/leg/esea02/index.html [April 2008]). Those requirements—and the consequences imposed by the law for failing to meet them—have meant that parents and others have a significantly increased interest in the precise content of standards, curriculum, the tests used to measure proficiency, and the material covered in classrooms. Massell’s interviews found intense differences of opinion related to standards. Her interview subjects reported disagreement about how rigorous academic and performance standards are and should be, about whether measures that sharpen accountability also lead to an unacceptable narrowing of the curriculum, and about the fairness of accountability sanctions. Despite tension around a number of issues, Massell noted that the leaders she and her colleagues interviewed generally take standards-based reform and accountability for granted, viewing this approach as a “central framework guiding state education policy and practice.” Even the leaders from North Dakota, where standards were adopted largely under federal duress, viewed this approach as a part of the landscape that is not likely to change. The other four states had made a stronger commitment to standards, and the leaders from those states described them in such terms as “even more central over time” and “integral” to policy initiatives. Massell observed that opening issues related to curriculum and instruction to public discussion has not had the effect of killing reform, as some may have feared, and the result has been “a surprising degree of agreement regarding the meaning and purpose of education.” The North Dakota respondents were more muted than the others, however. They were less likely to see standards as “central” to policy and tended to describe the effects of standards on classroom practice as marginal. Moreover, respondents from all five states reported that the focus on standards remains variable across and within both states and districts, as do their effects on instruction and learning. Massell explained that the interviewers asked state education leaders for their impressions regarding several aspects of standards-based reform, such as its impact on practice, learning opportunities, the quality of education, and resources. The leaders’ responses to these issues generated an array of reactions from workshop discussants and participants. Equity The effects of standards-based accountability systems on achievement gaps and equality of opportunity for disadvantaged students was the first
OCR for page 8
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series specific topic discussed in the interviews. In general, Massell reported, the state leaders believe that standards-based reform has led to: greater awareness of and attention to the academic performance of disadvantaged students; the expectation that all students will meet rigorous standards; reductions in achievement gaps; a more uniform education system (within states); and instruction that is tailored to the needs of individual students. They generally agreed that increased awareness of the performance of all groups may be the most widely recognized accomplishment of standards-based reform, and particularly of the NCLB legislation. Yet both the interview subjects and the workshop participants recognized the challenges of increasing equity in education and the limitations of what has been accomplished. The gaps have not been eliminated, and most agreed that reductions thus far have been fairly modest. Massell noted that according to a study by the Center on Education Policy (2007), gaps in most states remain substantial despite reductions, and some states have seen no reductions. Urban schools—those with the largest proportions of disadvantaged students—are the least likely to be meeting NCLB performance targets. Discussant Brian Stecher reinforced the concern that improvement has been modest, pointing out that “under the threat of severe sanctions from No Child Left Behind, there is an unknown amount of inflation in test scores, and what we see in terms of gap closing on state tests is not always replicated in other low-stakes assessments.” Many participants viewed the challenge of providing a truly equitable education for disadvantaged students as a central purpose of standards-based reform. Capacity The interview subjects viewed the states’ capacity to carry out all the improvements envisioned in standards-based reform as the most significant challenge to improving equity and achieving its other goals, and workshop participants were quick to agree. The reforms have stretched state agencies and districts significantly during a period in which most have been losing personnel and resources. Massell noted that Massachusetts had 325 full-time staff when its reforms were enacted into law in 1993, although it had employed 990 just 13 years earlier. Smaller staffs have been responsible for developing new standards and aligning curriculum, instruction, and assessments to them. Other technical challenges, such as measuring the progress of English language learners in a valid manner, have increased the difficulty of implementing the intended reforms.
OCR for page 9
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series NCLB required support of Title I schools (those serving specified percentages of low-income children) in specific ways. As growing numbers of schools and districts fall short of the NCLB performance targets, the strains on personnel are increasing. Fully 25 percent of schools across the country fell short of adequate yearly progress targets in 2004-2005, and the numbers have been increasing since then, although Massell noted that that figure masks significant variation across states. For example, Florida and Alabama reported that as many as 67 percent of their schools and 90 percent of their districts would fall short in 2008. Moreover, many states project that a cascading number of schools will be identified as underperforming in the coming years, as the law’s 2014 deadline for having 100 percent of students perform at proficient levels draws closer. Capacity is critical to making a standards-based system perform as it is intended to. One necessary component of the strategy is data analysis, since, ideally, thoughtful analysis of timely data will guide teachers as they plan instruction; administrators as they plan teaching assignments, professional development, and many other aspects of their schools; and district and state staff as they make decisions about key questions, such as curriculum planning and resource allocation. Yet as Stecher and other discussants pointed out, teachers, administrators, and policy makers frequently lack either the training or the time—or both—to use the data they receive wisely. Few teachers have been adequately trained to use data to make improvements in instruction, and the annual testing data that are the most typical product of accountability systems are not particularly useful for that purpose. More broadly, a number of participants stressed that standards-based accountability models provide a structure for identifying problems, but they do not directly address the challenges of bringing about better instruction. There is a risk that the standards-based reform model, and all of the testing and other time- and resource-intensive activities that are associated with it, may distract educators from one of the central challenges of reform: figuring out how to address the needs of disadvantaged students. As discussant Lynn Olson put it, one benefit of common standards could be to “force us to confront gross inequities,” although educators and the public have known for decades that disadvantaged students are not doing well. Quality Building on the capacity issues, participants also discussed the gaps between the ideal model and reality. Olson noted that in the evaluation of state standards recently published by Education Week, not one state earned a top score on each of the criteria used, and many scored very
OCR for page 10
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series poorly in a number of areas (Editorial Projects in Education, 2008). Stecher expanded on this point, arguing that very high standards are needed for the standards themselves. Because everything—including curriculum, textbooks, development of assessments, language for reporting results to the public—flows from the standards, they need not only to be clearly written and concise, but also to reflect current understanding of how children learn and their conceptual development. They also need to provide guidance about the performance criteria for determining whether students have mastered particular standards and guidance about the relative importance of the different elements included. In practice, as the Quality Counts (Editorial Projects in Education, 2008) and other evaluations attest, state standards are not yet meeting those kinds of criteria. In the absence of the guidance that standards should provide, the default source for guidance is the assessment system. As Stecher put it: “We may be drifting toward assessment-based reform, rather than standards-based reform.” Yet the standards themselves may be the best developed aspect of the evolving reform systems. Participants called attention to persistent concerns about the nature, rigor, and quality of the assessments used in many states and about the narrowing effects they can have on curriculum and instruction. For example, few states systematically provide for extensive formative assessments that teachers could use to tailor instruction to individual students’ needs. These kinds of concerns, many noted, suggest the potential benefits to states of greater uniformity among them. States could much more easily take advantage of one another’s knowledge and experience and avoid duplication of effort if they were applying consistent frameworks. This point was reinforced by questions about whether the multiple-standards model has yielded the consistency that was hoped for even within states. Researchers and policy makers from several states suggested that there is far more variation in both content and performance standards in practice than may be evident in states’ written plans. As discussant Rae Ann Kelsch explained: “People are very reluctant to give up control.” Although she spoke on the basis of the experience in North Dakota, which has not embraced standards wholeheartedly, others echoed her view. Standards-based systems have provided a model and a unifying conception of the purpose of education, “but very different goals can exist under the same banner,” as one participant put it. Discussant Scott Montgomery said that the problem lies in changing the entire system, not just in unifying the standards, so for him common standards would not necessarily bring the changes that he believes are needed.
OCR for page 11
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series VARIABILITY IN CONTENT AND PERFORMANCE STANDARDS AND ASSESSMENTS Similarities and differences among states’ content and performance standards are key to understanding the extent to which any move toward more common standards would have a substantive impact on current standards-based systems. An examination of states’ approaches was organized around three framing questions: How and to what extent do K-12 state content standards in English/ language arts, mathematics, and science at key grades vary? How and to what extent do K-12 state performance standards in English/language arts, mathematics, and science at key grades vary? How and to what extent does the implementation of K-12 state content and performance standards in multiple academic subjects in classrooms vary? Analysis of both content and performance standards provided the foundation for an extensive discussion of these questions. Andrew Porter and his colleagues described a very detailed review of 31 states’ standards in the three subjects, with a focus on grades 4 and 8, which was developed for the workshop. Michael Petrilli described an analysis conducted by the Fordham Foundation and the Northwest Evaluation Association to compare proficiency standards across states, and Peggy Carr described the results of an analysis by the National Center for Education Statistics of the relationship between proficiency standards for state assessments and those of the National Assessment of Educational Progress.1 Content Standards Porter and his colleagues addressed the first question by analyzing state content standards in English/language arts/reading, mathematics, and science for grades K-8 (Porter, Polikoff, and Smithson, 2008). Their analysis was based on a conceptual framework for considering the primary influences on teachers’ instructional practices. Their hypothesis was that teachers are most strongly influenced by standards policies that have five characteristics: 1 The term “proficiency standards” refers to the level of performance identified on a particular test as the minimum that qualifies as “proficient.” Thus, it is a type of performance standard.
OCR for page 12
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series They are specific in their messages to teachers about what they are to teach. They are consistent (aligned) among themselves so that teachers receive a coherent message. They have authority, in that they are developed and promoted by experts, are officially adopted by the state, are consistent with standards practice, and are promoted by charismatic individuals—meaning individuals who provide leadership and motivate those who must implement the standards. They have power, in that compliance with them is rewarded, whereas failure to comply is sanctioned. They have stability, in that they are kept in place over time. The team also based their analysis on a methodology they developed for describing in detail what it is that teachers teach, which they call a three-dimensional content language. Although this methodology actually predated the standards-based reform movement, it has proved a useful tool for examining the content of state standards documents. Porter described the method as a way of producing a visual representation of the relative coverage of various elements of a particular field that is similar to a topographical map of a geographical region. Using the content language, Porter and colleagues have divided each subject into general areas. For example, in mathematics there are 16 general areas (e.g., operations, measurement, basic algebra), and each of these can be further subdivided into between 9 and 14 more specific topics—for a total of 217. Apart from the subtopics that make up each field, the language also distinguishes levels of cognitive demand, which are also somewhat different for each subject. There are eight cognitive levels for mathematics: memorize, perform procedures, demonstrate understanding, conjecture, generalize, prove, solve novel problems, and make connections. Thus, Porter explained, content is defined as the intersection of these two dimensions. Using this tool one can determine, for example, not just whether or not linear equations are covered, but also whether students will be expected to memorize one, solve one, or use one to solve a story problem. To apply this language analysis to a state’s standards, trained analysts review and code the most specific available description of the standard for a particular subject and grade level. Each standard is analyzed by three to five analysts, and items that are difficult to code are flagged and discussed. The codes are entered into cells as proportions, with 0 representing no emphasis and 1 representing a very strong one. The cells are used to build the visual display that illustrates both the degree of focus on the various topics and the cognitive demand. Porter and colleagues drew on data from 31 states, although not all
OCR for page 13
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series provided data for every subject and grade level. The team also analyzed the national science standards and those of the NCTM. They focused on grades 4 and 8, with the goal of highlighting the degree to which states’ standards showed overlap or conceptual progressions between those two grades. Having entered the codes, they were able to conduct a variety of analyses and comparisons. For any pair of states for which they had data, the alignment for a particular standard can be calculated. Using averages of these results, they were also able to calculate alignment across and within grade levels. Figure 2-1 presents a pair of coarse-grained content maps showing the results for the two states that are most aligned in English/language arts/ reading for fourth grade. It is clear from the figure that content areas such as vocabulary, comprehension, and language study (the darkest areas) are strongly emphasized in both Ohio and Indiana and that neither state places any emphasis on phonemic awareness. These content areas, which FIGURE 2-1 Coarse-grained content maps for English/language arts/reading for grade 4. SOURCE: Porter, Polikoff, and Smithson (2008, Figure 1).
OCR for page 14
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series show up as darkest for both states, also line up along the column labeled “explain.” This indicates that both states target the level of cognitive demand the researchers labeled as “explain” for these content areas. In other words, whichever areas show up as dark in the maps for both of the states being compared are ones that are emphasized in both states. One of the areas that showed up as strongly aligned for these two states, comprehension at the “explain” level of challenge, is shown in greater detail in the two fi ne-grained content maps in Figure 2-2. The subcategories of this instructional area—which include recognizing the meaning of words from the context; understanding of phrase, sentence, and paragraph; and the like—are mapped in the same way, according to degree of emphasis and cognitive level. This pair of maps shows, for example, that both Ohio and Indiana place strong emphasis on such strategies as activating prior FIGURE 2-2 Fine-grained content maps for comprehension, English/language arts/reading for grade 4. SOURCE: Porter, Polikoff, and Smithson (2008, Figure 22).
OCR for page 15
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series knowledge and that both associate the same degree of cognitive demand with this strand of the standard for reading comprehension. The team also looked at the degree of alignment among 14 states in English/language arts/reading for grades 4 and 8, as shown in Table 2-1. The index runs from 0 to 1.00 with 1.00 representing perfect alignment. The results show significant variation, from lows, such as the .07 alignment between Maine and Wisconsin for grade 8, to highs, such as .47 between Ohio and California for grade 8. The researchers also examined the degree of alignment among the states’ standards and the national science standards and the NCTM standards. Porter and colleagues (2008) have produced a voluminous body of data: 90 figures, 10 tables, and an appendix; this material and further detail about their analysis can be found in their paper.2 Porter explained that although it would be much easier if these data could be simplified, he and his colleagues could find no substitute for the fine-grained analysis for answering the questions at hand. Nevertheless, some key general points were evident. First, they found little evidence to support the hypothesis that there is a de facto national curriculum. The degree of variability they found across states, and between state and national standards, does not support that hypothesis. In fact, they found that the alignment of topic coverage within states from grade to grade (the degree of overlap in what is in the standards for each grade, as students ideally progress) is generally greater than the degree of alignment across states in the material they cover at particular grades. The repetition, Porter suggested, sends students the message: “Don’t you dare learn this the first time we teach it; otherwise you are going to be bored out of your skull in the subsequent grades.” Porter and colleagues did find some indication that a few core areas are covered more consistently across states than the overall alignment data would show—or a small de facto common core curriculum. However, they also concluded that states’ content standards are in general not focused on a few big ideas. Although the states vary in this as well, overall their standards do not demonstrate the clear focus and discipline that many have advocated. State Assessments Assessing the extent to which the performance standards that states set come close to defining a de facto common standard for proficiency was the impetus behind another study, described by Michael Petrilli. 2 The material is available at http://www7.nationalacademies.org/cfe/State_Standards_Workshop_1_Agenda.html [May 2008].
OCR for page 16
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series TABLE 2-1 State-to-State Alignment, 4th and 8th Grade Standards for English/LanguageArts/ Reading CA Stnds Gr.4 DE GLEs Gr.4 ID Stnds Gr.4 IN Stnds Gr.4 KS Stnds Gr.4 ME GLEs Gr.4 MN ELAR Stnds. Gr.4 MT R Stnds Gr.4 NH Reading GLEs Gr.4 OH Indctrs Gr.4 OK Stnds Gr.4 OR Stnds Gr.4 VT GLEs Gr.4 WI Stnds Gr.4 CA Stnds Gr.4 (2005) 1.00 0.14 0.26 0.40 0.12 0.11 0.38 0.23 0.09 0.30 0.43 0.33 0.31 0.32 DE GLEs Gr.4 (2006) 0.14 1.00 0.09 0.16 0.13 0.11 0.17 0.19 0.11 0.16 0.17 0.19 0.15 0.16 ID Stnds Gr.4 (2006) 0.26 0.09 1.00 0.28 0.13 0.11 0.30 0.14 0.15 0.22 0.31 0.29 0.27 0.26 IN Stnds Gr.4 (2007) 0.40 0.16 0.28 1.00 0.14 0.18 0.38 0.17 0.19 0.48 0.34 0.42 0.35 0.30 KS Stnds Gr.4 (2005) 0.12 0.13 0.13 0.14 1.00 0.21 0.19 0.30 0.24 0.14 0.20 0.22 0.18 0.13 ME GLEs Gr.4 (2005) 0.11 0.11 0.11 0.18 0.21 1.00 0.14 0.19 0.23 0.18 0.16 0.18 0.17 0.11 MN ELAR Stnds. Gr.4 (2005) 0.38 0.17 0.30 0.38 0.19 0.14 1.00 0.28 0.18 0.33 0.45 0.42 0.42 0.30 MT R Stnds Gr.4 (2005) 0.23 0.19 0.14 0.17 0.30 0.19 0.28 1.00 0.19 0.19 0.25 0.25 0.22 0.22 NH Reading GLEs Gr.4 (2005) 0.09 0.11 0.15 0.19 0.24 0.23 0.18 0.19 1.00 0.13 0.12 0.18 0.17 0.13 OH Indctrs Gr.4 (2007) 0.30 0.16 0.22 0.48 0.14 0.18 0.33 0.19 0.13 1.00 0.29 0.39 0.31 0.30 OK Stnds Gr.4 (2005) 0.43 0.17 0.31 0.34 0.20 0.16 0.45 0.25 0.12 0.29 1.00 0.37 0.34 0.32 OR Stnds Gr.4 (2007) 0.33 0.19 0.29 0.42 0.22 0.18 0.42 0.25 0.18 0.39 0.37 1.00 0.39 0.28 VT GLEs Gr.4 (2006) 0.31 0.15 0.27 0.35 0.18 0.17 0.42 0.22 0.17 0.31 0.34 0.39 1.00 0.27 WI Stnds Gr.4 (2003) 0.32 0.16 0.26 0.30 0.13 0.11 0.30 0.22 0.13 0.30 0.32 0.28 0.27 1.00 CA Stnds Gr.8 DE GLEs Gr.8 ID Stnds Gr.8 IN Stnds Gr.8 KS Stnds Gr.8 ME GLEs Gr.8 MN ELAR Stnds. Gr.8 MT R Stnds Gr.8 NH Reading GLEs Gr.8 OH Indctrs Gr.8 OK Stnds Gr.8 OR Stnds Gr.8 VT GLEs Gr.8 WI Stnds Gr.8 CA Stnds Gr.8 (2005) 1.00 0.28 0.31 0.35 0.39 0.12 0.46 0.22 0.21 0.47 0.13 0.43 0.39 0.38 DE GLEs Gr.8 (2005) 0.28 1.00 0.26 0.25 0.24 0.24 0.30 0.32 0.26 0.23 0.15 0.29 0.27 0.18 ID Stnds Gr.8 (2006) 0.31 0.26 1.00 0.29 0.31 0.13 0.34 0.13 0.18 0.32 0.11 0.40 0.31 0.26 IN Stnds Gr.8 (2006) 0.35 0.25 0.29 1.00 0.29 0.09 0.34 0.15 0.17 0.28 0.12 0.34 0.28 0.24 KS Stnds Gr.8 (2003) 0.39 0.24 0.31 0.29 1.00 0.16 0.37 0.24 0.24 0.38 0.15 0.38 0.39 0.24 ME GLEs Gr.8 (2005) 0.12 0.24 0.13 0.09 0.16 1.00 0.12 0.24 0.24 0.10 0.15 0.18 0.13 0.07 MN ELAR Stnds. Gr.8 (2005) 0.46 0.30 0.34 0.34 0.37 0.12 1.00 0.23 0.29 0.47 0.23 0.48 0.41 0.32 MT R Stnds Gr.8 (2005) 0.22 0.32 0.13 0.15 0.24 0.24 0.23 1.00 0.23 0.18 0.19 0.20 0.19 0.15 NH Reading GLEs Gr.8 (2005) 0.21 0.26 0.18 0.17 0.24 0.24 0.29 0.23 1.00 0.20 0.22 0.24 0.22 0.10 OH Indctrs Gr.8 (2005) 0.47 0.23 0.32 0.28 0.38 0.10 0.47 0.18 0.20 1.00 0.12 0.40 0.38 0.33 OK Stnds Gr.8 (2007) 0.13 0.15 0.11 0.12 0.15 0.15 0.23 0.19 0.22 0.12 1.00 0.19 0.18 0.08 OR Stnds Gr.8 (2007) 0.43 0.29 0.40 0.34 0.38 0.18 0.48 0.20 0.24 0.40 0.19 1.00 0.43 0.30 VT GLEs Gr.8 (2006) 0.39 0.27 0.31 0.28 0.39 0.13 0.41 0.19 0.22 0.38 0.18 0.43 1.00 0.22 WI Stnds Gr.8 (2003) 0.38 0.18 0.26 0.24 0.24 0.07 0.32 0.15 0.10 0.33 0.08 0.30 0.22 1.00 SOURCE: Porter, Polikoff, and Smithson (2008, Table 1).
OCR for page 17
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series This study, conducted jointly by the Fordham Foundation and the North-west Evaluation Association (NWEA), was designed to address three questions: Where are states setting the proficiency bar, and can their approaches to setting cut scores be compared in a fair way? Given the pressure to bring 100 percent of students to proficient levels, are states lowering their standards over time in order to meet that goal? Are cut scores relatively consistent in terms of difficulty level across grades? Fordham and the NWEA decided to collaborate to conduct this study because the NWEA has developed a computer-adaptive assessment system that is used by many districts. The test is used primarily for diagnostic testing, so that districts can pinpoint the performance levels of individual students in different areas that are covered in the relevant state standards. Thus, the assessment is designed to be as well aligned as possible to the content standards of the states in which it is used. The NWEA maintains a large pool of items, and it constructs tests for districts by using the items that are closest to the standards for which that district is responsible. Because all the items are pegged to a common scale, the NWEA is able to make some comparisons across states. With this resource, the researchers were able to estimate where on the NWEA scale a given state is setting its cut score. In many cases, that calculation was available for two times, 2003 and 2006, and thus they were able to consider the possibility that the cut scores had changed over time in those states. They had data for a total of 830,000 students in 26 states who had taken both the NWEA assessment and their own state exam. The researchers’ primary finding was that there is enormous variability in the level of difficulty of states’ tests—a range from approximately the 6th percentile (94 percent would pass) to approximately the 77th percentile (23 percent would pass). These findings are shown in Figure 2-3. To illustrate the kinds of differences these numbers represent, Petrilli provided two sample grade 4 items from the NWEA assessment, each with a difficulty level at the cut score of one of the states. For the Wisconsin cut score, which they had calculated at the 16th percentile on the NWEA scale, the sample item asked students to select from a group of sentences the one that “tells a fact, not an opinion.” To represent the comparable cut score for Massachusetts, calculated at the 65th percentile, Petrilli showed an item that asked students to read a complex, difficult passage (excerpted from a work by Leo Tolstoy) and to pick from a list of factual statements the one that is actually found in the passage.
OCR for page 18
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series FIGURE 2-3 Difference in difficulty of state tests. SOURCE: Petrilli (2008).
OCR for page 19
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series Petrilli believes the implications of this degree of difference are profound. If, as many people believe, the high stakes attached to state tests mean that teachers focus the bulk of their attention on the students who are just below the proficient level, to get them over that bar, then teachers in Wisconsin will be targeting their instruction at a very low level in comparison to those in Massachusetts. This analytical approach also made it possible to compare the cut scores that states set for mathematics and for reading, at least in terms of percentiles. Doing so is useful, Petrilli explained, because test results that seem to demonstrate that students are doing better in one subject than another, may actually demonstrate that the level of mastery needed to score at the proficient level is quite different for each subject. With regard to the second question, whether states are engaged in a so-called race to the bottom, the researchers were surprised to find that this does not seem to be the case. Rather, Petrilli characterized the trend as a “walk to the middle.” Most states kept their cut scores relatively consistent across the time period studied, but the states that began with the highest standards had moderated their standards somewhat, while those with the lowest standards had raised theirs. He cautioned, however, that because they were working with percentiles, the change over time could be explained either by intentional shifts in cut scores or by changes in students’ actual achievement levels. In terms of the last question, the vertical alignment of state standards, the analysis showed that they are not well calibrated, grade level to grade level. In the majority of states, the elementary standards are set significantly lower than the middle school standards. When this is the case, students may have no trouble with the grade 3 test, proceed normally through subsequent grades, and then stumble on the grade 8 test. The aggregate results may inaccurately indicate a problem with middle school instruction, in comparison to elementary school instruction. Moreover, if standards are not aligned vertically, the test results will not be good indicators of students’ growth over time. Petrilli drew three conclusions from the research. First, state performance standards need “an overhaul.” If the goal is for standards to progress cumulatively from kindergarten through grade 12, states should begin with rigorous high school graduation requirements and work backward to develop vertically aligned standards. Second, Petrilli believes that the objective of bringing 100 percent of students to proficiency has become a perverse incentive that has the effect of lowering achievement. Finally, in responding to the workshop theme, he said that discussion of national standards should continue—that such discussion would have the effect of creating consistent objectives for students across the states.
OCR for page 20
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series Performance Standards Another way to examine the question of how much performance expectations vary across states is to use the National Assessment of Educational Progress (NAEP). NCLB requires not only that states report progress on their own assessments, but also that they participate in NAEP so that it can serve as a common yardstick for comparing students’ proficiency across states. The results of these comparisons indicate striking discrepancies between the performance required for proficiency on state assessments and what is required for proficiency on NAEP assessments. These results have received significant public attention, and, as presenter Peggy Carr explained, the National Center for Education Statistics (NCES) recognized the need for a more precise methodology with which to make these comparisons (see National Center for Education Statistics, 2007). Figure 2-4 illustrates the discrepancies between the proficiency levels that states use to report their adequate yearly progress and the NAEP proficiency levels, in terms of percentages of students meeting the standard. This information is useful to provide a snapshot, Carr explained. Since each state is asked to use NAEP as a benchmark, the comparison between each state and NAEP is valid. However, comparing states just by using the percentage meeting the standard is less useful. Consequently, FIGURE 2-4 A comparison of state proficiency and NAEP standards. SOURCE: Carr (2008).
OCR for page 21
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series FIGURE 2-5 Methodology for comparing state proficiency standards. SOURCE: Carr (2008). NCES staff used an equipercentile equating method to align the distributions of pairs of scales, the NAEP scale and that of each of the states. In other words, they used results from schools that had participated in NAEP to calculate what they called a NAEP-equivalent score on the state assessment. Having done that for each state, they could then compare the NAEP-equivalent scores of any state with that of any other. What the comparison shows is the relative degree of challenge of a state’s standards using the NAEP scale as the common yardstick. Figure 2-5 shows how the comparison works for two hypothetical states. The results of this analysis were quite similar to the results of the Fordham/NWEA analysis. Generally, the researchers found that states’ proficiency levels varied significantly and that the majority map onto the “below basic” range on the NAEP scale, although the distribution varied by subject and grade. The results for mathematics are shown in Figures 2-6 and 2-7. The researchers also looked at the correlation between the proportion of students that a state reports as meeting its proficiency standards and the NAEP-equivalent score. They found that the correlation was negative: that is, the higher the number of students that a state reports are passing its own standards, the less challenging are that state’s standards. The researchers also found that the position of a state’s adequate yearly
OCR for page 22
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series FIGURE 2-6 A comparison of proficiency standards in grade 4 mathematics. SOURCE: Carr (2008). FIGURE 2-7 A comparison of proficiency standards in grade 8 mathematics. SOURCE: Carr (2008).
OCR for page 23
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series progress standards on the NAEP scale bears little relationship to that state’s performance on the NAEP assessment. In other words, students’ performance on NAEP cannot be predicted from the relative difficulty of the state’s own standards. Carr’s conclusions from these results were similar to Petrilli’s. To illustrate their significance, she highlighted the results for three contiguous states, Georgia, North Carolina, and South Carolina. Students in these three states all perform at about the same level on the NAEP reading assessment, but the states have set very different standards for their students. An example of the practical effect of this discrepancy is that a student who moves from North Carolina to South Carolina might go from being viewed as a proficient reader to being placed in a remedial class. Her closing point was that state assessments vary widely in both content and design, and that states may attach different meanings to the label “proficient.” In the context of NAEP, proficiency is defined as “competency over challenging subject matter,” whereas states generally define proficiency as grade-level performance. Participants and discussants had a range of comments about the variation that was described, factors that may contribute to it, and its implications. Discussant Barbara Reys drew on her experiences co-chairing the standards development process for mathematics in Missouri to highlight some of the practical challenges of working toward common standards. Apart from the requirements of states that prize their autonomy, she noted the limitations of existing national standards, which may not be grade specific and lack other critical details. She was not surprised at the finding that many states’ standards do not align with national ones because “it’s really the decisions about what you want to focus on at particular grades that are the tough ones.” Reys also showed some results from an analysis of consistency she had conducted of K-8 mathematics standards. Her findings echoed those already presented. She found dramatic variation from state to state in the grade placement of particular concepts. The critical finding was again that a given learning expectation might be found in the grade 1 standards in one state and in the grade 3 standards in another state. These differences create a significant complication for textbook publishers who want to serve multiples states. From Reys’s analysis, only 4 of 108 possible learning expectations for fourth graders were common across 10 states—suggesting that a textbook publisher might choose to incorporate all 108 of them. Since the content of textbooks has a significant effect on teachers’ instructional plans, this lack of overlap becomes a self-reinforcing pressure against curricular focus. At the same time, however, textbooks are a potential tool for increasing uniformity because they are so influential.
OCR for page 24
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series Discussant William Schmidt characterized the variation among state standards as “enormous.” He believes that both mathematics and science standards display “the maximum possible variation at every combination of grade level and topic.” He suggested that this is particularly bad for mathematics because that subject has an inherent logic, so that it is essential that students learn concepts in a particular order if they are to develop sound mathematical thinking. The problem, he said, is that because so few standards establish coherence and vertical alignment in mathematics goals, the result is a mishmash, with many concepts being introduced far too early and then repeated over and over in subsequent grades. Ironically, he explained, the topics that get the least coverage tend to be the most important—the deeper topics that build conceptual understanding. Schmidt has also observed that district standards vary as much as those of states. Moreover, he suggested, variation at the classroom level, in terms of what teachers are actually covering with their students, far outpaces the variation at the district and state levels. For Schmidt, this variation, which permeates the entire education system, is “the Achilles heel of the No Child Left Behind Act.” Based on his analysis, he argued that the degree of variation in the opportunities children have to learn makes it inevitable that many will be left behind. Discussant Peter McWalters offered a perspective from Rhode Island, which has coordinated its standards development with two other states, New Hampshire and Vermont.3 Although the presentation suggested a number of questions for this consortium of states to ponder as they work to improve their standards, he labeled the effort a success and added that he would be happy to see a national model. He noted that NCLB had been the impetus for the efforts of the New England states because none of the three has a testing infrastructure and all are too small to produce what is required on their own. They were also fortunate in that none of them has regulatory requirements, such as state-mandated standards, that would present a barrier to the states working collectively. However, McWalters identified what he sees a major stumbling block to a national approach to standards—that “no state would trust the feds after our experience with the beginning of No Child Left Behind…. There is zero trust.” He also supported points made earlier regarding states’ capacity to change in the ways that are needed. For him, the biggest challenge is to find ways to serve diverse students with diverse needs. To do that successfully, teachers will need a command of their subjects—the content and the pedagogy that is “way beyond what we currently have.” 3 The three-state consortium, the New England Common Assessment Program, is discussed in greater detail in Chapter 7.
OCR for page 25
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series PARADOXES Committee chair Lorraine McDonnell reflected that the discussion of standards as they are currently operating, and the findings regarding variation across the states, yielded two significant paradoxes. The first is that, although standards are very well institutionalized across the country, with very few voices challenging their value as an organizing framework for reform, it is also the case that standards-based reform means different things to different people. The term in some ways disguises deep-seated differences about both priorities and strategies for achieving education goals. The second paradox is that, although there is little ostensible disagreement about the standards-based approach, there is a wide gap between the theoretical model and the reality of standards-based accountability systems in practice. The theoretical model of an aligned system is compelling as a strategy for meeting the needs of diverse students. Yet in practice, states and districts have lacked the capacity, resources, and, perhaps in some cases, the knowledge or the will to put all the essential elements into place. Participants described legislators and other policy makers who have viewed the development of a new core curriculum or the raising of high school graduation standards as all that is required to pursue standards-based reform. Disputes over the significance of testing results, and the effects the reporting of these results can have, have further clouded the discussion. The result is a situation in which the core concept—the goal of defining standards and holding educators accountable for the results—seems to be constant across states, but in which the execution of this goal yields starkly different results.
OCR for page 26
Common Standards for K-12 Education?: Considering the Evidence - Summary of a Workshop Series This page intentionally left blank.