3
Analyzing State Standards

Similarities and differences among states’ content and performance standards are key to understanding the extent to which any move toward more common standards would have a substantive impact on current standards-based systems. An examination of states’ approaches was organized around three framing questions:

  1. How and to what extent do K-12 state content standards in English/language arts, mathematics, and science at key grades vary?

  2. How and to what extent to K-12 state performance standards in English/language arts, mathematics, and science at key grades vary?

  3. How and to what extent does the implementation of K-12 state content and performance standards in multiple academic subjects in classrooms vary?

Analysis of both content and performance standards provided the foundation for an extensive discussion of these questions. Andrew Porter and his colleagues described a very detailed review of 31 states’ standards in the three subjects, with a focus on grades 4 and 8, which was developed for the workshop. Michael Petrilli described an analysis conducted by the Fordham Foundation and the Northwest Evaluation Association (NWEA) to compare proficiency standards across states. Peggy Carr closed the session with a description of the results of an analysis by the National Center



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 17
3 Analyzing State Standards S imilarities and differences among states’ content and performance standards are key to understanding the extent to which any move toward more common standards would have a substantive impact on current standards-based systems. An examination of states’ approaches was organized around three framing questions: 1. How and to what extent do K-12 state content standards in English/language arts, mathematics, and science at key grades vary? 2. How and to what extent to K-12 state performance standards in English/language arts, mathematics, and science at key grades vary? 3. How and to what extent does the implementation of K-12 state content and performance standards in multiple academic subjects in classrooms vary? Analysis of both content and performance standards provided the foundation for an extensive discussion of these questions. Andrew Porter and his colleagues described a very detailed review of 31 states’ standards in the three subjects, with a focus on grades 4 and 8, which was developed for the workshop. Michael Petrilli described an analysis conducted by the Fordham Foundation and the Northwest Evaluation Association (NWEA) to compare proficiency standards across states. Peggy Carr closed the ses- sion with a description of the results of an analysis by the National Center 1

OCR for page 17
1 ASSESSING THE ROLE OF K-12 ACADEMIC STANDARDS IN STATES for Education Statistics (NCES) of the relationship between proficiency standards for state assessments and those the National Assessment of Educational Progress (NAEP).1 VARIABILITY IN STATE CONTENT STANDARDS Porter and his colleagues addressed the first question by analyzing state content standards in English/language arts/reading, mathematics, and science for grades K-8 (Porter, Polikoff, and Smithson, 2008). Their analysis was based on a conceptual framework for considering the pri- mary influences on teachers’ instructional practices. Their hypothesis is that teachers are most strongly influenced by standards policies that have five characteristics: 1. They are specific in their messages to teachers about what they are to teach. 2. They are consistent (aligned) among themselves so that teachers receive a coherent message. 3. They have authority, in that they are developed and promoted by experts, are officially adopted by the state, are consistent with standards practice, and are promoted by charismatic individuals (meaning individuals who provide leadership and motivate those who must implement the standards). 4. They have power, in that compliance with them is rewarded while failure to comply is sanctioned. 5. They have stability, in that they are kept in place over time. The team also based their analysis on a methodology that Porter and his colleagues had developed for describing in detail what it is that teach- ers teach, which they have called a three-dimensional content language. Although this methodology actually predated the standards-based reform movement, it has proved a useful tool for examining the content of state standards documents. Porter described the method as a way of producing a visual repre- sentation of the relative coverage of various elements of a particular field that is similar to a topographical map of a geographical region. Using the content language, Porter and his colleagues have divided each subject into general areas. For example, in mathematics there are 16 general areas (e.g., operations, measurement, basic algebra), and each of these can be further 1 The term “proficiency standards” refers to the level of performance identified on a par- ticular test as the minimum that qualifies as “proficient.” Thus, it is a type of performance standard.

OCR for page 17
1 ANALYZING STATE STANDARDS subdivided into between 9 and 14 more specific topics—for a total of 217. Apart from the subtopics that make up each field, the language also dis- tinguishes levels of cognitive demand, which are also somewhat different for each subject. There are eight cognitive levels for mathematics: memo- rize; perform procedures; demonstrate understanding, conjecture; gener- alize; prove; solve novel problems; and make connections. Thus, Porter explained, content is defined as the intersection of these two dimensions. Using this tool, for example, one can determine not just whether or not linear equations are covered, but also whether students will be expected to memorize one, solve one, or use one to solve a story problem. To apply this language analysis to a state’s standards, trained analysts review and code the most specific available description of the standard for a particular subject and grade level. Each standard is analyzed by three to five analysts, and items that are difficult to code are flagged and discussed. The codes are entered into cells as proportions, with 0 repre- senting no emphasis and 1 representing a very strong focus. The cells are used to build the visual display that illustrates both the degree of focus on the various topics and the cognitive demand. Porter and his colleagues drew on data from 31 states, though not all provided data for every subject and grade level. The team also analyzed the national science standards and those of the National Council of Teachers of Mathematics (NCTM). They focused on grades 4 and 8, with the goal of highlighting the degree to which states’ standards showed overlap or conceptual progressions between those two grades. Having entered the codes, they were able to conduct a variety of analyses and comparisons. For any pair of states for which they had data, the alignment for a par- ticular standard can be calculated. Using averages of these results, they were also able to calculate alignment across and within grade levels. From the example in Figure 3-1, which shows the results for the two states that are most aligned in English/language arts/reading for 4th grade, it is clear that content areas such as vocabulary, comprehension, and language study (the darkest areas) are strongly emphasized in both Ohio and Indiana and that neither state places any emphasis on phone- mic awareness. Moreover, both states clearly focused on the capacity to explain as their target level of cognitive demand. Figure 3-2 shows the mapping for one of the areas of strong alignment, comprehension, by its subcategories. The team also looked at degrees of variance, as shown in Table 3-1, which depicts the degree of alignment among 14 states in English/language arts/reading for grades 4 and 8. The results show significant variation, from lows such as the 0.07 correlation between Maine and Wisconsin for grade 8, to highs such as 0.47 between Ohio and California for grade 8. Tables 3-2 and 3-3 show the degree of alignment among the states’

OCR for page 17
20 ASSESSING THE ROLE OF K-12 ACADEMIC STANDARDS IN STATES Alignment = .48 FIGURE 3-1 Coarse-grained content maps: English/language arts/reading for grade 4. SOURCE: Porter, Polikoff, and Smithson (2008, Figure 1). standards and the national science standards (5 states), and the NCTM standards (10 states). In all, Porter and his colleagues have produced a voluminous body of data: 90 figures, 10 tables, and an appendix. 2 Porter explained that although it would be much easier if these data could be simplified, he and his colleagues could find no substitute for the fine-grained analysis for answering the questions at hand. Nevertheless, some key general points were evident. First, they found little evidence to support the hypothesis that there is a de facto national curriculum. The degree of variability they found across states, and between state and national standards, does not support that hypothesis. In fact, they found 2 The material is available at http://www7.nationalacademies.org/cfe/State_Standards_ Workshop_1_Agenda.html [May 2008].

OCR for page 17
21 ANALYZING STATE STANDARDS FIGURE 3-2 Fine-grained content maps for comprehension: English/language arts/reading for grade 4. SOURCE: Porter, Polikoff, and Smithson (2008, Figure 22). that the alignment of topic coverage within states from grade to grade (the degree of overlap in what is in the standards for each grade, as students ostensibly progress) is generally greater than the degree of alignment across states in the material they cover at particular grades. The repetition, Porter suggested, sends students the message: “Don’t you dare learn this the first time we teach it; otherwise you are going to be bored out of your skull in the subsequent grades.” Porter and his colleagues did find some indication that there are a few core areas that are covered more consistently across states than the overall alignment data would show—or a small de facto common core curriculum. However, Porter and his colleagues also concluded that states’ content standards are in general not focused on a few big ideas. Though the states

OCR for page 17
22 TABLE 3-1 State-to-State Alignment, 4th and 8th Grade Standards for English/Language Arts/ Reading (ELAR) SOURCE: Porter, Polikoff, and Smithson (2008, Table 1).

OCR for page 17
2 ANALYZING STATE STANDARDS TABLE 3-2 Alignment Among States for the National Science Standards for Grades 1-8 SOURCE: Porter, Polikoff, and Smithson (2008, Table 6). vary in this as well, overall their standards do not demonstrate the clear focus and discipline that many have advocated. VARIABILITY IN STATE ASSESSMENTS Assessing the extent to which the performance standards that states set come close to defining a de facto common standard for proficiency was the impetus behind another study, described by Michael Petrilli. This study, conducted jointly by the Fordham Foundation and NWEA, was designed to address three questions: 1. Where are states setting the proficiency bar, and can states’ approaches to setting cut scores be compared in a fair way? 2. Given the pressure to bring 100 percent of students to proficient levels, are states lowering their standards over time in order to meet that goal? 3. Are cut scores relatively consistent in terms of difficulty level across grades? Fordham and the NWEA decided to collaborate to conduct this study because the NWEA develops a computer-adaptive assessment system that is used by many districts. The test is used primarily for diagnostic testing so that districts can pinpoint the performance levels of individual students in different areas that are covered in the relevant state standards. Thus, the assessment is designed to be as well aligned as possible to the content standards of the states in which it is used. NWEA maintains a large pool of items, and they construct tests for districts by using the items that are closest to the standards for which that district is responsible. Because all the items are pegged to a common scale, NWEA is able to make some comparisons across states.

OCR for page 17
2 TABLE 3-3 Alignment Among States on Mathematics Standards for Grades 1-8 SOURCE: Porter, Polikoff, and Smithson (2008, Table 7).

OCR for page 17
2 ANALYZING STATE STANDARDS With this resource, the researchers were able to estimate where on the NWEA scale a given state is setting its cut score. In many cases they had that calculation for two times, 2003 and 2006, and were thus able to consider the possibility that the cut scores had changed over time in those states. They had data for a total of 830,000 students in 26 states who had taken both NWEA’s assessment and their own state exam. The researchers’ primary finding was that there is enormous variabil- ity in the level of difficulty of states’ tests—a range from approximately the 6th percentile (94 percent would pass) to approximately the 77th per- centile (23 percent would pass). These findings are shown in Figure 3-3. To illustrate the kinds of differences these numbers represent, Petrilli provided two sample 4th-grade items from the NWEA assessment, each with a difficulty level at the cut score of one of the states. For the Wis- consin cut score, which they had calculated at the 16th percentile on the NWEA scale, the sample item asked students to select from a group of sentences the one that “tells a fact, not an opinion.” To represent the comparable cut score for Massachusetts, calculated at the 65th percentile, Petrilli showed an item that asked students to read a complex, difficult passage (excerpted from a work by Leo Tolstoy) and to pick from a list of factual statements the one that is actually found in the passage. Petrilli believes the implications of this degree of difference are pro- found. If, as many people believe, the high stakes attached to state tests mean that teachers focus the bulk of their attention on the students who are just below the proficient level, to get them over that bar, then teach- ers in Wisconsin will be targeting their instruction at a very low level in comparison with those in Massachusetts. This analytical approach also made it possible to compare the cut scores that states set for math and for reading, at least in terms of percentiles. Doing so is useful, Petrilli explained, because test results that seem to demonstrate that students are doing better in one subject that the other, may actually demonstrate that the level of mastery needed to score at the proficient level is quite differ- ent for each subject. With regard to the second question, whether states are engaged in a so-called race to the bottom, the researchers were surprised to find that this does not seem to be the case. Rather, Petrilli characterized the trend as a “walk to the middle.” Most states had kept their cut scores relatively consistent across the time period studied, but the states that began with the highest standards had moderated their standards somewhat, while those with the lowest standards had raised theirs. He cautioned, however, that because they were working with percentiles, the change over time could be explained either by intentional shifts in cut scores or by changes in students’ actual achievement levels.

OCR for page 17
2 3-3 Broadside 3rd Grade Reading Cut Scores 8th Grade Math Cut Scores FIGURE 3-3 Difference in difficulty of state tests. SOURCE: Petrilli (2008).

OCR for page 17
2 ANALYZING STATE STANDARDS In terms of the last question, the vertical alignment of state standards, the analysis showed that they are not well calibrated, grade level to grade level. In the majority of states, the elementary standards are set signifi- cantly lower than the middle school standards. Where this is the case, students may have no trouble with the 3rd-grade test, proceed normally through subsequent grades, and then stumble on the 8th-grade test. The aggregate results may inaccurately indicate a problem with middle school instruction, in comparison with elementary school instruction. Moreover, if standards are not aligned vertically, the test results will not be good indicators of students’ growth over time. Petrilli drew three conclusions from the research. First, state perfor- mance standards need “an overhaul.” If the goal is for standards to prog- ress cumulatively from kindergarten through 12th grade, states should begin with rigorous high school graduation requirements and work back- ward to develop vertically aligned standards. Second, Petrilli believes that the objective of bringing 100 percent of students to proficiency has become a perverse incentive that has the effect of lowering achievement. Finally, in responding to the workshop theme, he said that discussion of common standards should continue—that such discussion would have the effect of creating consistent objectives for students across the states. VARIABILITY IN STATE PERFORMANCE STANDARDS NAEP is also used as a common yardstick for comparing students’ proficiency across states. The No Child Left Behind Act requires not only that states report progress on their own assessments, but also that they participate in NAEP so that comparisons can be made. The results of such comparisons indicate striking discrepancies between the performance required for proficiency on state assessments and what is required for proficiency on NAEP assessments. These results have received significant public attention and, as presenter Peggy Carr explained, NCES recog- nized the need for a more precise methodology with which to make these comparisons (see National Center for Education Statistics, 2007). Figure 3-4 illustrates the discrepancies between the proficiency levels states use to report their adequate yearly progress and the NAEP profi- ciency levels, in terms of percentages of students meeting the standard. This information is useful to provide a snapshot, Carr explained. Since each state is asked to use NAEP as a benchmark, the comparison between each state and NAEP is valid. However, comparing states just by using the percentage meeting the standard is less useful. Consequently, NCES staff used an equipercentile equating method to align the distri- butions of pairs of scales, the NAEP scale and that of each of the states. In other words, they used results from schools that had participated in

OCR for page 17
2 ASSESSING THE ROLE OF K-12 ACADEMIC STANDARDS IN STATES Percent at or above NAEP Proficient Percent Meeting AYP Standard FIGURE 3-4 A comparison of state proficiency and NAEP standards. SOURCE: Carr (2008). 3-4 NAEP to calculate what they called a NAEP-equivalent score on the state assessment. Having done that for each state, they could then compare the NAEP-equivalent scores of any state to that of any other. What the comparison shows is the relative degree of challenge of a state’s standards using the NAEP scale as the common yardstick. Figure 3-5 shows how the comparison works for two sample states. The results of this analysis were quite similar to the results of the Fordham NWEA analysis. Generally, the researchers found that states’ proficiency levels varied significantly and that the majority map onto the “below basic” range on the NAEP scale, though the distribution varied by subject and grade. The results for mathematics are shown in Figures 3-6 and 3-7. The researchers also looked at the correlation between the proportion of students that a state reports as meeting its proficiency standards and the NAEP-equivalent score. They found that the correlation was nega- tive: that is, the higher the number of students that a state reports are passing its own standards, the less challenging are that state’s standards. The researchers also found that the position of a state’s adequate yearly progress standards on the NAEP scale bears little relationship to that state’s performance on the NAEP assessment. In other words, students’ performance on NAEP cannot be predicted from the relative difficulty of the state’s own standards.

OCR for page 17
2 ANALYZING STATE STANDARDS FIGURE 3-5 Methodology for comparing state proficiency standards. SOURCE: Carr (2008). 3-5 Carr’s conclusions from these results were similar to Petrilli’s. To illustrate their significance, she highlighted the results for three contiguous states, Georgia, North Carolina, and South Carolina. Students in these three states all perform at about the same level on the NAEP reading assessment, but the states have set very different standards for their students. An example of the practical effect of this discrepancy is that a student who, moves from North Carolina to South Carolina might go from being viewed as a proficient reader to being placed in a remedial class. Carr closed by noting that state assessments vary widely in both content and design, and that states may attach different meanings to the label “proficient.” In the context of NAEP, proficiency is defined as “com- petency over challenging subject matter”; in contrast, states generally define proficiency as on grade-level performance. Discussant Barbara Reys drew on her experiences cochairing the stan- dards development process for mathematics in Missouri to highlight some of the practical challenges of working toward common standards. Apart from the requirements of states that prize their autonomy, she noted the limitations of existing national standards, which may not be grade specific and lack other critical details. She was not surprised at the finding that many states’ standards do not align with national ones because “it’s

OCR for page 17
0 ASSESSING THE ROLE OF K-12 ACADEMIC STANDARDS IN STATES FIGURE 3-6 A comparison of proficiency standards in grade 4 mathematics. SOURCE: Carr (2008). 3-6 FIGURE 3-7 A comparison of proficiency standards in grade 8 mathematics. SOURCE: Carr (2008). 3-7

OCR for page 17
1 ANALYZING STATE STANDARDS really the decisions about what you want to focus on at particular grades that are the tough ones.” Reys also showed some results from an analysis of consistency she had conducted of K-8 mathematics standards. Her findings echoed those already presented. She found dramatic variation from state to state in the grade placement of particular concepts. The critical finding was again that a given learning expectation might be found in the first grade standards in one state and in the third grade standards in another state. These differences create a significant complication for textbook pub- lishers who want to serve multiples states. From Reys’ analysis, only 4 of 108 possible learning expectations for 4th graders were common across ten states—suggesting that a textbook publisher might choose to incor- porate all 108 of them. Since the content of textbooks has a significant effect on teachers’ instructional plans, this lack of overlap becomes a self- reinforcing pressure against curricular focus. At the same time, however, textbooks are a potential tool for increasing uniformity because they are so influential. Discussant William Schmidt characterized the variation among state standards as “enormous.” He believes that both math and science stan- dards display “the maximum possible variation at every combination of grade level and topic.” He suggested that this is particularly bad for math- ematics because that subject has an inherent logic, so that it is essential that students learn concepts in a particular order if they are to develop sound mathematical thinking. The problem, he said, is that because so few standards establish coherence and vertical alignment in mathematics goals, the result is a mishmash, with many concepts being introduced far too early and then repeated over and over in subsequent grades. Ironically, he explained, the topics that get the least coverage tend to be the most important—the deeper topics that build conceptual understanding. Schmidt has also observed that district standards vary as much as those of states. Moreover, he suggested, variation at the classroom level, in terms of what teachers are actually covering with their students, far outpaces the variation at the district and state levels. For Schmidt, this variation, which permeates the entire education system, is “the Achilles heel of the No Child Left Behind Act.” Based on his analysis, he argued that the degree of variation in the opportunities children have to learn makes it inevitable that many will be left behind. Discussant Peter McWalters offered a perspective from Rhode Island, which has coordinated its standards development with three other states, Maine, New Hampshire, and Vermont. Although the presentations sug- gested a number of questions for this consortium of states to ponder as they work to improve their standards, he labeled the effort a success and added that he would be happy to see a national model. He noted that

OCR for page 17
2 ASSESSING THE ROLE OF K-12 ACADEMIC STANDARDS IN STATES NCLB had been the impetus for the efforts of the New England states because none of the four has a testing infrastructure and all are too small to produce what is required on their own. They were also fortunate in that none of them has regulatory roadblocks, such as state-mandated standards, so working collectively was relatively easy. However, McWalters identified what he sees a major stumbling block to a national approach to standards, that “no state would trust the feds after our experience with the beginning of No Child Left Behind. . . . There is zero trust.” He also supported points made earlier regarding states’ capacity to change in the ways that are needed. For him, the big- gest challenge is in finding ways to serve diverse students with diverse needs. To do that successfully, teachers will need a command of their subjects—the content and the pedagogy that is “way beyond what we currently have.”