Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
2 Considering the Status Quo B efore considering options for integrating the most successful ele- ments of standards-based systems into a common system, the com- mittee wished to develop an accurate picture of the way the stan- dards already in place are operating. A look at the views of policy makers in several states provided the basis for an overview of the evolution of the role of standards in the states and highlighted some of the key challenges that remain. Analysis of the content covered in state standards and of the performance expectations that states define provides a detailed picture of the extent of variation across the states. VIEWS FROM THE STATES The views of state-level policy makers were the first focus. Diane Massell (2008) analyzed for the committee a series of interviews with education policy makers in five states: California, Florida, Massachu- setts, North Dakota, and Texas. The purpose of Massellâs interviews was to solicit opinions from a range of experienced policy makers who have been engaged in standards-based education reform, the catchall term for measures that states have taken to improve instruction and learning by organizing both policy and practice around clear, measurable standards. Massell and her colleagues hoped to trace both common themes and insights and possible differences, as well as to flag views that may be developing in response to current events. The five states were chosen to reflect both geographical diversity and diversity of experience with
COMMON STANDARDS FOR K-12 EDUCATION? standards. California, which initiated its standards approach during the 1980s, was an early adopter, for example, whereas North Dakota adopted standards in response to federal requirements in 1994. The 21 interview subjects included officials or education aides from governorsâ offices, members of state boards of education, state legislators, and state educa- tion agency officials. Massell began with a brief overview of the evolution of standards- based reforms in the states. She highlighted the current, unprecedented degree of public engagement in the specifics of implementing standards- based systems, particularly attention on the curriculum and instruction that make them concrete. She described standards-based reform as having had the effect of âopening Pandoraâs box,â because it resulted in a new transparency with regard to curriculum and instruction. Massell was bor- rowing a phrase from a 1950s report that described districts as reluctant to allow the public to involve itself with potentially divisive questions about what and how children should be taught. Although the minimum competency movement of the 1970sâas well as lawsuits in a number of states intended to force states to equal- ize school fundingâincreased focus on schoolsâ accountability to states with regard to what students actually learn, the achievement bar was set relatively low, Massell explained. The standards-based reform movement that developed in response to A Nation at Risk (National Commission on Excellence in Education, 1983) expanded the role of standards, empha- sizing rigorous requirements for high school graduation. As national organizations, such as the National Council of Teachers of Mathematics (NCTM), as well as individual states began to put forward more detailed statements of what students should be expected to know and be able to do, the concept of systemic reform, suggested by Smith and OâDay (1991), sharpened the focus on how standards might lead to the desired learning. The logic of systemic reform was that the primary elements of an educational systemâsuch as curriculum, instruction, teacher prepa- ration, professional development, and assessmentâmust all be aligned to carefully developed content and performance standards in order for those standards to affect teaching and learning. In this view, educators would still retain significant flexibility in meeting expectations but be held accountable for the results. In 1994 the reauthorization of the Elementary and Secondary Educa- tion Act made standards-based reform the official national approach to public schooling by requiring states to set challenging standards aligned to assessments and accountability measures (Massell, 2008). The testing requirements imposed by the No Child Left Behind Act (NCLB) of 2001 built on that commitment, requiring states to (1) publish challenging aca- demic content standards in English/language arts and mathematics for
CONSIDERING THE STATUS QUO each of grades 3 through 8 and one secondary grade, as well as standards for science in three grades, and to (2) assess students in these grades and subjects annually and to hold schools accountable for the results (http:// www.ed.gov/policy/elsec/leg/esea02/index.html [April 2008]). Those requirementsâand the consequences imposed by the law for failing to meet themâhave meant that parents and others have a significantly increased interest in the precise content of standards, curriculum, the tests used to measure proficiency, and the material covered in classrooms. Massellâs interviews found intense differences of opinion related to standards. Her interview subjects reported disagreement about how rig- orous academic and performance standards are and should be, about whether measures that sharpen accountability also lead to an unaccept- able narrowing of the curriculum, and about the fairness of accountability sanctions. Despite tension around a number of issues, Massell noted that the leaders she and her colleagues interviewed generally take standards- based reform and accountability for granted, viewing this approach as a âcentral framework guiding state education policy and practice.â Even the leaders from North Dakota, where standards were adopted largely under federal duress, viewed this approach as a part of the landscape that is not likely to change. The other four states had made a stronger commit- ment to standards, and the leaders from those states described them in such terms as âeven more central over timeâ and âintegralâ to policy ini- tiatives. Massell observed that opening issues related to curriculum and instruction to public discussion has not had the effect of killing reform, as some may have feared, and the result has been âa surprising degree of agreement regarding the meaning and purpose of education.â The North Dakota respondents were more muted than the others, however. They were less likely to see standards as âcentralâ to policy and tended to describe the effects of standards on classroom practice as mar- ginal. Moreover, respondents from all five states reported that the focus on standards remains variable across and within both states and districts, as do their effects on instruction and learning. Massell explained that the interviewers asked state education leaders for their impressions regarding several aspects of standards-based reform, such as its impact on practice, learning opportunities, the quality of edu- cation, and resources. The leadersâ responses to these issues generated an array of reactions from workshop discussants and participants. Equity The effects of standards-based accountability systems on achievement gaps and equality of opportunity for disadvantaged students was the first
COMMON STANDARDS FOR K-12 EDUCATION? specific topic discussed in the interviews. In general, Massell reported, the state leaders believe that standards-based reform has led to: â¢ reater awareness of and attention to the academic performance of g disadvantaged students; â¢ he expectation that all students will meet rigorous standards; t â¢ eductions in achievement gaps; r â¢ more uniform education system (within states); and a â¢ nstruction that is tailored to the needs of individual students. i They generally agreed that increased awareness of the performance of all groups may be the most widely recognized accomplishment of standards- based reform, and particularly of the NCLB legislation. Yet both the interview subjects and the workshop participants recog- nized the challenges of increasing equity in education and the limitations of what has been accomplished. The gaps have not been eliminated, and most agreed that reductions thus far have been fairly modest. Massell noted that according to a study by the Center on Education Policy (2007), gaps in most states remain substantial despite reductions, and some states have seen no reductions. Urban schoolsâthose with the largest propor- tions of disadvantaged studentsâare the least likely to be meeting NCLB performance targets. Discussant Brian Stecher reinforced the concern that improvement has been modest, pointing out that âunder the threat of severe sanctions from No Child Left Behind, there is an unknown amount of inflation in test scores, and what we see in terms of gap closing on state tests is not always replicated in other low-stakes assessments.â Many par- ticipants viewed the challenge of providing a truly equitable education for disadvantaged students as a central purpose of standards-based reform. Capacity The interview subjects viewed the statesâ capacity to carry out all the improvements envisioned in standards-based reform as the most sig- nificant challenge to improving equity and achieving its other goals, and workshop participants were quick to agree. The reforms have stretched state agencies and districts significantly during a period in which most have been losing personnel and resources. Massell noted that Massachu- setts had 325 full-time staff when its reforms were enacted into law in 1993, although it had employed 990 just 13 years earlier. Smaller staffs have been responsible for developing new standards and aligning curriculum, instruction, and assessments to them. Other technical challenges, such as measuring the progress of English language learners in a valid manner, have increased the difficulty of implementing the intended reforms.
CONSIDERING THE STATUS QUO NCLB required support of Title I schools (those serving specified per- centages of low-income children) in specific ways. As growing numbers of schools and districts fall short of the NCLB performance targets, the strains on personnel are increasing. Fully 25 percent of schools across the country fell short of adequate yearly progress targets in 2004-2005, and the numbers have been increasing since then, although Massell noted that that figure masks significant variation across states. For example, Florida and Alabama reported that as many as 67 percent of their schools and 90 percent of their districts would fall short in 2008. Moreover, many states project that a cascading number of schools will be identified as underper- forming in the coming years, as the lawâs 2014 deadline for having 100 percent of students perform at proficient levels draws closer. Capacity is critical to making a standards-based system perform as it is intended to. One necessary component of the strategy is data analysis, since, ideally, thoughtful analysis of timely data will guide teachers as they plan instruction; administrators as they plan teaching assignments, professional development, and many other aspects of their schools; and district and state staff as they make decisions about key questions, such as curriculum planning and resource allocation. Yet as Stecher and other discussants pointed out, teachers, administrators, and policy makers fre- quently lack either the training or the timeâor bothâto use the data they receive wisely. Few teachers have been adequately trained to use data to make improvements in instruction, and the annual testing data that are the most typical product of accountability systems are not particularly useful for that purpose. More broadly, a number of participants stressed that standards-based accountability models provide a structure for identifying problems, but they do not directly address the challenges of bringing about better instruction. There is a risk that the standards-based reform model, and all of the testing and other time- and resource-intensive activities that are associated with it, may distract educators from one of the central challenges of reform: figuring out how to address the needs of disadvan- taged students. As discussant Lynn Olson put it, one benefit of common standards could be to âforce us to confront gross inequities,â although educators and the public have known for decades that disadvantaged students are not doing well. Quality Building on the capacity issues, participants also discussed the gaps between the ideal model and reality. Olson noted that in the evaluation of state standards recently published by Education Week, not one state earned a top score on each of the criteria used, and many scored very
10 COMMON STANDARDS FOR K-12 EDUCATION? poorly in a number of areas (Editorial Projects in Education, 2008). Stecher expanded on this point, arguing that very high standards are needed for the standards themselves. Because everythingâincluding curriculum, textbooks, development of assessments, language for reporting results to the publicâflows from the standards, they need not only to be clearly written and concise, but also to reflect current understanding of how chil- dren learn and their conceptual development. They also need to provide guidance about the performance criteria for determining whether stu- dents have mastered particular standards and guidance about the relative importance of the different elements included. In practice, as the Quality Counts (Editorial Projects in Education, 2008) and other evaluations attest, state standards are not yet meeting those kinds of criteria. In the absence of the guidance that standards should provide, the default source for guidance is the assessment system. As Stecher put it: âWe may be drifting toward assessment-based reform, rather than standards-based reform.â Yet the standards themselves may be the best developed aspect of the evolving reform systems. Participants called attention to persistent concerns about the nature, rigor, and quality of the assessments used in many states and about the narrowing effects they can have on curriculum and instruction. For example, few states systematically provide for exten- sive formative assessments that teachers could use to tailor instruction to individual studentsâ needs. These kinds of concerns, many noted, suggest the potential benefits to states of greater uniformity among them. States could much more easily take advantage of one anotherâs knowledge and experience and avoid duplication of effort if they were applying consis- tent frameworks. This point was reinforced by questions about whether the multiple- standards model has yielded the consistency that was hoped for even within states. Researchers and policy makers from several states sug- gested that there is far more variation in both content and performance standards in practice than may be evident in statesâ written plans. As discussant Rae Ann Kelsch explained: âPeople are very reluctant to give up control.â Although she spoke on the basis of the experience in North Dakota, which has not embraced standards wholeheartedly, others echoed her view. Standards-based systems have provided a model and a unify- ing conception of the purpose of education, âbut very different goals can exist under the same banner,â as one participant put it. Discussant Scott Montgomery said that the problem lies in changing the entire system, not just in unifying the standards, so for him common standards would not necessarily bring the changes that he believes are needed.
CONSIDERING THE STATUS QUO 11 VARIABILITY IN CONTENT AND PERFORMANCE STANDARDS AND ASSESSMENTS Similarities and differences among statesâ content and performance standards are key to understanding the extent to which any move toward more common standards would have a substantive impact on current standards-based systems. An examination of statesâ approaches was orga- nized around three framing questions: 1. ow and to what extent do K-12 state content standards in English/ H language arts, mathematics, and science at key grades vary? 2. ow and to what extent do K-12 state performance standards in H English/language arts, mathematics, and science at key grades vary? 3. ow and to what extent does the implementation of K-12 state H content and performance standards in multiple academic subjects in classrooms vary? Analysis of both content and performance standards provided the foundation for an extensive discussion of these questions. Andrew Porter and his colleagues described a very detailed review of 31 statesâ standards in the three subjects, with a focus on grades 4 and 8, which was developed for the workshop. Michael Petrilli described an analysis conducted by the Fordham Foundation and the Northwest Evaluation Association to compare proficiency standards across states, and Peggy Carr described the results of an analysis by the National Center for Education Statistics of the relationship between proficiency standards for state assessments and those of the National Assessment of Educational Progress. Content Standards Porter and his colleagues addressed the first question by analyzing state content standards in English/language arts/reading, mathematics, and science for grades K-8 (Porter, Polikoff, and Smithson, 2008). Their analysis was based on a conceptual framework for considering the pri- mary influences on teachersâ instructional practices. Their hypothesis was that teachers are most strongly influenced by standards policies that have five characteristics: â The term âproficiency standardsâ refers to the level of performance identified on a par- ticular test as the minimum that qualifies as âproficient.â Thus, it is a type of performance standard.
12 COMMON STANDARDS FOR K-12 EDUCATION? 1. hey are specific in their messages to teachers about what they are T to teach. 2. hey are consistent (aligned) among themselves so that teachers T receive a coherent message. 3. hey have authority, in that they are developed and promoted T by experts, are officially adopted by the state, are consistent with standards practice, and are promoted by charismatic individualsâ meaning individuals who provide leadership and motivate those who must implement the standards. 4. hey have power, in that compliance with them is rewarded, T whereas failure to comply is sanctioned. 5. hey have stability, in that they are kept in place over time. T The team also based their analysis on a methodology they developed for describing in detail what it is that teachers teach, which they call a three-dimensional content language. Although this methodology actually predated the standards-based reform movement, it has proved a useful tool for examining the content of state standards documents. Porter described the method as a way of producing a visual repre- sentation of the relative coverage of various elements of a particular field that is similar to a topographical map of a geographical region. Using the content language, Porter and colleagues have divided each subject into general areas. For example, in mathematics there are 16 general areas (e.g., operations, measurement, basic algebra), and each of these can be further subdivided into between 9 and 14 more specific topicsâfor a total of 217. Apart from the subtopics that make up each field, the language also dis- tinguishes levels of cognitive demand, which are also somewhat different for each subject. There are eight cognitive levels for mathematics: memo- rize, perform procedures, demonstrate understanding, conjecture, gener- alize, prove, solve novel problems, and make connections. Thus, Porter explained, content is defined as the intersection of these two dimensions. Using this tool one can determine, for example, not just whether or not linear equations are covered, but also whether students will be expected to memorize one, solve one, or use one to solve a story problem. To apply this language analysis to a stateâs standards, trained analysts review and code the most specific available description of the standard for a particular subject and grade level. Each standard is analyzed by three to five analysts, and items that are difficult to code are flagged and discussed. The codes are entered into cells as proportions, with 0 repre- senting no emphasis and 1 representing a very strong one. The cells are used to build the visual display that illustrates both the degree of focus on the various topics and the cognitive demand. Porter and colleagues drew on data from 31 states, although not all
CONSIDERING THE STATUS QUO 13 provided data for every subject and grade level. The team also analyzed the national science standards and those of the NCTM. They focused on grades 4 and 8, with the goal of highlighting the degree to which statesâ standards showed overlap or conceptual progressions between those two grades. Having entered the codes, they were able to conduct a variety of analyses and comparisons. For any pair of states for which they had data, the alignment for a par- ticular standard can be calculated. Using averages of these results, they were also able to calculate alignment across and within grade levels. Figure 2-1 presents a pair of coarse-grained content maps showing the results for the two states that are most aligned in English/language arts/ reading for fourth grade. It is clear from the figure that content areas such as vocabulary, comprehension, and language study (the darkest areas) are strongly emphasized in both Ohio and Indiana and that neither state places any emphasis on phonemic awareness. These content areas, which Alignment = .48 FIGURE 2-1â Coarse-grained content maps for English/language arts/reading for grade 4. SOURCE: Porter, Polikoff, and Smithson (2008, Figure 1).
14 COMMON STANDARDS FOR K-12 EDUCATION? show up as darkest for both states, also line up along the column labeled âexplain.â This indicates that both states target the level of cognitive demand the researchers labeled as âexplainâ for these content areas. In other words, whichever areas show up as dark in the maps for both of the states being compared are ones that are emphasized in both states. One of the areas that showed up as strongly aligned for these two states, com- prehension at the âexplainâ level of challenge, is shown in greater detail in the two fine-grained content maps in Figure 2-2. The subcategories of this instructional areaâwhich include recognizing the meaning of words from the context; understanding of phrase, sentence, and paragraph; and the likeâare mapped in the same way, according to degree of emphasis and cognitive level. This pair of maps shows, for example, that both Ohio and Indiana place strong emphasis on such strategies as activating prior FIGURE 2-2 Fine-grained content maps for comprehension, English/language arts/reading for grade 4. SOURCE: Porter, Polikoff, and Smithson (2008, Figure 22).
CONSIDERING THE STATUS QUO 15 knowledge and that both associate the same degree of cognitive demand with this strand of the standard for reading comprehension. The team also looked at the degree of alignment among 14 states in English/language arts/reading for grades 4 and 8, as shown in Table 2-1. The index runs from 0 to 1.00 with 1.00 representing perfect alignment. The results show significant variation, from lows, such as the .07 align- ment between Maine and Wisconsin for grade 8, to highs, such as .47 between Ohio and California for grade 8. The researchers also examined the degree of alignment among the statesâ standards and the national science standards and the NCTM standards. Porter and colleagues (2008) have produced a voluminous body of data: 90 figures, 10 tables, and an appendix; this material and further detail about their analysis can be found in their paper. Porter explained that although it would be much easier if these data could be simplified, he and his colleagues could find no substitute for the fine-grained analysis for answering the questions at hand. Nevertheless, some key general points were evident. First, they found little evidence to support the hypothesis that there is a de facto national curriculum. The degree of variability they found across states, and between state and national standards, does not support that hypothesis. In fact, they found that the alignment of topic coverage within states from grade to grade (the degree of overlap in what is in the standards for each grade, as students ideally progress) is generally greater than the degree of alignment across states in the material they cover at particular grades. The repetition, Porter suggested, sends students the message: âDonât you dare learn this the first time we teach it; otherwise you are going to be bored out of your skull in the subsequent grades.â Porter and colleagues did find some indication that a few core areas are covered more consistently across states than the overall alignment data would showâor a small de facto common core curriculum. How- ever, they also concluded that statesâ content standards are in general not focused on a few big ideas. Although the states vary in this as well, overall their standards do not demonstrate the clear focus and discipline that many have advocated. State Assessments Assessing the extent to which the performance standards that states set come close to defining a de facto common standard for proficiency was the impetus behind another study, described by Michael Petrilli. â The material is available at http://www7.nationalacademies.org/cfe/State_Standards_ Workshop_1_Agenda.html [May 2008].
TABLE 2-1â State-to-State Alignment, 4th and 8th Grade Standards for English/Language Arts/ Reading 16 SOURCE: Porter, Polikoff, and Smithson (2008, Table 1).
CONSIDERING THE STATUS QUO 17 This study, conducted jointly by the Fordham Foundation and the North- west Evaluation Association (NWEA), was designed to address three questions: 1. here are states setting the proficiency bar, and can their approaches W to setting cut scores be compared in a fair way? 2. iven the pressure to bring 100 percent of students to proficient G levels, are states lowering their standards over time in order to meet that goal? 3. re cut scores relatively consistent in terms of difficulty level across A grades? Fordham and the NWEA decided to collaborate to conduct this study because the NWEA has developed a computer-adaptive assessment sys- tem that is used by many districts. The test is used primarily for diag- nostic testing, so that districts can pinpoint the performance levels of individual students in different areas that are covered in the relevant state standards. Thus, the assessment is designed to be as well aligned as possible to the content standards of the states in which it is used. The NWEA maintains a large pool of items, and it constructs tests for districts by using the items that are closest to the standards for which that district is responsible. Because all the items are pegged to a common scale, the NWEA is able to make some comparisons across states. With this resource, the researchers were able to estimate where on the NWEA scale a given state is setting its cut score. In many cases, that calculation was available for two times, 2003 and 2006, and thus they were able to consider the possibility that the cut scores had changed over time in those states. They had data for a total of 830,000 students in 26 states who had taken both the NWEA assessment and their own state exam. The researchersâ primary finding was that there is enormous variabil- ity in the level of difficulty of statesâ testsâa range from approximately the 6th percentile (94 percent would pass) to approximately the 77th per- centile (23 percent would pass). These findings are shown in Figure 2-3. To illustrate the kinds of differences these numbers represent, Petrilli provided two sample grade 4 items from the NWEA assessment, each with a difficulty level at the cut score of one of the states. For the Wis- consin cut score, which they had calculated at the 16th percentile on the NWEA scale, the sample item asked students to select from a group of sentences the one that âtells a fact, not an opinion.â To represent the comparable cut score for Massachusetts, calculated at the 65th percentile, Petrilli showed an item that asked students to read a complex, difficult passage (excerpted from a work by Leo Tolstoy) and to pick from a list of factual statements the one that is actually found in the passage.
18 3-3 Broadside 3rd Grade Reading Cut Scores 8th Grade Math Cut Scores FIGURE 2-3â Difference in difficulty of state tests. SOURCE: Petrilli (2008).
CONSIDERING THE STATUS QUO 19 Petrilli believes the implications of this degree of difference are pro- found. If, as many people believe, the high stakes attached to state tests mean that teachers focus the bulk of their attention on the students who are just below the proficient level, to get them over that bar, then teach- ers in Wisconsin will be targeting their instruction at a very low level in comparison to those in Massachusetts. This analytical approach also made it possible to compare the cut scores that states set for mathematics and for reading, at least in terms of percentiles. Doing so is useful, Petrilli explained, because test results that seem to demonstrate that students are doing better in one subject than another, may actually demonstrate that the level of mastery needed to score at the proficient level is quite differ- ent for each subject. With regard to the second question, whether states are engaged in a so-called race to the bottom, the researchers were surprised to find that this does not seem to be the case. Rather, Petrilli characterized the trend as a âwalk to the middle.â Most states kept their cut scores relatively con- sistent across the time period studied, but the states that began with the highest standards had moderated their standards somewhat, while those with the lowest standards had raised theirs. He cautioned, however, that because they were working with percentiles, the change over time could be explained either by intentional shifts in cut scores or by changes in studentsâ actual achievement levels. In terms of the last question, the vertical alignment of state standards, the analysis showed that they are not well calibrated, grade level to grade level. In the majority of states, the elementary standards are set sig- nificantly lower than the middle school standards. When this is the case, students may have no trouble with the grade 3 test, proceed normally through subsequent grades, and then stumble on the grade 8 test. The aggregate results may inaccurately indicate a problem with middle school instruction, in comparison to elementary school instruction. Moreover, if standards are not aligned vertically, the test results will not be good indicators of studentsâ growth over time. Petrilli drew three conclusions from the research. First, state per- formance standards need âan overhaul.â If the goal is for standards to progress cumulatively from kindergarten through grade 12, states should begin with rigorous high school graduation requirements and work back- ward to develop vertically aligned standards. Second, Petrilli believes that the objective of bringing 100 percent of students to proficiency has become a perverse incentive that has the effect of lowering achievement. Finally, in responding to the workshop theme, he said that discussion of national standards should continueâthat such discussion would have the effect of creating consistent objectives for students across the states.
20 COMMON STANDARDS FOR K-12 EDUCATION? Performance Standards Another way to examine the question of how much performance expectations vary across states is to use the National Assessment of Edu- cational Progress (NAEP). NCLB requires not only that states report prog- ress on their own assessments, but also that they participate in NAEP so that it can serve as a common yardstick for comparing studentsâ profi- ciency across states. The results of these comparisons indicate striking discrepancies between the performance required for proficiency on state assessments and what is required for proficiency on NAEP assessments. These results have received significant public attention, and, as presenter Peggy Carr explained, the National Center for Education Statistics (NCES) recognized the need for a more precise methodology with which to make these comparisons (see National Center for Education Statistics, 2007). Figure 2-4 illustrates the discrepancies between the proficiency levels that states use to report their adequate yearly progress and the NAEP pro- ficiency levels, in terms of percentages of students meeting the standard. This information is useful to provide a snapshot, Carr explained. Since each state is asked to use NAEP as a benchmark, the comparison between each state and NAEP is valid. However, comparing states just by using the percentage meeting the standard is less useful. Consequently, Percent Meeting AYP Standard Percent at or above NAEP Proficient FIGURE 2-4â A comparison of state proficiency and NAEP standards. SOURCE: Carr (2008). 3-4
CONSIDERING THE STATUS QUO 21 FIGURE 2-5â Methodology for comparing state proficiency standards. SOURCE: Carr (2008). 3-5 NCES staff used an equipercentile equating method to align the distri- butions of pairs of scales, the NAEP scale and that of each of the states. In other words, they used results from schools that had participated in NAEP to calculate what they called a NAEP-equivalent score on the state assessment. Having done that for each state, they could then compare the NAEP-equivalent scores of any state with that of any other. What the comparison shows is the relative degree of challenge of a stateâs standards using the NAEP scale as the common yardstick. Figure 2-5 shows how the comparison works for two hypothetical states. The results of this analysis were quite similar to the results of the Fordham/NWEA analysis. Generally, the researchers found that statesâ proficiency levels varied significantly and that the majority map onto the âbelow basicâ range on the NAEP scale, although the distribution varied by subject and grade. The results for mathematics are shown in Figures 2-6 and 2-7. The researchers also looked at the correlation between the proportion of students that a state reports as meeting its proficiency standards and the NAEP-equivalent score. They found that the correlation was nega- tive: that is, the higher the number of students that a state reports are passing its own standards, the less challenging are that stateâs standards. The researchers also found that the position of a stateâs adequate yearly
22 COMMON STANDARDS FOR K-12 EDUCATION? FIGURE 2-6â A comparison of proficiency standards in grade 4 mathematics. SOURCE: Carr (2008). 3-6 FIGURE 2-7â A comparison of proficiency standards in grade 8 mathematics. SOURCE: Carr (2008). 3-7
CONSIDERING THE STATUS QUO 23 progress standards on the NAEP scale bears little relationship to that stateâs performance on the NAEP assessment. In other words, studentsâ performance on NAEP cannot be predicted from the relative difficulty of the stateâs own standards. Carrâs conclusions from these results were similar to Petrilliâs. To illustrate their significance, she highlighted the results for three con- tiguous states, Georgia, North Carolina, and South Carolina. Students in these three states all perform at about the same level on the NAEP reading assessment, but the states have set very different standards for their students. An example of the practical effect of this discrepancy is that a student who moves from North Carolina to South Carolina might go from being viewed as a proficient reader to being placed in a remedial class. Her closing point was that state assessments vary widely in both content and design, and that states may attach different meanings to the label âproficient.â In the context of NAEP, proficiency is defined as âcom- petency over challenging subject matter,â whereas states generally define proficiency as grade-level performance. Participants and discussants had a range of comments about the vari- ation that was described, factors that may contribute to it, and its implica- tions. Discussant Barbara Reys drew on her experiences co-chairing the standards development process for mathematics in Missouri to highlight some of the practical challenges of working toward common standards. Apart from the requirements of states that prize their autonomy, she noted the limitations of existing national standards, which may not be grade specific and lack other critical details. She was not surprised at the finding that many statesâ standards do not align with national ones because âitâs really the decisions about what you want to focus on at particular grades that are the tough ones.â Reys also showed some results from an analysis of consistency she had conducted of K-8 mathematics standards. Her findings echoed those already presented. She found dramatic variation from state to state in the grade placement of particular concepts. The critical finding was again that a given learning expectation might be found in the grade 1 standards in one state and in the grade 3 standards in another state. These differences create a significant complication for textbook pub- lishers who want to serve multiples states. From Reysâs analysis, only 4 of 108 possible learning expectations for fourth graders were common across 10 statesâsuggesting that a textbook publisher might choose to incorporate all 108 of them. Since the content of textbooks has a significant effect on teachersâ instructional plans, this lack of overlap becomes a self- reinforcing pressure against curricular focus. At the same time, however, textbooks are a potential tool for increasing uniformity because they are so influential.
24 COMMON STANDARDS FOR K-12 EDUCATION? Discussant William Schmidt characterized the variation among state standards as âenormous.â He believes that both mathematics and science standards display âthe maximum possible variation at every combina- tion of grade level and topic.â He suggested that this is particularly bad for mathematics because that subject has an inherent logic, so that it is essential that students learn concepts in a particular order if they are to develop sound mathematical thinking. The problem, he said, is that because so few standards establish coherence and vertical alignment in mathematics goals, the result is a mishmash, with many concepts being introduced far too early and then repeated over and over in subsequent grades. Ironically, he explained, the topics that get the least coverage tend to be the most importantâthe deeper topics that build conceptual understanding. Schmidt has also observed that district standards vary as much as those of states. Moreover, he suggested, variation at the classroom level, in terms of what teachers are actually covering with their students, far outpaces the variation at the district and state levels. For Schmidt, this variation, which permeates the entire education system, is âthe Achilles heel of the No Child Left Behind Act.â Based on his analysis, he argued that the degree of variation in the opportunities children have to learn makes it inevitable that many will be left behind. Discussant Peter McWalters offered a perspective from Rhode Island, which has coordinated its standards development with two other states, New Hampshire and Vermont. Although the presentation suggested a number of questions for this consortium of states to ponder as they work to improve their standards, he labeled the effort a success and added that he would be happy to see a national model. He noted that NCLB had been the impetus for the efforts of the New England states because none of the three has a testing infrastructure and all are too small to produce what is required on their own. They were also fortunate in that none of them has regulatory requirements, such as state-mandated standards, that would present a barrier to the states working collectively. However, McWalters identified what he sees a major stumbling block to a national approach to standardsâthat âno state would trust the feds after our experience with the beginning of No Child Left Behind. . . . There is zero trust.â He also supported points made earlier regarding statesâ capacity to change in the ways that are needed. For him, the biggest chal- lenge is to find ways to serve diverse students with diverse needs. To do that successfully, teachers will need a command of their subjectsâthe con- tent and the pedagogy that is âway beyond what we currently have.â â The three-state consortium, the New England Common Assessment Program, is dis- cussed in greater detail in Chapter 7.
CONSIDERING THE STATUS QUO 25 PARADOXES Committee chair Lorraine McDonnell reflected that the discussion of standards as they are currently operating, and the findings regarding variation across the states, yielded two significant paradoxes. The first is that, although standards are very well institutionalized across the country, with very few voices challenging their value as an organizing framework for reform, it is also the case that standards-based reform means different things to different people. The term in some ways disguises deep-seated differences about both priorities and strategies for achieving education goals. The second paradox is that, although there is little ostensible disagree- ment about the standards-based approach, there is a wide gap between the theoretical model and the reality of standards-based accountability systems in practice. The theoretical model of an aligned system is com- pelling as a strategy for meeting the needs of diverse students. Yet in practice, states and districts have lacked the capacity, resources, and, perhaps in some cases, the knowledge or the will to put all the essential elements into place. Participants described legislators and other policy makers who have viewed the development of a new core curriculum or the raising of high school graduation standards as all that is required to pursue standards-based reform. Disputes over the significance of testing results, and the effects the reporting of these results can have, have further clouded the discussion. The result is a situation in which the core conceptâthe goal of defin- ing standards and holding educators accountable for the resultsâseems to be constant across states, but in which the execution of this goal yields starkly different results.