Read "Learning from TIMSS: Results of the Third International Mathematics and Science Study, Summary of a Symposium" at NAP.edu

« Previous: Critiques and Methodological Issues

Page 21 Cite

Suggested Citation:"Policy Issues." National Research Council. 1997. Learning from TIMSS: Results of the Third International Mathematics and Science Study, Summary of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/5937.

Page 22 Cite

Page 23 Cite

Page 24 Cite

Page 25 Cite

Page 26 Cite

Page 27 Cite

Page 28 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

groups because the context in which each participating student has learned science and mathematics is different. However, clear evi dence that a particular intervention had a particular result is not nec- essary to make the data useful. Data from each of the components can be used to enrich understanding of education and, more signifi- cantly, to identify promising connections that can then be further explored. POLICY ISSUES The many policy issues raised by the initial findings from TIMSS were on the minds of presenters and participants alike throughout the symposium, and many of them were raised more than once in differ- ent contexts. The issues raised can perhaps most easily be summa- rized as four basic messages that were drawn from what was known about TIMSS at the time of the symposium. Understanding Differences Among Countries The TIMSS results clearly highlight the importance of under- standing differences among countries. This issue, while seemingly obvious in that the purpose of TIMSS is to compare the educational structure and performance of participating nations, was manifested in two particular ways at the workshop that will be of interest to policy makers. The first of these was primarily addressed by Mike Atkin and Paul Black, whose paper summarized some of the results of a 13- country study, the Innovations in Science, Mathematics, and Tech- nology Education Project, sponsored by the Organization for Eco- nomic Co-operation and Development (OECD), for which they collected case studies of innovative approaches to mathematics and science education (Atkin and Black, 1997~.6 From this work they concluded that while every single participating nation (including those that per- formed well on TIMSS) is decidedly dissatisfied with the status of its own approach to mathematics and science education, not all nations share the same motivations for seeking improvement. Many, particu- larly those facing high unemployment, share with the United States an overriding concern with preparing young people for the labor mar- ket and using a focus on excellence in mathematics and science as a means of improving productivity and fostering economic growth. Others were motivated by quite different concerns, such as the state of ado- lescent health, or the need to address environmental deterioration (Atkin and Black, 1997:5~. According to Atkin and Black, Japan is primarily motivated by the concern that its students are not suffi 6The 13 countries involved in the study were Australia (Tasmania), Austria, Cana- da (British Columbia and Ontario), France, Germany, Ireland, Japan, Netherlands, Norway, the United Kingdom (Scotland), Spain, Switzerland, and the United States. RESULTS OF THE THIRD INTERNATIONAL MATHEMATICS AND SCIENCE STUDY 21

ciently creative, despite measurably high achievement, and their re- forms have been generally targeted toward fostering innovative problem-solving skills and encouraging real-world applications for mathematics and science education. This point was echoed by Jan de Lange, who cautioned that to an observer from abroad, the United States' virtual obsession with eco- nomic competition, particularly with Japan, is, to say the least, puz- zling. He reminded participants that somewhat loftier goals for edu- cation the proposition that "it makes people richer intellectually and culturally and prepares them for an increasingly complex society," for example have a practical application (de Lange, 1997:7~. Such goals, he argued, can enhance the development of intellectually rich aca- demic standards that are appropriate to their context. He suggested that the heavy emphasis on standardized test scores in the United States has distorted both curricula and expectations for student learn- ing. Atkin and Black made a similar comment, noting that "there is no substitute for hard argument within each country, to formulate the standards of high quality that it values and to work out the policies that can help achieve those standards" (Atkin and Black, 1997:16~. Atkin and Black stressed that their experience with the OECD study makes clear that the TIMSS results are a snapshot taken at a fixed point in time a snapshot of student performance and of educa- tional systems that are in near-constant flux. Their point, that the TIMSS results must be seen as a baseline against which changes in education can be marked, was shared by symposium presenter Rich- ard Elmore, who demonstrated a second reason that the context for each country's performance is so crucial. Elmore's focus was on the role TIMSS plays in the education policy environment in the United States, and his argument was that the study provides a unique oppor- tunity in this country because of the time at which it was done (Elmore, 1997). This, he argued, is a time when the proposition that imposing formal standards for students, teachers, and schools has real potential for improving U.S. schools has achieved an almost unprecedented level of agreement among concerned groups. Consequently, he maintained, the data produced by TIMSS, which includes detailed information about classroom practice, curriculum, teacher preparation, and many other contextual factors, should provide support for education leaders who want to take standards the crucial step forward, into classroom practice. Elmore structured his argument around a description of the U.S. political system as being characterized by both pluralism and dis- persed control. Tying these characteristics to our education system, Elmore pointed out that the system is pluralistic in the sense that any constituency that is able to muster a critical mass of support can have an impact on education policy. He argued that the structure of educa- tion governance in the United States is neither centralized nor, though it is often so described, localized. Elmore prefers to describe control over education governance as "dispersed": depending on the power 22 LEARNING FROM TIMSS:

of interested constituencies, influence can be wielded at any level. Though central controls are not prevalent, he noted, the federal gov- ernment intervened with force in support of school desegregation during the 1960s. More typical are situations in which constituencies with differing views seek in their own ways to influence policy deci- sions made at various levels, and the outcome is determined largely by political clout. It is because of this possibly unique system that the current apparent consensus over the value of education standards is so remarkable, said Elmore. Typically, Elmore argued, the dual effects of pluralism and dis- persed control have helped to ensure that most "policy talk" is car- ried on at an abstract level and has little impact on the day-to-day negotiations about specific decisions. (Elmore credited Tyack and Cuban, 1995, for this point.) TIMSS presents the novel possibility that policy prescription could move into the "instructional core," as he put it, by influencing decisions about "what gets taught to whom." TIMSS was designed to investigate the links between achievement and contextual factors and was based on the conviction that class- room decisions and other contextual variables have significant effects on student learning. For this reason, Elmore argued, it should pro- vide real support for policy decisions that truly confront what are for him the two key issues for the success of the standards movement, capacity and incentives. Elmore formulated what he described as a new principle, "reci- procity of capacity and accountability," to explain his conception of how standards-based reform ought to proceed. His concern is that holding schools accountable for student performance is tremendously risky (Elmore, 1997: 15 ): Race, social class, and home environment are the strongest predic- tors of education performance for students. Rewarding and punish- ing schools based on their performance under these circumstances means rewarding and punishing them, in effect, for the students they serve. Worse yet, adjusting rewards and punishments for stu- dent background means that certain schools will be allowed to con- tinue to have lower expectations for their students than other schools, thus defeating the main purpose of standards-based reform. Acknowledging that this is, as he put it, "a horrendously difficult problem," Elmore maintained that TIMSS can play a valuable role in focusing discussion on the issue. The study strongly emphasizes the connection between student learning and the many influences on teachers and schools that affect it. Consequently, it supports his argument that identifying and providing the supports necessary to enable stu- dents, teachers, and schools to meet established standards will be crucial to the success of standards-based reform. Jan de Lange had a somewhat different perspective on the same issue. He noted that "there is no mechanism that steers innovation in the United States." He added that although the United States spends more money than any other country in the world on research about RESULTS OF THE THIRD INTERNATIONAL MATHEMATICS AND SCIENCE STUDY 23

mathematics education, to an outsider it does not seem that this re- search has provided as much benefit to students and teachers as it should have. Because so many decisions about school governance are made at the local level and because, he said, "the school board people are not always, let me put it gently, experts in education" he believes they are not particularly likely to be aware of, or persuaded by, educa- tion research. De Lange and Elmore shared a conviction that for improvement to occur, the gap between research and theory, on one hand, and practice, on the other, must be bridged. Finally, de Lange reiterated the point that understanding of the contexts that influence education within each country is indispens- able. He called for a focus on variations of performance within na- tions as well as those between them. Citing the vast differences that have been revealed (through the International Assessment of Educa- tional Progress) between the performance of students in Iowa, North Dakota, and Minnesota and that of students in Alabama and Louisi- ana, for example, he remarked that "this gives at least a suspicion that we cannot blame textbooks or curriculum alone. He maintained that this variation in performance ought to be "unacceptable" (de Lange, 1997:10). Support for Teachers Although discussions throughout the symposium touched on is- sues that revealed potential conflicts of various sorts, two basic points of agreement emerged clearly. Perhaps clearest was a ringing en- dorsement for the idea that teachers in the United States require far more support than they are currently getting if they are to effect the desired improvements. Jan de Lange remarked that he had "never seen teachers working under [such] bad conditions . . . as American teachers" and deemed it "remarkable that we still end up in the middle" under these circum- stances. He cited their few opportunities for professional develop- ment, their low status, and the incoherence of the system in which they function as just a few among the many problems they face. Mary Linguist followed up by noting that in her experience working with teachers, what they want most is "the time to do the things that they think they should be doing." Atkin and Black addressed the role of teachers from a different angle. One of the conclusions they drew from the OECD project was that the absolute dominance that university-based scientists and math- ematicians have had over the content of K-12 instruction is declining. Teachers in particular, they noted, are gaining new influence in deter- mining what should be taught, at least in some areas. However, as they put it, "change creates turbulence" (Atkin and Black, 1997:11). For teachers to exercise this influence comfortably, Atkin and Black explained, they need opportunities for collaborating with their peers, and for upgrading and maintaining their own subject knowledge. They 24 LEARNING FROM TIMSS:

called attention to some revealing data from TIMSS showing that U.S. science teachers average significantly fewer hours per week de- voted to both professional reading and development and to lesson planning than did such high-scoring countries as Japan, Hungary, and Singapore (Atkin and Black, 1997: 13~. Elmore also addressed the urgency of attending to what teachers need in order to do their jobs well. He noted that "the work day of most teachers is organized in a way that allows them virtually no time to engage in any sustained learning about how to do their work differently," and that "most professionals learn new practices by working with other professionals, in close proximity to the details of practice, and by making their clients pay for the surplus time required to retool and renew themselves" (Elmore, 1997:13~. He views it as critical that teachers be given similar opportunities at the same time they are required to meet new standards. He also noted how ill-suited most existing standards documents are for helping teachers make immediate decisions about what and how to teach. To be useful to teachers, he argued, these documents need to take account of the lesson time teachers actually have and to be "drastically pared, simplified, and operationalized in the form of lesson plans, materials, and practical ideas about teaching practice" (Elmore, 1997:12~. In general, participants and presenters clearly seemed to agree that while teachers need to be held to high standards themselves and to significantly raise their expectations for U.S. stu- dents, they need to be supported in doing so with concrete and well- planned allocations of time and training. Secondary Analyses of TIMSS Data The other basic point of agreement at the symposium was that, despite numerous cautions and criticisms, the TIMSS data are ex- tremely valuable and can serve as the platform from which a wide variety of secondary analyses can take off. The bulk of the specific suggestions for valuable secondary analyses based on TIMSS data came from Edward Haertel, who had been asked to discuss the issue at the symposium. He began with the premise that linking single variables to achievement would likely be unprofitable. "The answers to all such questions," he wrote, "are likely to be equivocal, with many factors each being found to matter a little" (Haertel, 1997:5~. For example, he explained, "more than two hours per day of televi- sion viewing may be associated with lower achievement, but it does not follow that [students'] watching less television will cause achievement to rise." He also noted that it may be far easier to use TIMSS data to identify factors that have no apparent effect than to calibrate the relative effects of those that are influential. Haertel's suggestion for approaches to more fine-grained analy- ses of the data is to break them down in various ways. By exploring subsets of test questions, or items, he explained, it should be possible RESULTS OF THE THIRD INTERNATIONAL MATHEMATICS AND SCIENCE STUDY 25

to begin addressing in detail some questions with significant policy implications. Clusters of items could be defined in a variety of ways- for example, by mathematics or science topics or by the type of task the item calls for. Alternatively, clusters of students could be defined by demographic factors, by exposure to particular material, or by school characteristics. Another approach would be to select subsets of items by statistical characteristics and then try to determine whether they share any features. Generally, looking at targeted portions of the data could provide answers to specific questions about the relative effects of various factors on achievement. Haertel pointed out that scores varied far more within individual nations than they did between nations, and he said that gaining under- standing of reasons for this would be extremely useful. The United States, he noted, has the third greatest variation in scores of the 41 nations that participated in the middle-school (Population 2) portion of TIMSS.7 One constructive response to that fact, he argued, would be to try to learn from the exceptions, to ask: "Where do the poor learn as much as the wealthy? . . . Where are classes large and resources meager but achievement still high?" Haertel encouraged observers who are not psychometricians to participate in the formulation of questions to be addressed using the TIMSS data. He suggested four examples of areas of policy interest that could be explored, while acknowledging that there are many oth ers: · What are the patterns of gender differences in mathematics and science achievement in different nations? · How does the variability in educational opportunity and out- comes within the United States compare with that within other na- tions? · How widely are new ideas about mathematics curriculum and instruction being implemented? · Do new approaches in instruction, school governance, or other areas, seem to lead to distinctive patterns of student achievement? In general, Haertel suggested, the cross-national comparisons made possible by TIMSS are "sources of hypotheses of what to look for within the United States." Specific hypotheses cannot be tested using TIMSS data alone, he noted; the national populations are not compa- rable, so evidence of success with a particular approach in one place cannot be transferred to another. Haertel offered a reminder that TIMSS is not an instrument for comparing the results of educational "experiments" conducted in "laboratories" around the world, but a 7Among the participating nations, the standard deviation ranges from 72 to Ill; the standard deviation for the U.S. science scores is 106. The standard deviation of the national averages is approximately 50. 26 LEARNING FROM TIMSS:

comparative observational study. "The most powerful uses of TIMSS," he explained, "may be to show us the range of the possible." Limitations of TIMSS Symposium presenters were perhaps most outspoken in describ- ing some of the "yellow lights" they wanted to hold up about ways in which the TIMSS results might be used or misused. Foremost among these concerns was that the study and its results are complex and that it is very tempting to oversimplify them in talking about their impli- cations. Participants emphasized their concern, for example, that re- sults from one of the three-country studies of middle-school students might easily be misconstrued as explaining achievement results for the 41 countries that tested that population, or those at the other two age levels. Another danger of oversimplification was supplied by Atkin and Black, who noted that the practices the education community consid- ers desirable are by no means always characteristic of the countries who performed well. "If . . . the cost of high scores is to incur or exacerbate weaknesses on other important criteria," they explained, "then there twould be] some difficult decisions to be made" (Atkin and Black, 1997: 14~.8 Many presenters and participants also pointed out that the educa- t~on community actually knows very little about some of the high- performing countries. Since Singapore performed so well, they ar- gued, the next step is to learn more about how that country actually educates its children, rather than to blindly imitate what is already known or, worse, assumed. A second concern that was expressed by several participants is that TIMSS, although an exemplary assessment by many criteria, is in no way suited for use as a benchmark of world-class performance. As has been noted, the framework on which the achievement results are based covers only the content which the 45 participating coun- tries could agree merited assessment. It does not represent anyone's idea of a valid program of instruction in itself. It is not correct, as Richard Elmore emphasized, that "since the TIMSS study embodies standards that somehow these standards have some sort of authorita- tive standing as a consequence of having been connected up with very fine state-of-the-art empirical research." Moreover, as Jan de Lange and others made clear, the testing instrument, which had to be both affordable and understandable in countries all over the world, was capable of measuring only a limited universe of material. It was Their point is reinforced by the fourth-grade results, released after the sympo- sium, in which U.S. students at that level ranked considerably higher relative to their international counterparts than did U.S. eighth-graders. Clearly, policy prescriptions designed to make the U.S. system more like those of particular high-performing countries look even less sensible in light of this difference between the grades. RESULTS OF THE THIRD INTERNATIONAL MATHEMATICS AND SCIENCE STUDY 27

not designed to assess many of the skills identified in current math- ematics and science standards, for example, because these cannot re- alistically be assessed in a large-scale assessment format. Michael Huberman offered another perspective on the notion of TIMSS as an international benchmark. "There seems to be a Zeitgeist permeating the study," he wrote. (Huberman, 1997:7) His suggestion was that the U.S. NCTM standards had a heavy influence on the content framework, and that a policy perspective supportive of na- tional curricula and of a "back to the basics" approach to standards seemed to lie behind some of the decisions about the structure of TIMSS. His concern was that these unexamined assumptions have guided the study itself and will guide interpretation and application of the findings. Finally, many of the presentations offered reminders that the TIMSS results are not yet fully digested, and are by no means conclusive. Taking as an example the question of a national curriculum, it is clear that many perspectives are coexisting under the tent of TIMSS. The conclusion drawn by Bill Schmidt, based on his study of curricula and texts, is clearly that U.S. students don't perform as well as they could because their instruction is neither coherent nor consistent. While none of the other TIMSS researchers made causal claims as specific, it is clear that other plausible explanations deserve exploration. The preliminary findings from both of the qualitative studies presented at the symposium, for example, highlight compelling observations about classroom practice and contextual factors that might have large effects on student learning. Atkin and Black clearly took issue with Schmidt's claim that a lack of curricular coherence accounts for the performance of U.S. students, noting that, "there is no strong evidence from the TIMSS data that the existence or absence of a nationally prescribed curricu- lum leads to improved performance" (Atkin and Black, 1997:15~. They noted that "although eight of the top ten countries tin science] all have national curricula, so do eight of the bottom ten," and the results are similar for mathematics (Atkin and Black, 1997:15~. Paul Black concluded his remarks with a gloomy scenario for the United States related to this point. "My nightmare," he explained, "is that an incor- rect conclusion from the TIMSS data is you need a firm national curriculum Land that] you need regular testing. It has got to be afford- able; therefore it will be short; and we have got to do this quickly." Reminding participants that his own country, Great Britain, has re- cently instituted a national curriculum, Black argued that that experi- ence had yielded little improvement and had damaged teacher morale. Elmore implicitly addressed Schmidt' s call for coherence in America' s curricula by arguing, in effect, that it is not politically realistic. Not- ing that "the temporary bi-partisan consensus on goals and standards that followed from the Charlottesville summit [on education issues] concealed, it turns out, a deep and roiling suspicion of anything 'na- tional' or 'federal' in matters of curriculum and student learning" 28 LEARNING FROM TIMSS:

Next: Summary »

Learning from TIMSS: Results of the Third International Mathematics and Science Study, Summary of a Symposium (1997)

Chapter: Policy Issues

Welcome to OpenBook!

Get Email Updates