Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 15
of the three cultures did. Lois Peak of NCES later made the point
that the written questionnaires used carefully chosen language to ask
about homework because staff were aware of the issue. Neverthe-
less, Stevenson maintained that far more sense could be made of such
an issue through observation and interview than through a question-
naire.
Another example that Stevenson addressed which had already
been raised several times during the day was that of "juku," the
after-school classes attended by many Japanese students. Stevenson's
point was that "juku" is a very vague term that refers not only to
intense academic classes, but also to craft classes, sports, and other
structured social activities. Many U.S. observers have made the claim
that the Japanese students' superior performance can be explained by
their attendance at juku because they have assumed that it provided
students with rigorous training for college entrance exams and would
compensate for any weaknesses in the schools' academic programs.
Stevenson claimed that a deeper understanding of the cultural context
reveals that this is not true, or at least that it is a seriously oversim-
plified portrayal.
Stevenson described a few other findings from the study:
.
The role of the school principal in Japan is very different
from that of one in the United States. In Japan, committees of teach-
ers have primary responsibility for running the school; the principal
serves primarily to "execute" the committee's decisions.
.
Classifications of student ability come at different times in
the three countries. In the United States, the urge to assist children
who need it often leads to tracking decisions as early as kindergarten.
In Germany, a formal decision is made at the end of fourth grade. In
Japan such evaluations are made much later.
· The Japanese curriculum is "a set of broad guidelines of the
kinds of things that should be accomplished at each grade level."
Teachers are then given considerable latitude to develop specific ex-
pectations for different children. In Germany, Stevenson found, the
situation is more similar to that of the United States in that each state
is empowered to adopt its own guidelines. The German states are,
however, required to meet broad national guidelines.
To provide a sense of the flavor of some of the material the study
produced, Stevenson read extended quotations from several teachers.
He closed by remarking that "it is these kinds of . . . vivid, vital
responses that we think give a meaning to a case study . . . that is
very difficult to come up with in any other way."
CRITIQUES AND METHODOLOGICAL ISSUES
Lynn Paine, one of the session moderators, expressed a key issue
facing the participants when she pointed out that they had been shown
RESULTS OF THE THIRD INTERNATIONAL MATHEMATICS AND SCIENCE STUDY
15
OCR for page 16
graphs of international achievement scores for thousands of students
in the morning and a videotape of "one classroom, one teacher, a
small number of students" in the afternoon. "How," she asked, "do
we somehow bring those together?" Her question reflected not de-
spair but a sense that the challenge presented by TIMSS is a new one.
As was repeatedly pointed out, TIMSS includes data drawn from dif-
ferent samples and by means of different methods; moreover, the two
three-country studies were added to the original TIMSS design (at the
urging of the United States), and there is no detailed blueprint for
fitting these elements together.
Clearly TIMSS offers risks as well as possibilities. As one of the
symposium paper authors, Michael Huberman (1977:1), wrote: "Such
a study could run the risk of the centipede, marching off in several
directions at once." The results available so far suggest that different,
and possibly conflicting, conclusions might be supported by different
parts of the study. Moreover, because the qualitative studies are inno-
vations, neither means of verifying their results nor standards for evaluating
their methods are readily available. This section explores questions
raised about aspects of the study and the larger issue of linking its
components.
Linking the Components of TIMSS
A certain amount of ambiguity may be an inevitable outcome of a
study so large and complex. Theoretical or political concerns may
drive observers to focus more closely on either the implications for
curriculum raised by Schmidt's work or the concerns about teacher
preparation raised by Stevenson, for example, given that the study
itself was not designed to indicate which finding deserves more weight.
For purely practical reasons, few observers may have both the time
and skill to truly digest all that TIMSS has to offer. This point need
not diminish the usefulness of the study's component parts, but it will
surely affect attempts to integrate them.
Nevertheless, the study components each make a contribution to
answering core questions about teaching and learning in mathematics
or science, and they should be considered as a package. At the time
of the symposium, the first TIMSS reports had just been released, and
it will not be until some time in 1998 that the last of the reports
documenting the primary analysis for each of the study components
will be released. Links among the components of the study were not
really forged during this first stage. However, the ways in which
these links are forged once the primary analyses are completed will be
crucial, and symposium participants stressed the importance of estab-
lishing a clear linking framework. Several key points about the links
were made at the symposium:
· For the components of this study to be effectively linked, rela-
tionships among different research disciplines will need to be estab
16
LEARNING FROM TIMSS:
OCR for page 17
fished. Scholarly communities that are not accustomed to working
with one another's data will need to collaborate in innovative ways to
make the best use of the findings from TIMSS.
· What happens with TIMSS will be a model for the future.
Lois Peak reported that NCES is considering using videotapes in
future studies, but she noted that using this powerful tool in valid
ways is not a straightforward task. Given the initial reaction to what
is known about the qualitative studies and the publicity they have
received, it is likely that other researchers are already considering
applying these methods in other contexts. The education community
has a considerable appetite for rich data about teaching and learning,
but, as many at the symposium pointed out, these new kinds of data
can easily be misused.
Simplistic understandings of TIMSS may be misleading. Un-
til the links are forged and subjected to rigorous scholarly scrutiny,
there is a danger that observers will use "common sense" to link the
data from the various components of TIMSS, perhaps yielding mis-
leading results. Observers who do not pay close attention might
easily miss the fine points in this complex study the fact that some
data comes from only 3 nations and some comes from 410r 26, for
example and make erroneous conclusions about explanations for achieve-
ment results.5 There are obviously many other differences among
the study's components that are salient to any analysis that draws on
more than one.
.
The Achievement Study
As has been noted, many presenters marveled at the magnitude of
what TIMSS accomplished. One described it as "a researcher's trea-
sure trove," and many noted that analyses using the data could easily
occupy the research community for many years. However, since the
achievement component of TIMSS is the base on which the study
rests, it is worth noting that several presenters expressed caveats about
it. Jan de Lange, noting that multiple-choice items have been out-
lawed in his country, The Netherlands, argued that the TIMSS items
are primarily useful for testing low-level knowledge and do not nec-
essarily represent anyone's idea of a desirable curriculum. In their
paper, Atkin and Black (1997) expressed a similar concern, noting,
for example, that a total of 11 multiple-choice and 3 free-response
items were used to test the middle school population's knowledge of
5Population 2 students in six nations were surveyed in the Survey of Mathematics
and Science Opportunities. The topic trace mapping components of the curriculum
study covered 46 nations, and that study~s survey of teachers covered Population 2
students in three nations. The videotape study and case studies each involved only
Population 2 students in Germany, Japan, and the United States. Finally, as noted,
the achievement results were reported for Population 1 students in 26 countries,
Population 2 students in 41 countries, and Population 3 students in 21 countries.
RESULTS OF THE THIRD INTERNATIONAL MATHEMATICS AND SCIENCE STUDY
17
OCR for page 18
the portion of the test domain identified as "Environmental Issues and
the Nature of Science." First, they argued, from this "small number
of questions the results can hardly be a substantial basis for firm
conclusions." They also noted that these 11 questions cover two
distinct content areas, whose relationship to one another is not ex-
plained in the framework (Atkin and Black, 1997:12-13~.
Others made similar comments, but most, including de Lange as
well as Atkin and Black, acknowledged that it would likely not have
been possible to conduct the assessment at all without using methods
that are both efficient and well established. Nevertheless, participants
noted how easy it is for observers to lose sight of exactly what was
assessed as the results are disseminated and applied in various con-
texts.
The Curriculum Study
A number of participants raised questions about the curriculum
study, primarily focusing on the conclusions Schmidt drew from his
findings. For example, several questions focused on what TIMSS
suggests about the ways that control over education systems might
interact with achievement. In response to Schmidt's argument that
U.S. students' relatively low performance is the result of an incoher-
ent curriculum, Atkin and Black made reference to results indicating
that TIMSS does not reveal a clear correspondence between centrally
controlled, and, by implication, coherent, education systems and achieve-
ment. Schmidt responded by noting that even a very focused curricu-
lum may not be implemented in the classroom in a coherent manner.
Others raised questions about whether the available means of measur-
ing and comparing curricula were truly sophisticated enough to sup-
port the detailed comparisons that have been made. Still others our
,
. . .
~ . ~ ~
sued this point from a different angle, questioning whether the impact
of the structure of curricula and textbooks can be isolated as a factor,
separate from the ways they are translated into classroom instruction.
Schmidt argued that it can, though he noted that U.S. curricula and
textbooks may not be functioning as they are intended to. For ex-
ample, he explained, textbook publishers have made rational market-
ing decisions in choosing to reflect a variety of curricula in their
books. Their intention has been that teachers will use only the mate-
rial that is relevant to the curricula they are following. Schmidt's
point was that if the system is not working, only systemic changes
can effectively improve student performance. "The problem," he main-
tained, "is in the curriculum policy area, and the only way it can be
addressed is . . . as a nation."
The Qualitative Studies
Another issue presented by TIMSS is that both of the qualitative
studies took existing methods and "ratcheted them up," in the words
18
LEARNING FROM TIMSS:
OCR for page 19
of one participant, to new levels of both scale and sophistication.
Before even addressing the links among them and the achievement
and curriculum data, observers have begun to assess these studies
themselves. Not surprisingly, because of its novelty, the videotape
study dominated the discussion.
Michael Huberman raised several important issues. He offered a
general critique of the study's theoretical underpinning (see below,
"Policy Issues"), but he also raised some specific questions about the
methods of the videotape study. First, he pointed out, although the
videotape certainly provides a far more detailed picture of the class-
room than questionnaire data could possibly have done, the picture is
still far from complete. Students and school culture, for example,
contribute a great deal to the nature of a classroom lesson and have
considerable influence on teachers' decisions, both large and small.
A videotaped lesson, Huberman argued, is not easy to interpret in the
absence of knowledge of its context. An understanding of what oc-
curred during the days preceding and following the lesson that was
videotaped might significantly alter an observer's interpretation of
the lesson.
A related issue for Huberman was that the videotapes provide a
very "teacher-centered" vision of the lesson. They cannot reveal how
students have perceived the lesson. Researchers coded teacher re-
sponses for "helpfulness" as part of their analysis, for example, al-
though they had no means of knowing whether students had per-
ceived that they had been helped by the interaction in question.
The coding was also an issue for Huberman for another reason.
What, he wondered, is the value of collecting data as rich as these
videotapes, and then immediately coding it and reducing it to statis-
tics that can be put into tables? Moreover, he asks, is there not a
danger in the "irresistible analytic convenience" of the software? Might
not the software's power in counting the frequency with which cer-
tain behaviors occurred have "tricked" researchers into "unearthing
'themes' or 'patterns"' that were not actually there (Huberman, 1997:14)?
Huberman also raised questions about the sampling for the study.
Pointing out that the sampling was not random, Huberman noted in
particular that the three types of schools in Germany, the hauptschule,
the realschule, and the gymnasium, which differ in significant ways,
were not represented proportionally. He also raised a question about
how the high refusal rate (almost 50 percent) among schools that
were asked to participate might have affected the outcome. Although
the study included a record number of classrooms, it nevertheless
runs the risk of seeming to be no more than an unusually rich collec-
tion of persuasive anecdotes.
Huberman also noted that the effect of the cameras on the teach-
ers and students who were filmed could not be known. Stigler had
addressed that issue in his presentation because it had been an impor-
tant concern for his team. Their conclusion was that while teachers'
and students' awareness of the camera may have affected their be
RESULTS OF THE THIRD INTERNATIONAL MATHEMATICS AND SCIENCE STUDY
19
OCR for page 20
havior in a variety of ways, it is not likely that teachers could actu-
ally change their teaching in fundamental ways likely to alter the study' s
results. If they could, Stigler joked, the installation of cameras in
classrooms would be a simple means of improving teaching.
A final set of questions Huberman raised concerned the fact that
the study filmed a single 50-minute lesson in each of 231 classrooms.
Huberman wondered whether filming a series of lessons in a smaller
number of classrooms might have yielded more useful results. Many
of the coding categories, he noted, were efforts to capture "activities
or processes that play out over time," such as building on complex
concepts or establishing links with content covered previously, that
can not easily be evaluated in the context of a single lesson (Huberman,
1997: 12).
Although Huberman's primary contribution was to raise questions
about the study, he nevertheless described it as an extremely impres-
sive effort. Symposium participants did not have sufficient time to
wrestle with all of the questions, or to resolve any of them, but they
did refer to many of them in various contexts. After watching two
excerpts from the videotapes, participants also raised another concern.
As was discussed above, many at the symposium had enthusiastic
reactions to the videotapes and launched eagerly into discussions of
what the lessons shown demonstrated. But as Lois Peak pointed out,
the powerful reactions people had illustrate the risk that the video-
tapes could be misused: because they are so much richer and more
compelling than written descriptions, viewers may feel a sense of
certainty about impressions based on them that is unwarranted.
This richness is, of course, their virtue as well. Lynn Paine cited
as an example of this something she observed in the two lessons that
were shown. Both could be described as decidedly teacher directed,
but their ways of being so were dramatically different. In the U.S.
lesson, she pointed out, the teacher was evidently perceived as the
sole source of both information and ideas; students in the class did not
look at others who were speaking, or seem to engage as a team. In
contrast, the Japanese teacher had clearly planned the lesson around
the idea that different students would come up with different valid
means of solving problems. He showed that he intended the students
to learn from one another as well as from him, even though he re-
tained control of the discussion.
Part of Paine's point was that this sort of insight is valuable re-
gardless of how representative a particular lesson or behavior might
be. In a larger sense, this point applies to many aspects of TIMSS.
While forging links among the components will be extremely impor-
tant, the separate sets of data can be of significant value on their own
to both policy makers and others who are seeking to evaluate policies
and strategies, and to practitioners who are seeking insights or inspi-
rations. TIMSS is not a research project designed to test pre-existing
hypotheses, as Edward Haertel pointed out; its results cannot be used
to conclusively prove or disprove assertions. It provides no control
20
LEARNING FROM TIMSS:
Representative terms from entire chapter:
international mathematics