Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 229
Methodological Advances in Cross-National Surveys of Educational Achievement Part III Making Inferences
OCR for page 230
Methodological Advances in Cross-National Surveys of Educational Achievement This page in the original is blank.
OCR for page 231
Methodological Advances in Cross-National Surveys of Educational Achievement 8 The Measurement of Opportunity to Learn1 Robert E. Floden* Sometimes it seems as though, in the United States at least, the attention to student opportunity to learn (OTL) is even greater than the attention to achievement results. For the Third International Mathematics and Science Study (TIMSS), the finding that the U.S. curriculum is “a mile wide and an inch deep” may be better remembered than whether U.S. students performed relatively better on fourth-grade mathematics or on eighth-grade science. To be sure, the interest in what students have a chance to learn is motivated by a presumed link to achievement, but it is nonetheless striking how prominent OTL has become. As McDonnell says, OTL is one of a small set of generative concepts that “has changed how researchers, educators, and policy makers think about the determinants of student learning” (McDonnell, 1995, p. 305). Over more than three decades of international comparative studies, OTL has come to occupy a greater part of data collection, analysis, and reporting, at least in studies of mathematics and science learning. The weight of evidence in those studies has shown positive association between OTL and student achievement, adding to interest in ways to use OTL data to deepen understanding of the relationships between schooling and student learning. In the broader realm of education research, its use has been extended to frame questions about the learning opportunities for others in education systems, including teachers, administrators, * Robert Floden is professor of teacher education, and of measurement and quantitative methods, in the College of Education at Michigan State University.
OCR for page 232
Methodological Advances in Cross-National Surveys of Educational Achievement and policy makers. In education policy, the concept is used to frame questions about quality of schooling, equal treatment, and fairness of high-stakes accountability. It seems certain to play a continuing part in international studies, with a shift toward use as an analytic tool now that the general facts of connections to achievement and large between-country variations have been repeatedly documented. Given its importance, it is worth considering what hopes have been attached to OTL, how it has been measured, how it has actually been used, and what might be done to improve its measurement and productive use. This chapter will address these several areas by looking at the role of OTL in international comparative studies and at its use in selected U.S. studies of teaching, learning, and education policy. Most attention will fall on studies of mathematics and science learning because those are the content areas where the use of OTL has been most prominent, in part because it has seemed easier to conceptualize and measure OTL in those subject areas. WHAT IS OTL? The most quoted definition of OTL comes from Husen’s report of the First International Mathematics Study (FIMS): “whether or not . . . students have had the opportunity to study a particular topic or learn how to solve a particular type of problem presented by the test” (Husen, 1967a, pp. 162-163, cited in Burstein, 1993). (The formulation, with its mention of both “topic” and “problem presented by the test,” hints at some of the ambiguity found both in definition and in measurement of OTL.) Husen notes that OTL is one of the factors that may influence test performance, asserting that “If they have not had such an opportunity, they might in some cases transfer learning from related topics to produce a solution, but certainly their chance of responding correctly to the test item would be reduced” (Husen, 1967a, pp. 162-163, cited in Burstein, 1993). The conviction that opportunity to learn is an important determinant of learning was incorporated in Carroll’s (1963) seminal model of school learning, which also extended the idea of opportunity from a simple “whether or not” dichotomy to a continuum, expressed as amount of time allowed for learning. By treating other key factors, including aptitude and ability as well as opportunity to learn, as variables expressed in the metric of time, Carroll’s model created a new platform for the study of learning. One important consequence was that the question became no longer “What can this student learn?” but “How long will it take this student to learn?” Questions of instructional improvement have, as a result, been reshaped to give greater prominence to how much time each student is given to work on topics to be mastered. In the United States, this new
OCR for page 233
Methodological Advances in Cross-National Surveys of Educational Achievement view of aptitude contributed to the shift from identifying which students could learn advanced content to working from the premise that all students could, given sufficient time, learn such content. That shift supports the interest in opportunity to learn as a potentially modifiable characteristic of school that could significantly affect student learning. Carroll posits that the degree of student learning is a function of five factors: Aptitude—the amount of time an individual needs to learn a given task under optimal instructional conditions. Ability—[a multiplicative factor representing the student’s ability] to understand instruction. Perseverance—the amount of time the individual is willing to engage actively in learning. Opportunity to learn—the time allowed for learning. Quality of instruction—the degree to which instruction is presented so as not to require additional time for mastery beyond that required by the aptitude of the learner. (Model as presented in Borg, 1980, pp. 34-35) Of these factors, the first three are characteristics of the student; the last two are external to the child, under direct control of the teacher, but potentially influenced by other aspects of the education system. The model specifies the functional form of the relationship, starting with the general formulation that the degree of learning is a function of the ratio between time spent learning and time needed to learn: The model elaborates on this function, expressing “time spent learning” as the product of “opportunity to learn” and “perseverance,” and “time needed to learn” as a product of “aptitude,” “quality of instruction,” and “ability to understand instruction” (see Figure 8-1). (The last three variables are scaled counterintuitively, so that a low numerical value is associated with what one would typically think of as “high” aptitude or “high” quality of instruction.) Carroll’s model, with its general emphasis on the importance of instructional time, was elaborated by Bloom (1976) and Wiley and Harnischfeger (1974). The concept of OTL has been differentiated so that it involves more than a simple time metric. One set of distinctions has separated the intentions for what students will study from the degree to which students actually encounter the content to be mastered. Imagine a
OCR for page 234
Methodological Advances in Cross-National Surveys of Educational Achievement FIGURE 8-1 The Carroll model. SOURCE: Berliner (1990). Reprinted with permission of Teachers College Press. progression that starts at a distance from the student—say with a national policy maker—and goes through successive steps, nearer and nearer to the student, ending with content to which the student actually attends. At each step in this chain, a form of OTL exists if the content is present to some degree and does not exist if the content is absent. Studies could attempt to measure the degree of any of these types of OTL: To what extent is the topic emphasized in the national curriculum? In the state curriculum? In the district curriculum? In the school curriculum? How much time does the teacher plan to spend teaching the content to this class? How much time does the teacher actually spend teaching the topic? How much of that time is the student present? To what degree does the student engage in the corresponding instructional activities? (In Carroll’s model, the latter may be part of “perserverance.”) For each level, common sense and, in some cases, empirical evidence suggest that OTL will be related to whether or how well students learn the content. International comparative studies, and the International Association for the Evaluation of Educational Achievement (IEA) studies in particular, have divided this chain of opportunities into two segments, or “faces” of the curriculum: the intended curriculum and the implemented curriculum. (A third “face” of the curriculum, the “attained curriculum,” is what students learned. That represents learning itself, rather than an opportunity to learn.) For each link in this chain, a study could measure opportunity to learn as a simple presence or absence or as having some degree of emphasis, usually measured by an amount of time intended to be, or
OCR for page 235
Methodological Advances in Cross-National Surveys of Educational Achievement actually, devoted to the topic. At the level of national goals, for example, one could record whether or not a topic was included, or could record some measure of the relative emphasis given to a topic by noting how many other topics are mentioned at the same level of generality, by examining how many items on a national assessment are devoted to the topic, or by constructing some other measure of relative importance. For the implemented curriculum, emphasis on a topic could be measured by the amount of time spent on the topic (probably the most common measure), by counting the number of textbook pages read on the topic, by asking the teacher about emphasis given to the topic, and so on. Early studies that used time metrics for opportunity to learn2 looked at several ways of deciding what time to count. These studies looked for formulations that would be highly predictive of student achievement and that could be used to make recommendations for changes in teaching policy and practice. Wiley and Harnischfeger (1974) began by looking at rough measures of the amount of time allocated in the school day, finding a strong relationship between the number of hours scheduled in a school year and student achievement. The Beginning Teacher Evaluation Study (BTES) (Berliner, Fisher, Filby, & Marliave, 1978) also found that allocated time, in this case the time allocated by individual teachers, was related to student achievement. To obtain an even stronger connection to student achievement, the investigators refined the conception of opportunity to learn, adding information about student engagement in instructional tasks and about the content and difficulty of the instructional tasks to their measurement instruments. Following the work of Bloom (1976), Berliner and his colleagues argued that student achievement would be more accurately predicted by shifting from allocated time to “engaged” time. That is, students are more likely to learn if they not only have time that is supposed to be devoted to learning content, but also are paying attention during that time, if they are “engaged.” Pushing the conception even further, they argued that the student should not only be engaged, but should be engaged in some task that is relevant to the content to be learned. That is, the opportunity that counts is one in which the student is paying attention, and paying attention to material related to the intended learning. Finally, the research group studied what level of difficulty was most related to student learning, asking whether it was more productive for students to work on tasks where the chance of successfully completing the task was high, moderate, or low. They found that student achievement was most highly associated with high success rate.3 Therefore, the version of opportunity to learn that they found to be tied most closely to student learning is what they dubbed “Academic Learning Time” (ALT), defined as “the amount of time a stu-
OCR for page 236
Methodological Advances in Cross-National Surveys of Educational Achievement dent spends engaged in an academic task [related to the intended learning] that s/he can perform with high success” (Fisher et al., 1980, p. 8). Measuring this succession of conceptions of OTL—allocated time, engaged time, time on task, Academic Learning Time—requires increasing amounts of data collection. Allocated time can be measured by asking teachers to report their intentions, through interview, questionnaire, or log. Measuring engaged time requires an estimate of the proportion of allocated time that students were actually paying attention.4 Measuring time on task requires a judgment about the topical relevance of what is capturing the students’ attention. Measuring Academic Learning Time requires an estimate of the degree to which students are completing the tasks successfully. In their study of the influence of schooling on learning to read, Barr and Dreeben (1983) supplemented data on amount of time spent with data on the number of vocabulary words and phonics concepts students studied. Building on their own analysis of the literature growing out of Carroll’s model, they investigated how the social organization of schools, especially the placement of students into reading groups, worked to influence learning, both directly and through the mediating factor of content coverage by instructional groups. They argue that Carroll’s model is a model for individual learning, rather than school learning, in the sense that it describes learning as a function of factors as they influence the student, without reference to how those factors are produced within the social settings of the school or classroom. For large-scale international comparative studies, these conceptions of OTL suggest a continuum of tradeoffs in study design. Each conception of OTL has shown some connection to student achievement. The progression of conceptions from system intension to individual time spent on a topic moves successively closer to the experiences that seem most likely to influence student learning. But the problems of cost and feasibility also increase with the progression. The model of learning suggests that the link will be stronger if OTL is measured closer to the student; policy makers, however, are more likely to have control of the opportunities more distant from students. Questionnaires have been used in international studies to gather information on allocated time, tied to specific content. Such questionnaires give information on time allocation and the nature of the task, but shed little light on student engagement in the tasks, or on students’ success rate. Ball and her colleagues’ pilot study of teacher logs (Ball, Camburn, Correnti, Phelps, & Wallace, 1999) raises questions about teachers’ ability to estimate the degree of student engagement. (Nearly a century ago, Dewey [1904/1965] claimed that it is difficult for anyone to determine when a student is paying attention. The difficulty probably increases with
OCR for page 237
Methodological Advances in Cross-National Surveys of Educational Achievement the age of students.) Teachers might be able to report on the students’ success rate at a task, but that success rate probably varies across tasks, suggesting that measurement at a single time (e.g., with a questionnaire) would be unlikely to capture difficulty for the school year as a whole. Given the ambiguity about whether engagement is a part of OTL or of a student’s perseverance, it is probably best for large-scale international studies to leave student engagement out of the measurement of OTL. It should be clear the OTL is a concept that can have a variety of specific interpretations, each consistent with the general conception of students having had the opportunity to study or learn the topic or type of problem. Past international studies have chosen to include measurements of more than one of these conceptions, which may be associated with one another, yet remain conceptually distinct. WHY IS OTL IMPORTANT? For international comparative work, OTL is significant in two ways: as an explanation of differences in achievement and as a cross-national variable of interest in its own right. In the first case, scholars and policy makers wish to take OTL into account or “adjust” for it in interpreting differences in achievement, within or across countries. If a country’s low performance on a subarea of geometry, for example, is associated with little opportunity for students to learn the content of that subarea, there is no need to hunt for an explanation of the low score in teaching technique or poorly designed curriculum materials: The students did not know the content because they had never been taught it. In the second case, scholars and policy makers take an interest in which topics are included in a country’s curriculum (as implemented at a particular grade level, with a particular population) and which are excluded or given minimal attention. Policy makers in a country might, for example, be interested to see that some countries have included algebra content for all students in middle school, contrary to a belief in their own country that such content is appropriate for only a select group, or only for older students. The language of the FIMS reports suggests that the reason for asking whether students had an opportunity to learn content was to determine whether the tests used would be “appropriate” for the students: “Teachers assisting in the IEA investigation were asked to indicate to what extent the test items were appropriate for their students. This information is based on the perception of the teacher as to the appropriateness of the items” (Husen, 1967b, p. 163). The implication suggested is that if a student had not had the opportunity to learn material, testing the student on the material would be inappropriate, in the sense that the student could not be expected to answer the questions. At the level of a country, taking
OCR for page 238
Methodological Advances in Cross-National Surveys of Educational Achievement results for content that students had not had the opportunity to learn at face value also would be inappropriate. Information about OTL has value because it gives a way of deciding whether it is appropriate to look at national achievement results for particular content. In the same spirit, reports on the Second International Mathematics Study (SIMS) warn against comparing the performance of two countries unless both countries had given students the opportunity to learn the content. It is interesting to note that students in Belgium (Flemish), France, and Luxembourg were among those who had their poorest performance on the geometry subtest. In those three systems, the study of geometry constitutes a significant portion of the mathematics curriculum, and these results are an indication of the lack-of-fit between the geometry curriculum in those systems and geometry as defined by the set of items used in this study. These findings underscore the importance of interpreting these achievement results cautiously. They are a valid basis for drawing comparisons only insofar as the items which defined the subtests are equally appropriate to the curricula of the countries being compared. (Robitaille & Garden, 1989, p. 123) Although some scholars deny that comparative studies should be taken as some sort of “cognitive Olympics,” many news reports treat them as such.5 Information on OTL provides a basis for deciding whether a country’s poor performance should be attributed to a decision not to compete. Adjustments for OTL are also of interest for those scholars who see comparative research as a search for insights into processes of teaching and learning, rather than a way to determine winners and losers. Because of the variation in national education systems, information on the achievement in other countries can be a source of ideas for how teaching processes, school organization, and other aspects of the education system affect student achievement. Comparative research can help countries learn from the experiences of others. To the extent that research is able to untangle the various influences on student achievement, it can help in developing models of teaching and learning that can be drawn on in various national contexts, avoiding some of the pitfalls that come from simply trying to copy the education practices of countries with high student achievement. The issue . . . is not borrowing versus understanding. Borrowing is likely to take place. The question is whether it will take place with or without understanding.... Understanding . . . is a prerequisite to borrowing with satisfactory results. (Schwille & Burstein, 1987, p. 607)
OCR for page 239
Methodological Advances in Cross-National Surveys of Educational Achievement OTL can be an important determinant of student achievement. If OTL is not taken into account, its effects may be mistakenly attributed to some other attribute of the education system. A general rule in developing and testing models of schooling is that misspecification of the model, such as omitting an important variable like OTL, can lead to mistaken estimates of the effects of other factors. In addition to its uses for understanding achievement results and their links to education systems, OTL is of interest in its own right. One of the insights from early comparative studies was a picture of the commonalities and differences in what students in varying countries had the opportunity to learn. As one of the SIMS reports puts it: “A major finding of this volume is that while there is a common body of mathematics that comprises a significant part of the school curriculum for the two SIMS target populations . . . , there is substantial variation from system to system in the mathematics content of the curriculum” (Travers & Westbury, 1989, p. 203). An understanding of similarities and differences across countries gives each nation a context for considering the learning opportunities it offers. A look at the within-country variation in OTL also provides a basis for considering current practice and possible alternatives. What variation in OTL occurs across geographic regions in a country? Across social classes? Between boys and girls? The variation found in other countries is a basis for reflecting on the variation in one’s own country. HOW HAS OTL BEEN MEASURED IN INTERNATIONAL COMPARISONS? I will focus on FIMS, SIMS, and TIMSS, the three international comparative studies in which OTL has played the most significant role. The First International Mathematics Study (Husen, 1967a, 1967b) included the definition of OTL quoted earlier, as “whether or not . . . students have had the opportunity to study a particular topic or learn how to solve a particular type of problem presented by the test.” The FIMS report describes the questions asked about OTL as “based on the perception of the teacher as to the appropriateness of the items” (Husen, 1967a, p. 163). The choice of the word “appropriate” suggests that the intent in measuring OTL was to prevent interpreting low scores due to lack of OTL as indicative of some deficiency in teachers or students. If the item was not taught to students, then it would be “inappropriate” for those students. The actual question put to teachers to measure OTL was as follows: To have information available concerning the appropriateness of each item for your students, you are now asked to rate the questions as to
OCR for page 256
Methodological Advances in Cross-National Surveys of Educational Achievement ing, but the importance of out-of-school learning will vary by content area and by country. In most industrialized countries the path to early literacy begins in the home and continues through formal instruction in school. There is an interdependence between the two sources, since formal instruction gains its full effectiveness on the foundation established and maintained by parents and family members. In many developing countries, however, high rates of parental illiteracy make it impossible for parents to enter directly into the process of helping their children learn how to read. In these societies, instruction in reading depends primarily on what the child encounters in school. (Stevenson, Lee, & Schweingruber, 1999, p. 251) Such differences in family-based opportunities to learn may account for some of the well-documented associations between family background (including social class, income, levels of mother’s and father’s education) and achievement. Such associations have been found within countries, in the United States most famously in the Coleman report (Coleman et al., 1966). Effects of family background on achievement also have been found in cross-national studies such as SIMS. “Just like its IEA predecessors, the SIMS results of the analyses of the effects of background characteristics on achievement (status) at either pretest or posttest occasions showed the strong relationships of such variables as the pupil’s mother’s education, father’s education, mother’s occupation, and father’s occupation with cognitive outcomes,” note Kifer and Burstein (1992, p. 329). “The immediate evidence and external evidence agree in attributing more variation in student achievement to the family background than to school factors. The reason is not far to seek. It is that parents vary much more than schools,” adds Peaker (1975, p. 22).12 Such family background variables are like OTL in that they may provide an explanation for achievement differences that otherwise might be attributed to differences in the education system. To understand the connections between schooling and achievement, some of the effects of family background can be statistically “controlled,” either by including measures of the background variables in statistical models used to estimate links between schooling and achievement or by including measures of prior achievement, which would themselves be highly associated with family background. SIMS investigators concluded that including prior achievement was a necessary approach for controlling both the effects of experiences prior to the school year and the effects of differences in curriculum based on those prior experiences (Schmidt & Burstein, 1992). They also found that including prior achievement in the analysis—looking at learning across the
OCR for page 257
Methodological Advances in Cross-National Surveys of Educational Achievement year rather than merely at achievement—greatly reduced the influence of such background variables. “The background characteristics of students are not strongly related to growth because the pretest removes an unknown but large portion of the relationship between those characteristics and the posttest,” note Kifer and Burstein (1992, p. 340, emphasis added). The strategy of focusing on achievement gains as a way of controlling for differences in background factors seems implicit in the arguments for importance of curriculum in the TIMSS publication, Facing the Consequences (Schmidt et al., 1999). Using items that appear on tests at more than one level, and taking advantage of the fact that tested populations include students at more than one grade level, Schmidt and his colleagues are able to estimate gains in achievement across grades. While acknowledging that gains across grades could be due in part to “life experience” (i.e., opportunities to learn outside school), they argue that the associations of the gains with curriculum content at the corresponding grade levels indicate that curriculum, that is, OTL within schools, is an important factor in learning the content of these items. The general purport of all specific items discussed above is that, while developmental and life experience factors may be involved in accounting for achievement changes, curricular factors undoubtedly are.... The main evidentiary value of examining these link items is that their differences rule out explanations based on factors such as maturation, life experience, or some general measure of mathematics or science achievement. (Schmidt et al., 1999, pp. 158-159) In summary, connections between OTL and student achievement typically are conceived as within-school phenomena. Students do, however, sometimes have other opportunities to learn outside the classroom, opportunities that are often linked to differences in family background. The significance of these outside opportunities will vary by content area: Families differ substantially in the opportunities young children have to acquire basic literacy; families likely will vary much less in the opportunities they provide for learning how to compute the perimeter of a rectangle (because such knowledge is less likely to be part of the ordinary lives of any families). Thus, evidence about the connection between OTL and student achievement may be easier to interpret for topics that are more “academic,” that is, more distinctively school knowledge. Although differences in how “academic” that content is will vary by school subject (e.g., chemistry is more academic than reading), looking at individual items can make it easier to identify content that is unlikely to be learned outside school. Looking at measures of gain, rather than status, is another way to simplify interpretation of the effects of school OTL.
OCR for page 258
Methodological Advances in Cross-National Surveys of Educational Achievement WHAT HAS RESEARCH SHOWN ABOUT THE STRENGTH OF OTL EFFECTS? Empirical support for the influence of time spent engaged in learning on student achievement was provided by the Beginning Teacher Evaluation Study (BTES), which used a combination of teacher logs and classroom observations to record how much time a sample of elementary school teachers “allocated to reading and mathematics curriculum content categories (e.g., decoding consonant blends, inferential comprehension, addition and subtraction with no regrouping, mathematics speed tests, etc.)” (Fisher et al., 1980). (Classroom observations also were used to estimate the fraction of allocated time that students actually engaged in the learning opportunities.) BTES found substantial differences in the time teachers allocated (i.e., in OTL) and found statistically significant associations between time allocated and student achievement. These are samples of the early empirical evidence that if students spend more time working on a topic, they will learn more about the topic. Or, conversely, and perhaps more important in the context of international comparative studies, if students spend little or no time working on a topic, they will learn little about it. As the quote from Husen (p. 232) suggests, the exceptions come either when the student is able to transfer learning from another topic or when the student spends time outside of school on the topic (e.g., learning from parents or independent reading, even though the topic is not studied as part of formal education). The BTES research found positive associations between student achievement and each of these measures of OTL. For the full set of Academic Learning Time variables, the effects on student achievement were statistically significant for some, but not all, of the specific content areas tested in grades two and five reading and mathematics. Overall, about a third of the statistical tests were significant at the 0.10 level. The magnitudes of the effects are indicated by the residual variance explained by the ALT variables, after the effect of prior achievement has been taken into account. Those residual effects ranged in magnitude from 0.01 to 0.30, with an average on the order of 0.10 (Borg, 1980, p. 67). Barr and Dreeben (1983) found a high correlation (0.93) between the number of basal vocabulary words covered and a test of vocabulary knowledge, accounting for 86 percent of the variance on that test. The correlation was also high with a broader test of reading at the end of first grade (0.75) and even with a reading test given a year later (0.71). For phonics, the correlations with content coverage were somewhat lower, 0.62 with a test of phonics knowledge, 0.57 with first-grade achievement, and 0.51 with second-grade achievement. Barr and Dreeben’s attention to the social organization of schooling led them to examine whether the correlation between coverage and
OCR for page 259
Methodological Advances in Cross-National Surveys of Educational Achievement achievement comes because students with higher aptitude cover more content. They found that groups with higher mean aptitude do cover more content, but that the correlation between individual student aptitude and achievement was close to zero. Thus the investigators conclude that student aptitude affects how much content they cover through affecting assignment to group, but, given assignment to group, it is content coverage, not aptitude, that is the major determinant of learning, especially for vocabulary. In FIMS, the within-country relationship between OTL and achievement was positive, but varied. As Husen (1967a, pp. 167-168) wrote, “There was a small but statistically significant positive correlation between the scores and the teachers’ ratings of opportunity to learn the topics. There was, however, much variation between countries and between population within countries in the size of these coefficients.” The small magnitude of the relationship in some countries may have been due to limited variation in OTL within those countries, that is, in the uniformity of curriculum in those countries. The between-country association between OTL and student achievement, however, was substantial, with correlations of 0.4 to 0.8 for the different populations. “In other words, students have scored higher marks in countries where the tests have been considered by the teachers to be more appropriate to the experience of their students” (p. 168). For SIMS, the Westbury-Baker exchange mentioned earlier shows that the relationship between OTL and achievement was, at least in Westbury’s initial analysis, strong enough to explain all of the Japan-U.S. differences in achievement. Thus OTL can be a powerful explanatory variable when looking at particular, fairly narrow comparisons. The more general analysis of OTL data for SIMS, however, did not yield results that were striking enough to be given attention in general conclusions of the study. An overall OTL variable was included in a broader search for patterns in the data (Schmidt & Kifer, 1989). The within-country analysis used a hierarchical model with predictive variables that included student gender, language of the home, family help, hours of mathematics homework, proportion of class in top one-third nationally, class size, school size, and teacher’s age. The model was estimated for each country by achievement topic area (i.e., arithmetic, algebra, geometry, measurement, statistics, total). The frequency of the importance of each variable was reported in a table of the number of statistically significant betas for each topic area over the 20 countries. What is striking about the table (Schmidt & Kifer, 1989, p. 217) is that none of the between-class or between-school variables has more than 4 (out of a possible 20) significant betas for any topic area. The OTL variable has one significant beta over countries for each of the five subtests and one for the total test.
OCR for page 260
Methodological Advances in Cross-National Surveys of Educational Achievement Among the between-class or between-school variables, that puts it below class size, proportion of class in top third nationally, class hours of mathematics per week, and about the same as teacher age. This is, at best, a modest effect within countries. That is consistent with the FIMS results, where within-country associations between OTL and achievement were positive, but weak to modest. Perhaps these modest associations were due, as in the case of FIMS, to relatively small within-country variation in OTL in most countries. (The United States is unusual in the extent of within-country OTL variation.) For FIMS, between-country associations were substantial. Beyond the Westbury-Baker exchange, I did not locate any reports of the between-country OTL-achievement associations from SIMS. As noted, the TIMSS reports document the cross-country variation in the content of curricular materials, but I did not locate reports on the variation in teachers’ reported OTL.13 Given the important role the differences in national curriculum differences have played in policy discussions, the TIMSS curriculum analysis deserves further attention by scholars within and outside the community that has carried out the TIMSS studies. CONCLUSION The sequence of international comparative studies of mathematics and science has given increasing attention to variation in OTL, both in data collection and in reporting. OTL information has obvious importance for the interpretation of achievement differences within and across countries. OTL’s link to curricular intentions has become a major spur for discussions about an individual country’s curriculum, at least in the United States. The positive association between OTL and student learning has been documented in a number of studies, although the measured association between OTL and achievement or achievement gains has been quite varied. Some analyses show a weak connection; Westbury’s analysis produced a dramatic effect of taking OTL into account. Some of the variation in strength of association can be attributed to the amount of variation in OTL; some can be attributed to reliance on teachers’ memories of content coverage; some may be due to learning outside school or to transfer of learning from one topic to another. The number of reports making extensive use of the currently collected OTL data seems small. The complexities of analysis are likely daunting. The dramatic results in Westbury and Baker should make further analyses attractive. Their focus on a single pair of countries undoubtedly contributed to their ability to find clear (though somewhat contradictory) results.
OCR for page 261
Methodological Advances in Cross-National Surveys of Educational Achievement What does this review suggest for future comparative studies? The importance of continuing to collect some OTL information is evident. Now, as before, interpretation of achievement or learning comparisons requires some understanding of what learning was intended and how much these intentions were realized, by nations and by teachers. Looking at specific topics, rather than general content areas, is more likely to yield measurable differences in OTL and stronger relationships to achievement. But finer grained work also places greater burdens on respondents and analysts. Choices about what sort of OTL data to collect, and how, must be considered in light of the questions to be answered. Surely no further evidence is needed to support the conclusion that OTL has some positive effect on achievement. Attention now should turn to understanding the processes of teaching and learning, using information on OTL to help construct analytic models that can do a better job of identifying the separate, though perhaps interacting, effects of student aptitude, perseverance, quality of instruction, and opportunity to learn. The work done to date with OTL suggests several principles that should be used in continuing to refine measures: The questions asked should allow for analyses that separate OTL from related variables, such as teachers’ judgments about whether students have learned (as opposed to having the opportunity to learn) and student perseverance in working on a topic. Questions should also allow for separation of information about classroom processes from information about content coverage. The Carroll model separates OTL from quality of instruction. Keeping these distinct is important for building more accurate models of the influences on student learning. Moreover, methodological studies suggest that teachers are less reliable in giving retrospective information about time spent using different methods of instruction than they are in giving information about time spent on specific topics. Information about opportunities to learn in grades other than those tested is important for understanding a system’s curriculum and for understanding connections between school practices and student learning. Information can be collected from teachers in the tested grades, but their reports may contain inaccuracies because of lack of communication within the system or because of changes in the curriculum over time. Information about student achievement in the prior year can give a better indication of prior learning. In past studies, some countries have seen such longitudinal designs as difficult to implement, but the added information they provide strongly suggests that they should be considered.
OCR for page 262
Methodological Advances in Cross-National Surveys of Educational Achievement Information also should be gathered about opportunities to learn topics outside the classrooms in the tested subject. The interpretation of achievement results will be influenced by whether students had opportunities to learn content outside school or in other subject areas (e.g., learning how to write reports in social studies classes or learning about measurement in science classes). Development of measures of OTL deserves the same care given to development of cognitive achievement measures. The reliability of OTL measures can be used as the basis for successive refinements, just as it is in selection and revision of test items. Ambiguous wordings of content descriptions should be detected and eliminated. In mathematics, the wording and organization of topic categories have benefited from decades of research; topic categories in other subject areas would benefit from similar cycles of testing and revision. Studies of the associations between student learning and various OTL variables can be used to determine what combinations and functional forms have the greatest predictive validity. Further work could be done to investigate the construct validity of OTL measures. In short, researchers and policy makers have come to agree that OTL should be an important part of international comparative studies of achievement, in large part because OTL has been shown to have a link to student learning. In future studies, attention should shift from producing further evidence to support the existence of that link, toward using measures of OTL to build better models of the sources of student learning. Such models should give insight into the reasons students receive differing opportunities to learn and the reasons that student learning varies, given the same opportunities to learn. To make that shift, investments should be made, as they have begun to be in mathematics, into improving OTL measures so that they can better support fruitful research. Carroll’s model of school learning remains a helpful general framework for such research, but should be supplemented to include the factors—system policies, classroom organization, teacher knowledge—that influence the opportunity to learn and the quality of instruction students experience. NOTES 1. This paper has benefited substantially from comments on an earlier draft by Andy Porter, by other members of BICSE, and by Jack Schwille. 2. The word “early” is not quite accurate. Studies of time allocations in schools go back at least to the early 1900s. Although these studies did not use the language of “opportunity to learn,” they were motivated by the thought that allocations of time were a determinant of student achievement. For a sketch of this distant history, see Borg (1980) or Berliner (1990).
OCR for page 263
Methodological Advances in Cross-National Surveys of Educational Achievement 3. The idea that student achievement will be highest when academic tasks are relatively easy for students (i.e., that students have a high success rate with them) seems sensible in some respects, but may depend on the subject matter involved and the focus of the achievement test. Berliner and his colleagues were looking at effects on the learning of basic skills. For higher level content, easy academic tasks might be less productive. Recent discussions of standards often use the phrase “challenging content,” which seems inconsistent with the assignment of easy tasks. A basic principle in the Japanese mathematics lessons, captured in the TIMSS videotapes, seems to be that lessons should be structured around difficult problems. 4. “Engaged time” also may be seen as a part of Carroll’s “perseverance,” rather than OTL. How this should be treated in a research study depends on the purpose of the analysis. Engagement might be thought of as a function of individual student characteristics, or as another aspect of instruction that the educational system, through the teacher, might affect. 5. Schwille and Burstein (1987, p. 606) describe tensions within the research community itself between seeing international studies as between-country comparisons and seeing them as a set of within-country studies. They note, however, that IEA researchers often have denied interest in the “cognitive Olympics,” given the difficulties of analysis that must take into account so many differences among countries, differences that are arguably important as control or explanatory variables. 6. In the first science study, the OTL questions were asked collectively of all teachers at the tested grade level in a school. That obscured any differences in OTL among teachers within a school. That problem was corrected in the mathematics studies. 7. Porter and his colleagues have combined an instrument for measuring instructional practices with one measuring instructional content in mathematics and science into a package they call the Surveys of Enacted Curriculum (SEC) (Porter & Smithson, 2001). This package is being used in several large-scale studies (Blank, Kim, & Smithson, 2000; Council of Chief State School Officers, 2000; Ware, Richardson, & Kim, 2000). The performance of this package in these studies should be useful in deciding how this package might be adapted and refined for future research. 8. This empirical result appears to be at odds with interpretations of TIMSS results that suggest that Japanese teachers’ treatment of a small number of mathematics and science topics in great depth leads to high overall levels of achievement. 9. Porter and Smithson (2001) offer a useful discussion of the role of studies of alignment between standards and assessments in studying the effects of policies on student learning. 10. For the “how often” question, the conversion was: never = 0 days; a few times a year = 5 days; once or twice a month = 14 days; once or twice a week = 55 days; nearly every day = 129 days; daily = 184 days. For the “for how many minutes” question, the conversion was: none = 0; a few minutes in the period = 5 minutes; less than half = 15 minutes; about half = 25 minutes; more than half = 37.5 minutes; almost all of the period = 50 minutes. 11. TIMSS staff told me that the OTL information will be an important part of more detailed analyses presented in reports in preparation at the time this chapter was going to press. 12. Jack Schwille has pointed out to me that Peaker’s attribution of more variance to family background than to schools is based on analyses that enter all family background variables first. When, as is often the case, family background variables are correlated with school variables, it is difficult to determine how the shared variance should be apportioned. In any case, the effects of differences in family background on achievement consistently have been shown to be substantial.
OCR for page 264
Methodological Advances in Cross-National Surveys of Educational Achievement 13. These may appear in TIMSS reports that were in preparation at the time this paper went to press. REFERENCES Baker, D. P. (1993a). Compared to Japan, the U.S. is a low achiever . . . really: New evidence and comment on Westbury. Educational Researcher, 22(3), 18-20. Baker, D. P. (1993b). A rejoinder. Educational Researcher, 22(3), 25-26. Ball, D. L., Camburn, E., Correnti, R., Phelps, G., & Wallace, R. (1999). New tools for research on instruction and instructional policy: A Web-based teacher log (Document W-99-2). Seattle: University of Washington, Center for the Study of Teaching and Policy. Barr, R., & Dreeben, R. (1983). How schools work. Chicago: University of Chicago Press. Berliner, D. C. (1990). What’s all the fuss about instructional time? In M. Ben-Peretz & R. Bromme (Eds.), The nature of time in school (pp. 3-35). New York: Teachers College Press. Berliner, D. C., Fisher, C. W., Filby, N., & Marliave, R. (1978). Executive summary of Beginning Teacher Evaluation Study. San Francisco: Far West Regional Laboratory for Educational Research and Development. Blank, R. K., Kim, J. J., & Smithson, J. L. (2000). Survey results of urban school classroom practices in mathematics and science: 1999 report. Norwood, MA: Systemic Research. Bloom, B. S. (1976). Human characteristics and school learning. New York: McGraw-Hill. Borg, W. R. (1980). Time and school learning. In C. Denham & A. Lieberman (Eds.), Time to learn (pp. 33-72). Washington, DC: U.S. Department of Health, Education, and Welfare, National Institute of Education. Burstein, L. (1993). Prologue: Studying learning, growth, and instruction cross-nationally: Lessons learned about why and why not engage in cross-national studies. In L. Burstein (Ed.), The IEA Study of Mathematics III: Student growth and classroom processes. New York: Pergamon Press. Burstein, L., McDonnell, L. M., Van Winkle, J., Ormseth, T., Mirocha, J., & Guitton, G. (1995). Validating national curriculum indicators. Santa Monica, CA: RAND. Carroll, J. (1963). A model for school learning. Teachers College Record, 64, 723-733. Coleman, J. S., Campbell, E. Q., Hobson, C. J., McPartland, J., Mood, A. M., Weinfeld, F. D., & York, R. L. (1966). Equality of educational opportunity. Washington, DC: U.S. Department of Health, Education, and Welfare. Council of Chief State School Officers. (2000). Using data on enacted curriculum in mathematics and science: Sample results from a study of classroom practices and subject content. Washington, DC: Author. Dewey, J. (1904/1965). The relation of theory to practice in education. In M. L. Borrowman (Ed.), Teacher education in America: A documentary history (pp. 140-171). New York: Teachers College Press. Fisher, C. W., Berliner, D. C., Filby, N. N., Marliave, R., Cahn, L. S., & Dishaw, M. M. (1980). Teaching behaviors, academic learning time, and student achievement: An overview. In C. Denham & A. Lieberman (Eds.), Time to learn (pp. 7-32). Washington, DC: U.S. Department of Health, Education, and Welfare, National Institute of Education. Flanders, J. (1994). Student opportunities in grade 8 mathematics: Textbook coverage of the SIMS test. In I. Westbury, C. A. Ethington, L. A. Sosniak, & D. P. Baker (Eds.), In search of more effective mathematics education (pp. 61-93). Norwood, NJ: Ablex. Gamoran, A., Porter, A. C., Smithson, J., & White, P. A. (1997). Upgrading high school mathematics instruction: Improving learning opportunities for low-achieving, low income youth. Educational Evaluation and Policy Analysis, 19(4), 325-338. Husen, T. (Ed.). (1967a). International Study of Achievement in Mathematics: A comparison of twelve countries (Vol. I). New York: John Wiley & Sons.
OCR for page 265
Methodological Advances in Cross-National Surveys of Educational Achievement Husen, T. (Ed.). (1967b). International Study of Achievement in Mathematics: A comparison of twelve countries (Vol. II). New York: John Wiley & Sons. Kifer, E., & Burstein, L. (1992). Concluding thoughts: What we know, what it means. In L. Burstein (Ed.), The IEA Study of Mathematics III: Student growth and classroom processes (pp. 329-341). New York: Pergamon Press. Knapp, M., with others. (1995). Teaching for meaning in high-poverty classrooms. New York: Teachers College Press. Knapp, M. S., & Marder, C. (1992). Academic challenge for the children of poverty, Vol. 2: Study design and technical notes. Washington, DC: U.S. Department of Education, Planning and Evaluation Service. Mayer, D. P. (1999). Measuring instructional practice: Can policymakers trust survey data? Educational Evaluation and Policy Analysis, 21(1), 29-45. McDonnell, L. M. (1995). Opportunity to learn as a research concept and a policy instrument. Educational Evaluation and Policy Analysis, 17(3), 305-322. National Council of Teachers of Mathematics. (1991). Professional standards for teaching mathematics. Reston, VA: Author. Peaker, G. F. (1975). An empirical study of education in twenty-one systems: A technical report (Vol. 8). Stockholm: Almqvist and Wiksell International. Porter, A., Floden, R., Freeman, D., Schmidt, W., & Schwille, J. (1988). Content determinants in elementary school mathematics. In D. A. Grouws, T. J. Cooney, & D. Jones (Eds.), Effective mathematics teaching (pp. 96-113). Reston, VA: National Council of Teachers of Mathematics. Porter, A. C. (1998). The effects of upgrading policies on high school mathematics and science. In D. Ravitch (Ed.), Brookings papers on education policy (pp. 123-164). Washington, DC: Brookings Institution Press. Porter, A. C., & Smithson, J. L. (2001). Are content standards being implemented in the classroom? A methodology and some tentative answers. In S. H. Fuhrman (Ed.), From the capitol to the classroom: Standards-based reform in the states. One hundredth Yearbook of the National Society for the Study of Education. Part II (pp. 60-80). Chicago: University of Chicago Press . Robitaille, D. F. (1989). Students’ achievements: Population A. In D. F. Robitaille & R. A. Garden (Eds.), The IEA Study of Mathematics II: Context and outcomes of school mathematics (pp. 102-125). New York: Pergamon Press. Robitaille, D. F., & Garden, R. A. (1989). The IEA study of Mathematics II: Contexts and outcomes of school mathematics. Oxford, England: Pergamon Press. Schmidt, W. H., & Burstein, L. (1992). Concomitants of growth in mathematics achievement during the population a school year. In L. Burstein (Ed.), The IEA Study of Mathematics III: Student growth and classroom processes (pp. 309-327). New York: Pergamon Press. Schmidt, W. H., & Kifer, E. (1989). Exploring relationships across Population A systems: A search for patterns. In F. D. Robitaille & R. A. Garden (Eds.), The IEA Study of Mathematics II: Contexts and outcomes of school mathematics (pp. 209-231). New York: Pergamon Press. Schmidt, W. H., McKnight, C. C., Cogan, L. S., Jakwerth, P. M., & Houang, R. T. (1999). Facing the consequences: Using TIMSS for a closer look at U.S. mathematics and science education. Boston: Kluwer Academic Press. Schmidt, W. H., McKnight, C. C., & Raizen, S. A. (1997). A splintered vision: An investigation of U.S. science and mathematics education. Boston: Kluwer Academic Press. Schmidt, W. H., McKnight, C. C., Valverde, G. A., Houang, R. T., & Wiley, D. E. (1997). Many visions, many aims, Vol. 1.: A cross-national investigation of curricular intentions in school mathematics. Boston: Kluwer Academic Press.
OCR for page 266
Methodological Advances in Cross-National Surveys of Educational Achievement Schwille, J., & Burstein, L. (1987). The necessity of trade-offs and coalition building in cross-national research: A critique of Theisen, Achola, and Boakari. Comparative Education Review, 31, 602-611. Smithson, J. L., & Porter, A. C. (1994). Measuring classroom practice: Lessons learned from the efforts to describe the enacted curriculum — The Reform Up Close Study. Madison: University of Wisconsin-Madison, Consortium for Policy Research in Education. Stevenson, H. W., Lee, S., & Schweingruber, H. (1999). Home influences on early literacy. In D. A. Wagner, R. L. Venezky, & B. V. Street (Eds.), Literacy: An international handbook (pp. 251-257). Boulder, CO: Westview Press. Travers, K. J., & Westbury, I. (1989). The IEA Study of Mathematics I: Analysis of mathematics curriculum. Oxford, England: Pergamon Press. Ware, M., Richardson, L., & Kim, J. (2000). What matters in urban school reform. Norwood, MA: Systemic Research. Westbury, I. (1992). Comparing American and Japanese achievement: Is the United States really a low achiever? Educational Researcher, 21(5), 18-24. Westbury, I. (1993). American and Japanese achievement . . . again: A response to Baker. Educational Researcher, 22(3), 21-25. Wiley, D. E., & Harnischfeger, A. (1974). Explosion of a myth: Quantity of schooling and exposure to instruction, major educational vehicles. Educational Researcher, 3(4), 7-12.
Representative terms from entire chapter: