Research and the Improvement of Education
Few Americans would question the proposition that research has been a potent force for improved medical care in the twentieth century. Few would deny that research has played an equally powerful role in the emergence of modern agriculture. When it comes to education, however, research enjoys no such flattering reputation. Whether or not the judgment is justified, research in education is more likely to be dismissed as trivial or irrelevant than it is to be considered a fundamental ingredient in understanding how children learn and in improving how they are taught.
One example of this low regard is the very small portion of federal research and development funding that goes to education: slightly more than $350 million of $64.1 billion in fiscal 1991—one half of 1 percent. In comparison, the federal government spends 3 times as much on research and development activities for agriculture, 21 times as much for space research and development, and 30 times as much for research related to health. If relative funding levels are any indication, Congress is clearly not convinced that federal support of research can benefit public education in the way it has benefited the nation's health and agriculture. Members of Congress are not alone in their general low regard for research as an integral part of a robust system of education: teachers commonly indicate that they do not use research and do not see its connection to what they do on a daily basis in the classroom (Louis et al., 1984).
There are many reasons for the undistinguished reputation of research in education, only some of which are well founded. Part of the cause can be
found in the practical orientation of teacher education. Schools of education generally do not prepare the nation's future teachers to value disciplined inquiry or even, at a more mundane level, to keep track of relevant research. Once on the job, the conditions of work do not encourage school teachers to study the research literature. No matter how enlightening research may be, it cannot contribute to improvements in education if it is not understood, used intelligently, and refined in the context of local experience.
This situation is aggravated by the national tendency to want quick solutions to problems—even if they have been generations in the making. Much of the public discussion of education research has a distinctly utilitarian cast: it assumes that researchers conduct studies, their findings are translated into products or programs for use in the schools, and education is improved. This view is at once too narrow and too grandiose. It implies that the only valuable research is research that can be directly translated into classroom practice, a view that gives short shrift to much research. And it encourages unrealistic expectations about what research can—or should be able to—accomplish.
The effects of research on educational practice are seldom straightforward and quick. As in other fields, there are few definitive studies, but rather a gradual accretion of knowledge drawn from overlapping studies in many fields of study, conducted over a long period of time, punctuated by an occasional breakthrough. In physics and chemistry, as well as social and behavioral science, decades of basic research provide the seed bed for new approaches and methods. Improvement in education will occur only if all participants—parents, students, teachers, the public, and policy makers—are willing to make strong intellectual commitments to work together using new insights, approaches, and techniques to improve education.
The undistinguished reputation of education research is also partly attributable to some of the work. There has been some methodologically weak research, trivial studies, an infatuation with jargon, and a tendency toward fads with a consequent fragmentation of effort. The committee, however, does not share the widespread negative judgments about the contributions of research to the reform of education. Our review of research-based programs to improve teaching, strengthen curricula, restructure institutions of learning, and assess and monitor the progress in U.S. schools has convinced us not only that research can improve education, but also that it has been demonstrably useful.
This chapter provides a brief introduction to a few of the contributions from research and development for education. Some of the work has been funded by the Office of Educational Research and Improvement, some has been funded by the National Institute of Child Health and Human Development and the National Science Foundation, and some by other federal agencies. We first discuss just one stream of basic research, cognitive science,
and how it has informed understanding of the teaching and learning of mathematics and reading. We next introduce seven innovative curricula and teaching approaches, several of which are based partly on the findings from cognitive science, and then two school restructuring processes, based on other research from the social sciences. The third section describes some of the major efforts to monitor the status of American schools and teachers and the achievement of students. Finally, we discuss some of the ways in which Congress and congressional agencies have used education research.
There is much research and development that we have not covered, such as work on the social and cultural contexts of school and learning, school finance, the economics of education, administrative and organizational studies, classroom observational studies, curriculum analysis, and studies of postsecondary education. Our exclusion of these lines of work is no reflection on their importance but rather a result of the committee's limited time and our decision to cover fewer topics in greater depth. For a broader introduction to the field, see the Encyclopedia of Educational Research (Alkin, 1992).
BASIC RESEARCH IN COGNITIVE SCIENCE
Research has enriched knowledge of learning and teaching in many ways. One of these is knowledge about early development of thinking, reading, and mathematics skills. A number of the basic theories of the development of cognitive processes presented in this section informed the design of programs discussed below. In some cases, the findings of cognitive researchers have reinforced traditional practices used to assist children in acquiring reading and mathematics skills. For instance, the practice of reading with a child and discussing the story has been shown to build cognitive skills of summarizing, clarifying, predicting, and questioning. But just as often, cognitive researchers working in areas such as artificial intelligence and expert systems have suggested new approaches to teaching.
For many years the principles espoused by B.F. Skinner dominated human experimental psychology. His approach was based on determining the relationships between observable stimuli and observable responses, with little consideration for what went on in between. Since the late 1960s, however, the emphasis has shifted to the study of cognitive processes, modeling what the mind knows and how it knows—an approach that is more compatible with providing guidance for teaching and learning. According to Resnick (1987a:7):
The process of making explicit the abilities formerly left to the intuitions of gifted learners and teachers is precisely what we need to establish a scientific foundation for the new agenda of extending thinking and reasoning abilities to all segments of the population.
Cognitive scientists—including researchers in psychology, computer science, linguistics, neuroscience, and anthropology—have differentiated and expanded understanding of how thought and knowledge develop and interact. The notion of schemata, first discussed by Bartlett in 1932, has reemerged as a principal concept. A schema is a mental framework for acquiring and organizing new knowledge and skill and interpreting new experience; it also contains both the elements of knowledge and the rules for relating the elements. The development of expertise involves more than the acquisition of new knowledge, it involves the remodeling of one's prior perspective. According to cognitive theorists, individuals have several schemata, each of which may result in a different interpretation of an event.
Thinking skills are sets of strategies for analysis and self-regulation that build on prior knowledge and experience and generate increasingly complex frameworks for understanding (Chipman et al., 1985; Glaser, 1984; Resnick, 1989). Some aspects of thinking are common across domains; others are quite specialized and domain specific (Benton and Kiewra, 1987). And thinking is influenced by social support, shared experience, and role models (Brown and Palincsar, 1989; Rogoff and Lave, 1984).
In the past it was believed that young children were essentially empty vessels to be filled with knowledge, and when faced with unfamiliar problems, their errors were the result of random guessing. Work in cognitive science has since shown that many errors made by children in the first grade of school are based on the consistent application of incorrect rules (Brown and VanLehn, 1982; Fisher and Bullock, 1984). With this new understanding of learning processes, cognitive scientists began to explore and categorize faulty rules, looking for the principles underlying the errors made in different learning tasks. The results of this work have provided the ability to identify the cause of children's errors and to design instructional strategies to eliminate them.
One approach to supporting cognitive development in young children is guided intervention, a collaborative process based on shared experiences and understanding (Vygotsky, 1978). In this approach, children develop thinking and subject-related skills through guided, social contact with adults. The adult models a behavior that is slightly beyond the child's current capabilities, coaches the child in the behavior, and guides him or her in reflecting on the new experience for purposes of mastering the behavior. In this way, the child acquires not only the new skill but also the adult's understanding of the skill.
Another important line of research in cognitive science is modeling the knowledge structures and judgments of experts and novices and then comparing the two as a way to understand the nature of expertise and the training needed to turn novices into experts. For example, Chi et al. (1981) have examined the differences in the knowledge structures and problem
approaches of expert and novice physicists to better understand how the acquisition of knowledge and rules affects problem-solving strategies. The schemata and algorithms used by experts can be studied by using such methods as cognitive task analysis or think-aloud protocols (Newell and Simon, 1972). According to Glaser et al. (1991), models representing stages in the progression from novice to expert skill would be useful in guiding the learning process.
The principles of cognitive science have provided important guidance to the developers of many promising programs on curriculum design and teaching approaches. Two examples are Cognitively Guided Instruction and Reciprocal Teaching (described below).
Recent research on mathematics learning shows that preschool children develop mathematical concepts that they apply to a variety of practical situations. Not surprisingly, many of their concepts and algorithms are incorrect. In a careful study of the processes used by children in multidigit subtraction, Brown and VanLehn (1983) found that many errors were systematic and consistent and could be traced to erroneous variations in procedures known as "bugs". For example, when the top digit is smaller than the bottom digit, children often subtract the top digit from the bottom digit instead of borrowing. At the time of their article, Brown and VanLehn had found 88 primitive bugs and 300 combined bugs based on children's flawed hypotheses.
Many children experience difficulty in learnings mathematics in school because they and their teachers do not understand the relationship between the rules and algorithms taught in school and the children's own, independently developed mathematical intuitions. Understanding the rules followed by children as they make errors can be useful in diagnosing specific learning problems and in developing effective instructional strategies. For instance, researchers at the Learning Research and Development Center are currently working on a reasoning-based mathematics program designed to help children build on their intuitions, showing them how to correct and extend them. At the Wisconsin Center for Educational Research, Fennema et al. (1989) developed a taxonomy of word problems in addition and subtraction. The taxonomy helps teachers identify the mathematical concepts a student understands and those that must be mastered to solve given problems correctly.
Several lines of research have contributed to understanding how to teach beginning reading. One central stream is the decades of work on alphabetic
coding, phoneme awareness, and word recognition. Adams (1990), a researcher at the OERI's Center for the Study of Reading, provides a detailed analysis of this work. She describes the reading system, based on four processors: the orthographic processor perceives the sequence of letters in the text; the phonological processor maps the letters onto their spoken equivalents; the meaning processor generates meaning from words; and the context processor constructs an on going understanding of the text. Experimental research on eye movements and fixations of skilled readers provides important insights into how each of these processors is used and how they interact as a reader moves from print to meaning.
Phoneme awareness has been shown to be a prerequisite for mapping alphabetic symbols to sound, and alphabetic mapping is believed to be necessary for learning to identify words. Chall (1983) found that the two best predictors of early reading achievement are letter knowledge and the ability to discriminate phonemes. Findings reported by Adams (1990) and Vellutino (1991) suggest that strategies based on the direct teaching of letter-sound combinations to facilitate the generative use of sounds for word decoding are necessary but not sufficient conditions for learning to read: meaning-based strategies are also important for comprehension.
Skilled reading requires mastering the basic processes of letter and word recognition to the point that they are automatic. When reading is not fluent, comprehension has been found to be deficient perhaps because ''less than optimum facility in word identification drains off cognitive resources that would normally be diverted to comprehension processes, thereby impeding these processes'' (Vellutino, 1991:438).
In 1972 a small group of distinguished social scientists was assembled by OERI's predecessor, the National Institute of Education (NIE), to examine written and oral communication from the standpoint of information processing theory. The group concluded (Miller, 1973): "NIE should actively support efforts to understand the cognitive processes involved in acquiring basic reading skills and the cognitive processes involved in comprehending linguistic messages." More than 100 researchers were subsequently involved in a consensus building process to plan an appropriate research program. In response to the ensuing request for field-initiated proposals, 100 were received and approximately 25 were funded. Funding was also provided to the Center for the Study of Reading and the Center for The Study of Writing.
This work and that supported by other federal agencies substantially expanded understanding of cognitive aspects of learning to read. In a major summary of the work 10 years later, Anderson et al. (1985) indicate that constructing meaning from written text is central to the reading process and that this involves selecting and using knowledge about people, places, and things, as well as developing the skills of summarizing, clarifying, and
predicting. Skilled readers use prior knowledge about the topic and about the syntax of language to fill in gaps, integrate different pieces of information, and infer meaning. Consequently, skilled readers are strategic: They assess their prior knowledge on a topic; adjust their approach on the basis of the complexity of the text, their familiarity with the topic, and their purpose for reading; and monitor their comprehension.
It has long been known that parents reading to young children is useful for intellectual, social, and emotional development. In addition, however, research has shown that the intellectual stimulation is enhanced if the parent engages the child in discussions of the stories. Asking children to recall facts, provide descriptions, and reflect on the experiences in the stories introduces them to reading as a constructive and strategic process (Anderson et al., 1985; Dole et al., 1991). Another new research finding involves writing. It used to be thought that young children could not learn to write meaningful text until they mastered basic reading skills. However, when researchers tried to teach writing and reading simultaneously to first and second graders, they were not only successful, but they also serendipitously found that writing instruction accelerated the acquisition of reading skills (Graves, 1983).
Research on reading and writing has contributed to the development of innovative programs, such as Reciprocal Teaching, Reading Recovery and Success For All, and has informed parents, teachers, and policy makers through a series of widely distributed publications. A popular summary of the research work, Becoming a Nation of Readers (Anderson et al., 1985) was published by the Center for the Study of Reading and sold 250,000 copies. In addition, OERI published nearly a million copies of three companion booklets—for parents, for teachers, and for principals.
CURRICULUM DEVELOPMENT AND IMPROVED TEACHING APPROACHES
The committee's informal search for examples of programs designed to improve curriculum was not comprehensive—programs suggested by committee members and a review of Educational Programs That Work (National Dissemination Study Group, 1990) produced a list of 30 candidates. From those we selected seven that appeared to have at least moderately credible evaluation data, that had evidence of at least moderate impact, and that illuminated a variety of approaches. We did not assess whether the examples are the most effective or efficient programs available for a given purpose.
The programs selected for discussion are striking in their variety, but all provide evidence that the translation of research to improved teaching and learning is a lengthy and expensive process that often requires numer-
ous iterations of research, development, and refinement. Two of the programs are still in the early stages of research, but all are the product of 4 or more years of work, and several have been in various stages of development or dissemination for more than 10 years.
In-service teacher education and development is part of each program. For most of the programs, initial training is for less than 1 week, but Cognitively Guided Instruction requires a 4-week seminar, and Reading Recovery requires a full year of training.
Federal funding for research and development portions of the programs ranged from $330,000 to $8 million. Many of the innovative programs were also supported by state and private funds and by the in-kind contributions of school districts that participated in their development and demonstration.
The seven programs reviewed in this section present examples of promising programs in four areas: higher order thinking skills, reading, mathematics, and generic instructional approaches for use in any subject area. The first program, Project IMPACT, integrates instruction in thinking skills into the curriculum as part of each subject-matter course. The next two programs are designed to enhance reading instruction, and they draw on research in both reading theory and cognitive science. One of the programs, Reading Recovery, is described in detail because it offers an example of a fully developed program for the poorest readers in the first grade that has been demonstrated to have a continuing positive effect over at least the next two grades. The other reading program, Reciprocal Teaching, demonstrates Vygotsky's model of guided intervention and provides a direct test of the theory of the centrality of comprehension monitoring in strategic reading. Reciprocal Teaching also offers an example of an extensive basic research program designed to study metacognition that has also begun to benefit educational practice.
The two mathematics programs are the Comprehensive School Mathematics Program (CSMP) and Cognitively Guided Instruction. CSMP offers an interesting example of a field-initiated effort based on new concepts of instruction that was subsequently developed by a laboratory of NIE. The idea of including thinking skills in the instruction of mathematics is now becoming more widespread. Cognitively Guided Instruction provides an example of how teachers can learn to diagnose a child's level of comprehension and create instruction that builds directly on that level. The program draws on research examining the mathematical concepts of preschool children conducted at the University of Wisconsin and the Learning Research and Development Center.
The final two examples, computer-assisted instruction (CAI) and cooperative learning, are illustrative of successful, widely disseminated approaches that are used for instruction in all subject areas. Both approaches have a history of at least 25 years of research, development, refinement, and appli-
cation, and both have been studied extensively by several researchers. These long-term efforts illustrate the usefulness of research in identifying and confirming innovations that can make a positive difference in education. Student Team Learning, a cooperative learning program created by researchers at the Johns Hopkins Center for Social Organization of the Schools, is an example of a program based on prior research in social and organizational psychology and modified by the developers' own subsequent research and evaluations.
All of the programs described in the section used research designs in which the performance of students in an experimental group was compared with national norms or with the performance of students in a control group. However, the committee, was generally disappointed in the quality of the evaluations available from these programs, which exhibited several problems. Part of the gains reported for some of the evaluations may be due to a statistical artifact known as regression toward the mean, a problem that occurs when subjects for a remedial treatment become candidates for selection by exhibiting lower than average achievement scores. The most commonly used criterion for program success was a statistically significant increase in performance on a standardized achievement test. Although these tests are useful, they often provide a relatively narrow indication of student achievement. In addition, few of the programs have collected follow-up data to determine whether initial effects were maintained after students left the program. In many cases, the evaluation data are only from carefully managed demonstrations and not from subsequent dissemination sites except for computer-assisted instruction and cooperative learning, where full adoption of the program is not assured. Moreover, except for computer assisted instruction and cooperative learning, all of the evidence for effectiveness has been provided by program developers. Lastly, the reports of evaluations were often less thorough than we would have liked.
Despite these limitations, the programs have been subject to two or more evaluations, often using different approaches, and the results were similar. This increased our confidence in the findings. Nevertheless, because of the important limitations in the evaluations, most of the programs and approaches described should be considered promising, rather than of proven effectiveness.
Project IMPACT was designed to improve the critical thinking of students by incorporating instruction in thinking skills into the regular course work of reading, mathematics, and science. Development of this field-
initiated program began in 1979 with remedial students in grades 6–9; it has since been expanded to include all grades from kindergarten through high school. The premise of the program is that all students with intellectual potential at or near the normal range will benefit from instruction in higher level thinking skills.
The IMPACT program provides curriculum materials and training. The first section of the materials is designed to assist curriculum planners in analyzing existing courses for embedded instruction on thinking skills: classifying, seeing cause-and-effect relationships, making generalizations, forming predictions, and making assumptions. The second section of the program materials is designed to help teachers fill in identified gaps in thinking skills instruction. These materials contain lesson plans for classifying and categorizing information, ordering and setting priorities for information, formulating effective questions, and various reasoning exercises (Winocur, 1987).
Training is available at two levels. At the first level, teachers participate in a 3-day in-service workshop in which they are provided with strategies for introducing and guiding the process of critical thinking within the context of various subject areas. Special instruction is provided in effective use of Project IMPACT curriculum materials. The second level of training is designed to prepare graduates of the first level of training to act as trainers and disseminators in their local school districts.
Testing and Evaluation
The original evaluation study was conducted in 1983 and involved four school districts. All student subjects were in the seventh, eighth, or ninth grades and were enrolled in remedial reading or mathematics classes. The students selected to participate in both the experimental and control groups were those who failed to pass their district proficiency test and scored at or below the 37th percentile in reading comprehension on the Comprehensive Test of Basic Skills. The 426 students in a remedial IMPACT classes were taught by IMPACT-trained teachers; the 352 students in remedial control classes were taught by regular teachers. Standardized tests of basic skills and the Cornell Critical Thinking Test were used as measures of students' gains. Pretests were administered in September and post-tests in February; all testing was conducted by classroom teachers. The results show that students in the Project IMPACT classes significantly outperformed students in control classes in thinking skills, mathematics, and reading at all three grade levels. According to the analysis, the magnitude of the gains for IMPACT students was 1 standard deviation on both the Comprehensive Test of Basic Skills and the Cornell Critical Thinking Test, compared with an average gain of 0.1 standard deviations for students in control classes (Winocur, 1983).
Results obtained in a later study by the same investigator showed that 83 students in seventh grade classes taught with IMPACT significantly improved their reading comprehension between the pretest and the post-test (as measured by the Comprehensive Test of Basic Skills): the mean performance of these students increased from the 40th percentile to the 51st percentile of the norm group, which included a nationally representative student (Winocur, 1987).
Project IMPACT offers an approach that can be embedded into most curricula and used as part of most teaching methods. As a result, the program appears to have widespread utility. Since 1983 the National Diffusion Network has helped disseminate this program. According to IMPACT staff, there have been approximately 6,500 adoptions across all grades.
Reading Recovery is an early intervention program designed to assist the lowest 20 percent of readers in the first grade, as determined by a special battery of diagnostic tests developed by Marie Clay (1985). The goal of this field-initiated program is to teach these children reading skills that are comparable to those of the average students in their class. Current data show that 86 percent of the children in Reading Recovery have successfully met that goal within 12 to 20 weeks.
The Reading Recovery program is designed to be supplementary to regular reading instruction in the classroom. A child in the program leaves the classroom to work one-to-one with a specially trained teacher for 30 minutes every day. Within a lesson, reading and writing activities axe integrated, based on the idea that development in one skill area supports advancement in the other. Program materials include several hundred small books graded at 20 levels of difficulty. The easier levels of these books contain illustrations on every page and are written in predictable language patterns compatible with children's ability to understand.
The approach is to encourage children to read by building on the knowledge and skills they already have. In the early stages of the program, a teacher works very closely with a child, examining each word and each sentence in a small story for recognition, pronunciation, and meaning. Throughout the story reading, the child and the teacher discuss what the characters might do next. As the child progresses through the lessons, he or she is supported by the teacher in the development of strategies that good readers appear to acquire naturally, such as summarizing, clarifying, and predicting. Teachers stress meaning cues and comprehension in both reading and writing activities. Children are considered ready to leave the program when they have
demonstrated the ability to use these strategies on their own and when they have reached the average reading level of their class.
A critical component of the Reading Recovery program is teacher preparation. Teacher leaders go through a year-long program at a university training site and then return to their local regions to provide a full year of training for other teachers. During training the teachers learn how to prepare lesson plans and administer the program, how to create diagnostic summary reports, and how to assess student progress. They also practice working with a child behind a one-way glass while being observed by the other teachers in the program. As a teacher works with a child, the observers comment and discuss the process. Teachers also keep records on every aspect of the process. Throughout the training year, teachers work with students in their schools using the Reading Recovery methods.
Research and Development
Maria Clay, who developed Reading Recovery in New Zealand in the 1970s, spent several years reviewing theories of reading and studying the reading behaviors of young children. As a result of her field-initiated work, she found that children use phonological awareness, syntax cues, and their knowledge of subject matter when extracting meaning from text. Her approach to teaching reading immerses learners in high-interest, authentic literary tasks instead of drills and exercises; and teachers coach students in using all three strategies for extracting meaning.
Reading Recovery also incorporates a number of principles from Vygotsky's work on learning through social interaction and from Piaget's work in the genetic or historical reconstructive method. Essentially, these developmental theories suggest that instruction be provided as guided reinvention—a process that offers a structure for a teacher to share activity with a child in a way that the growth of the child is maximized. As the child gains competence, new levels of knowledge are jointly explored. Reading Recovery adopts these principles in its interactive lessons and in the adjustments made from lesson to lesson on the basis of the progress of the child. When a child successfully completes the program, he or she has internalized the skills necessary to continue to learn to read alone. According to Clay and Cazden (1990:207):
The end point of early instruction has been reached when children have a self improving system: they learn more about reading every time they read, independent of instruction. When they read texts of appropriate difficulty for their present skills, they use a set of mental operations, strategies in their heads, that are just adequate for more difficult bits of the text. In this process they engage in reading work, deliberate efforts to solve new problems with familiar information and procedures. They are
working with theories of the world and theories about written language, testing them and changing them as they engage in reading and writing activities.
The first field studies of the Reading Recovery program were initiated in 1978; they were designed to answer questions about the program's impact on students as well as the influence of various school characteristics on the effectiveness of the program. Early successes in New Zealand led to interest from researchers in the United States. In 1983 Pinnell and Huck transferred the program to Ohio State, where several additional field tests were conducted. Between 1984 and 1986, $1.5 million was invested in the program's development by state and local funding sources in Ohio.
Testing and Evaluation
In the first full year of implementation in Ohio (1985–1986), urban school students in the lowest 20 percent of their first grade classes, as determined by Clay's diagnostic test battery, were randomly assigned either to Reading Recovery (133 students) or to a control group that received a commonly used remedial reading curriculum (51 students). After 15.7 weeks the results appeared impressive: Reading Recovery students performed significantly better than control students, and 73 percent of the children reached average levels of achievement in reading for their respective first grade classes. In addition, the developers found that Reading Recovery students made an average normal curve equivalent gain on the Comprehensive Test of Basic Skills of 8.6 for the school year, compared with -0.2 for students in the control group. At the end of the first grade, Reading Recovery students (the 73 percent who had successfully completed the program) were reading and writing at or above the average level of their classmates; when entering the third grade, they were reading at a 3.1 grade level.
Because Reading Recovery is labor intensive and thus very expensive, developers have been exploring ways to reduce the amount of time a teacher spends with each child. In a recent study supported by the MacArthur Foundation, Pinnell et al. (1991) compared four reading instruction methods with one another and with a control group. The four methods included (1) regular Reading Recovery, (2) Reading Recovery with teachers who received a shortened training course, (3) a one-to-one practice model (Direct Instruction Skills Plan), and (4) group lessons by a Reading Recovery Teacher. The sample included 324 students in 10 school districts. According to Pinnell et al. (1991:1):
Regular Reading Recovery was the only group for which the mean treatment effect was significant on all four measures [of reading ability] at the conclusion of the field experiment and was also the only treatment indicat-
ing lasting effects. Results of this study indicate that one-to-one instruction is a necessary but not sufficient factor in Reading Recovery's success. Quantitative results and the qualitative analysis of videotapes indicate that Reading Recovery training is a powerful influence on teachers and makes a difference in student success.
Over the past decade Reading Recovery has spread to 33 states and two sites in Canada. In 1991 there were 84 teacher leaders, 1,906 teachers, and 12,902 children involved in the program. According to the developers, approximately 86 percent of those students completed the program successfully in 12–20 weeks, demonstrating reading skills at the average level for their class.
A critical element in the success of this program is the central quality control over the program provided by the staff at Ohio State University. Not only is the staff responsible for ensuring effective training, they also analyze the results for every student enrolled in Reading Recovery. Based on the results to date, the program continues to be effective from entry to the third grade. OERI's National Diffusion Network certified Reading Recovery as an effective program in 1987 (see Chapter 3) and has helped disseminate it since that time. A key goal for the future is to develop Reading Recovery as a group instructional program to reduce operating costs.
Reciprocal Teaching is a 10-year, field-initiated program of basic research funded by the National Institute of Child Health and Human Development to test the theory that the skills that define "comprehension monitoring" are central in strategic reading. The comprehension monitoring activities selected for study include summarizing (self-review), questioning, clarifying, and predicting. The approach is to instruct students in the use of these skills by encouraging them to participate in guided activity before they are asked to perform independently (see Brown and Palincsar, 1989; Palincsar and Brown, 1984). "In these teaching situations the novice carries out simple aspects of the task while observing and learning from an expert, who serves as a model for higher level involvement" (Palincsar and Brown, 1984:123). Initially, research on Reciprocal Teaching focused on reading and listening; more recently, it has been extended to include mathematics (Campione, et al., 1988).
In Reciprocal Teaching of comprehension, a teacher and a student take turns leading a discussion concerning sections of the text: the task includes clarifying complex sections, asking questions, making predictions, and gen-
erating summaries. Initially the teacher models the activities, and the students are encouraged to work at whatever level they can; the teacher then provides guidance at the appropriate level for each student. In the beginning, students have a great deal of difficulty with the process of becoming a leader. However, as the Reciprocal Teaching progresses, with the teacher providing directed feedback and guidance, the students become much more competent and comfortable.
Testing and Evaluation
The first two experiments of reciprocal reading were conducted with seventh grade students who could read but were at least 2 years behind on standardized scores of reading comprehension (Palincsar and Brown, 1984). In the first study, some students received the Reciprocal Teaching approach from the program developers, and other students were assigned to one of three comparison groups. Program developers worked with students in pairs in the Reciprocal Teaching group, giving them an overview of the lesson, having them read a passage silently, asking one to take the role of teacher, and as the lesson progressed, providing corrective and supportive feedback. Students were told that the strategies they were learning were general and would help them understand as they read. Each day students took three unassisted assessments (before, during, and after training) in which they read a passage and answered ten questions.
The results provide impressive support for the efficacy of Reciprocal Teaching. Average performance in the comparison groups did not improve and remained at around 40 percent comprehension. In the Reciprocal Teaching treatment, students became more like the adult model: they began to use their own words, and main idea summaries became more and more frequent. All six students in the program reached a stable level within 12 days of instruction, and five of the six were operating at a level of 70–80 percent comprehension. Moreover, five of the six students improved their classroom comprehension on other tasks. These improvements were still in place 8 weeks after the Reciprocal Teaching program.
In the second study, regular classroom teachers provided Reciprocal Teaching to three seventh grade and one eighth grade reading class (ranging in size from four to seven students). The student selection criteria and the materials and procedures for Reciprocal Teaching were the same as those used in the first experiment. After 15 days of Reciprocal Teaching, students were demonstrating comprehension of 75–80 percent on daily assessments—up from 40–50 percent at the beginning of the intervention. By the 25th day, many were at 100 percent of comprehension and these levels were maintained on the 8-week posttest. As with the first experiment, there was also evidence that the comprehension skills transferred to other subjects
such as mathematics. Moreover, combined results from the two studies show that students receiving Reciprocal Teaching gained an average of 20 months in comprehension in comparison with a 1-month average gain by control students; Reciprocal Teaching students also improved their percentile rankings by more than 40 points in social studies and science in comparison with a randomly selected sample of all seventh graders in the schools where the experiments were conducted.
In 1987 Palincsar et al. reported on the results of a study using peer tutors as Reciprocal Teaching instructors. In addition to providing another demonstration of the effectiveness of Reciprocal Teaching, this study examined the effectiveness of instructional chains—one group of individuals is taught an activity and then becomes responsible for teaching the activity to others. The nine peer tutors, selected from developmental reading classroom students, were taught by teachers for 10 days; they then taught one or two other students in their class for 12 days. During the study the peer tutors' comprehension rose from an average of 72 percent correct to an average of 87 percent correct during their training and tutoring, and the tutorees comprehension rose from an average 53 percent to an average of 77 percent correct.
In another study using Reciprocal Teaching to develop listening skills, Brown and Palincsar (1989) collected data on 17 first grade teachers, 132 experimental children whose comprehension abilities were severely impaired, and 66 children of comparable ability in a control group. After 20 days of Reciprocal Teaching, 78 percent of the students showed consistent gains in comprehension (either reaching a criterion of 70 percent correct or improving comprehension by at least 20 percentage points); in comparison, only 28 percent of students in the control group showed such gains.
According to program researchers, Reciprocal Teaching instruction has been conducted with approximately 50 teachers and 1,000 students under highly controlled experimental conditions. The method has begun to spread as a teaching strategy, but it has never been formally disseminated or evaluated in the nonexperimental dissemination sites.
The Comprehensive School Mathematics Program
The Comprehensive School Mathematics Program (CSMP) was initially designed to develop thinking skills as part of mathematics instruction for children of all ability levels in grades K-6. Specifically, the program addresses: (1) the need to expand the definition of basic skills beyond computation; (2) the need for problem solving to be the focus of mathematics; (3) the need for developing such skills as reasoning, analyzing, estimating, and
inferring; and (4) the need to increase emphasis on numeration and number sense, patterns, probability, logic, geometry, algorithmic thinking, and mathematical connections. Two important features of CSMP are that students are taught through interrelated experiences and problems that are appropriate to their natural instincts and level of understanding. When CSMP was developed, however, the notion of integrating thinking skills with content differed dramatically from the prevailing emphasis on drill and practice. CSMP is one of the few mathematics programs that conforms to many of the elements of the recently developed Curriculum and Evaluation Standards for Schools Mathematics (National Council of Teachers of Mathematics, 1989).
The program uses a ''pedagogy of situations''—gamelike problem situations and story settings to teach both content and processes. Specifically, content is presented as an extension of a child's everyday and fantasy experiences. Three special languages are used: the language of strings (notion of sets), the language of arrows (notion of relations and functions), and the language of Papy Minicomputer, which models the positional structure of the Western system of numeration. A key feature of CSMP is the sequencing of the curriculum in a spiraling form: "each student spirals through repeated exposures to the content, building interlocking experiences of increasing sophistication" (Heidema, 1991).
Research and Development
According to Claire Heidema, the initial planning for CSMP began with mathematician Bert Kaufman in 1966; he brought the program to the Central Midwestern Regional Educational Laboratory (CEMREL)—a laboratory supported by NIE—in 1970. Kaufman, an active researcher in mathematics education, was joined by Belgium mathematician Frederique Papy in 1972. Papy provided the fundamental concept of using situations as a basis for instruction.
CSMP was field tested and revised over a 5-year period. In the first year, instructional materials were used by CSMP staff in both public and parochial school classes. The second year was devoted to a local pilot test in which revised materials were used in about ten regular classrooms in St Louis. During the third and fourth years, an extended pilot trial version (based on revisions from the local pilot test) was evaluated in a national network of cooperating schools. In this test, CSMP classes were compared with non-CSMP classes, and the materials were revised again. In the fifth year, the material revisions from the extended pilot trials were prepared for publication. Throughout the 5-year process, an independent unit of CEMREL conducted evaluations of the program. Altogether, the Department of Education provided $8 million dollars for development of the program.
Testing and Evaluation
CSMP developers claim that students using the curriculum are better able to apply the mathematics they have learned to new problem situations and perform as well in traditional basic computational skills. Two types of studies have been used to compare CSMP students with non-CSMP students. In the first type, the same teachers taught two courses—the first year they taught the regular mathematics course and the second year they taught the CSMP curriculum—and the students' scores were compared. In the second type, matched groups of students were taught CSMP or the regular mathematics curriculum by different teachers in the same year. In the spring of each year the students were tested using a standardized mathematics test and a specially designed test to measure thinking skills—MANS, Mathematics Applied to Novel Situations. MANS was developed by researchers at CEMREL; it tests skills in estimation, mental arithmetic, representations of numbers, number patterns and relationships, word problems, and production of multiple answers.
The research sample included grades 2–6 in nine school districts (more than 300 students): six of the districts used the different teacher model, and three used the same teacher model for data collection. Prior to the experiment, each class (both experimental and control) was given the Gates MacGinitie Vocabulary Test; CSMP students scored slightly higher on average than non-CSMP students (between 1 and 2 items correct out of 45).
When the two groups were compared on the MANS test, CSMP students performed consistently and significantly better than students in the regular curriculum on all scales except "producing multiple answers," for which there were no difference. These data support the claim that CSMP students are better able than comparable non-CSMP students to apply the mathematics they have learned to new problem situations. When the two groups were compared on a standardized test of basic mathematics, there were no differences, except for the second grade, in which CSMP students performed better than the control group. More recently, the sample has been expanded to over 30 school districts, and the findings confirm those obtained in the earlier studies.
CSMP is currently being disseminated by the OERI-supported Mid-continent Educational Laboratory (McREL) and the National Diffusion Network. The program has been adopted in more than 125 school districts in 34 states, Washington, D.C., Puerto Rico, and Canada.
Cognitively Guided Instruction
Cognitively Guided Instruction (CGI) is a program developed at the University of Wisconsin to help teachers understand how students think
about mathematics and then to use this knowledge in making instructional decisions in classroom activities. Current project research activities are focused on students in kindergarten and the first grade. Teachers in the program are given training in problem types (a taxonomy of addition and subtraction problems graded in difficulty), in children's early cognitions of mathematics, and in how to build on what children do naturally to reach an understanding of symbols and principles.
The teachers use this knowledge and skills with their existing curriculum materials to assist their students in gaining correct mathematical concepts. In a CGI classroom, teachers work interactively with the whole class, asking all children to participate by giving their solutions to interesting, everyday problems that represent problem types in the addition-subtraction word problem taxonomy developed by program researchers. Teachers begin with the easiest problem types and work towards the more difficult ones. Teachers encourage students to find alternative ways of solving a given problem as a basis for building understanding.
CGI grew out of two principal lines of research. One focused on creating a taxonomy of addition and subtraction word problems and developing a detailed understanding of the development of preschool children's conceptions of addition and subtraction (Carpenter et al., 1989). The second line of research focused on teachers' beliefs about students abilities, on teaching behaviors, and on how various types of teacher behavior relate to student achievement. One of the basic strategies of CGI was to modify both teachers' attitudes and instructional behavior.
Testing and Evaluation
An evaluation of CGI was conducted with 40 first grade teachers assigned randomly either to an experimental or a control group (Carpenter et al., 1989). The 20 experimental teachers participated in a 4-week summer workshop in which they were provided with information about the CGI approach to teaching and learning. During the workshop, teachers worked on designing their own programs of instruction on the basis of the principles discussed. In addition, all teachers participating in the workshop were given readings on the problem taxonomy and on research studies describing children's solutions to addition and subtraction word problems.
The evaluation was conducted over 1 year. Throughout that school year, project researchers systematically observed and measured classroom teachers' knowledge and beliefs and their students' learning. The results suggest that CGI teachers taught problem solving significantly more, and number facts significantly less, than teachers in non-CGI classes. In addition, CGI teachers encouraged students to use a variety of problem solving strategies, and they listened to the students explain their processes significantly more than did control teachers (Carpenter and Fennema, 1992). Even
though CGI teachers spent about half as much time teaching number facts as other teachers, CGI students exceeded non-CGI students in number fact knowledge, problem solving, and reported confidence in their problem-solving abilities.
During the experiment, six detailed case studies were conducted to learn how teachers gained an understanding of their students and how they used this knowledge to build on their students' informal knowledge. In most cases, assessments were an ongoing part of the instruction—the teachers continually asked students to describe their solutions to a given problem and to discuss the process they used to arrive at the solution. The problem taxonomy was particularly useful for organizing the problems and processes used by children in solving each problem type. The taxonomy gave the teachers some direction on what questions to ask and what to listen for in the students' solutions. The children learned through interaction with the teacher and through listening to the solutions presented by other children: this is a common thread with Reciprocal Teaching (Palincsar and Brown, 1984).
According to Becker (1990), by the end of the 1980s there were more than 2.5 million microcomputers in the schools (approximately 1 per classroom), and many new applications, such as hypertext and advanced graphics, are being developed. Computers have been used in education for several purposes. Niemiec et al. (1987) present the following taxonomy:
Computer-managed instruction (CMI): the computer serves a clerical function; it assesses student progress toward curriculum goals, indicates needed instruction, and tracks progress.
Computer-aided drill and practice: the student interacts directly with the computer in learning and recalling factual information. Drill and practice supplement the curriculum by providing students with additional practice on lower order learning skills.
Computer-aided tutorials: the computer works in an interactive mode with students by presenting concepts and providing feedback and direction; the software reinforces correct responses and assists in correcting errors.
Computer-aided problem solving: the computer is used by students as a tool for deriving information and conducting analyses needed to solve a problem.
Testing and Evaluation
Over the years, thousands of articles evaluating or discussing CAI and its implications for educational practice have been published. Some of
these articles—Colorado (1988); Kulik et al. (1985); Kulik and Kulik (1986, 1987); and Niemiec et al. (1987)—provide reviews of many studies. In their 1987 article, Kulik and Kulik summarize findings from 199 studies of CAI used primarily as a supplement to regular classroom instruction at all levels from elementary school through college. Although these studies include a wide variety of computer-assisted approaches and instructional settings, the overall results indicate that CAI generally increased student performance and decreased learning time. More specifically:
For all levels of schooling taken together, CAI students' performance was 11 percentile points higher than that of students not using CAI, and their instructional time was 32 percent less.
The average performance of students in elementary school using CAI was 18 percentile points higher than the average performance of students in control groups (the average effect size was 0.47 standard deviations).
Low-aptitude students were more favorably affected by computer-delivered instruction than high-aptitude students.
There was no significant difference between tutorial programs and drill and practice programs in terms of their effect on student performance.
Students were more positive towards computers after they had used them as part of the instructional process (effect size of 0.33).
Another study (Levin et al., 1987) compared the cost-effectiveness of CAI with cross-age tutoring (peer tutors from the upper grades), reduced class size, and increased instructional time. The results show that peer tutoring and CAI were equally cost-effective, and both were superior to the other two approaches.
Cooperative learning is an approach that encourages learning as a social process and facilitates the building of learning communities in the schools (E. Cohen, 1986; Johnson and Johnson, 1990; Slavin, 1990). One example of a successful cooperative learning program is Student Team Learning, developed by Slavin and his associates at Johns Hopkins' Center for Social Organization of the Schools, which was supported by OERI and NIE from 1967 to 1985. Student Team Learning, designed primarily for elementary education, includes three programs—Student Teams Achievement Division, Teams-Games-Tournament, and Cooperative Integrated Reading and Composition. All of these programs involve students' working together on common topics. Students are scored individually on the basis of the amount of improvement they make from one test or graded exercise to the next. Indi-
vidual scores are combined to obtain group scores, which are used to determine group rewards.
In Student Teams Achievement Division, students are assigned to four-or five-member teams made up of high-, average-, and low-performing students, males and females, with different ethnic backgrounds. After a weekly topic has been presented by the teacher, students work in their teams, studying worksheets as individuals or in pairs, quizzing one another, and holding group discussions to learn the material. Students understand that they are not finished studying until they are sure that all the team members have mastered the topic. When the teams have completed their preparation, each individual is tested and scored—and teams earn recognition based on the improvement made by all students. Teams-Games-Tournament is the same except that instead of taking quizzes, students are drawn from their Student Teams Achievement Division teams to play games and show their academic mastery of a particular subject matter in tournaments held each week. Students from different teams who have demonstrated comparable performance in the past are pitted against each other in groups of three. In Cooperative Integrated Reading and Composition, students work on basic reading activities, comprehension, and writing in cooperative groups similar to Student Teams Achievement Division teams.
Research and Development
Student Team Learning was designed to evaluate the effects of heterogeneous groupings, cooperative tasks, and group rewards on student learning. Slavin and his associates have conducted at least 35 studies on activities that are currently incorporated into Student Team Learning (Slavin, 1986). Research began in 1972 and is still continuing. In 1975 Teams-Games-Tournament was certified as effective by OERI's National Diffusion Network for dissemination (see Chapter 3); in 1978 Student Teams Achievement Division was added; and in 1988 Cooperative Integrated Reading and Composition was accepted.
Testing and Evaluation
In 1983 Slavin conducted a "best evidence synthesis" of 42 relatively high-quality field experiments of cooperative learning. For purposes of analysis, the studies were grouped into four categories: group study and group reward for learning (25 studies); group study but no group reward (9 studies); task specialization and group reward for learning (1 study); and task specialization but no group reward (6 studies). All three programs in Student Team Learning fall into the first category.
The results showed that the experimental treatment with the most pos-
itive effects was the combination of group study and group reward: 22 of the 25 studies showed students' performing significantly better under this condition than under control group conditions. In the other three categories of treatment, only 4 of 15 studies showed statistically significant results for the experimental treatment. Slavin's (1990) conclusion is that "cooperative learning methods that use specific group rewards based on group members' individual learning consistently increase achievement more than control methods."
The work of the Johns Hopkins Center is a excellent example of mixing research, development, evaluation, refinement, and persistence in the pursuit of better education. Its two-decade program of work along a specific line of inquiry and development is not uncommon in the natural sciences, but it is rare in education research and development.
Research provides important insights into the processes involved in school change. School restructuring goes beyond the adoption of innovative curricula or teaching methods: it calls for a fundamental rethinking of the process of schooling. According to Smith and O'Day (1990:2):
In this "new" conception, the school building becomes the basic unit of change and school educators (teachers and principals) are not only the agents but also the initiators, designers, and directors of change efforts. In addition to an emphasis on process, student outcomes are also key in this new approach. The principle underlying many of the second wave themes—from school-site management to teacher professionalism to parental choice—is the notion that if school personnel are held accountable for producing change and meeting outcome objectives, they will expend both their professional knowledge and their creative energies to finding the most effective ways possible to do so.
Two examples of promising field-initiated, school restructuring projects are James Comer's School Development Program and the Outcomes Driven Development Model (ODDM) created by the Johnson City Central School District in New York. Both are aimed at coordinated change in the organization and operation of schools, in the beliefs and behaviors of staff and parents, and in the design and delivery of instruction. Like the projects discussed in the previous section, the evaluations of these programs are limited, and thus we consider them promising but not proven. Neither program is an all-encompassing prescription for a school, but, rather, a restructuring process that establishes administrative mechanisms and a climate for cooperation and change. Specific changes can vary moderately from one school to another under the School Development Program, and they will vary substantially when using ODDM.
School Development Program
The School Development Program was initiated in 1968 in two New Haven elementary schools as a joint effort between the Yale Child Study Center and the New Haven School System. According to Comer (1980), the program's hypothesis is that:
the application of social and behavioral science principles to every aspect of a school program will improve the climate of relationships among all involved and will facilitate significant academic and social growth of students.
Psychiatrists, psychologists, and social workers at the Yale center drew on their knowledge of child development and organizational change to develop and implement the program. By 1988 Comer reported that the program had been adopted throughout the New Haven School System and in 150 schools in 16 other school districts. The program is currently being extended to dozens of urban schools in New Jersey (Schmidt, 1991).
The School Development Program was initially supported by the Ford Foundation as one of several cooperative projects between universities and public school systems. The first two schools were located in low socioeconomic neighborhoods and served a student population that was 98 percent African American. Records indicated that at the beginning of the project these students were lowest in academic achievement in the city, and there were reports of serious attendance and behavior problems.
The basis of the program is to actively involve schooL administrators, teachers, parents, and mental health specialists in creating a secure and accepting environment for student learning. Most of the underlying concepts for the program are drawn from developmental and social psychology and are used to educate administrators, teachers and parents in how to assist children in emotional, social, and academic growth.
As described by Comer (1980), the School Development Program was designed to include: (1) a steering committee composed of school administrators, teachers, parents, and representatives from the Yale Child Study Center; (2) a pupil personnel team composed of mental health workers and speech and hearing therapists; and (3) three school committees—curriculum, personnel, and evaluation—each of which included the principal, the social worker from the mental health team, teachers, and elected representatives from the parents' group. The work of these groups was supported in part by workshops, an extended-day program in which teachers learned more about child development and behavior, and a program for parents to participate as teacher aides in the classroom. Small stipends were provided for parents and for teachers in the extended-day program.
Throughout the program's development and implementation, the focus was on involving the parents, encouraging participation among all interested parties, and working to create an understanding of children's emotional, social, and academic needs. The mental health team worked with children and taught their teachers how to respond appropriately to disruptive or antisocial behavior. A special program, the discovery room, included a variety of tools, toys, and material for students to use individually, in pairs, or in groups, and was staffed by an understanding and accepting teacher who helped the children work through their fears, anxieties, and anger. Many of the children in these schools were not emotionally or socially prepared for school: they came from insecure family environments that did not encourage cognitive or emotional growth. As a result, it was necessary for the school to provide an environment for this growth. In addition, to create more stability for the children, they were assigned to the same teacher for 2 years.
Administrators, parents, and teachers worked together on all aspects of the schools' programs. Curriculum changes and the introduction of special innovations, such as the discovery room, were contingent on the approval of parents. According to Comer (1980, 1988), parental participation was critical to program success. The first year of the project was marked by dissension and lack of parental support, but by the second and third years there was growing cooperation and participation. In the second year, teachers developed a set of guidelines for parents to use when observing a class: what to look for, what sorts of questions to ask, etc.
Testing and Evaluation
Data reported by Comer (1988) show marked improvement for mathematics and reading scores of fourth graders attending the first two New Haven Schools; no performance data have been provided on students in other grades. In 1969 the fourth grade students in these schools were functioning slightly below a third grade level; 10 years later they were performing at grade level; and by 1984 they were scoring 2 years above grade level. Moreover, school attendance, at all grade levels, improved to second highest in the city, and student conflict was reported as minimal.
The program was further evaluated by developers in a 1987 study using a randomly selected sample of 306 African American students in grades 3–5. Of the total sample, 176 students were attending seven School Development Program schools around the country, 91 were attending four control schools (comparable schools not using the program), and 39 were in three special schools. The results show significant gains in reading scores, as measured by classroom grades, for students in program schools but not for students in control schools; there were no significant gains in mathematics for either
group. Children in program schools were significantly more positive toward their classroom environment after the program was introduced, and absenteeism declined significantly (Comer et al., 1989).
In another study, students in ten predominantly African American schools in Prince Georges County (Maryland) using Comer's program showed significantly higher percentile gains on the California Achievement Test than those reported for the district as a whole (Comer, 1988). Between 1985 and 1987 third grade students in Comer's program schools gained 18 percentile points in mathematics, 17 in language and 9 in reading; throughout the district, students gained 10 percentile points in mathematics, 8 in language, and 5 in reading. At the fourth grade level, students in the program schools gained more than 20 percentile points in mathematics, 12 in language, and 7 in reading, compared with district-wide gains of 11, 7 and 4 percentile points respectively. Prince Georges County is the fifteenth largest in the country and has 105,000 students, 62 percent of whom are African American.
Outcomes Driven Development Model
Development of the Outcomes Driven Development Model (ODDM) began in 1971 in response to dissatisfaction with student performance of administrators and teachers in New York's Johnson City Central School District. ODDM is a procedural model involving the direct participation of teachers, administrators, and boards of education in restructuring all aspects of a school to achieve a specified set of outcomes.
ODDM provides a set of procedures for aligning all facets of a school or district with a goal of excellence in student achievement and for guiding teachers and administrators in using the best research available for these purposes—research on instructional practices, curriculum design, school climate, change theory, and school management. One of the first steps involves changing the belief systems of teachers and administrators concerning student capabilities. ODDM creates conditions in which all teachers and administrators can participate in decisions that influence the direction of the organization. Restructuring also involves changes in the administrative supports, the classroom supports, and the community supports: ODDM provides a blueprint for making these changes.
All adopters are required to follow an eight-phase training plan over a 2-year period, starting with the development of a leadership team and ending with diffusion; the training is provided by Johnson City Central School District staff and other certified trainers. All adopters are also required to carry out the full ODDM process. Since the first work at Johnson City, the program has been expanded to serve high schools.
Testing and Evaluation
For the developers, the original goal was to have at least 75 percent of the students achieve scores of 6 months above grade level in reading and mathematics by the end of the eighth grade. In 1976 the percentage of eighth grade students at or above this level was 44 percent for reading and 53 percent for mathematics; 7 years later the percentages were 75 percent for reading and 80 percent for mathematics. In Utah, five districts have used ODDM for 3 or more years; four of them have data showing dramatic improvements, using pretest and posttest performance on the Comprehensive Test of Basic Skills. One district showed average increases of two and three grade levels by fifth and sixth grade students; a second showed math score increases of 1.5, 2.0, and 3.0 grade levels by students in the third, fourth, and fifth grades, respectively; a third raised the average reading percentile approximately 5 points and the average math percentile by 10–20 points; and the fourth raised reading and language arts percentiles by approximately 10 points and mathematics percentiles by 15–20 points. A particularly encouraging aspect of these findings is that the effects appear to be cumulative—probably because of the systematic redirection of instruction throughout all the grade levels in a school or district.
ODDM has been actively disseminated through the National Diffusion Network since 1985. Future plans for the project involve additional evaluation and continuing dissemination.
MONITORING THE STATE OF PUBLIC EDUCATION
One important function of education research is to inform policy makers about the course of education. In this regard, the federal government, principally through the National Center on Educational Statistics (NCES), has long played a major role by large-scale collection of data on the condition of education in this country. NCES produces data on the demographic, financial, physical, and performance characteristics of U.S. school systems. In this section we discuss some of the most widely used databases and reports prepared by NCES, including statistical compilations describing current conditions, longitudinal studies, national assessments of student performance, and international comparisons.
The principal descriptive databases currently prepared by NCES include the Common Core of Data, the Schools and Staffing Survey, the Integrated Postsecondary Education Data System, the National Postsecond-
ary Student Aid Survey, and the Survey of earned Doctorates Awarded in the United States.
The Common Core of Data (CCD) on state education agencies, local education agencies, and public schools is collected annually. It includes enrollments by grade, the racial and ethnic composition of the enrollments, the percentage of students eligible for the free lunch program, number of handicapped students, number of graduates, number of teachers and other staff, teacher salaries, and an array of other financial data. The Schools and Staffing Survey (SASS) collects more detailed information on school staff and workplace conditions from a sample of public and private schools.
The Integrated Postsecondary Education Data System (IPEDS), which superseded the Higher Education General Information Survey (HEGIS), collects data on types of programs offered; tuition; full- and part-time enrollment; racial and ethnic characteristics of students; age distributions of students; degrees completed by race, ethnicity and gender; full-time equivalent staffing; rank, tenure, and salaries of staff; and various other financial data. The National Postsecondary Student Aid Study (NPSAS) provides the most comprehensive nationwide data on how students and their families pay for postsecondary education. The Survey of Earned Doctorates Awarded in the United States has been conducted each year since the 1920s.
Drawing on these databases, NCES regularly publishes three major compendia of statistics: Condition of Education, Digest of Educational Statistics, and Projections of Education Statistics. NCES also publishes numerous annual and biennial reports from the individual databases, and special reports on specific topics. The reported statistics are used by Congress, federal agencies, state and local education agencies, professional associations of educators, researchers, businesses that sell goods and services to schools, the media, and officials in other nations. They are used to project future enrollments at each grade, assess the supply and demand of teachers, profile the teacher work force, portray school climate and working conditions, describe various characteristics of the educational programs, compare public and private schools, examine educational attainments, and monitor expenditures.
Longitudinal studies provide important information about the changes in the behavior and performance of a cohort of students over time. NCES has supported three particularly notable large-scale longitudinal studies of students: the 1972 National Longitudinal Study, the 1980 High School and Beyond Study, and the 1988 National Educational Longitudinal Study. These studies have been ''valuable in basic scientific and policy research, and they have occasionally been useful in monitoring trends in educational transitions'' (Hauser, 1991:3).
The 1972 National Longitudinal Study (NLS-72) was designed to describe the transition of students from high school to college and then into the work force. The sample was composed of 16,683 high school seniors around the country. The data that were collected included demographic and school characteristics, courses taken, academic achievement, and current status of and future plans for both education and work. Follow-up data were collected in 1973, 1974, 1976, 1979, and 1986 to determine changes in education, work, and marital status. NLS-72 has provided the basis for hundreds of studies. One example is Student Progress in College: NLS-72 Postsecondary Education Transcript Study (Knepper, 1989), which examined patterns of student progress through the postsecondary education system. Although the majority of 1972 high school seniors went on to postsecondary education, most did not follow the traditional pattern of completing college within 4 years. Students who attended 4-year schools entered sooner after high school graduation than those who began other forms of postsecondary education. Generally, men took longer than women to complete each level of education. Transfer students generally took an average of 8 months more to complete a B.A. degree than nontransfer students.
High School and Beyond was designed to provide additional data on the school-to-work transition, particularly in light of concerns raised by increases in school dropout rates and decreases in academic achievement. In 1980 a sample of 28,000 sophomores and 30,000 seniors was drawn from a highly stratified national probability sample of 1,100 secondary schools. The longitudinal design included follow-up surveys of large subsets of the two cohorts in 1982, 1984, and 1986, with another follow-up of the sophomore cohort scheduled for 1992. The High School and Beyond study expanded the data collection categories used in NLS-72 to include parents' attitudes and financial planning, teachers' observations, and students' scores on a variety of achievement tests. The findings from the first follow-up study of sophomores provided important information concerning school dropout problems and the factors that influence student aspirations during the last 2 years of high school (National Center for Education Statistics, 1991c). Comparisons between the results of NLS-72 and High School and Beyond provide insight into the effects of social and cultural changes on the attitudes and performance of high school students. The 1988 National Educational Longitudinal Study (NELS-88) was designed to examine the major factors contributing to student achievement, persistence in school, and participation in postsecondary education. The study will follow a cohort of 25,000 eighth graders for a decade, and it will collect more data on family, school, and classroom characteristics than NLS-72 and High School and Beyond. Tests of reading, mathematics, science, and social science skills were administered in 1988 and 1990 and will be administered again in 1992. Although most of the questions being addressed by NELS-88 cannot be answered for
several years, the first round of data provides some important baseline information concerning the eighth grade students in the sample: 44 percent of African Americans, 39 percent of Hispanics, and 36 percent of Native Americans have two or more background characteristics that put them at risk for school failure. The risk factors are single-parent family, family income less than $15,000 annually, child home alone more than 3 hours per day, parents without a high school diploma, a sibling who has dropped out of school, and limited English proficiency. These data also reveal that eighth graders spend an average of 21.4 hours per week watching television, 5.6 hours doing homework, and 2 hours doing outside reading.
National Assessment of Educational Progress
Development of the National Assessment of Educational Progress (NAEP) was initiated with federal funds in the late 1960s to measure the achievement of elementary and secondary school students in science, mathematics, reading, writing, citizenship, history, geography, social studies, art, music, and literature. NAEP is now conducted in even-numbered years using samples of students in the fourth, eighth, and twelfth grades. Currently, reading and mathematics are assessed every 2 years, writing and science every 4 years, and the other subjects on an irregular basis.
NAEP is both a product and a source of education research and development. The matrix sampling of test items, the analysis of potential items for bias, and the scale scores used in reporting findings are the result of complex psychometric work accomplished over the past three decades by psychologists, statisticians, and education researchers. Without this work, NAEP would have been more time-consuming for students to take, more expensive to administer, and less accurate as a measure of achievement.
NAEP has revealed much about students' knowledge and how their knowledge has changed over time. In 1990, only 74 percent of twelfth graders knew how many hours equal 150 minutes; only 49 percent knew what percent 7 is of 175; and only 15 percent knew how much a $1,000 deposit would be worth after 6 months if it earned interest of 1 percent per month on the initial amount (National Assessment Governing Board, 1991). Trend data indicate that reading and mathematics skills have increased only slightly over the past two decades while science achievement declined and then rose to about the 1970 levels (National Center for Education Statistics, 1991d). One of the lesser known trends is that the reading and mathematics scores of African American and Hispanic students have been rising faster than the national average. The first decade of NAEP data partly provoked the 1980s school reform efforts (National Research Council, 1986); the second decade of data has prompted policy makers to rethink their reform
strategies because the effects of those reforms on NAEP scores have been less than they had hoped.
Despite its prominence and use, NAEP is the target of controversy and misunderstanding. There is debate over which knowledge and skills should be assessed and how they should be measured. Some observers are concerned that students do not try to do their best because the scores have no personal consequences. And many people do not realize that changes in scores, or lack thereof, reflect changes in the student populations and social conditions as well as changes in school instruction. If the quality of instruction in schools improved, there would be no change in the scores if, for instance, the level of student motivation and family support declined. Conversely, increases in scores are not necessarily evidence that school instruction has improved.
International studies are helpful in interpreting the educational achievement of U.S. students. NCES has been a major contributor of resources and data in coordinating comparisons among countries. For example, NCES has contributed to the 1982 Second International Mathematics Study, the 1984 International Education Association (IEA) Writing Study, the 1985 Second International Science Study, the 1988 International Assessment of Education Progress, and the 1991 IEA Reading Literacy Study, and it is currently funding the planning of performance assessments for the upcoming Third International Science Study (TIMSS).
Although each of the studies has some limitations, they have repeatedly found that American students' academic skills lag behind those of many other nations. For instance, the Second International Math Study found that U.S. eighth graders were slightly above average in arithmetic calculation, but well below average in problem solving. The results of international studies of mathematics and science achievement were cited prominently in A Nation At Risk (National Commission on Excellence in Education, 1983), a publication that helped precipitate the school reform efforts of the 1980s. These results also inspired one of the National Goals for Education: "By the year 2000 U.S. students will be first in the world in science and mathematics achievement."
Several of the above studies, particularly the more recent ones, have collected some data on characteristics of the schooling provided in each country. These data permit analysis of the relationships between school achievement and such factors as curricula, amount of time spent on school work, teacher development, classroom size, parental involvement, and other factors. Other international studies focus intensively on comparisons of
curriculum and teaching practices in different countries. For example, it has been found that American mathematics textbooks often develop ideas by progressing slowly through a hierarchy of learning tasks, while textbooks used in several Asian countries immerse students in more demanding problems from the beginning (Fuson et al., 1988). The recently developed and widely hailed standards for mathematics teaching in this country endorse the latter practice (National Council of Teachers of Mathematics, 1989).
CONGRESSIONAL USE OF EDUCATION RESEARCH
There are four agencies that serve as conduits for education research and development information to Congress. They are the Congressional Research Service (CRS), the General Accounting Office (GAO), the Congressional Budget Office (CBO), and the Office of Technology Assessment (OTA). The committee interviewed 17 of the education specialists in the first three agencies and reviewed the OTA publications that deal with educational matters. Of the specialists, 14 indicated that they often use data or reports from NCES; 13 often use the bibliographic and document retrieval system of OERI's Educational Resources Information Center (ERIC). Only 3 of the specialists said they often use the studies of the OERI laboratories and centers (see Chapter 3) or consult with their staff, but another 9 do so occasionally. Several of the education specialists also mentioned using the results of education research and development from the National Science Foundation (NSF), the Department of Health and Human Services, and the National Institute of Mental Health.
All four congressional agencies have used reports from NCES's High School and Beyond study, and one staffer described it as a "treasure trove." Several staff members mentioned that the federal government is usually the only source of longitudinal studies, and several reported using NAEP data and reports. Staff at one congressional agency found that the work they were planning was already under way at OERI centers and laboratories. And when staff at a congressional agency were working on testing issues they used data from OERI's Center on Assessment and Evaluation and "consulted regularly" with its director. Another agency staffer, with a Ph.D. in economics, described the work of the former Center on Education and Employment as ''impressive high quality research.''
An examination of OTA's reports covering education reveal use of a broad range of OERI work, including NAEP, High School and Beyond, the National Longitudinal Survey, the Digest of Education Statistics , the Condition of Education, NCES bulletins, and reports from the laboratories and centers. NSF's work is also frequently cited.
Several staff in the congressional agencies noted a marked improvement in NCES's services since 1987. Suggestions for additional change included more timely collection and delivery of data, release of confidential data with appropriate safeguards, more longitudinal studies with broad scope, consultation between NCES and congressional agencies when planning data collection efforts, and easy access to the ongoing work of the centers and laboratories.
One example of an education research study that directly influenced policy is the National Assessment of Vocational Education, completed in 1989 (Wirt et al., 1989). According to the counsel for the House Education and Labor Committee, Congress was much influenced by the study when drafting the Perkins Vocational and Applied Technology Act. This legislation focused federal support for vocational education on districts that have the highest proportion of poor families, that integrate academic and vocational instruction, and that operate effective programs and produce the desired results. The legislation also encouraged an easing of state regulatory burdens and the transfer of more authority over these programs to the local level (Jennings, 1991).
Why did this study have a major impact on Congress? At least four factors seem to have contributed: the study was mandated by Congress to help inform its reauthorization of the Perkins Act; the researchers consulted with Congress on the design of the study; interim findings were released as they became available; and the researchers responded with additional analyses that were requested by Congress. In short, the researchers worked closely with Congress to make the study meet its needs.
The programs discussed in this chapter are examples of how research has (1) expanded fundamental understanding of child development, learning, and teaching; (2) pointed the way to the discovery of effective elements of curriculum, instruction, and school organization; (3) provided a basis for evaluating worthwhile innovations; and (4) monitored the progress of reform efforts, assessing whether the reforms have been implemented as planned and are having the intended effects.
Findings from basic research have significantly broadened knowledge concerning the underlying processes of acquiring, organizing, storing, and retrieving information. Moreover, researchers studying early cognitive development have provided important insights into preschool children's understanding of mathematical and language concepts, thus giving developers of educational programs important guidelines for enhancing instruction. Two of the programs—Reciprocal Teaching and Cognitively Guided Instruction—were designed as basic research on theories developed by cognitive scien-
tists, but they have also led to some practical guides for classroom instruction. Several of the other programs also drew heavily on one or more of the concepts developed by cognitive scientists. The focus in this chapter has been on the contributions of cognitive science to learning and teaching, but many other disciplines have made important contributions to the process of education: some of this research is at a detailed physiological level; other research concerns the functioning of groups and organizations.
Our review shows that the effects of research on educational practice are for the most part indirect and slow. The programs examined in this chapter range in development time from 4 to more than 20 years and are based on decades of research work. Some of the programs are heavily grounded in basic research; all of them have made use of principles drawn from basic research. The three most highly developed programs—Reading Recovery, Comprehensive School Mathematics Program, and Student Team Learning—have had the advantages of continuous support and the work of dedicated researchers since the initial stages of research, and all have been under development for longer than 20 years.
One important shortcoming of most of these innovative programs is the lack of follow-up assessment to determine if the gains measured at the end of the treatment are maintained over time. Another shortcoming is the lack of evaluations at dissemination sites where the programs are no longer under the direct control of the developers. Once a program is disseminated, evaluation activities are usually reduced or discontinued because of logistical difficulties and the lack of funding. A third shortcoming of most of the programs is the limited number of adoptions—without adoptions, programs cannot have impact.
Both the laboratories and centers that are supported by OERI and individual scholars have been responsible for producing promising research and innovative programs. The Learning Research and Development Center and the Wisconsin Center for Educational Research have been leaders in research on cognitive science and its contributions to education; the Center for the Study of Reading has been a leader in reading research (and produced A Nation of Readers); the Central Midwestern Regional Laboratory designed the Comprehensive School Mathematics Program, and the Midcontinent Regional Educational Laboratory disseminated it; and the Johns Hopkins' centers have developed and expanded Student Team Learning.
Field-initiated research and development has also been a rich source of ideas and has provided a wide range of promising programs for schools. Most of the innovative programs we reviewed were built on a research base accumulated largely from the field-initiated work of individual social and behavioral scientists. In addition, many of the innovative programs were developed by individuals working on their own initiative. For example, the use of computers in the schools was introduced and tested through numer-
ous field-initiated efforts; Reading Recovery is based on the research of Marie Clay, who then proceeded to develop this powerful intervention; IMPACT was a field-initiated effort aimed at integrating instruction in thinking and reasoning into content courses; and the developers of both the School Development Program and the Outcomes Driven Development Model combined prior basic research and their own practical experience to create comprehensive approaches to school restructuring.
Research has also informed policy makers and the public about the status of the U.S. education system, it problems, and the progress towards reaching the nation's goals in education. The major activities of the National Center for Educational Statistics and the data it collects and reports are invaluable to congressional agency staff and to social science researchers in general.
Our review demonstrates that successful education research and development requires a sustained investment of time and money for research, development, and dissemination. We conclude that no one mechanism for the support of research should dominate federal grant-making policy. A vigorous program for support of field-initiated research is as important as the support of laboratories, centers, and other such institutions. Furthermore, no one discipline should be given priority. Advances in education have been built on research in the cognitive sciences, psychology, sociology, anthropology, organizational behavior, and clinical work in and outside of classrooms.