Read "Selecting Instructional Materials: A Guide for K-12 Science" at NAP.edu

Page 23 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

3 The Development of a Guide for Evaluating Instructional Materials

This chapter documents the general principles and rationale for the Committee's decisions and describes the process used to develop an evaluation tool and a guide for the tool's use. The process was designed as an investigation that began with a review of the evaluation tools developed by others (see Chapter 2). The Committee then established a set of working principles to use in designing its prototype evaluation tool. Potential users field tested the prototype to provide information to guide the Committee in making revisions. The chart below outlines the process.

Developmental Milestones	Desired Results
Examine existing review tools.	Determine need for and attributes of a tool for school district use.
Practice using various tools.	Create a common base of review experience.
Establish general principles.	Develop basis for designing the tool.
Design a prototype review tool.	Implement desirable attributes identified to date.
Use the prototype to review the materials to be used in first field tests.	Test prototype usability and generate review data to compare to field test results.
Field test at three sites, keeping type of participants, tool, directions, and quality of facilitation as constant as possible.	Gather information to be used in revising the tool.

Page 24 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

Developmental Milestones	Desired Results
Revise the tool and draft a guide of sequential steps and recommended processes.	Determine what information and resources would be most helpful to facilitators of the review, selection, and approval processes.
Field test at three sites with diverse needs and participants, varying the training approaches.	Gather data to be used in revising the guide and revise the tool as needed.
Conduct two focus groups.	Complete both the tool and the guide.

THE COMMITTEE'S PRELIMINARY REVIEW OF MATERIALS

The Committee reviewed a selection of science instructional materials in order to reach consensus on the challenges of developing a tool for evaluation and establish a framework for discourse. This exercise provided the context for designing early versions of the evaluation tool for use in initial field tests. The varied professional experiences of the Committee members (science teachers, scientists, science supervisors, and science curriculum designers) provided a rich mix of ideas. The discussion focused on the best way to obtain data on what students would learn in a classroom where the teacher uses the instructional material in question.

Committee members examined a range of life sciences materials. They discussed how to focus reviewer attention on the alignment of the material with content standards and on how well the material would support student learning of that content. For example, simply checking off whether a particular standard is "covered" does not provide useful information for making judgments about the likelihood of students learning the science content embodied in that standard. Judging the quality of the instructional design needs to be tied to the question of what students are likely to learn if the particular materials are used. It became clear that to obtain informative evaluations, reviewers also must first identify the set of science education standards for which the instructional material is to be examined and then evaluate it standard by standard. In addition, it would be important to obtain information on the extent of professional development required to achieve effective teaching with the materials and the cost of this teacher education process.

Page 25 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

The Next Step: Designing a Prototype Evaluation Tool

After its study of current selection practices, an investigation of other efforts to develop evaluation tools, and some practical experience in carrying out evaluations, the Committee designed its process for developing and testing an evaluation tool. It formulated a shared set of principles on which the tool would be based, including a goal of fulfilling needs not met by other organizations' efforts. The Committee then constructed a prototype tool and subjected it to an iterative process that cycled experiences from field tests and focus groups back to the Committee to inform the modifications made in subsequent drafts.

GENERAL PRINCIPLES

The Committee established the following general principles as the basis for its design of a prototype evaluation tool.

1. The evaluation tool should fulfill needs not met by other instruments. The Committee identified unmet needs from its analysis of the review tools available for instructional materials. We found, for example, that the National Science Foundation's (NSF) Framework for Review is designed for materials that cover programs of a year or more of classroom work, and that it addresses only briefly the question of whether the materials under review are likely to lead to student learning and understanding (NSF, 1997). The latter question was therefore selected for emphasis in our prototype tool. The National Science Resources Center's (NSRC) Evaluation Criteria for Science Curriculum Materials (NSRC, 1998) does not ask reviewers to evaluate materials against specific Standards or Benchmarks, which the Committee deems necessary. The Project 2061 review tools require highly trained evaluators and weeks of effort (Roseman et al., 1997), and they are not feasible for many local school districts with limited time, funds, and expertise. Moreover, none of these tools articulate a process that encompasses both evaluation and selection processes.

2. The evaluation tool should assume that a set of standards and a curriculum program or framework will inform the work of evaluators in appraising the effectiveness of instructional materials. Evaluation of science instructional materials is a formidable task. A set of standards and a curriculum program or framework documents the school district's expectations for science education and serves as an important reference for the evaluation. Moreover, the existence of such policies implies an established

Page 26 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

community support for the science education program, which in turn can promote acceptance of recommended instructional materials.

Since education in the United States is controlled at the local level, in many instances evaluators will need to use their local or state standards rather than the Standards or Benchmarks. The Committee at first considered producing a tool that encouraged selection of material aligned with national standards. However, it realized that a tool that is applicable to local standards would be more widely used and would foster the understanding of standards and encourage their use. The Committee therefore resolved to make a flexible tool that could be used with any standards and in many situations including the review of a whole instructional program, a series of units, or individual units of instruction.

3. An evaluation process should require reviewers to provide evidence to support their judgments about the potential of the instructional materials to result in student learning. Other review tools designed for use in limited time periods commonly use a checklist of items for consideration, a numerical scale, and weighted averages of the numerical evaluations. Use of such tools can result in a superficial evaluation of a set of materials that may identify the content standards covered, but fail to indicate whether the coverage will help teachers foster student learning and understanding. The Committee concluded that a rigorous evaluation process must continually challenge reviewers to identify evidence of the materials' potential effectiveness for this important purpose.

4. Evaluators will more likely provide critical and well-thought-out judgments if they are asked to make a narrative response to evaluation questions or criteria, rather than make selections on a checklist. When asked to construct a narrative response, an evaluator has to develop a cogent and supportable statement. This requires more careful thought than simply checking items on a list. By their very nature, narrative responses help build understanding on the part of an evaluator and can, therefore, serve as professional development. In addition, narrative responses give evaluators more latitude to assess materials in the context of local goals and needs and allow the evaluators (teachers and scientists alike) to contribute their own knowledge and experience to the task. The Committee concluded that the tool should require evaluators to provide their professional judgment as narrative responses and thereby encourage a critical analysis of the materials.

5. An effective evaluation process must include one or more scientists

Page 27 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

on the review teams. Published science instructional materials are not always scientifically sound or up to date. Moreover, some materials do not consistently reflect an understanding of what is and what is not important in a particular scientific discipline. The Committee found, in its examination of instructional materials, many cases where materials contained detailed information of little relevance, extensive unnecessary vocabulary, and only cursory treatment of the essential concepts. Scientists on the review team are helpful in judging the accuracy of the science presented in the material and the importance of the information for understanding essential concepts.

6. An evaluation instrument needs to serve diverse communities, each one of which has its own needs. Since an evaluation instrument for instructional materials will be used by different groups for a variety of purposes, no single model can be assumed. In most cases, school district evaluation groups will use it; however, individual schools and statewide evaluation groups will also use the tool. These evaluation groups will have varying resources, and the students being taught will differ with respect to language proficiencies, economic status, abilities and disabilities, and home support resources. Therefore, the Committee resolved to design a tool that is adaptable.

7. Tension exists between the need for well-informed, in-depth analyses of instructional materials and the real limitations of time and other resources. The National Science Teachers Association surveyed some 10% of its members just before the release of the Standards, in January 1996, to ascertain their perceptions of the barriers to implementation of these national standards (Wheeler, 1999a). Two major impediments were identified: lack of time and lack of other resources. The Committee resolved to develop a tool that recognizes the real limitations faced by the evaluators of instructional materials.

8. Many evaluators (including teachers, administrators, parents, and scientists) using the tool will be unfamiliar with current research on learning. Curriculum decisions are not always informed by research on learning, but rather on what feels comfortable to teachers, what seems to work, or what is expected (Orpwood, 1998). Once teachers have completed their formal education and begun to teach in the classroom, access to research publications and the time to review them are a challenge. Most scientists also lack the time and interest to delve deeply into education research. In addition, the typical professional development workshops for teachers rarely devote time to in-depth study of

Page 28 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

current research on learning (Loucks-Horsley et al, 1996). Therefore, the evaluation coordinator should be strongly encouraged to provide references and resources for research on learning, including the Standards (NRC, 1996), Benchmarks (AAAS, 1993), and more recent studies (NRC, 1999a, b).

9. It is more important to evaluate materials in-depth against a few relevant standards than superficially against all standards. The pressures of limited time and funds can drive an evaluation team to inspect instructional materials superficially against all relevant standards. The Committee concluded that if time and funds are limited, it is preferable for the team to select a small number of high-priority standards for an in-depth examination.

10. The review and selection processes should be closely connected even when reviewers are not members of the selection committee. In some school districts, one team evaluates instructional materials and reports to another group that is responsible for final approval and selection. In others, one team is responsible for both evaluating and selecting instructional materials. Considerations such as cost, the local district's ability to refurbish materials, and political acceptability (e.g., attitudes about teaching evolution) may play a role in the final selections. The Committee concluded that in all instances it is important that final selections be based primarily on a standards-based review. It is therefore important that one or more of the members of the evaluation team be on the selection committee.

PROTOTYPE TOOL AND FIRST ROUND OF FIELD TESTS

The Committee's initial prototype tool was designed to include the following characteristics:

reliance on the professional judgment of reviewers;
substantiation of review ratings by cited evidence;
ability to be completed in a reasonable amount of time;
focus on the extent to which the instructional materials matched a standard; and
consideration of scientific inquiry as content and as a way of teaching and learning.

The Committee members tested the prototype themselves. To begin, they participated in a preliminary review of sample materials and compared their results with one another. After scanning materials on middle school environmental science from seven publishers, they chose three that represented various

Page 29 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

types for closer review. One was an eight-week kit-based unit, another was a section from a textbook, and the third was chosen because the Committee agreed that it was likely to get a low rating because of inadequate and inaccurate content coverage and an absence of attention to scientific inquiry. Each Committee member used the first draft of the prototype tool to review one of these instructional materials. The results were useful in giving each member the experience of trying a standards-based review and a context with which to assess the results of the field tests.

First Round of Field Tests

The goal of the first field test was to investigate the levels of expertise of typical reviewers and their reactions to the prototype tool. The prototype tool was used by three sets of teachers and program administrators interested in instructional materials review. Each set reviewed the same middle school environmental science materials considered by the Committee. One test involved leaders from four states cooperating in a rural systemic initiative supported by the NSF. The second test included members of a statewide professional development program for science. The third test was conducted by school district leaders in a state-led science education program. The tests took place in different parts of the country.

No training was provided for any of these field tests and the Committee's facilitator (the study director), who conducted the test, did not coach the reviewers. The reviewers used standards of their choice, and both local and national standards were used.

In general, the reviews from the field were less critical than those of the Committee members. In particular, the materials that were included in the field test sample because of their obvious inadequacies were deemed mediocre, rather than poor or unacceptable. The Committee members had registered concern about the lack of attention to scientific inquiry in the reviewed textbook, but inquiry was largely ignored in the field reviews. In almost half of the field reviews, it was unclear whether the reviewer had used a standard as the basis for the review in spite of written instructions to do so. Thus, the reviewers seemed to misunderstand the main focus of the review tool. When a reviewer failed to cite standards, it was unclear whether the reason was frustration with the tool, a lack of knowledge of the standards, or some other reason.

Some reviews were insightful while others were shallow. The most perceptive were produced by those individuals with a high level of classroom experience and a deep knowledge of standards.

Page 30 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

The quality of evidence presented by reviewers to back up their judgments was very uneven. The Committee members had not included much prompting or structure in the prototype tool, in the hope that the field reviewers would apply their personal expertise to construct compelling evidence-based arguments.

Most of the field reviewers made positive comments about their participation in the review. They indicated that they considered the process to be a professional growth experience and showed by their hard work and attention that they found the endeavor worthwhile. A more detailed analysis of the results of the first field test and the Committee's follow-up decisions about the next draft of the tool are summarized below.

Committee's Analysis of and Response to the First Round of Field Tests

The time needed to review one unit of instructional materials with the prototype tool was about 3-4 hours. Reviewers indicated that the time requirement was too long to meet their needs at the district level in a realistic way.

Committee's response: Further streamlining of the review tool.
Many reviewers did not base their review on one or more standards, in spite of explicit instructions to do so.

Committee's response: Revision of the format and editing to make the use of standards unavoidable. Introduce training as a preliminary to the use of the tool.
All three sets of reviewers rejected a review criterion that required publishers to supply data on the materials' effectiveness based on field tests or other research findings. They considered the criterion unlikely to produce useful information.

Committee's response: The Committee decided to eliminate this criterion in the interest of keeping the process as streamlined as possible. However, this decision was not made easily, since the Committee members were also interested in emphasizing that evidence of effectiveness should be required of the developers and publishers of instructional materials.
The ratings did not strongly match those predicted by the Committee. The field reviewers were more likely to identify strengths than weaknesses. They recorded recommendations that

Page 31 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

were uncritical and unlikely to be of much help in sorting and selecting from a number of choices.

Committee's response: Clarify the criteria and add specific recommendations for reviewer training.

The degree to which the instructional materials involved students in scientific inquiry did not appear to be an important criterion for the reviewers, although it is an essential standard in the Standards.

Committee's response: Strengthen this important criterion and add recommendations for reviewer training.
Consideration of the cost of the materials, an element in the prototype tool, seemed to confuse reviewers, required extensive research, and did not contribute to the evaluation.

Committee's response: This consideration was moved to a new selection phase of the tool. It was not deleted because it will be an important final consideration.
In one of the three field-test groups, most of the reviewers had previous experience in instructional materials review and strongly suggested the use of a rubric for each criterion. In the education profession a rubric is a scale that includes a detailed definition of each rating level for each criterion.

Committee's response: Refrain from recommending the use of rubrics in order to remain flexible in meeting local needs and to encourage and honor the individual judgments of reviewers.
The experiences with all three review groups indicated that training of the evaluators would be required in order to assure a reference to standards as an integral part of the process, to include the consideration of inquiry-based learning as an important feature of instructional materials, and to encourage the exercise of individual, independent judgment.

Committee's response: Prepare a training guide to accompany the tool.
The field test exposed the separate procedures used for evaluation and selection in some school districts. The prototype tool blurred the different considerations and people involved in these two processes.

Page 32 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

Committee's response: Redesign the tool so that evaluation and selection can be carried out by either the same group or two different groups.

Discussions with the field-test groups revealed that although there had been considerable work by others to develop evaluation tools for science instructional materials, no one had undertaken the task of guiding states and districts for the purpose of carrying out a standards-based selection process for these materials.

Committee's response: Include in the training guide advice on organizing and carrying out evaluation and selection, designed for the school district facilitators of these processes.

SECOND ROUND OF FIELD TESTS USING THE MODIFIED TOOL

The Committee modified the prototype tool according to the elements listed above and added a guide that included the requisite training for reviewers. The modified tool was used in a second round of field tests. This included discussion meetings and review activities at three new sites, described below. During this round of field tests, the groups could choose the materials to be reviewed and the Committee's facilitator (the study director) experimented with training methods. Therefore, each field test in this round had unique features. This testing provided an opportunity to learn more about how the tool could be used in a variety of situations, allowed an evaluation of the addition of training to the procedure, and informed subsequent revisions to the guide.

Site One

The first test site of the second round was based on a one-day meeting of four groups, each consisting of two teachers and two scientists. A district science coordinator convened the groups to consider whether an elementary science unit currently in use in the district was aligned with state science education standards. The teachers in the group had taught the unit and were therefore familiar with the materials. The scientists also knew the materials because they had assisted the teachers one-on-one in understanding the materials and using them in the classroom.

The reviewers spent much of the time discussing the standards that they had been assigned to consider. All four groups emphasized that the standards against which the materials were to be judged were overly broad. Three of the four groups completed a review of the

Page 33 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

materials under consideration, but the time available was insufficient to thoroughly document reviewer work.

Site Two

At the second test site, a State Systemic Initiative coordinator brought together some 30 reviewers (including science and math educators, scientists, and mathematicians) from across the state for a day and a half to learn how to review and select science and math instructional materials.

Before beginning the review, the facilitator began the review training by discussing examples of review comments that cited evidence either effectively or ineffectively. Subsequently the reviewers were asked to generate their own definitions for the review criteria specified by the tool. This took nearly two hours to reach consensus.

The reviewers then divided into 8 groups and, using 18 standards, conducted a mock review of one middle school science unit that was in use in a number of school districts in the state. Because the unit did not meet the two content standards, several reviewers expressed concern that the standards-based review would undermine the use of the unit that had been chosen by their school district. Expressing satisfaction with the process as a whole, the reviewers said they viewed the process as one they could use to select instructional materials, despite concerns about the time involved.

Site Three

At this site, a group of nine research scientists and four teachers reviewed a popular advanced placement biology textbook. The outreach director of a university program served as facilitator. No reviewer training was provided. The standards, review documents, and instructional materials were mailed to each participant in advance of the meeting. Each reviewer was instructed to spend no more than five hours reviewing the high school text.

The group discussion revealed some confusion about the task purpose. One reviewer asked, ''Are we reviewing the materials, the instrument, or the standard"? Over half of the submitted review forms did not mention the standard used. Interestingly, all the scientists judged the materials as having completely met the standards, while all the teachers stated that the materials met the standards incompletely.

Committee's Response to the Second Round of Field Tests

As a result of the second round of field tests, the guide was modified and amplified as described in the following paragraphs. Experience with the diverse review situations in which the tool was used suggested that the guide

Page 34 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

should include straightforward practical advice regarding its use.

The guide was modified to make the references to standards more prominent and more frequent. For example, Form 1 directs reviewers and the facilitator to identify standards that should be top priority. Form 2 requires the full text of the standard to be entered. These simple processes help ensure that a reviewer attends to the standards when documenting a review. Furthermore, the summary judgment of the reviewer must be expressed as an opinion about the extent to which students are likely to achieve the standard. Toward the same end, each step of the suggested review process reiterates the overall goal of increasing student achievement by applying the standards.

Scientists participated at each site during the second round of field testing, in each case contributing a point of view that complemented that of the educators and emphasizing their importance to a thorough evaluation. The most significant contribution of scientists is attention to the accuracy, completeness, and presentation of the content. Participating scientists described their experiences as valuable and enlightening.

The second round of field testing demonstrated that reviewer training can improve the quality of the review by providing more extensive and convincing evidence. For example, at one site the reviewers, before beginning their own process, were shown examples of poor responses, then better responses, and then very good responses. The examples used are included in Chapter 5 "Resources for Training." Training also proved useful in defining the review criteria. Reviewers at one site found that generating definitions of each of the criteria as a group was useful, and the group's reviews were more comprehensive than those of any other field test. A sample agenda for generating these definitions is found in Chapter 5 "Resources for Training."

The training and review process described in the guide is as streamlined as possible and will require at least two days of training, followed by one hour of deliberation and writing for each standard used. Nevertheless, every field test produced some participant objections about the length of the process. The Committee is satisfied that the process presented here has been designed with this concern in mind and cannot be shortened without sacrificing the intent and validity of the review process. The Committee hopes that experience with a standards-based review will convince both the reviewers and the teachers and students who use the materials that a careful review is worth the time invested. Local facilitators of this process are encouraged to develop creative strategies to join forces

Page 35 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

and share both resources and results to lessen the individual costs for a thorough review.

It is realistic to expect that the guide can be used successfully in a variety of circumstances. The review process described in the guide contains recommendations that have been constructed to highlight only the principles and main tasks of each step. Specifics are left to the professional judgment of the facilitator and reviewers, because nearly every situation will have unique features. Suggestions for some specific situations have been included in "Constraints and Cautions" sections in Chapter 4.

LESSONS LEARNED

The development process described in detail above provided Committee members with experiences and evidence concerning the need for a new kind of review instrument and the impact of myriad local concerns. A summary of the lessons learned may be useful in developing the capacity of the science education community to recognize and use effective instructional materials.

Training is essential if the evaluations are to be valid and useful. Field tests were carried out both with and without prior training. The sophistication and depth of the evaluations carried out after training were significantly improved compared to those obtained when training was omitted. In part this is because the tool asks the evaluators to exercise independent judgment without the guidance of detailed questions and check-off boxes for responses. This approach was not familiar to most evaluators, and they therefore benefited from training, including a group 'mock' evaluation, before they began their work. The requirement to exercise independent judgment and provide a narrative explaining the evidence for the judgment was challenging to participants in the field trials. Frequently, there was a request for more specific questions and accompanying boxes for checking off responses. The Committee responded positively to a few of these requests in subsequent versions of the tool, however the Committee concluded that the challenge to evaluators in the final tool is a useful one for fostering understanding of standards and for developing the capacity to carry out thoughtful evaluations.

As already noted, many teachers are unfamiliar with pertinent modern learning research. Training sessions need to include explication of the most significant aspects of this research. This can be accomplished by reference to the Standards and Benchmarks, supplemented by more recent work, such as How People Learn (NRC, 1999b).

Page 36 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

The field trials demonstrated that many evaluation team members are not sufficiently familiar with the applicable standards to carry out the review tasks without training. Moreover, some members of evaluation teams are not inclined to refer consistently to the standards, preferring to make judgments based on their own views of what should be included in the instructional materials. Training must therefore include a description of the applicable standards, the way they were developed, and why it is important to base evaluations on the standards. The goal of this training is to assure that all evaluators accept the applicable standards as the basis for their judgments.

Another lesson learned from the field trials concerns the priorities given to different aspects of the review materials. In the absence of training, some reviewers made no priorities among the several criteria being considered. There was, in some instances, resistance to the idea that the quality of the scientific content and pedagogical approach must take priority over all other criteria (e.g., quality of pictures and diagrams, teacher aids, cost, or applicability to a bilingual school setting). In such cases, the relative quality of the materials became secondary. Apparently, current practice does not always give precedence to these two critical matters.

Participants in the field trials consistently found that the time required to complete the review was too long. This was true even though the Committee was attentive to this issue in the earliest version of the tool, and at each iteration attempted to streamline the process. It was common for the review of an individual material to take between two and four hours, even when both the pertinent grade level and relevant standards were restricted. In an actual evaluation process, for example, six different materials might be under consideration, requiring between 12 and 24 hours of work. To this, the time required for training and for follow-up discussions by the evaluation team must be added. Subsequently, evaluation of another set of materials may be required for a different grade level or a different set of standards. The total time required is a difficult assignment for classroom teachers and working scientists, except perhaps when the task is carried out during vacation time. In that case, compensation should be provided (Tyson-Bernstein, 1988). This is a serious issue because a thorough, thoughtful review with reference to standards is, by its nature, a lengthy process. The Committee considered some strategies to help ameliorate this problem. The most promising strategies included limiting the review materials to materials judged acceptable by the NSRC (NSRC, 1996, 1998) or Project

Page 37 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

2061 (AAAS, forthcoming a, b, c); setting aside materials that are plainly inadequate; or selecting a limited number of materials to be reviewed based on information acquired from other states or school districts. However, any such narrowing of the field to be reviewed should be employed with caution. Considering the magnitude of the instructional materials investment and the societal costs of failure to educate students successfully, adequate resources — including time — to accomplish the selection of the best possible instructional materials must be provided. Developing the capacity of the reviewers and paying attention to local standards for student learning are responsibilities that are too important to be evaded.

The field-test teams' comments underscore the diversity of opinion, experience, goals, and standards that exist in the 50 states and the thousands of school systems. Moreover, comments and reactions to the tool were different depending on where in the K-12 years the instructional materials were designed to be used.

OBSERVATIONS

This report should be considered a beginning. The Center for Science, Mathematics, and Engineering Education plans to continue the work begun by this Committee by disseminating this report and encouraging its use. It is expected that wide application will reveal additional desirable modifications to the guide and tool. The Committee envisions that the tool will be regularly revised in response to experience and ongoing learning research.

The Committee recognized an inherent difficulty in trying to determine whether a particular instructional material is "good." The definition of "good" must include an assessment of the match between the instructional material and the applicable standards, learning goals, and pedagogical approaches. The critical question is whether the material will increase the likelihood that students will attain the knowledge defined by the standards and goals. That is, will the material be effective? Here the Committee found itself on uncertain ground, and evaluation teams will have similar experiences. There is no adequate body of research on this topic. There is, of course, a literature that evaluates pedagogical approaches and what children are capable of learning and understanding at different ages (NRC, 1999b). But on the question of the specific attributes of effective materials, little is known.

Conventional analysis of teaching effectiveness is based primarily on student performance on standardized

Page 38 Cite

Suggested Citation:"3 The Development of a Guide for Evaluating Instructional Materials." National Research Council. 1999. Selecting Instructional Materials: A Guide for K-12 Science. Washington, DC: The National Academies Press. doi: 10.17226/9607.

×

tests. As already described, such tests often fail to adequately assess understanding of scientific concepts and knowledge about specific aspects of the natural world (CPRE, 1996). Moreover, most assessments evaluate the effectiveness of a student's entire learning experience; they do not distinguish between what students learn from instructional materials and the teaching centered on the materials, as distinct from what they have learned from their own activities and experiences and from their parents. There is no substantial body of research that tries to evaluate the effectiveness of particular instructional materials as a separate variable in the total learning experience. The one reasonably well-documented example of such a study evaluates a sixth grade unit on "Matter and Molecules" (Lee, Eichinger, Anderson, Berkheimer, and Blakeslee, 1993). In the absence of a substantial body of research, the use of tools such as the one described in this report will depend to some extent on the experiences that evaluators bring to the review and selection processes. Classroom experience, while informative, cannot, for many reasons, be considered definitive or unbiased. The Committee urges that extensive research on the effectiveness of instructional materials be promoted in the near future.