types for closer review. One was an eight-week kit-based unit, another was a section from a textbook, and the third was chosen because the Committee agreed that it was likely to get a low rating because of inadequate and inaccurate content coverage and an absence of attention to scientific inquiry. Each Committee member used the first draft of the prototype tool to review one of these instructional materials. The results were useful in giving each member the experience of trying a standards-based review and a context with which to assess the results of the field tests.
The goal of the first field test was to investigate the levels of expertise of typical reviewers and their reactions to the prototype tool. The prototype tool was used by three sets of teachers and program administrators interested in instructional materials review. Each set reviewed the same middle school environmental science materials considered by the Committee. One test involved leaders from four states cooperating in a rural systemic initiative supported by the NSF. The second test included members of a statewide professional development program for science. The third test was conducted by school district leaders in a state-led science education program. The tests took place in different parts of the country.
No training was provided for any of these field tests and the Committee's facilitator (the study director), who conducted the test, did not coach the reviewers. The reviewers used standards of their choice, and both local and national standards were used.
In general, the reviews from the field were less critical than those of the Committee members. In particular, the materials that were included in the field test sample because of their obvious inadequacies were deemed mediocre, rather than poor or unacceptable. The Committee members had registered concern about the lack of attention to scientific inquiry in the reviewed textbook, but inquiry was largely ignored in the field reviews. In almost half of the field reviews, it was unclear whether the reviewer had used a standard as the basis for the review in spite of written instructions to do so. Thus, the reviewers seemed to misunderstand the main focus of the review tool. When a reviewer failed to cite standards, it was unclear whether the reason was frustration with the tool, a lack of knowledge of the standards, or some other reason.
Some reviews were insightful while others were shallow. The most perceptive were produced by those individuals with a high level of classroom experience and a deep knowledge of standards.