On the practical side, all existing alignment procedures are based on judgment. Educators look at assessments and standards and try to decide whether a given task (or set of tasks) seems to demand the knowledge and skills described in a given standard or set of standards. Such judgments are difficult and time-consuming to make, and different approaches to the process yield different results. Thus, while alignment is widely regarded as essential in a standards-based system, few are satisfied with current means of measuring it.

A number of researchers have developed practical procedures for judging alignment. Although they differ in specifics, each of the procedures restricts the scope of the comparison by focusing on a small number of key dimensions, and each provides operational definitions and training to improve the reliability of judgments by raters. Overall, many researchers who study alignment have concluded that the state tests they studied were less challenging and narrower in content than their standards. In a paper prepared for the committee, Rothman (2003) summarizes and analyzes six recent alignment studies: Norman L. Webb’s studies of alignment between standards and tests in mathematics and science (Webb, 1997, 1999, 2001); Karen K. Wixson’s studies of alignment between standards and tests in elementary reading (Wixson, Fisk, Dutro, and McDaniel, 2002); Andrew C. Porter’s tools for measuring the content of standards, tests, and instructional materials (Porter, 2002); Achieve’s studies of alignment of standards and tests (www.achieve.org); The Buros Center for Testing’s study of alignment between commercially available tests and state standards (Impara, 2001; Plake, Buckendahl, and Impara, 2004); and The American Association for the Advancement of Science’s Project 2061’s studies of standards, textbooks, and textbook tests (2002).

Rothman concludes (p. 16), “Although the six methods differ widely in their criteria for alignment and the procedures used to gauge alignment, they share the conclusion that, with a few exceptions, standards and tests are generally not well aligned. This conclusion contrasts with the results from studies by states and publishers, which typically show a higher degree of alignment.”

Further research on alignment is clearly needed. Determining the key dimensions that characterize alignment and examining the validity of methods that are used to set standards for alignment are two issues that should be given high priority by states. Practical procedures need to be developed to improve the reliability of ratings and to reduce the time burden associated with alignment studies. However, these shortcomings should not deter states from making immediate and concerted efforts to bring assessments in line with standards.

We note here that the creation of an assessment system may create additional challenges for alignment studies, although a systems approach could improve the overall alignment between standards and assessments. The designers of a science assessment system select the tests and tasks that constitute the system to align collectively with the breadth and depth of state science content standards, to address program monitoring and evaluation needs, and to provide evidence of stu-

