The education system generally fails to distinguish the requirements of formative, summative, and program evaluation assessments. What is needed is not only greater sophistication in designing assessments to better serve specific purposes, but also coordination within and between the levels of assessment. A well-designed assessment system would allow for the bidirectional flow of information among the levels.
Current large-scale standardized tests used by most states to assess academic achievement fall short in important respects (National Research Council, 2001c). The models of learning and measurement underlying such tests are generally shallow, raising doubts about the quality of the evidence they can provide about student learning or the impact of instructional programs. If it is to support student learning and provide reliable measures of program effectiveness, SERP must undertake research on, and development of, informative and coordinated assessment systems. The SERP program affords a unique opportunity to pursue research and development on integrated assessment systems because it will involve projects and individuals who are concerned with the range of assessment purposes. Research and development initiatives appear in the agenda for all three subjects. This appendix discusses the common elements of those initiatives.
Regardless of their purpose, quality assessment instruments depend on the same three components: (1) theories and data about content-based cognition that indicate the knowledge and skills that should be tested, (2) tasks and observations that can provide information on whether the student has mastered the knowledge and skills of interest, and (3) qualitative and quantitative techniques for scoring responses that capture fairly the differences in knowledge and skill among the students being tested. Research and development related to each of the three components is needed in order for assessments to provide reliable indicators of student achievement. For example, researchers have developed sophisticated models of student cognition in various areas of the curriculum, but in many cases this has