of results from accommodated assessments be evaluated. For this reason, the committee examined the available documentation of the constructs to be assessed and the validity evidence laid out for NAEP assessments.
The committee concludes that the validation argument for NAEP in general is not as well articulated as it should be. NAEP officials have not explicitly described the kinds of inferences they believe their data should support, and we found insufficient evidence to support the validity of inferences made from accommodated NAEP scores. While arguments in support of the validity of accommodated administrations of NAEP are discussed in some NAEP materials, more extensive and systematic investigation of the validity of inferences made from these scores is needed. At the same time, as has been noted, existing research does not provide definitive evidence about which procedures will, in general, produce the most valid estimates of performance for students with disabilities and English language learners.
The committee presents a model for evaluating the validity of inferences made from accommodated assessments, based in part on the evidence-centered design approach that has been developed by Hansen, Mislevy, and Steinberg (Hansen and Steinberg, 2004; Hansen et al., 2003; see also Mislevy et al., 2003). This model offers a means of disentangling the potential explanations for observed performance on an assessment and using this analysis to discern the effects of accommodations on the validity of inferences to be based on the observed performance. This approach provides a first step in laying out validity arguments to be investigated through empirical research.
We make three recommendations regarding validity research on accommodations. Although these recommendations are specific to NAEP, we strongly urge the sponsors of state and other large-scale assessment programs to consider them as well.
Recommendation 6-1: NAEP officials should identify the inferences that they intend should be made from its assessment results and clearly articulate the validation arguments in support of those inferences.
Recommendation 6-2: NAEP officials should embark on a research agenda that is guided by the claims and counterclaims for intended uses of results in the validation argument they have articulated. This research should apply a variety of approaches and types of evidence, such as analyses of test content, test-takers’ cognitive processes, criterion-related evidence, and other studies deemed appropriate.
Recommendation 6-3: NAEP officials should conduct empirical research to specifically evaluate the extent to which the validation argument that underlies each NAEP assessment and the inferences the assessment was designed to support are affected by the use of particular accommodations.