are aggregated to summary scores, and other key elements such as the performance levels.16 In the present context, the analysis would need to focus on the levels set by the state to define when a student is “English proficient.” This approach would compare the performance levels in terms of what students are expected to know and be able to do in order to be considered “English proficient” to evaluate the extent to which the states require similar skills. The approach might compare the performance levels with other sacross the states. Or it might involve determining a priori a definition of English proficiency and evaluating each state’s performance levels in relation to this definition. The a priori definition might be determined by the DoEd or through the use of an independent expert panel.

To date, no qualitative crosswalk studies or statistical linking studies have been conducted for any of the ELP assessments we reviewed. “Bridging” studies have been done that predicted performance on the ACCESS from performance on other ELP tests (the studies by Kenyon, 2006a-2006d, mentioned earlier), but these studies were restricted to the kinds of assessments in place prior to NCLB (e.g., IPT, LAS, LPTS, and MAC II). It is important to point out that this situation is not unique to the ELP tests. The content standards, tests, and performance standards that states use for other aspects of NCLB (e.g., the reading and mathematics achievement tests) also vary from state to state, and scores are not comparable across states. Furthermore, it is important to note that the ELP tests were not designed from the outset to yield comparable results across states. The development effort would likely have taken a much different focus had cross-state comparability been the original intent. It is always difficult to attach a new use to test results when the test has not been designed from the outset for that purpose.

CONCLUSION 3-1 Although the English language proficiency assessments that we reviewed share common features and many states use the same test, the level of performance that defines when a student is considered to be “English proficient” is set by each state. There is no empirical evidence that has been collected to evaluate the comparability of these levels across the states.

In closing, however, we point out that results from the ELP test are not the sole basis for decisions to classify ELL students. Even if the ELP tests were linked and their scores placed on the same scale, there are still differences among the states in their procedures and criteria for classifying students. We take up these issues in the next chapter.


Crosswalk analyses are sometimes used for alignment studies to evaluate the extent to which test items are aligned with content standards (see, e.g., [December 2010]). Crosswalk analyses have also been conducted in a variety of other settings (see, e.g., [December 2010]).

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement