have negative consequences for some students even while serving important social or educational policy purposes. Perhaps some would be willing to accept, for example, that some students will be harmed, not helped, by a strict rule linking promotion with getting a certain test score—if that policy leads to increased public confidence and support for the schools. The committee takes no position on the wisdom of such a trade-off; but it is our view that policymakers should fully understand what is at stake and who is most likely to be harmed.
The Congress also asked the National Academy of Sciences to consider whether "existing and new tests adequately assess student reading and mathematics comprehension in the form most likely to yield accurate information regarding student achievement of reading and mathematics skills." This could refer to a wide range of issues, including, for example, the balance of multiple-choice and constructed-response items, the use of student portfolios, the length and timing of the test, the availability of calculators or manipulatives, and the language of administration. However, in considering test form, the committee has chosen to focus on the needs of English-language learners and students with disabilities, in part because these students may be particularly vulnerable to the negative consequences of large-scale assessments. (In the literature, English-language learners have been known as "limited-English-proficient students." We adopt the current nomenclature in referring to this group.) We consider, for these students, in what form and manner a test is most likely to measure accurately a student's achievement of reading and mathematics skills.
Two policy objectives are key for these special populations: one is to increase their participation in large-scale assessments, so that school systems can be held accountable for their educational progress. The other is to test each such student in a manner that accommodates for a disability or limited English proficiency to the extent that either is unrelated to the subject matter being tested, while still maintaining the validity and comparability of test results among all students. These objectives are in tension, and thus present serious technical and operational challenges to test developers and users.
The remainder of Part I provides a broad review of the background and context of large-scale standardized achievement testing with high stakes for individual students. Chapter 2 reviews the policy context and