Of the nearly 46 million students enrolled in grades K-12 in U.S. public schools during the 2000-2001 school year, some 11.5 percent or nearly 5.5 million were classified as having some kind of disability (U.S. Department of Education, 2002). In addition, nearly 4.6 million or 9.6 percent were identified as English language learners (National Clearinghouse for English Language Acquisition & Language Instruction Educational Programs, 2004). The educational needs of these students vary considerably, as do the strategies for meeting them that are in place in school districts around the country.
The past several decades have seen a significant increase both in the numbers of such students enrolled in U.S. public schools and in attention to their needs, as well as a corresponding demand for information about their academic progress. Agreement has been growing that educational assessments should, whenever possible, include students with disabilities and English language learners so that data can be collected about their progress in school. Legislation has also made their inclusion mandatory. While these two groups of students are often discussed together, it is important to note that the differences between them have important implications in the context of assessment issues. Nevertheless, many of the assessment issues that arise for these two groups of students are similar, and we have addressed them together in this report.
To meet the need to include students from these populations, accommodations are increasingly being used in large-scale assessments, both state assessments and the National Assessment of Educational Progress (NAEP). Drawing on the definition provided in the APA/AERA/NCME Standards for Educational and Psychological Testing (American Educational Research Association et al., 1999,
p. 101), we define an accommodation as the general term for any action taken in response to a determination that an individual’s disability or lack of English language proficiency requires a departure from established testing protocol. An accommodation may involve a change in the characteristics of specific assessment tasks (e.g., simplified language, native language translation, large font, Braille) or in administrative procedures (e.g., additional time, oral reading of instructions, access to specific equipment). More detailed discussions of accommodations for students with disabilities and English language learners and issues related to their use appear later in the report.
Although the definition is relatively straightforward, identifying students to be included, determining which accommodations are appropriate, and ensuring that scores from accommodated assessments can be interpreted in the same way as scores from regular assessments turn out to be highly complex and problematic issues. To a degree that may surprise those who have not considered the question, the procedures for including both of these groups of students in testing, as well as for providing them with testing accommodations, are far from uniform around the country. Furthermore, research on the effects of various accommodations on performance, as well as on the validity of the inferences made on the basis of scores from accommodated assessments, is inconclusive.
It was in this context that, in 1996, the National Assessment Governing Board (NAGB) and the National Center for Education Statistics (NCES), the groups responsible for developing and implementing policy for NAEP, revised NAEP’s policies for including students with disabilities and English language learners in the assessment. They made the changes, the primary effect of which was to include more students in testing, in recognition of changing regulations regarding the testing of these two groups and because of increased appreciation of the value of testing these students. NAEP’s sponsors were guided by the importance of maintaining the integrity of NAEP data despite these policy changes, as well as by the importance of keeping NAEP’s policies and procedures in accord with those used in other large-scale testing programs administered by states.
In brief, the new policies call for the inclusion of most students with disabilities and most students who have been designated as limited English proficient, and for the exclusion, in general, only of those who cannot meaningfully participate with accommodations approved for NAEP. Under the old policies, far fewer students in these two categories had been included in testing.
We note here that several terms are used to refer to students who are not yet fluent in English, and these may reflect somewhat different understandings of these students and their needs. Although NAEP materials currently use the term LEP (limited English proficient), the committee prefers the more widely used term English language learners, which emphasizes these students’ developing English proficiency rather than their limitations.
Two significant challenges faced NAEP’s sponsors as they revised their policies and procedures. First, the policies and procedures used by states, districts, and schools vary with respect to which students are classified as having disabilities and being English language learners. These variations in policies and procedures affect decisions about (1) who is included in the assessment, (2) who receives accommodations, (3) what accommodations are allowed and provided, and (4) which students’ scores are included in reports. The second major challenge lay in the lack of clear guidance from the available research base regarding the effects of accommodations on test performance. While a considerable body of research exists, findings from these studies are both inconsistent and generally inconclusive (Sireci et al., 2003). The available research is discussed in Chapter 5.
POLICY AND PRACTICE REGARDING INCLUSION AND ACCOMMODATION
The variation in state policies for handling the assessment of students with disabilities and English language learners is particularly relevant for NAEP. NAEP officials identify the sample of students to be included in the assessment at each participating school, but they must rely on school-level staff to make the decisions about which of the selected students can meaningfully participate and which cannot. That is, selected students whom the local education agency has classified as students with disabilities or English language learners may be excluded from NAEP if school-level staff judge that they cannot meaningfully participate or if they require testing accommodations that NAEP does not permit.1 It is therefore the local education agency that makes the ultimate decisions about which students will participate in NAEP and which accommodations they may be given, using the guidelines provided by NAEP officials combined with their knowledge of the students.
Students are selected to participate in NAEP on the basis of a complex sampling scheme designed to ensure that a nationally representative subset of students is assessed. Variability in state, district, and school policies and procedures for determining which students are considered to have disabilities or to be English language learners, which of these students can meaningfully be assessed, and which accommodations they will receive, all affect the outcome of the sampling.
This variability has implications for the interpretation of NAEP results. First, local decisions about which students will be included will affect the specific samples that are obtained. Second, the accommodations with which students are
provided will affect the conditions under which scores are obtained. As a consequence, a given state’s results are affected by these locally made decisions, which may be based on criteria that vary from school to school within a state. Third, national NAEP results, in which scores are aggregated across states, are also affected by these locally made decisions. Finally, a key objective for NAEP is to characterize the achievement of the school-age population in the United States, yet the extent to which NAEP results are representative of the entire population depends on these locally made decisions.
EFFECTS OF ACCOMMODATIONS ON PERFORMANCE AND ON THE INTERPRETATION OF SCORES
The interpretation of NAEP results is further complicated by the fact that the effects of accommodations on test performance are not well understood. Although considerable research has been conducted, a number of questions remain:
Do commonly used accommodations yield scores that are comparable to those obtained when accommodations are not used? Do they over- or undercorrect for the impediment for which they are designed to compensate?
Do commonly used accommodations alter the construct being tested?
What methods should be used for evaluating the effects of a particular accommodation on the validity of test results?
Research on the effects of accommodations has been conducted in different ways, and some of it has yielded intriguing results. The committee commissioned a critical review of this literature, which is discussed in greater detail in Chapter 5. From that review as well as its own observations, the committee notes that research premises and methodologies have varied, stark differences among researchers remain, and little consensus has emerged.
Many researchers, for example, have focused on comparisons of score gains associated with taking the assessment under standard and accommodated conditions. Many studies use a quasi-experimental design in which the target group (e.g., students with disabilities or English language learners) and the comparison group (e.g., nondisabled students or native English speakers) take an assessment with and without accommodations. If scores increase under the accommodated condition for the target group but not for the comparison group, the accommodation is considered to be working as intended.
Other researchers (National Research Council, 2002a, pp. 74-75) have challenged the underlying premise of this research design—that is, they do not agree that such results constitute adequate evidence that an accommodation is working as intended. These critics argue that there may be a confound between the construct being evaluated and the accommodation. Performance on the construct may depend on skills other than those the assessment is intended to measure.
Accommodations may assist all examinees with these skills and consequently help general education students as well as those with identified special needs. These critics argue for different ways of evaluating accommodations, and the committee agrees that alternative methodologies should be used. This point is addressed in greater detail in Chapters 5 and 6.
These questions about the validity of interpretations of accommodated scores are of considerable importance for NAEP, and they are equally important for state assessment programs. At the program level, all large-scale testing programs must develop policies about which accommodations should be allowed and which should not be allowed, given the content and skills being assessed. Likewise, at the individual level, educators must determine which accommodations are appropriate given an individual’s needs and the content and skills to be assessed. These decisions should be guided by a clear statement about the inferences to be based on the particular test results and by research findings. In our judgment the available research has not yet yielded the guidance needed to make these decisions; goals for this research are addressed in Chapters 5 and 6.
THE COMMITTEE’S APPROACH
The Board on Testing and Assessment of the National Research Council has for some time been concerned about the issues surrounding the inclusion of students with disabilities and English language learners in large-scale assessments and the effects of accommodations on test performance and the interpretation of scores. In November 2001 the board held a workshop on reporting and interpreting test results for students with disabilities and English language learners who receive test accommodations. That workshop, which was designed to investigate the implications of NAEP’s policies regarding the reporting of results for these two groups, made clear that a more comprehensive look at both the variability in inclusion and accommodation policies, and the available research into the effects of accommodations was urgently needed (National Research Council, 2002a).
Thus the National Research Council convened the Committee on Participation of English Language Learners and Students with Disabilities in NAEP and Other Large-Scale Assessments to study these issues. The committee was asked to build on the information learned at the November 2001 workshop (National Research Council, 2002a) and other earlier work in this area (e.g., National Research Council, 1997a, 1997b, 1999a, 1999b, 2000a, 2000b, 2002a). The committee had two primary objectives: (1) to identify what is known about how inclusion and accommodation decisions are currently made and (2) to synthesize recent research about the effects of accommodations on academic test performance and the interpretation of scores. The 2001 workshop included discussions and presentations about states’ and NAEP’s policies for making inclusion and participation decisions as well as presentations by several individuals who have
conducted extensive research in this area (e.g., Jamal Abedi, Stephen Elliott, Laura Hamilton, John Mazzeo, and Gerald Tindal). The workshop report presented summaries of the studies discussed by these authors. In preparation for the committee’s work, a critical review of the literature, focusing on studies conducted between January 1990 and December 2002, was commissioned that was intended to build on and extend the research summaries in the workshop report. The authors of the review were Stephen Sireci, associate professor of education and co-director for the center for educational assessment, Stanley Scarpati, associate professor of special education, and Shuhong Li, graduate student in research and evaluation methods, all with the University of Massachusetts at Amherst. This review and critique of the literature assisted the committee with its review and synthesis of research findings. In meeting both aspects of their charge, the committee draws conclusions and makes recommendations about the implications of this information for NAEP policies and the interpretation of NAEP data, as well as for the policies of state assessment programs and the interpretation of their data.
The committee collected information about these issues in several ways. It held two meetings at which presentations were made by a variety of experts. At the first, which focused on policies and procedures, Martha Thurlow of the National Center on Education Outcomes and Charlene Rivera of the Center for Equity and Excellence in Education at George Washington University made presentations on state policies regarding students with disabilities and English language learners, respectively. John Olson of the Council of Chief State School Officers (CCSSO) presented data collected by CCSSO on state policies. Arnold Goldstein of NCES and Jim Carlson of NAGB discussed NAEP policies on accommodation and research conducted on the effects of accommodations on NAEP results. Also at that meeting Stephen Sireci presented the literature review he and his colleagues had conducted (see Sireci et al., 2003) and received feedback from the committee, which was used in preparing the final version of the paper.
At a second meeting, the committee focused on relevant research into the validity of accommodated assessment results. Mary Crovo of NAGB made a presentation about constructs assessed on NAEP reading and mathematics assessments. Eric Hansen of Educational Testing Service presented a second paper prepared for the committee on his plan for an “evidence-centered design approach” to determining allowable accommodations. The committee also heard presentations by Wayne Camara of the College Board and Robert Ziomek of ACT, Inc., on studies of the effects of accommodations on other large-scale assessments (the ACT and the SAT) and on the sampling methodology for students with disabilities and English language learners in NAEP. We reviewed materials made available by NAGB regarding inclusion and accommodation policies and procedures and reports on the participation of students with disabilities and English language learners in the assessment. While the report writing process was under way,
several new NAEP reports became available, and we note their relevance to some of our recommendations.
This report of the committee’s findings is designed to be of use not only to those who develop policies for NAEP, administer it, or use its results, but also to others interested in the data that large-scale testing can provide about the performance of two groups of students whose educational needs are gaining increased recognition. The committee hopes that this report will be useful for NAGB as it strives to make NAEP more inclusive of students with special needs and to provide results that are more representative of the entire population of school-age children in the United States.
The committee also intends for the report to be useful to states, districts, and schools as they attempt to comply with the terms of the No Child Left Behind Act of 2001. This legislation mandates that states include all students in statewide accountability programs and that they disaggregate assessment results for students with disabilities and English language learners. It holds states accountable for demonstrating that students in these groups are making continuous academic progress. The legislation places considerable demands on state and local testing programs to produce a far greater volume of data than they have previously; because such serious decisions are to be based on test results, the importance of their reliability and validity is greater than ever. Understanding how both inclusion and accommodation decisions are implemented at the local level, as well as the effects of accommodations on test performance, will be key to understanding the meaning of test results for these groups of students.
GUIDE TO THE REPORT
The structure of the report corresponds to the two aspects of the committee’s charge. We first deal with the questions of which students are included in testing and the ways in which they are tested. We describe policies, procedures, and practices for identifying, classifying, and including students with disabilities and English language learners, as well as the kinds of accommodations these students are offered. We then address the meaning of scores from accommodated assessments, including what is known about the effects of accommodations on performance in large-scale assessments and the nature of validation arguments and the kinds of evidence that are needed to support inferences made from scores.
Chapter 2 provides background information on students with disabilities and English language learners, on NAEP and other large-scale assessments, and on the issues surrounding the inclusion of these students in testing and the accommodations they need. Chapter 3 discusses the impact of policies currently followed with regard to both including and accommodating these students. Chapter 4 discusses the sampling procedures that are the basis for all NAEP reports on the performance of students with disabilities and English language learners and the factors that complicate the sampling of these groups. Chapter 5 describes the
available research on the ways in which the validity of inferences based on test results is affected by accommodations and provides a recommendation for further research in this area. Chapter 6 discusses the kinds of validation arguments that should be articulated for NAEP and for other large-scale assessments. Chapter 7 provides an overview of the primary implications of the committee’s findings and recommendations both for NAEP and for the states. The committee’s findings and recommendations are presented at the end of the chapters that discuss the evidence on which they are based.