As mandated by Congress in 1969, the National Assessment of Educational Progress (NAEP) surveys the educational accomplishments of students in the United States. The assessment monitors changes in achievement, providing a measure of students’ learning at critical points in their school experience (U.S. Department of Education [DoEd], 1999). Results from the assessment inform national and state policy makers about student performance, thereby playing an integral role in evaluating the conditions and progress of the nation’s educational system.
NAEP includes two distinct assessment programs, referred to as “long-term trend NAEP” (or “trend NAEP”) and “main NAEP,” with different instrumentation, sampling, administration, and reporting practices (DoEd, 1999). Long-term trend NAEP is a collection of test items in reading, mathematics, and science that have been administered many times over the last three decades. As the name implies, long-term trend NAEP is designed to document changes in academic performance over time. It is administered to nationally representative samples of 9-, 13-, and 17-year-olds (DoEd, 1999).
Main NAEP test items reflect current thinking about what students know and can do in the NAEP subject areas. They are based on recently developed content and skill outlines in reading, writing, mathematics, sci-
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
Reporting Test Results for Students with Disabilities and English-Language Learners: Summary of a Workshop 1 Introduction OVERVIEW OF THE NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS As mandated by Congress in 1969, the National Assessment of Educational Progress (NAEP) surveys the educational accomplishments of students in the United States. The assessment monitors changes in achievement, providing a measure of students’ learning at critical points in their school experience (U.S. Department of Education [DoEd], 1999). Results from the assessment inform national and state policy makers about student performance, thereby playing an integral role in evaluating the conditions and progress of the nation’s educational system. NAEP includes two distinct assessment programs, referred to as “long-term trend NAEP” (or “trend NAEP”) and “main NAEP,” with different instrumentation, sampling, administration, and reporting practices (DoEd, 1999). Long-term trend NAEP is a collection of test items in reading, mathematics, and science that have been administered many times over the last three decades. As the name implies, long-term trend NAEP is designed to document changes in academic performance over time. It is administered to nationally representative samples of 9-, 13-, and 17-year-olds (DoEd, 1999). Main NAEP test items reflect current thinking about what students know and can do in the NAEP subject areas. They are based on recently developed content and skill outlines in reading, writing, mathematics, sci-
OCR for page 1
Reporting Test Results for Students with Disabilities and English-Language Learners: Summary of a Workshop ence, U.S. history, world history, geography, civics, the arts, and foreign languages. Main NAEP assessments use the latest advances in assessment methodology. Typically, two subjects are tested at each biennial administration. Main NAEP results are also used to track short-term changes in performance. Main NAEP has two components: national NAEP and state NAEP. National NAEP tests nationally representative samples of students in grades four, eight, and twelve. In most subjects, NAEP is administered two, three, or four times during a 12-year period. State NAEP assessments are administered to representative samples of students in states that elect to participate. State NAEP uses the same large-scale assessment materials as national NAEP. It is administered to grades four and eight in reading, writing, mathematics, and science (although not always in both grades in each of these subjects). NAEP differs fundamentally from many other testing programs in that its objective is to obtain accurate measures of academic achievement for groups of students rather than for individuals. To achieve this goal NAEP uses innovative sampling, scaling, and analytic procedures. NAEP’s current practice is to use a scale of 0 to 500 to summarize performance on the assessments. NAEP reports scores on this scale in a given subject area for the nation as a whole, for individual states, and for population subsets based on demographic and background characteristics. Results are tabulated over time to provide both long-term and short-term trend information. In addition to scale scores, NAEP uses achievement levels to summarize performance. The percentage of students at or above each achievement level is reported. The National Assessment Governing Board (NAGB) has established, by policy, definitions for three levels of student achievement: basic, proficient, and advanced (DoEd, 1999). The achievement levels describe the range of performance NAGB believes should be demonstrated at each grade. Uses for NAEP Results NAEP is intended to serve as a monitor of educational progress of students in the United States. Although NAEP results receive a fair amount of public attention, they have typically not been used for high-stakes purposes, such as for making decisions about placement, promotion, or retention. Surveys and other analyses reveal that NAEP results are used for the following purposes (National Research Council [NRC], 1999, p. 27).
OCR for page 1
Reporting Test Results for Students with Disabilities and English-Language Learners: Summary of a Workshop to describe the status of the educational system, to describe student performance by demographic group, to identify the knowledge and skills over which students have (or do not have) mastery, to support judgments about the adequacy of observed performance, to argue the success or failure of instructional content and strategies, to discuss relationships between achievement and school and family variables, to reinforce the call for high academic standards and educational reform, and to argue for system and school accountability. The ways NAEP results are used are likely to change, however, as a result of the legislation that, at the time of this workshop, was still pending in Congress (and has since been enacted into law). At the workshop, Thomas Toch, guest scholar at the Brookings Institute, described the proposed legislation. This legislation calls for annual testing of third through eighth graders in mathematics and reading, with test results used to determine rewards or corrective actions for schools, school districts, and states. The education plan contains an adequate yearly progress element, which in effect requires that schools, school districts, and states set standards and report annual progress for students in four groups: racial/ethnic minorities, economically disadvantaged students, English-language learners, and students with disabilities. If students in each of those four groups do not make sufficient progress each year toward the state’s standards, the schools, school districts, and states would be subject to corrective action. The ultimate objective is for 100 percent of the students in each of these four groups to achieve state standards for proficiency within 12 years. Schools that accomplish this goal would be eligible for financial rewards. Corrective actions for schools that do not show progress include the following: their students may be allowed to attend different public schools; the state may take over school operations; and/or the schools may be subject to other forms of restructuring. At the time of the workshop, the proposed legislation called for comparisons to be made between state assessment results and an external test in order to encourage states to establish high standards and use high-quality tests. The Senate version of the bill, which was the one that passed, called for NAEP to fill this benchmarking role. The language was modified in the
OCR for page 1
Reporting Test Results for Students with Disabilities and English-Language Learners: Summary of a Workshop final version of the legislation, and it does not actually call for such benchmarking. The law does, however, mandate state participation in biennial NAEP assessments of fourth and eighth grade reading and mathematics, and it is expected that NAEP will serve as a benchmark for state assessments (Taylor, 2002). It was within this context—a general expectation that the proposed legislation would be adopted and that such comparisons would be required—that the workshop took place. Including and Accommodating Students with Special Needs Accommodations are provided to test takers with special needs in order to remove disability-related barriers to performance. The goal is to provide accommodations that compensate for a student’s specific disability but do not alter the attributes measured by the assessment or give an unfair advantage to the accommodated student. Accommodations are intended to correct for the disability so that scores from an accommodated assessment measure the same attributes as scores from an assessment administered without accommodations to individuals without disabilities (NRC, 1997; Shepard, Taylor, and Betebenner, 1998; Koretz and Hamilton, 2000). However, there are no hard and fast rules for what constitutes an appropriate accommodation for a given student’s special needs. Hence, there is always a risk that the accommodation over- or under-corrects in a way that distorts performance. In 1996, NAEP began piloting testing procedures for including and accommodating students with special needs in the assessment. At the same time, a research plan was implemented to investigate the impact of the policy changes on the participation of special needs students in NAEP and to examine the effects on performance of testing with accommodations. Research has continued with subsequent assessments, and inclusion and accommodation policies are now a permanent aspect of the program. Currently, NAEP’s stewards1 are addressing issues related to reporting the results from accommodated administrations. Beginning in 2002, NAEP will report aggregated data that combine results for those who receive accommodations and those who take the test under standard procedures. Since accommodations were not allowed prior to 1996, there is 1 NAEP’s stewards include National Assessment Governing Board members and staff as well as National Center for Education Statistics staff members.
OCR for page 1
Reporting Test Results for Students with Disabilities and English-Language Learners: Summary of a Workshop some concern about the comparability of pre-1996 data to future data. That is, what effects will the new policies have on the interpretation of trends (long term as well as those based on main NAEP)? Considerable research has been conducted on the effects of accommodations on performance on tests other than NAEP. One objective for the workshop was to learn more about the findings from the research and to consider the extent to which they generalize to NAEP. Of particular interest was research on the comparability of scores from accommodated and nonaccommodated administrations and the extent to which they can be considered to measure similar constructs. In addition, through their efforts to comply with existing legislation (such as the Americans with Disabilities Act, the Individuals with Disabilities Education Act, and Title I), states have accumulated a good deal of experience with including and accommodating students with special needs and reporting their results. Another objective for the workshop was to learn about states’ experiences in enacting their reporting policies. NAEP’s stewards believed that such information would be useful as they formulate reporting policies for NAEP. Of particular interest were questions such as: What data do states include in their reports? Under what conditions are results for accommodated and nonaccommodated test takers aggregated for reporting? For what categories of students do states report disaggregated results? What, if any, complications have arisen in connection with preparing aggregated or disaggregated data? And what have been the effects of inclusion and accommodation on trend data reported for the state assessment? The fact that the new legislation is expected to require comparisons between state assessment and NAEP results makes these reporting issues are especially relevant. OVERVIEW OF WORKSHOP Officials with the National Center for Education Statistics asked the NRC’s Board on Testing and Assessment (BOTA) to convene a workshop to assist them with their decision making about reporting results for accommodated test takers. BOTA is well positioned to assist with these questions since it has already conducted two evaluations of NAEP programs (NRC, 1999, 2001) and two studies on testing students with special needs (NRC, 1997, 2000). The workshop brought together representatives from state assessment offices, individuals familiar with testing students with disabilities and En-
OCR for page 1
Reporting Test Results for Students with Disabilities and English-Language Learners: Summary of a Workshop glish-language learners, and measurement experts to discuss the policy and technical considerations associated with testing students with special needs. The daylong workshop included four panels that explored the following issues: What inclusion and accommodation policies are in effect in state testing programs? What data do states report for excluded students, included and accommodated students, and students tested under standard testing conditions? How are data aggregated and disaggregated for reporting purposes? How do states report trend data for accommodated students and for those tested under standard testing conditions? What issues have states encountered as they make decisions about reporting results for accommodated test takers? What does the research suggest about the effects of accommodations on test performance for English-language learners and students with disabilities? What does the research suggest about the validity of scores from accommodated administrations? What does the research suggest about the comparability of scores from standard and accommodated administrations? The first panel of workshop speakers laid out the policy and legal context for including and accommodating students with special needs in large-scale testing. Arthur Coleman, with Nixon Peabody LLP, and Thomas Toch, guest scholar with the Brookings Institute, addressed these issues. In addition, Peggy Carr, associate commissioner of education at the National Center for Education Statistics, and Jim Carlson, assistant director for psychometrics at the National Assessment Governing Board (NAGB), provided background information on NAEP’s policies. The second panel addressed state policies on accommodations and reporting results for students with disabilities and English-language learners. Speakers included Martha Thurlow, director of the National Center on Educational Outcomes at the University of Minnesota, and Laura Golden and Lynne Sacks, researchers at George Washington University’s Center for Equity and Excellence in Education (CEEE), who highlighted findings from their surveys of states’ policies. In addition, representatives from two state offices of assessment—Scott Trimble (Kentucky) and Phyllis Stolp (Texas)—spoke about the policies of their respective states.
OCR for page 1
Reporting Test Results for Students with Disabilities and English-Language Learners: Summary of a Workshop Panel three consisted of researchers who have investigated the effects of accommodations on test performance. John Mazzeo, executive director of the Educational Testing Service’s School and College Services, spoke about research conducted on NAEP. Other speakers included Stephen Elliott, professor at the University of Wisconsin; Gerald Tindal, professor at the University of Oregon; Jamal Abedi, adjunct professor at the UCLA Graduate School of Education and director of technical projects at the National Center for Research on Evaluation, Standards, and Student Testing (CRESST); and Laura Hamilton, behavioral scientist with the RAND Corporation. The final panel consisted of four discussants who were asked to summarize and synthesize the ideas presented during the workshop and to highlight issues in need of further exploration and research. Panel speakers included Eugene Johnson, chief psychometrician at the American Institutes for Research; David Malouf, educational research analyst at DoEd’s Office of Special Education Programs; Richard Durán, professor at the University of California at Santa Barbara; and Margaret Goertz, co-director of the Consortium for Policy Research in Education. OVERVIEW OF THIS REPORT Chapter 2 provides background information on NAEP’s policies for including and accommodating students with special needs and gives an overview of the research plan first implemented with the 1996 assessment. Chapter 3 summarizes information provided by Arthur Coleman on federal requirements for including and accommodating students with disabilities and English-language learners in large-scale assessment. Chapter 4 presents the findings from surveys of states’ policies for including, accommodating, and reporting results for students with special needs. First-hand accounts of policies and experiences with reporting results for accommodated test takers in Texas and Kentucky appear in Chapter 5. Chapter 6 highlights the main points made by the speakers in the fourth panel, who discussed findings from research on the effects of accommodations on NAEP and on other tests. Chapter 7 concludes the report with a summary of discussants’ remarks.