Read "High Stakes: Testing for Tracking, Promotion, and Graduation" at NAP.edu

Page 188 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

8
Students with Disabilities

Recent federal and state laws mandate the inclusion of all students in large-scale assessments, even students with special learning and language needs. The federal legislation includes Goals 2000 and Title I of the Improving America's Schools Act of 1994. Several other federal statutes and regulations, including the Individuals with Disabilities Education Act (IDEA), Section 504 of the Rehabilitation Act of 1973, and Title II of the Americans with Disabilities Act (ADA), play a prominent role in determining how students with disabilities will participate in general education, including large-scale assessment programs.

The committee has benefited enormously from recent work in this area by another committee of the Board on Testing and Assessment. The Committee on Goals 2000 and the Inclusion of Students with Disabilities examined in detail the implications of standards-based reforms for students with disabilities. That committee's report, Educating One and All: Students with Disabilities and Standards-Based Reform (National Research Council, 1997), discusses systems of accountability and assessment and ways to increase the participation of children with disabilities. We draw on this earlier report throughout this chapter.

The 1997 amendments to the IDEA include several new or expanded assessment provisions likely to increase dramatically the participation of students with disabilities in large-scale assessments. They link decisions about participation to a student's individualized education program (IEP),

Page 189 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

which is required by law to be developed for each student with a disability. The 1997 amendments provide that:

As a condition of eligibility, states must have policies and procedures to ensure that students with disabilities are included in general state- and district-wide assessment programs, with appropriate accommodations when necessary.
Effective July 1, 1998, IEPs must include a statement of any individual modifications in the administration of state- or district-wide assessments of student achievement that are needed in order for the student to participate in such assessments. If the IEP team—which includes parents, the teacher, and others concerned with the child's education—determines that the child will not participate in a particular state-wide or district-wide assessment of student achievement (or part thereof), then the IEP must include a statement of why that assessment is not appropriate for the student and how the student will be assessed.
For the students whose IEPs specify that they should be excluded from regular assessments, the state must ensure development of guidelines for their participation in alternate assessments, developing and conducting alternate assessments no later than July 1, 2000.
States must have recording policies and procedures in place that ensure proper reporting of information regarding the performance of students with disabilities on large-scale assessments.

These changes in the law were designed to benefit children with disabilities by promoting high expectations commensurate with their needs and providing a means of holding school systems accountable for attending to those needs. Because about 50 percent of students with disabilities have been excluded from state- and district-wide assessments in the past, there has been a shortage of key indicators of success for many of these children, including performance on assessments, dropout rates, graduation rates, and regular reports to the public on progress toward meeting goals for their educational improvement. Many school systems have therefore not established meaningful educational goals for children who, it is now clear, can achieve at higher levels than society has historically assumed. Changes in the new law also aim to improve reporting to parents and teachers of students with disabilities (and the students themselves) with respect to the progress they are making toward achievement of these goals. The 1997 IDEA amendments also contemplate that state

Page 190 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

performance goals and indicators will help determine needs for personnel training and development (U.S. Department of Education, October 1997).

Students with Disabilities: Who Qualifies?

Describing appropriate assessment policies for students with disabilities is complicated by the wide variation in the characteristics of the 5 million students—10 percent of the school-age population—who qualify for special education services under the IDEA. Cognitive, physical, sensory, and behavioral disabilities are covered.

Educating One and All describes the diversity of this population and their school experiences (National Research Council, 1997):

Although 13 disabilities are mentioned in the federal IDEA and defined in the regulations, 4 categories of disability account for about 90 percent of all special education students: speech or language impairment, serious emotional disturbance, mental retardation, and specific learning disabilities. The category of learning disabilities, by itself, accounts for more than half of all students identified for special education.
Identification and classification practices vary widely from place to place. Inconsistencies are particularly common in distinguishing students with mild cognitive disabilities, such as learning disabilities and mild mental retardation, from students who are low achieving but may not have a specific disability. Identification practices and disability definitions vary so greatly, in fact, that a student who is identified in one of these categories in one school district may not be so identified in another (Shepard, 1989). Prevalence rates vary widely across jurisdictions (U.S. Department of Education, 1995). Practices for identifying students with disabilities also vary over time and thus can affect estimates of trends in prevalence rates. The responses of schools to financial incentives (such as increased funding based on the numbers of students who are classified as having a disability) also vary over time and can affect these trends.
Students with disabilities also vary in terms of their educational experience. Over the past 20 years, students with disabilities have been participating to an increasing extent in general education classrooms and curricula. For example, as of 1993–1994, almost 75 percent of special

Page 191 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

education students spent at least 40 percent of their school day in regular classrooms (U.S. Department of Education, 1996). Others receive specialized instruction and individualized curricula for some part of their school day or in the area of their disability. For a small number of students, mostly those with severe cognitive impairments, the focus of a predominantly academic general education curriculum is not consistent with their life goals. They experience a highly individualized curriculum that emphasizes independence and self-sufficiency.

Identification for Special Education

To be eligible for special education services under the IDEA, a student must first be found to meet the criteria for at least 1 of 13 recognized disabilities (or the counterpart categories in state law). The second criterion is that the student must be found to require special educational services. A student who needs special education is also entitled to receive related services—that is, other services that enable him or her to benefit from special education. States and school districts must provide each such student with a free, appropriate public education.

This means that not all students with disabilities are eligible to receive services under the IDEA. They must demonstrate educational need. Even those who do not qualify for special education, however, are protected by Section 504 of the Rehabilitation Act of 1973 and by the ADA (1990), which entitle them to reasonable accommodations in school activities to permit them to overcome impairments in one or more major life activities. The number of students in this category is not known.¹ Nevertheless, the legal rights accorded them could affect large-scale assessment programs, especially those with high stakes for individual students, by increasing the number of students receiving accommodations and raising questions about the extent to which the scores of students with disabilities should be aggregated with other students' scores.

Educating One and All describes the provisions of the IDEA designed to ensure the fair, nondiscriminatory use of tests in identifying students who qualify for special education services (National Research Council, 1997:69–70):

¹	Data on children with disabilities are collected for IDEA-eligible children and thus do not include other students with disabilities.

Page 192 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

The IDEA is explicit and detailed about testing and assessment procedures used to qualify students for special education. A number of legislative provisions are designed to protect students and ensure the fair, nondiscriminatory use of tests. These provisions stipulate that decisions about children must be based on more than a single test, that tests must be validated for the purpose for which they are used, that students must be assessed in all areas related to a suspected disability, and that evaluations must be made by a multidisciplinary team. Children are generally tested in one-to-one situations with various school professionals (e.g., school psychologist, an occupational therapist, a speech and language therapist) on tests that can be individually adapted to match the child's level.

Such highly individualized testing, conducted for the purpose of diagnosis or instructional planning, differs considerably from the large-scale, group-administered assessments of achievement that are the primary focus of this report. The Congress nonetheless expressed serious concern about racial disproportions in special education when it reauthorized the IDEA in 1997.²

IEP Process

Testing thus plays a critical role in determining who qualifies for special education services, but traditionally accountability in special education has not relied mainly on assessment. Rather, it has centered on the individualized education program, an essentially private document that lays out the educational goals and curriculum of an individual student. Each IEP is designed to reflect one child's capabilities and to specify the services necessary for the child to benefit from that curriculum. IEPs thus vary considerably from student to student and have varying degrees of relationship with the general curriculum. For example, one IEP may call for a sign-language interpreter to enable a deaf student to participate fully in the general education curriculum; another may establish a set of instructional objectives that focus on the goal of independent living—telling time, personal hygiene, and basic safety skills.

By law the IEP also serves as a device for monitoring a student's progress. Classroom-based assessment, teacher judgment, and other measures that are sensitive to small and specific changes are typically used for

²	As part of this legislation, the Congress requested a National Research Council study to examine issues related to the overrepresentation of minority children in special education.

Page 193 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

this purpose. Large-scale assessments have not often been used, because of their emphasis on broad content domains rather than on the specific skills that are usually represented in IEP goals.

In many cases, the IEP is used in making high-stakes decisions about students with disabilities. The IEP team often considers assessment information in conjunction with a student's IEP goals and the progress made toward meeting them when making decisions on placement or retention in grade. In several states, where special education students may be exempt from state or local graduation requirements, completion of IEP goals is a sufficient condition for receipt of a high school diploma or its equivalent (Thurlow et al., 1996). Box 8-1 provides additional detail on state practices with regard to high school graduation and students in special education.

Participation in Large-Scale Assessment Programs

For a variety of reasons, many students with disabilities have traditionally been exempted from large-scale achievement tests. Educators and parents have sometimes been confused about the availability of test modifications or accommodations, or they have been concerned about subjecting these children to the stress of testing. Officials sometimes have "excused" children with disabilities from a testing requirement in an effort to raise their schools' average scores. Other concerns include the potential mismatch between test content and student curricula and difficulties in administering certain tests to students with severe disabilities. In any case, exempting these students from assessments, and thus from system accountability measures, has meant that there is less incentive to enhance their educational programs and improve their performance (National Research Council, 1997:152–153). Exclusion from testing may also communicate the message that students are not capable of meeting the expectations represented by the test. Most parents and teachers of students with disabilities say they want these students to meet the same high standards set for the general population (Thurlow et al., 1998).

There are a number of other reasons for including students with disabilities in assessment systems. A more accurate picture of aggregate student performance is produced when all students are included (Vanderwood et al., 1998); comparisons of test results among schools or districts will not be valid if participation rates of students with disabilities

Page 194 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

BOX 8-1 High School Graduation

Educating One and All (National Research Council, 1997) reported that, among states with high school exit exams in 1995, five exempted special education students with IEPs from such tests if course requirements were met; this could be done even through alternative special education courses (Thurlow et al., 1995a). Some analysts have suggested that such policies may inflate special education referrals (Allington and McGill-Franzen, 1992) or lead families to move their children to schools in "easy" graduation states (Thurlow et al., 1995a). A related issue is that students with disabilities who do not graduate are entitled to remain in school and receive special education and related services until the age of 21 or 22—a costly proposition. Receiving a diploma, which terminates IDEA services, constitutes a change in placement requiring parental consent, and some parents may prefer that their children continue to receive services rather than graduate.

Another option practiced in some states is to give students a modified diploma or certificate upon successful completion of IEP goals. This alternative is reserved in some places for students with the most profound disabilities. The various practices reflect differing opinions about how best to meet the needs of students with disabilities. Some argue that differentiated diplomas stigmatize students; others feel that giving a standard diploma to these students devalues the credential and corrupts the educational process (DeStefano and Metzer, 1991). Research evidence on these questions is generally missing.

Educating One and All emphasizes that, because a high school diploma is the minimum requirement for many jobs, graduation testing of students with disabilities raises serious concerns. If graduation standards are increased, more students—including those with disabilities—may well be denied diplomas. The 1997 report makes the following recommendations:

If students receive alternative credentials to the standard high school diploma, parents need to understand the different diplomas and the implications of decisions to modify curriculum and assessments for the type of diploma their child will receive.

Before attaching significant stakes to the performance of individual students, those students should be given an opportunity to learn the skills and knowledge expected of them.

The report also calls for research on the effects of different kinds of high school credentials on employment and other post-school outcomes, as well as research to develop meaningful alternative credentials that can credibly convey the nature of a student's accomplishments and capabilities.

Page 195 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

vary from one place to the next. Individual scores also provide important information to students, their parents, and their teachers. In addition, education reforms and the allocation of resources and extra services are increasingly driven by these test results; if students with disabilities were not included, then the resulting reforms would be less likely to meet their needs. Finally, recent federal legislation has mandated that students with disabilities be included in large-scale assessments and that accommodations and alternate assessments be provided when necessary.

Surveys have indicated that the participation of students with disabilities in statewide assessments has generally been minimal, but with extreme variation from one state to another—from 0 to 100 percent (Erickson et al., 1995; Gronna et al., 1998; McGrew et al., 1992; Shriner and Thurlow, 1992). Forty-three states had written guidelines for participation at the time that Educating One and All was written. IEP teams helped make the decision in most states, but only about half of those with guidelines required that participation decisions be documented in the IEP (Erickson and Thurlow, 1996). Research suggests that the criteria for making decisions about participation have also varied widely from district to district, even in states with written guidelines (DeStefano, 1998).

One of the ways to increase the participation of students with disabilities in large-scale assessments is to offer accommodations. Four broad categories of accommodations are currently in use (Thurlow et al., 1993):

(1)	Changes in presentation: for example, braille forms for visually impaired students, taped versions for students with reading disabilities.

(2)	Changes in response mode: use of a scribe or amanuensis (someone who writes answers for the examinee), computer assistance on tests not otherwise administered by computer.

(3)	Changes in timing: extra time within a given test session, the division of a session into smaller time blocks.

(4)	Changes in setting: administration in small groups or alone, in a separate room. Some students with disabilities may take the test in a standard setting with some form of physical accommodation (e.g., a special desk) but with no other change (National Research Council, 1997:159).

Written guidelines on the use of accommodations vary considerably from state to state. States take different approaches regarding what accommodations they allow or prohibit. A change that is explicitly permitted

Page 196 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

in one state might be prohibited in another (Thurlow et al., 1995b). Little research exists on the impact of specific accommodations on the validity of achievement tests (Thurlow et al., 1995c).

Most students with disabilities have only mild impairments. These students can participate in large-scale assessments, although some will require accommodations. A small percentage of students have disabilities severe enough to require a different assessment, because their curriculum does not match the content of the common test. Section 504 of the Rehabilitation Act of 1973, the ADA, the IDEA, Goals 2000, and Title I all require states and school districts to provide different assessments for these children.

Educating One and All reported that only six states currently offer students with disabilities alternatives to the common assessment (Bond et al., 1996), and that no research exists on "either the ability of alternate assessments to measure students' educational progress validly or to encourage greater accountability for students with disabilities. We do know, however, that the design of alternate assessments poses all the same technical challenges as the development of valid accommodations for the common assessment" (National Research Council, 1997:175).

Psychometrics of Accommodations

Educating One and All reviews the existing evidence on the reliability and validity of test use for students with disabilities, including the logic of providing accommodations, and summarizes the findings as follows (National Research Council, 1997):

Traditionally, standardization (of content, administrative conditions, scoring, and other features) has been used to make the results of assessments comparable in meaning from one test-taker to the next. For some students with disabilities, however, a standard assessment may yield scores that are not comparable in meaning to those obtained by other students because the disability itself biases the score. In many cases, students with disabilities would get a lower score than they should because the disability introduces construct-irrelevant variance, variations in the scores unrelated to the construct purportedly measured. Therefore, "in the case of students with disabilities, some aspects of standardization are breached in the interest of reducing sources of irrelevant difficulty that might otherwise lower scores artificially" (Willingham, 1988:12).

Accommodations are intended to correct for distortions in a student's true competence caused by a disability unrelated to the construct being measured….

Page 197 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

The risk of accommodations is that they may provide the wrong correction. They may provide too weak a correction, leaving the scores of individuals with disabilities lower than they should be, or they may provide an irrelevant correction or an excessive one, distorting or changing scores further and undermining rather than enhancing validity. This risk is explicitly recognized in the guidelines provided by some state education agencies for avoiding these errors, although their guidance is sometimes very general and limited. For example, Maryland's Requirements and Guidelines for Exemptions, Excuses, and Accommodations for Maryland Statewide Assessment Programs (Maryland State Department of Education, 1995) says that "accommodations must not invalidate the assessment for which they are granted" (p. 2, emphasis in the original). However, the only guidance it provides for meeting this standard is a single pair of examples (p. 3):

Addressing the issue of validity involves an examination of the purpose of test and the specific skills to be measured. For example, if an objective of the writing test is to measure handwriting ability, that objective would be substantially altered by allowing a student to dictate his/her response. On the other hand, if a writing objective stated that the student was to communicate thoughts or ideas, handwriting might be viewed as only incidental to achieving the objective. In the latter case, allowing the use of a dictated response probably would not appreciably change the measurement of the objective.

Unfortunately, many cases will be far less clear than this, and accommodations may not succeed in increasing validity even when they seem clear and logical on their face (pp. 173, 176–177).

<><><><><><><><><><><><>

Many approaches to the assessment of individuals with disabilities, particularly assessment accommodations, assume that disabilities are not directly related to the construct tested. Case law indicates that rights to accommodations do not apply when the disability is directly related to the construct tested (see Phillips, 1994). In other words, a student with a reading disability might be allowed help with reading (the accommodation) on a mathematics test, since reading is not in the construct being measured, but would not be allowed help with reading on a reading test, since the disability is directly related to the construct of reading.

However, the groups of students with clearly identifiable disabilities (such as motor impairments) that are largely unrelated to the constructs being tested constitute a small number of the identified population of students with disabilities. Most students with disabilities have cognitive impairments that presumably are related to at least some of the constructs tested.

Relationship between disabilities and assessed constructs have important implications for the validity of inferences based on test scores. For example, if a new assessment includes communication skills as an important part

Page 198 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

of the domain of mathematics, then, to score well in mathematics, students would need to be able to read and write reasonably well…. On such an assessment, it is possible that students with reading disabilities might score worse than their proficiency in other aspects of mathematics would warrant, but providing them with accommodations such as the reading of questions or the scribing of answers is likely to undermine the validity of inferences to the broader, more complex domain of mathematics (pp. 170–171).

Legal and Professional Standards³

Federal statutes and regulations on educating students with disabilities, including the IDEA, Section 504, and the ADA, require that tests and other evaluation materials be validated for the specific purpose for which they are used. All three also require that, when a test is given to a student with impaired sensory, manual, or speaking skills, the results accurately reflect the child's achievement level or whatever other factors the test purports to measure, rather than reflecting the student's disabilities.

These statutes and regulations also require accommodations. Both Section 504 and the ADA prohibit discrimination on the basis of disability. People with disabilities are guaranteed access to programs and services as effective as those provided to their peers without disabilities. The ADA further requires that public entities make "reasonable modification" in policies, practices, and procedures when "necessary to avoid discrimination on the basis of disability, unless the public entity can demonstrate that making the modifications would fundamentally alter the nature of the service, program, or activity" (28 CFR 35.130(b)(7)). In other words, alternate forms or accommodations in testing are required, provided that the content being tested is the same.

Distinctions among the various purposes of assessments become critical in light of these legal rights. Some assessments, for example, are designed mainly for the accountability of schools and school systems. Others are an integral part of learning, instruction, and curriculum. Some tests are used for making high-stakes decisions about individual students, including tracking, promotion or retention in grade, and awarding of a high school diploma or certificate of mastery. Each use raises its own set of legal issues with different implications. As a general rule, the greater the potential harm to students, the greater the protection to which they

³	The legal discussion in this section is drawn from Educating One and All.

Page 199 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

are entitled, and the more vulnerable the assessment is to legal challenge (National Research Council, 1997:186–187).

The Standards for Educational and Psychological Testing (American Educational Research Association et al., 1985) state that any claims made for a test cannot be generalized to a version of the test that has been altered significantly. The Standards continue: "When tests are administered to people with handicapping conditions, particularly those handicaps that affect cognitive functioning, a relevant question is whether the modified test measures the same constructs" (cited in Phillips, 1993:381).

Scarcity of Research Evidence

The committee that wrote Educating One and All concluded that "research on the validity of scores from accommodated assessments is limited, and little of it is directly applicable to the assessments that are central to standards-based reform. Much of the available evidence pertains to college admissions tests and other postsecondary tests (e.g., Wightman, 1993; Willingham et al., 1988)" (National Research Council, 1997:179). From the research reviewed, that committee went on to conclude:

Different disabilities can cause different distortions in test scores. Predicting how the type of disability will affect test scores is difficult, in part because of the ambiguity of the disability classifications. Research is needed about the relationship of specific disabilities to test score performance in different subject areas (pp. 177–178).
Some accommodations may inflate scores for some students. Raising scores, however, is not the purpose of accommodations and it is inappropriate to use them merely to raise scores (pp. 179–182).
A need for additional testing time should not be assumed. The effects on test scores of providing additional time warrant empirical investigation (pp. 180–181).
Although individuals with disabilities are entitled to reasonable accommodations that do not alter the content being tested, current knowledge and testing technology are not sufficient to allow the design of such accommodations (p. 193).

Recent studies have examined teachers' perceptions of accommodations and their likelihood of use. Gajria et al. (1994) found that teachers were more likely to use modifications involving changes in test design

Page 200 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

(e.g., large print, braille, response format) than those involving changes in administrative procedures (e.g., extra time, individual administration). Jayanthi and colleagues (1996) reported that most of the 401 general educators they surveyed believed it was unfair to provide testing adaptations only for students with identified disabilities, and that the adaptations teachers considered most helpful (simplified wording for items, individual help with directions) were not easy to make. Schumm and Vaughn (1991) also found that teachers believed all types of accommodations were more desirable than feasible. In that study, teachers favored accommodations that addressed motivational issues over those requiring curricular or environmental adaptations.

Clearly, more research on the validity of scores from accommodated testing is needed—in particular, research tailored directly to the particular assessments and inferences central to high-stakes decision making. In the interim, the existing research, although limited and based largely on different populations and types of assessments, suggests caution. The effects of accommodation cannot be assumed; they may be quite different from what an a priori logical analysis would predict. In addition, policies on the need to "flag" or mark score reports when accommodations are provided raise vexing legal, policy, and ethical questions (see Box 8-2).

The committee that wrote Educating One and All made the following broad recommendation to guide policymakers attempting to include more students with disabilities in standards-based assessment programs (p. 204):

Assessment accommodations should be provided, but they should be used only to offset the impact of disabilities unrelated to the knowledge and skills being measured. They should be justified on a case-by-case basis, but individual decisions should be guided by a uniform set of criteria.

A number of research studies are under way, including efforts to include more students with disabilities in the National Assessment of Educational Progress.⁴ A recent report from the National Center for Education Statistics describes many of these efforts (Olson and Goldstein, 1997); they are summarized in the appendix to this chapter.

Another problem raised in Educating One and All, which may make scores on large-scale assessments hard to interpret for some students with

⁴	More complete discussion of the issues of including students with disabilities in the National Assessment of Educational Progress (NAEP) can be found in Grading the Nation's Report Card: Evaluating NAEP and Transforming the Assessment of Educational Progress (National Research Council, 1999).

Page 201 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

Box 8-2 Test Score Flagging

Flagging is a concern when a nonstandard administration of an assessment—for example, providing accommodations such as extra time or a reader—may have compromised the validity of inferences based on the student's score. Flagging warns the user that the meaning of the score is uncertain.

Flagged scores are typically not accompanied by any descriptive detail about the individual, or even the nature of accommodations offered. Therefore, flagging may not really help users to interpret scores more appropriately. It does, however, confront them with a decision: Should the score be ignored or discounted because of the possibility that accommodations have created unknown distortions? In the case of scores reported for individual students, flagging identifies the individual as having a disability, raising concerns about confidentiality and possible stigma.

When testing technology is able to ensure that accommodations do not confound the measurement of underlying constructs, score notations will be unnecessary. Until then, however, flagging should be used only with the understanding that the need to protect the public and policymakers from misleading information must be weighed against the equally important need to protect student confidentiality and prevent discriminatory uses of testing information.

SOURCE: National Research Council (1997).

disabilities, is the way in which performance levels are set. A number of new large-scale assessments typically use only a few performance levels, wherein the lowest level is high relative to the average distribution of performance. Consequently, very little information is provided about modest gains by the lowest-performing students, including some students with disabilities. This kind of reporting rubric may also signal that modest improvements are not important unless they bring students above the performance standard. To enable participation of students with disabilities, high-stakes tests should represent performance accurately at all points across a rather broad continuum. This not only implies breadth in terms of difficulty and the content assessed, but also requires that reporting methods provide sufficient and adequate information about all levels of student performance.

New assessment systems are relying heavily on performance assessments, which may decrease the reliability of information about low-achieving

Page 202 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

students, including some with disabilities. All such assessments should be designed to be informative about the achievement of all students. In particular, task selection and scoring criteria should be designed to accommodate varying levels of performance. These reliability concerns are magnified when high-stakes decisions will be based on individual test score results.

Promising Approaches in Test Design⁵

Research and development in the field of educational testing is continually experimenting with new modes, formats, and technologies. Continued development of new forms of test construction, such as new ways of constructing test items and using computers, may hold promise for accommodating the needs of students with disabilities in large-scale assessment programs.

Item response theory (IRT), which is rapidly displacing classical test theory as the basis for modern test construction, is one promising development. IRT models describe "what happens when an examinee meets an item" (Wainer and Mislevy, 1990:66). They are based on the notion that students' performance on a test should reflect one latent trait or ability and that a mathematical model should be able to predict performance on individual test items on the basis of that trait.⁶

To use IRT modeling in test construction and scoring, test items are first administered to a large sample of respondents. Based on these data, a model is derived that predicts whether a given item will be answered correctly by a given individual on the basis of estimates of the difficulty of the item and the skill of the individual. A good model yields information about the difficulty of items for individuals with differing levels of skill. Items for which the model does not fit—that is, for which students' estimated mastery does not predict performance well—are discarded. This information is later used to score tests given to actual examinees.

Item response theory offers potential for including students with disabilities

⁵	This section is taken from pp. 182–183 of Educating One and All (National Research Council, 1997).
⁶	Most IRT models are predicated on the notion that a test is unidimensional and that scores should therefore reflect a single latent trait. Recently, however, IRT models have been extended to multidimensional domains as well.

Page 203 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

abilities in large-scale assessments. First, in many cases, assessments based on IRT allow for everyone's scores to be placed on a common scale, even though different students have been given different items. Given the wide range of performance levels among students, including students with disabilities, it is unlikely that the same set of items will be appropriate for everyone. Second, IRT makes it possible to assess changes in the reliability of scores as a function of a student's skill in what the assessment is measuring. Thus it is possible to identify an assessment that may not be reliable for low-scoring students with disabilities, even though it is reliable for high-scoring students. Third, IRT provides sophisticated methods for identifying items that are biased for students with disabilities.

Computerized testing also holds promise (Bennett, 1995). One of the accommodations most often given to students with disabilities is extra time. But, as noted earlier, extra time may undermine the validity of scores. Computer-based "adaptive" assessments allow students with a wide range of skills to be tested at a reasonable level of reliability and in a shorter amount of time by adapting items individually. This makes it possible to "give more time to everyone." Computer-based adaptive tests can be shorter than traditional tests but still comply with measurement principles. The need for accommodation in test administration is reduced, thereby circumventing the validity problems.

Finally, computer-based tests may allow students with disabilities to participate in simulated hands-on assessments through adaptive input devices, such as a light pen mounted on a head strap. These can replace assessments requiring manual movements that are impossible for some students. However, as Baxter and Shavelson (1994) have shown, computerized simulations of hands-on tasks can yield results surprisingly unlike those generated by the original tasks, so this approach will require careful evaluation.

The committee's findings and recommendations about students with disabilities are reported in Chapter 12.

Page 204 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

Appendix

Including Students with Disabilities in Large-Scale Assessments: Summary of Current and Ongoing Research Activities

This appendix is a catalogue of the projects summarized and described in more detail in Olson and Goldstein (1997).

National Center for Education Statistics (NCES)

Working Conference on Guidelines for Inclusion of Students with Disabilities and Accommodations in Large-Scale Assessment Programs . This report (National Center on Educational Outcomes, 1994a) summarizes a conference at which a variety of issues relating to the participation of students with disabilities in large-scale assessments and guidelines for including and accommodating students were discussed. Many recommendations made to NCES were incorporated into National Assessment of Educational Progress (NAEP) procedures for upcoming assessments.
Working Paper on Assessing Students with Disabilities and Limited English Proficiency (Houser, 1995).

National Academy of Education (NAE)

Several studies were conducted for the National Academy of Education by the American Institutes for Research to evaluate the NAEP Trial State Assessment. NAE's evaluation of the 1992 trial state assessment (National Academy of Education, 1993) examined the exclusion of students, and the evaluation of the 1994 assessment (National Academy of Education, 1996) examined the accessibility of English-language learners and those with individual education programs who were excluded.

National Center on Educational Outcomes

This center has issued a collection of reports that include recommendations for the inclusion and accommodation of students with disabilities in state assessments and guidelines for states' use of assessments (National Center on Educational Outcomes, 1994b, 1995a, 1995b, 1996).

Office of Educational Research and Improvement (OERI)

Grants from OERI have helped fund a number of state projects aimed at creating better assessments, providing appropriate accommodations, and modifying assessments for use with students with disabilities or English-language learners.

Page 205 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

Office of Special Education and Rehabilitation Services

The Division of Innovation and Development (DID) of the Office of Special Education Programs (OSEP) funds a number of research projects on students with disabilities, including some related to assessment.

Council of Chief State School Officers State Collaboration on Assessment and Student Standards (SCASS)

SCASS Consortium on Assessing Special Education Students. In this consortium, states share methods and criteria for accommodating special education students in large-scale assessments and plan a research program to develop criteria and procedures to assess the performance of all students.
SCASS Consortium on Technical Guidelines for Performance Assessment (TGPA). This consortium focuses on designing and implementing research to foster the development of sound performance assessments. The SCASS TGPA "Study 6—The Impact of Adapting Assessments for Special Education Students" examines issues of inclusion.

National Center for Research on Evaluation, Standards and Student Testing (CRESST)

The research being conducted at CRESST focused on assessment of students with disabilities addresses validity issues and the quality of measuring students' performance. Specifically, these studies examine the characteristics of the students, the level of difficulty of assessments for students with disabilities, the types of accommodations that seem reasonable, and the validity of results from accommodated assessments.

Educational Testing Service (ETS)

A book issued in 1988 by ETS (Willingham et al.) contains the findings of several studies that focus on measurement and validity issues in the testing of students with disabilities. These studies examined population characteristics, test performance, use of accommodations, admissions decisions, psychometric characteristics, and effects on validity.

National Academy of Sciences/National Research Council

Several studies have been conducted by the NAS/NRC. The Committee on Goals 2000 and the Inclusion of Students with Disabilities issued a report that examines policy and legislative background on special

Page 206 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

education and standards-based reform and implications for research, practice, and policy (National Research Council, 1997). The Committee on the Evaluation of National and State Assessments of Educational Progress reviews NAEP in general and also evaluates the participation of English-language learners and students with disabilities (National Research Council, 1999). Two earlier studies are still relevant: Placing Children in Special Education: A Strategy for Equity (National Research Council, 1982b) and Ability Testing for Handicapped People: Dilemma for Government, Science, and the Public (National Research Council, 1982a).

NAEP Validity Studies (NVS) Expert Panel

The NVS panel planned a study on the validity of testing accommodations and their impact on student performance. "A Proposed Study Design to Examine the Impact of Accommodations of the Performance of Students with Disabilities in NAEP: The Impact of Increased Testing Time on the Performance of Disabled and Non-Disabled Students" is a draft proposal for a study of the impact of extended testing time on students with and without disabilities and across different content areas, as well as of student perceptions of their need for accommodations.

Joint Committee on Testing Practices (JCTP)

Project on Assessing Individuals with Disabilities. Currently, a working group of JCTP is compiling information about what is helpful to test users (e.g., assessment specialists, educators, counselors) and what is useful to policymakers about the assessment of students with disabilities, about interpreting scores from assessments, and about making educational and career decisions.

References

American Educational Research Association, American Psychological Association, and National Council on Measurement in Education 1985 Standards for Educational and Psychological Testing. Washington, DC: American Psychological Association.

Allington, R.L., and A. McGill-Franzen 1992 Unintended effects of educational reform in New York. Educational Policy 4:397–414.

Baxter, G.P., and R.J. Shavelson 1994 Performance assessment. International Journal of Education Research 21(3):233–350.

Page 207 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

Bennett, R.E. 1995 Computer-based Testing for Examinees with Disabilities: On the Road to Generalized Accommodation. Princeton, NJ: Educational Testing Service.

Bond, L.A., D. Braskamp, and E.D. Roeber 1996 The Status of State Student Assessment Programs in the United States . Oak Brook, IL: North Central Regional Educational Laboratory and Council of Chief State School Officers.

DeStefano, L. 1998 Translating Classroom Accommodations to Accommodations in Large-Scale Assessment. Paper presented at the Annual Meeting of the American Educational Research Association, San Diego, CA.

DeStefano, L., and D. Metzer 1991 High stakes testing and students with handicaps: An analysis of issues and policies. Pp. 267–288 in Advances in Program Evaluation 1, R.E. Stake, ed. Greenwich, CT: JAI Press.

Erickson, R.N., and M.L. Thurlow 1996 State Special Education Outcomes 1995. Minneapolis, MN: National Center on Educational Outcomes, University of Minnesota.

Erickson, R.N., M.L. Thurlow, and K. Thor 1995 1994 State Special Education Outcomes. Minneapolis, MN: National Center on Educational Outcomes, University of Minnesota.

Gajria, M., S.J. Salend, and M.A. Hemrick 1994 Teacher acceptability of testing modifications for mainstreamed students. Learning Disabilities Research and Practice 9(4):236–243.

Gronna, S.S., A.A. Jenkins, and S.A. Chin-Chance 1998 Who are we assessing? Determining state-wide participation rates for students with disabilities. Exceptional Children 64(3):407–418.

Hambleton, R.K., and H. Swaminathan 1985 Item Response Theory: Principles and Applications. Hingham, MA: Kluwer Boston, Inc.

Houser, J. 1995 Assessing Students with Disabilities and Limited English Proficiency . Working Paper Number 95-13. Washington, DC: National Center for Education Statistics, Policy and Review Branch, Data Development Division.

Jayanthi, M., M.H. Epstein, E.A Polloway, and W.D. Bursuck 1996 A national survey of general education teachers' perceptions of testing adaptations. The Journal of Special Education 30(1):99–115.

Koretz, D., S. Barron, K. Mitchell, and B. Stecher 1996a Assessment-Based Educational Reform: A Look at Two States (Kentucky and Maryland). Papers presented at the Annual Meeting of the American Educational Research Association, New York, NY.

1996b The Perceived Effects of the Kentucky Instructional Results Information System. Santa Monica, CA: RAND.

Maryland State Department of Education 1995 Requirements and Guidelines for Exemptions, Excuses, and Accommodations for Maryland Statewide Assessment Programs. Baltimore: Maryland State Department of Education.

Page 208 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

McGrew, K.S., M.L. Thurlow, J.G. Shriner, and A.N. Spiegel 1992 Inclusion of Students with Disabilities in National and State Data Collection Programs. Minneapolis, MN: National Center on Educational Outcomes, University of Minnesota.

National Academy of Education 1993 The Trial State Assessment: Prospects and Realities. Stanford, CA: National Academy of Education.

1996 Quality and Utility: The 1994 Trial State Assessment in Reading. Stanford, CA: National Academy of Education.

National Center on Educational Outcomes 1994a Making Decisions about the Inclusion of Students with Disabilities in Large-Scale Assessments. Synthesis Report 14. Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.

1994b Recommendations for Making Decisions about the Participation of Students with Disabilities in Statewide Assessment Programs. Synthesis Report 15. Minneapolis, MN: National Center on Educational Outcomes.

1995a Compilation of State Guidelines for Including Students with Disabilities in Assessments. Synthesis Report 17. Minneapolis, MN: National Center on Educational Outcomes.

1995b Compilation of State Guidelines for Accommodations in Assessments for Students with Disabilities. Synthesis Report 18. Minneapolis, MN: National Center on Educational Outcomes.

1996 Assessment Guidelines that Maximize the Participation of Students with Disabilities in Large-Scale Assessments: Characteristics and Considerations. Synthesis Report 25. Minneapolis, MN: National Center on Educational Outcomes.

National Research Council 1982a Ability Testing of Handicapped People: Dilemma for Government, Science and the Public, S.W. Sherman and N.M. Robinson, eds. Panel on Testing of Handicapped People, Committee on Ability Testing, National Research Council. Washington, DC: National Academy Press.

1982b Placing Children in Special Education: A Strategy for Equity, K.A. Heller, W.H. Holtzman, and S. Messick, eds. Committee on Child Development Research and Public Policy. Washington, DC: National Academy Press.

1997 Educating One and All: Students with Disabilities and Standards-Based Reform, L.M. McDonnell, M.J. McLaughlin, and Patricia Morison, eds. Committee on Goals 2000 and the Inclusion of Students with Disabilities, Board on Testing and Assessment. Washington, DC: National Academy Press.

1999 Grading the Nation's Report Card: Evaluating NAEP and Transforming the Assessment of Educational Progress, J. Pellegrino, L. Jones, and K. Mitchell, eds. Committee on the Evaluation of National and State Assessments of Educational Progress, Board on Testing and Assessment. Washington, DC: National Academy Press.

Olson, J.F., and A.A. Goldstein 1997 The Inclusion of Students with Disabilities and Limited English Proficient Students in Large-scale Assessments: A Summary of Recent Progress . NCES 97-482. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement.

Page 209 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

Phillips, S.E. 1993 Testing accommodations for disabled students. Education Law Reporter 80:9–32.

1994 High-stakes testing accommodations. Validity versus disabled rights. Applied Measurement in Education 7(2):93–120.

1995 All Students, Same Test, Same Standards: What the New Title I Legislation Will Mean for the Educational Assessment of Special Education Students . Oak Brook, IL: North Central Regional Educational Laboratory.

Schumm, J.S., and S. Vaughn 1991 Making adaptations for mainstreamed students: General classroom teachers' perspectives. Remedial and Special Education 12(4):18–27.

Shepard, L.A. 1989 Identification of mild handicaps. Pp. 545–572 in Educational Measurement , 3rd Ed., R.L. Linn, ed. New York: American Council on Education and Macmillan Publishing Co.

Shriner, J.G., and M.L. Thurlow 1992 State Special Education Outcomes 1991. Minneapolis, MN: National Center on Educational Outcomes, University of Minnesota.

Thurlow, M.L., J.L. Elliott, and J.E. Ysseldyke 1998 Testing Students with Disabilities: Practical Strategies for Complying with District and State Requirements. Thousand Oaks, CA: Corwin Press.

Thurlow, M.L., J.E. Ysseldyke, and C.L. Anderson 1995a High School Graduation Requirements: What's Happening for Students with Disabilities? Synthesis Report 20. Minneapolis, MN: National Center on Educational Outcomes, University of Minnesota.

Thurlow, M.L., D.L. Scott, and J.E. Ysseldyke 1995b A Compilation of States' Guidelines for Accommodations in Assessments for Students with Disabilities. Synthesis Report 17. Minneapolis, MN: National Center on Educational Outcomes, University of Minnesota.

Thurlow, M.L., J.E. Ysseldyke, and B. Silverstein 1993 Testing Accommodations for Students with Disabilities: A Review of the Literature. Synthesis Report 4. Minneapolis, MN: National Center on Educational Outcomes, University of Minnesota.

1995c Testing Accommodations for Students with Disabilities. Remedial and Special Education 16(5):260–270.

Thurlow, M.L., R. Erickson, R. Spicuzza, et al. 1996 Accommodations for Students with Disabilities: Guidelines from States with Graduation Exams. Minnesota Report 5. Minneapolis, MN: National Center on Educational Outcomes, University of Minnesota.

U.S. Department of Education 1995 Seventeenth Annual Report to Congress on the Implementation of the Individuals with Disabilities Education Act. Washington, DC: Office of Special Education Programs.

1996 Eighteenth Annual Report to Congress on the Implementation of the Individuals with Disabilities Education Act. Washington, DC: Office of Special Education Programs.

Page 210 Cite

Suggested Citation:"8 Students with Disabilities." National Research Council. 1999. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: The National Academies Press. doi: 10.17226/6336.

×

1997 Nineteenth Annual Report to Congress on the Implementation of the Individuals with Disabilities Education Act. Washington, DC: Office of Special Education Programs.

Vanderwood, M., K. McGrew, and J.E. Ysseldyke 1998 Why we can't say much about students with disabilities during educational reform. Exceptional Children 64(3):359–370.

Wainer, H., and R.J. Mislevy 1990 Item response theory, item calibration and proficiency estimation. Pp. 65–102 in Computerized Adaptive Testing: A Primer, H. Wainer, ed. Mawah, NJ: Erlbaum.

Wightman, L.F. 1993 Test Takers with Disabilities: A Summary of Data from Special Administrations of the LSAT. Research Report 93-03. Newton, PA: Law School Admission Council.

Willingham, W.W. 1988 Introduction. Pp. 1–16 in Testing Handicapped People, W.W. Willingham, M. Ragosta, R.E. Bennett, H. Braun, D.A. Rock, and D.E. Powers, eds. Boston, MA: Allyn and Bacon.

Willingham, W.W., M. Ragosta, R.E. Bennett, H. Braun, D.A. Rock, and D.E. Powers, eds. 1988 Testing Handicapped People. Boston, MA: Allyn and Bacon.