4

Inclusion and Accommodation

In this chapter we first review the issue of inclusion and accommodation in the VNT, along with the findings and recommendations of the NRC during the year 1 evaluation of VNT development. We then review and assess the treatment of inclusion and accommodation in VNT development activities from fall 1998 through April 1999.

THE CHALLENGES OF INCLUSION AND ACCOMMODATION

In the November 1997 legislation that established the National Assessment Governing Board's responsibility for the development of the Voluntary National Tests, Congress required NAGB to make four determinations. The third of these is “whether the test development process and test items take into account the needs of disadvantaged, limited English proficient and disabled students” (P.L. 105-78:Sec. 307 (b)(3)). The same legislation called on the National Research Council “to evaluate whether the test items address the needs of disadvantaged, limited English proficient, and disabled students.”

There are two key challenges to testing students with disabilities or limited English proficiency. The first is to establish effective procedures for identifying and screening such students, so they can appropriately be included in assessment programs. Federal law and state and local policy increasingly demand participation of these special populations in all education activities, both as a means of establishing the needs and progress of individual students and for purposes of system accountability. The second challenge is to identify and provide necessary accommodations (e.g., large-print type, extended time) to students with special needs while maintaining comparable test validity with that for the general population (see National Research Council, 1997, 1999c). That is, any accommodation should alter only the conditions of assessment without otherwise affecting the measurement of performance. This issue is growing in importance, as is the number of students with disabilities or with limited English proficiency. Students with disabilities are now 12.3 percent of all students in elementary and secondary school, and students with limited English proficiency are 5.5 percent of all students.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 43
Evaluation of the Voluntary National Tests, Year 2: INTERIM REPORT 4 Inclusion and Accommodation In this chapter we first review the issue of inclusion and accommodation in the VNT, along with the findings and recommendations of the NRC during the year 1 evaluation of VNT development. We then review and assess the treatment of inclusion and accommodation in VNT development activities from fall 1998 through April 1999. THE CHALLENGES OF INCLUSION AND ACCOMMODATION In the November 1997 legislation that established the National Assessment Governing Board's responsibility for the development of the Voluntary National Tests, Congress required NAGB to make four determinations. The third of these is “whether the test development process and test items take into account the needs of disadvantaged, limited English proficient and disabled students” (P.L. 105-78:Sec. 307 (b)(3)). The same legislation called on the National Research Council “to evaluate whether the test items address the needs of disadvantaged, limited English proficient, and disabled students.” There are two key challenges to testing students with disabilities or limited English proficiency. The first is to establish effective procedures for identifying and screening such students, so they can appropriately be included in assessment programs. Federal law and state and local policy increasingly demand participation of these special populations in all education activities, both as a means of establishing the needs and progress of individual students and for purposes of system accountability. The second challenge is to identify and provide necessary accommodations (e.g., large-print type, extended time) to students with special needs while maintaining comparable test validity with that for the general population (see National Research Council, 1997, 1999c). That is, any accommodation should alter only the conditions of assessment without otherwise affecting the measurement of performance. This issue is growing in importance, as is the number of students with disabilities or with limited English proficiency. Students with disabilities are now 12.3 percent of all students in elementary and secondary school, and students with limited English proficiency are 5.5 percent of all students.

OCR for page 43
Evaluation of the Voluntary National Tests, Year 2: INTERIM REPORT For the latter percentage, however, it is important to remember that in some local jurisdictions, the percentage is as high as 60 percent. During the summer of 1998, NAGB approved a set of “principles” for the inclusion and accommodation of students with disabilities and limited English proficiency in the pilot test of the VNT (National Assessment Governing Board, 1998a). In essence, these principles repeated the guidelines for inclusion and accommodation that had been adopted for the National Assessment of Educational Progress (NAEP) in 1996, but which had not then substantially affected participation in NAEP. The substance of the 1996 policy changes had been a deliberate shift from determining who to include in assessments—on the assumption that students with disabilities and limited English proficiency would normally be excluded—to determining who would not be included—on the assumption that all students would normally be included. Unfortunately, this change, which was evaluated with a split-sample design, had little influence on participation. There may be several reasons for the limited change in participation. Realistically, there is the simple function of the time needed to implement the change at the state and local education levels. The degree and breadth of this change and the type of accommodations might also be other factors. Earlier that year, NAGB had commissioned background reports on the inclusion and accommodation of students with disabilities and with limited English proficiency from the American Institutes for Research. As of the summer of 1998, there had as yet been little developmental work on the project to address the special needs of these populations. The 584 participants in cognitive laboratory sessions included students with disabilities (32 in reading and 19 in math) and limited English proficiency (11 in reading and 12 in math), but their numbers were too small to provide substantial or reliable information about the potential participation of such students in the VNT (American Institutes for Research, 1998d). In a later document, American Institutes for Research (1998m:2) reports that “29 of the reading participants and 18 of the mathematics participants had some sort of special educational need other than gifted and talented. No systematic data were gathered, however, on the numbers of these students who were receiving special educational services guided by an individualized educational plan. ” In addition to the statement of principles, AIR had two planning documents: “Revised Inclusion and Accommodations Work Plan (American Institutes for Research, 1998k), which was approved by NAGB at its May meeting, and “Background Paper Reviewing Laws and Regulations, Current Practice, and Research Relevant to Inclusion and Accommodations for Students with Disabilities” (American Institutes for Research, 1998b). AIR had also developed lists of organizations with particular interest in the educational and testing needs of students with disabilities or limited English proficiency for NAGB's use in holding public hearings about inclusion and accommodation (American Institutes for Research, 1998h, 1998i). At that time, a parallel background paper on inclusion and accommodation of students with limited English proficiency was being drafted for presentation to NAGB at its November 1998 meeting. At that time, no VNT development activities or plans had been specifically aimed at the needs of disadvantaged students or those with limited English proficiency, but such students did participate in one development activity—the cognitive labs. YEAR 1 EVALUATION FINDINGS In its year 1 report, the National Research Council (1999a:46) noted: Because of the compressed schedule in the early phases of VNT development, along with the desire to achieve close correspondence between the VNT and NAEP, the NAGB plans and the AIR background paper on students with disabilities both focus on recent NAEP practices for inclusion and accommodation, rather than taking a broader,

OCR for page 43
Evaluation of the Voluntary National Tests, Year 2: INTERIM REPORT more proactive stance. We believe the federal government has an important leadership role to play in subsidizing and demonstrating valid efforts to include these populations . . . The procedures discussed in the draft documents are intended to increase participation and provide valid assessments for all students, but they essentially involve retrofitting established assessment instruments and procedures to special populations of students; another approach would be to design and develop assessments from the beginning that are accessible to and valid for all students. From its beginning around 1970 and through the middle 1990s, NAEP assessments had been carried out without any kind of accommodation. Procedures for “inclusion” usually focused on the exclusion of some students from the assessments, rather than on universal participation. Since the mid-1990s, as the growth of special student populations and the importance of their participation in large-scale assessments have increasingly been recognized, NAEP has experimented with new, more inclusive participation and accommodation policies reflected in the NAGB principles for the VNT and the AIR planning documents. For example, in NAGB's draft principles, depending on the test and population in question, accommodations for a pilot test may include large-print booklets, extended time, small-group administration, one-on-one administration, a scribe or computer to record answers, reading a test aloud by an examiner, other format or equipment modifications, or a bilingual dictionary if it is normally allowed by the school. However, the success of these policies in increasing participation has not yet been established, nor have their effects on test performance and score comparability been validated. The year 1 evaluation report also observed (National Research Council, 1999a:46): Unless extensive development work is done with students with disabilities and with limited English proficiency, it would be unreasonable to expect that the VNT will be valid for use with these student populations. Both of these populations are heterogeneous, e.g., in primary language, level of proficiency in English, and specific type of disability. Moreover, they differ from the majority of students, not only in ways that affect test-taking directly—e.g., those that can be accommodated through additional time or assistive devices—but also in styles of learning and in levels of motivation or anxiety. Such differences are very likely to reduce the validity and comparability of test performance. The Committee on Appropriate Test Use identified two important ways in which inclusion and accommodation can be improved (National Research Council, 1999c). First, the focus should be on inclusion and accommodation issues throughout item and test development, so a test is designed from the ground up to be accessible and comparable among special populations. For example, the report recommends oversampling of students with disabilities and with limited English proficiency in the course of pilot testing so there will be sufficient numbers of cases in major subgroups of these students to permit differential item functioning (DIF) analyses. Second, test developers should explore the use of new technologies—such as computer-based adaptive testing for students who need extra time—which show promise for substantially reducing or eliminating irrelevant performance differentials between students who require accommodation and other students. The report recognized, however, that development work of this kind is just beginning, and there are presently few exemplars of it. The NRC year 1 evaluation report concluded (National Research Council, 1999a:47): The statement of principles and the AIR planning documents provide a limited basis for evaluation of provisions for inclusion and accommodation in the VNT—and no specific basis to address the quality of item development relative to the needs of those students . . . . a major opportunity for improved large-scale assessment is being lost in NAGB's conservative approach to inclusion and accommodation in the VNT. It thus recommended and explained (National Research Council, 1999a:47): NAGB should accelerate its plans and schedule for inclusion and accommodation of students with disabilities and limited English proficiency in order to increase the participation of both those student populations and to increase the comparability of VNT performance among student populations . . . This recommendation requires prompt action because so much of the development work in the first round of the VNT has already been completed. We have already noted the modest attention to students with special needs in the cognitive laboratory sessions. In the pilot test, NAGB plans to identify students with disabilities and with limited English proficiency and with the types of accommodations that have been provided. However, there are no provisions in the design to ensure that there will

OCR for page 43
Evaluation of the Voluntary National Tests, Year 2: INTERIM REPORT be sufficient numbers of these students—such as students requiring specific types of accommodation—to support reliable DIF analyses. We think that it would be feasible to include larger numbers of such students in the pilot and field tests, for example, by increasing sampling fractions of such students within schools. Moreover, there appears to be no plan to translate the 8th-grade mathematics test into Spanish (or any other language), a decision that is likely to affect participation in the VNT by major school districts. There has been some discussion of a Spanish translation after the field test, but this would be too late for the item analyses needed to construct comparable English and Spanish forms. INTERIM YEAR 2 FINDINGS Since fall 1998, NAGB and AIR have continued information-gathering and planning activities related to inclusion and accommodation in the VNT. The committee finds that these activities have not fully implemented the earlier NRC recommendation for accelerated work on inclusion and accommodation. In its November 1998 report, American Institutes for Research (1998m) reviewed steps taken during item development to consider the needs of special students. They included passage review, item editing, bias and sensitivity review, cognitive lab participation, and item analysis. Although these steps are appropriate and commendable, they do not focus specifically on the needs of students with disabilities or with limited English proficiency. We have already noted the small number of disabled students and English language learners in the first year of cognitive laboratories. In fact, AIR's March 1999 report (American Institutes for Research, 1999a) on lessons learned in the first year cognitive laboratories makes no reference to inclusion or accommodation. Earlier, in commenting on plans for item analysis, American Institutes for Research (1998m:2) reported that, “even with the planned oversampling of Hispanic students [in the pilot test], it was not anticipated that sufficient numbers of either students with disabilities or students with limited English proficiency would have been included to permit DIF analysis based on these characteristics. ” AIR also suggested two “enhancements” of item development activities during year 2 (American Institutes for Research, 1998m:3): (1) “to review items against plain language guidelines that have been developed under the sponsorship of the Goals 2000 program and elsewhere” and (2) to include larger numbers of students with disabilities and limited English proficiency in the second year of cognitive laboratories. The committee has been informed that both of these proposals have been adopted. In particular, an effort is being made to include students with disabilities and with limited English proficiency among the nine students in each item trial in this year's cognitive laboratories. The AIR planning document also reports that, in connection with its exploration of reporting and test use, focus groups with parents and teachers included (American Institutes for Research, 1998m:5): “one composed of special education teachers, one comprising parents of special education students, and one involving teachers of students with limited English proficiency.” We have not seen a report of the findings or implications of these focus groups. We have already mentioned the draft background paper of July 1998 on inclusion and accommodation of students with disabilities. By November 1998, that paper had been completed (American Institutes for Research, 1998a) along with a parallel report on students with limited English proficiency (American Institutes for Research, 1998c). In the judgment of the committee, these two reports are thorough, competent, and cogent, and they would have provided an ample basis for additional development activities to address the needs of students with disabilities and with limited English proficiency. The AIR planning document also reports (American Institutes for Research, 1998m:5): Following the release of the NRC report on appropriate test use in September 1998, the board undertook a series of public hearings in which it invited citizens concerned with the education and assessment of students with disabilities

OCR for page 43
Evaluation of the Voluntary National Tests, Year 2: INTERIM REPORT or limited English proficiency to comment on the findings of the NRC report and the evolving plans for addressing the needs of these students in the VNT. Invitations to the hearings were developed on the basis of lists of relevant groups and agencies that were assembled by AIR (American Institutes for Research, 1998h, 1998i). The hearings were held during October and November 1998 in several large cities: Washington, DC; Atlanta, GA; New York, NY; Chicago, IL; Austin, TX; and Los Angeles, CA. Witnesses were asked to respond to a series of questions about inclusion and accommodation in large-scale assessment, including the proposed VNT. For example, in the case of students with disabilities (National Assessment Governing Board 1999a:1-3), there were eight queries: What should be the criteria for including a student with disabilities in standard administrations of large-scale tests? What, if any, should be the criteria for exempting a student with disabilities from standard administrations of large-scale tests? What examples exist of such criteria that have been empirically determined and/or validated? What accommodations in the administration of the test should be considered for students with disabilities and according to what criteria should those accommodations be provided? What adaptations (i.e., changes or special versions) to the test should be considered and under what circumstances should they be provided? What validation evidence is sufficient to conclude that results from testing with accommodations/adaptations are comparable to results from testing without accommodations/adaptations? Until such validity is established, what cautions about interpretation, if any, should accompany individual test results? The Voluntary National Test is intended to provide individual student results. What are the pros and cons of permitting aggregate results to be reported for students with disabilities within a school, district, or state? In setting performance standards on a test for interpreting test results, what if any considerations should be given with regard to implications for students with disabilities? What specific criteria should be considered in reviewing test items as they are being developed to optimize the inclusion of students with disabilities? What other specific considerations should be taken into account in test development? What other issues related to inclusion and accommodations in testing for students with disabilities should be considered by the Governing Board for the Voluntary National Tests in order to ensure fair and equitable test administration for all students? Parallel questions were also posed with reference to students with limited English proficiency (see National Assessment Governing Board, 1999b). In addition, witnesses who addressed the testing of students with limited English proficiency were asked: “What are the technical and policy pros and cons of testing students with limited English proficiency in the student's native language and under what circumstance, if any, might this be considered as an appropriate adaptation in large-scale testing?” (National Assessment Governing Board, 1999b:3-5). The testimony at these two series of hearings was recorded and summarized in the two reports (National Assessment Governing Board, 1999a, 1999b). While the hearings may have been useful in providing a platform for a number of interested groups and agencies—and signaling the interest of NAGB in the views of those groups and agencies—the committee does not find that the hearing summaries add substantial new information or ideas that could affect the design of the VNT.

OCR for page 43
Evaluation of the Voluntary National Tests, Year 2: INTERIM REPORT AIR has repeatedly proposed a study of the effect of two particular accommodations—extended time and small group administration—on VNT performance, which would be carried out in connection with the pilot test. The study is intended to measure the effects of extended time and small-group administration of the tests. These two accommodations are common, and, thus, the committee considers it important to understand their effects on test performance and validity. This proposal appeared in a proposed year 2 research plan, which was abandoned because of the congressional ban on pilot testing during 1999. The proposal was modified for year 3 of development, in connection with the year 2000 pilot test (American Institutes for Research, 1999e:8-9), which states: It is important to determine the impact that extra time has on student performance and whether only examinees with disabilities or limited English proficiency benefit from having extra time, or the degree to which the scores of other examinees also would improve with such accommodations. In the case of small-group administration, the research proposal focuses on students with individual education plans (IEPs) and is designed to estimate the effects of extended time and small-group administration, but is not designed to compare the performance of students with IEPs to that of other students under the experimental conditions. Each study will be based on a single pilot test form. The first, extended-time study is proposed for a supplementary sample of large schools, in order to minimize the cost of obtaining 150 observations in each cell of the design. The experimental design crosses the three student groups by two times of administration: standard time (45 minutes) and twice the standard time (90 minutes). The second study is to use a combination of schools in the main pilot sample with those in a supplementary sample of schools. There are five cells, crossing standard administration, embedded small groups, and pull-out small groups with standard time limits and double time limits (excluding the double-time condition with standard administration). In both studies, it should be possible to assess the speededness of the tests, as well as differences in performance between groups of students and experimental conditions. The analysis of the first study is also proposed to include analyses of differential item functioning among students with disabilities, students with limited English proficiency, and students who are not in these special groups. Since there are only 300 students in each of the three groups, the committee is concerned that these analyses may not have sufficient statistical power, relative to the design of the main pilot sample. However, the committee does agree that it will be valuable, first, to assess the influence of these two accommodations on VNT performance. CONCLUSION AND RECOMMENDATIONS After release of the NRC report on year 1 of VNT development, NAGB completed several activities related to inclusion and accommodation of students with disabilities and with limited English proficiency. They included preparation of two background papers, public hearings, and the summary of testimony from those hearings. However, these activities have not yet carried the development process much beyond its state in the summer of 1998, when the report found (National Research Council, 1999a:47) “a limited basis for evaluation of provisions for inclusion and accommodation in the VNT—and no specific basis to address the quality of item development relative to the needs of those students.” We applaud AIR's proposal to evaluate the effects of two common accommodations on VNT performance among students with disabilities and with limited English proficiency in the pilot test, but little else has been accomplished. At present, no specific recommendations or actions appear to have been made or taken on the basis of the hearings on inclusion or accommodation. Specifically, with reference to recommendations of the NRC year 1 evaluation report: no change

OCR for page 43
Evaluation of the Voluntary National Tests, Year 2: INTERIM REPORT has been made in pilot test plans that would support reliable DIF analyses of the special groups of students in the pilot test, and no consideration has been given to the development of Spanish (and other languages) versions of the 8th grade mathematics test. With respect to the latter, while language simplification methodology has been used in the test development process, little attention has been paid to other language issues regarding the VNT math test (e.g., whether Spanish and other language versions should be developed or methods to reduce the reading level of mathematics items). Participation in the cognitive laboratories by students with disabilities and with limited English proficiency has been expanded, and the committee will be interested to learn how this information will be used in item and test development. The committee is concerned about operational use of the principles of inclusion and accommodation that were adopted by the National Assessment Governing Board (1998a) for the pilot test in the summer of 1998. Given the modest subsequent development activity, the committee can only assume that these principles are likely to be applied in the field test and under operational conditions. However, the committee is concerned that it may be more difficult to provide these accommodations on a large scale than under the operational conditions of NAEP, where they are now standard procedure. Recommendation 4.1 NAGB should accelerate its plans and schedulefor inclusion and accommodation of students with disabilities andlimited English proficiency in order to increase the participationof both those student populations and to increase the comparabilityof VNT performance among student populations. We repeat this recommendation, which is unchanged from the NRC report on year 1 of VNT development. Although the condensed schedule of test development during year 1 provided a substantial rationale for limited attention to issues of inclusion and accommodation, the extension of the development schedule for another year prior to pilot testing provided an extra window of time for more intensive consideration of these issues. That window is already half closed, and there are modest signs of progress: the effort to include more students in cognitive labs, the intent to apply guidelines for simplified language, and the AIR proposal to study effects of accommodation in the pilot test. We concur with the observation in the NRC earlier report (National Research Council, 1999a:46): “the federal government has an important leadership role to play in subsidizing and demonstrating valid efforts to include these populations.” We offer two additional recommendations to underscore our concern for the urgency and NAGB's need to expand inclusion and accommodation efforts for the VNT. Recommendation 4.2 NAGB should consider one or more additional accommodationsby expanding the pilot test that has been planned and will includesmall-group administration and expanded time accommodations. Thisexpansion should focus on additional accommodations for English languagelearners, perhaps in the forms of both a Spanish version and theuse of English-Spanish dictionaries for the mathematics test, andby examining group as well as item-by-item differences in performancethrough differential item functioning analyses. Recommendation 4.3 NAGB should describe what additional accommodationsare being considered as part of the VNT planning and detail how thesepotential accommodations can provide useful information to studentswho can be thus included in the VNT by means of these accommodations.

OCR for page 43
Evaluation of the Voluntary National Tests, Year 2: INTERIM REPORT We think these issues should be addressed in a systematic manner—one that will gather valid, scientific evidence that can be used to improve inclusiveness and validity of testing for these groups. Our specific recommendations suggest a step-by-step approach to learning more about testing students with learning disabilities and students with limited English proficiency. They should provide new evidence that will improve testing guidelines or suggest additional lines of research to improve practice.