5
Student Assessment

This chapter addresses the issue of assessing the language proficiency and subject matter knowledge and skills of English-language learners. 1 Assessment plays a central role in the education of English-language learners and bilingual children. Teachers generally use assessments to monitor language development in students' first or second language and track the quality of their day-to-day subject matter learning. In addition, "high stakes" assessments are used to place students in special programs and to provide information for accountability and policy analysis purposes.

Garcia and Pearson (1994:343-349) examine assessment for culturally diverse learners across a wide range of subject matters and test types. They highlight potential validity and reliability problems for English-language learners that result from the "mainstream bias" of formal testing, including a norming bias (small numbers of particular minorities included in probability samples, increasing the likelihood that minority group samples are unrepresentative), content bias (test content and procedures

1  

 The standards for assessing reading and writing developed by the International Reading Association and the National Committee of Teachers of English, as well as those developed by Teachers of English to Speakers of Other Languages for assessing English proficiency, are consistent with and supportive of the model of assessment emerging from the review in this chapter.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 41
5 Student Assessment This chapter addresses the issue of assessing the language proficiency and subject matter knowledge and skills of English-language learners. 1 Assessment plays a central role in the education of English-language learners and bilingual children. Teachers generally use assessments to monitor language development in students' first or second language and track the quality of their day-to-day subject matter learning. In addition, "high stakes" assessments are used to place students in special programs and to provide information for accountability and policy analysis purposes. Garcia and Pearson (1994:343-349) examine assessment for culturally diverse learners across a wide range of subject matters and test types. They highlight potential validity and reliability problems for English-language learners that result from the "mainstream bias" of formal testing, including a norming bias (small numbers of particular minorities included in probability samples, increasing the likelihood that minority group samples are unrepresentative), content bias (test content and procedures 1    The standards for assessing reading and writing developed by the International Reading Association and the National Committee of Teachers of English, as well as those developed by Teachers of English to Speakers of Other Languages for assessing English proficiency, are consistent with and supportive of the model of assessment emerging from the review in this chapter.

OCR for page 41
reflecting the dominant culture's standards of language function and shared knowledge and behavior), and linguistic and cultural biases (factors that adversely affect the formal test performance of students from diverse linguistic and cultural backgrounds, including timed testing, difficulty with English vocabulary, and the near impossibility of determining what bilingual students know in their two languages). The ensuing discussion of assessment as applied to English-language learners and bilingual children inherently involves questions about the validity and reliability of assessments and their appropriateness for these children. It is also important to note that assessment practices have social and educational consequences that should be considered in an ongoing program of validity research (Messick, 1988). FINDINGS This section begins with two subsections that review issues involved in assessing language proficiency and those associated with the assessment of subject matter knowledge. The next two subsections examine uses of assessment that are unique to and those that extend beyond English-language learners. Issues associated with assessing special populations are then explored. The section ends with a discussion of standards-based reform and its implications for the design and conduct of student assessments. Issues in Assessing Language Proficiency In assessing the language proficiency of English-language learners, both discrete language skills (e.g., vocabulary and grammar) and more authentic and holistic uses of language should be assessed. The assessment of discrete language skills is a legitimate endeavor because each of these components is systemically related to authentic use of language (McLaughlin, 1984). However, many researchers in this area (Rivera, 1984; Wong Fillmore, 1982; Valdez Pierce and O'Malley, 1992) recommend assessment procedures that reflect tasks typical of classroom or real-life settings, such as oral interviews, story retellings, simulations, directed dialogues, incomplete story/topic prompts, picture cues, teacher observation checklists, student self-evaluations, and portfolios. Authentic assessments are both more difficult to administer and less objectively scored than traditional assessments, but they do reflect the important view that language proficiency is multifaceted and varies according to the task demands and content area domain (see Chapter 2).

OCR for page 41
Most measures used to assess English proficiency have measured decontextualized skills and set fairly low standards for language proficiency. Ultimately, English-language learners should be held to high standards for both English language and literacy, and should transition from being assessed with special measures of their increasing command of English to full participation in regularly administered assessments of English-language arts. Issues in Assessing Subject Matter Knowledge In this section, we examine the difficulties involved in incorporating English-language learners and bilingual children into subject matter assessments intended for their English-proficient peers. As noted in the Standards for Educational and Psychological Tests, every assessment is an assessment of language (American Educational Research Association, American Psychological Association, and National council on Measurement in Education, 1985). This is even more so given the advent of performance assessments requiring extensive comprehension and production of language. Given that the English-language proficiency levels of students affects their performance on subject area assessments administered in English (Garcia, 1991; Alderman, 1981) and that recently developed assessments require high levels of English proficiency, assessments and assessment procedures appropriate for English-language learners are needed. One strategy under active investigation is the use of native-language assessments. Approximately 75 percent of English-language learners come from Spanish-language backgrounds. For some of these students, it is realistic to develop native-language assessments. However, one must keep in mind the difficulties involved in developing native-language assessments that are equivalent to the English versions. Such difficulties include problems of regional and dialect differences, nonequivalence of vocabulary difficulty between two languages, problems of incomplete language development and lack of literacy development in students' primary languages, and the extreme difficulty of defining a "bilingual" equating sample (each new definition of a bilingual sample requires statistical equivalence among groups). Minimally, back translation should be done to determine equivalent meaning, and ideally, psychometric validation should be undertaken as well, such as validating the translated version with empirical evidence using item response theory (Hambleton and Kanjee, 1994). Another strategy to make assessments both comprehensible and con-

OCR for page 41
ceptually appropriate for English-language learners might entail decreasing the English-language load through actual modification of the items or instructions. This would not be a straightforward task, however. While some experts recommend reducing nonessential details and simplifying grammatical structures (Short, 1991), others claim that simplifying the surface linguistic features will not necessarily make the text easier to understand (Saville-Troike, 1991). When Abedi et al. (1995) reduced the linguistic complexity of National Assessment of Educational Progress mathematics test items in English, they reported only a modest and statistically unreliable effect in favor of the modified items for students at lower levels of English proficiency. Other strategies for incorporating English-language learners into assessments include extra time, small-group administration, flexible scheduling, reading of directions aloud, use of dictionaries, and administration of the assessment by a person familiar with the children's primary language and culture (Rivera, 1995). Additional possibilities include making test instructions more explicit and allowing English-language learners to display their knowledge using alternative forms of representation (e.g., showing math operations on numbers and knowledge of graphing in problem solving). However, almost no research has been conducted to determine the effectiveness of these techniques. Another issue in assessment of subject matter knowledge for English-language learners is the errors that result from inaccurate and inconsistent scoring of open-ended or performance-based measures. There is evidence that scorers may pay attention to linguistic features of performance unrelated to the content of the assessment. Thus, scorers may inaccurately assign low scores for performance in which English expression (either oral or written) is weak even though understanding or mastery of skills is high. This obviously confounds the accuracy of the score enormously.2 Absent training, different scorers probably will rate the same student work very differently. Some states also provide guidance to scorers on evaluating the work of English-language learners. Hafner (1995) reports that 10 percent of states give special training on evaluating the work of English-language 2    Interestingly, Lindholm (1994) found highly significant and positive correlations between standardized scores of Spanish reading achievement and teacher-rated reading rubric scores, as well as between the standardized reading scores and students' ratings of their reading competence, for native English-speaking and native Spanish-speaking students enrolled in a bilingual immersion program.

OCR for page 41
learners, and 10 percent give directions in their manuals. Some training entails the development of scoring rubrics and procedures for constructed response items that are sensitive to the language and cultural characteristics of English-language learners. The Council of Chief State School Officers recently developed a scorer's training manual (Wong Fillmore and Lara, 1996) for use by states and local education agencies as an aid in the scoring of English-language learners' answers to open-ended mathematics questions. This manual will be piloted in collaboration with the National Center for Education Statistics and the Educational Testing Service, using the work of English-language learners who participated in the 1996 National Assessment of Educational Progress math assessment, to see how well it prepares scorers to assess the work of those students accurately. Assessment Purposes Unique to English-Language Learners Assessment purposes unique to English-language learners are focused on determining when these students should be placed in and exited from special language services such as English as a second language and bilingual education programs. There is a great deal of variability across school districts in the way assessments are used for these purposes. This variability exists because many states, while providing guidance to districts on assessment procedures, allow them considerable flexibility in choosing assessment methods, assessment instruments (usually from a menu of state-approved instruments), and cutoff scores for eligibility and classification for those instruments (August and Lara, 1996).3 3    Of the 25 states that have assessment requirements for determining which language-minority students are of limited English proficiency, 22 specify English proficiency tests. Of these 22 states, 8 also specify achievement tests, and 3 specify English proficiency tests and below-average performance based on grades or classwork. When assessment is used for program placement, similar procedures are used. In the other states, it is up to individual districts to set these policies. In some states, native-language proficiency assessments are required (Arizona, Hawaii, Utah, California, Texas, New Jersey) or recommended. The only information regarding methods for reclassifying students from language assistance programs (Cheung and Soloman, 1991) indicates that language tests are the most frequently used method (required in 36 percent of states, recommended in 30 percent), followed by content area tests (required in 34 percent of states, recommended in 11 percent). Other methods recommended for determining program exit include observations and interviews. About one-third of states reported having no state requirement regarding exit criteria.

OCR for page 41
Assessment Purposes That Extend Beyond English Language Learners The assessment policies discussed in this section are related to determining eligibility for federal assistance and monitoring student progress at the state and district levels. Title I Specific attention is given to Title I, which is by far the largest federal program serving English-language learners. Changes in the Title I legislation provide for the participation of all students, including English-language learners, in assessments to determine whether they are meeting performance standards, and for reasonable adaptations of these assessments to this end. According to the law, English-language learners are to be included in assessments to the extent practicable, in the language and form most likely to yield accurate and reliable information on what they know and can do, including their mastery of skills in target subject matter areas, not just English. The law now further requires that each state plan identify the languages other than English that are present in the participating student population and indicate the languages for which yearly student assessments are not available and are needed. States are required to make every effort to develop such assessments and may request assistance from the U.S. Department of Education if linguistically accessible assessment measures are needed (see August et al., 1995). Assessment is particularly important for purposes of selecting students eligible for services in Title I targeted assistance programs (as opposed to Title I school-wide programs), whereby Title I services are made available to a subset of the students "on the basis of multiple, educationally related, objective criteria established by the local educational agency and supplemented by the school" (Section 1115). The current policy guidance provided by the U.S. Department of Education does not elaborate on how equitable selection might be accomplished for English-language learners, and leaves it up to local districts to select those eligible students "most in need of special services." In the absence of test modifications, including assessments conducted in the native language, as well as methods for determining how English-language learners compare with other students on educational needs, a large proportion of English-language learners may not be served through Title I.

OCR for page 41
State, District, and Classroom Assessments States are in various stages of incorporating English-language learners into performance-based assessments and standardized achievement tests, measures they use to monitor student performance (August and Lara, 1996; Rivera, 1995). August and Lara (1996) found that only 5 states require English-language learners to take state-wide assessments required of other students;4 36 states exempt English-language learners from such assessments, although 22 of those states require these students to take the assessments after a given period of time (usually 1-3 years). Some states base their assessment decision on the proficiency level of their English-language learners; of these, a few leave it up to local districts to determine which students have enough English proficiency to participate in the state-wide assessments. Finally, some states use multiple criteria to excuse students from state-wide assessments, including number of years in English-speaking classrooms, language proficiency scores, school achievement, and teacher judgment. States use a variety of approaches to assess students that have been exempted from the state-wide assessments. Hafner (1995) reports that 55 percent of states allow modifications in the administration of at least one of their assessments to incorporate English-language learners. The most common modifications are extra time (20 states), small-group administration (18 states), flexible scheduling (16 states), simplification of directions (14 states), use of dictionaries (13 states), and reading of questions aloud in English (12 states). Other accommodations include assessments in languages other than English, availability of both English and non-English versions of the same assessment items, division of assessments into shorter parts, and administration of the assessment by a person familiar with the children's primary language and culture (Rivera, 1995). Clearly, classroom teachers also assess students to determine how well they are grasping coursework and to inform instructional practice (see Chapter 7). Innovations at the classroom level include an assessment process that is multiple-referenced. That is, it incorporates information about the students in a variety of contexts obtained from a variety of sources through a variety of procedures (Genesee and Hamayan, 1994). Navarette et al. (1990) describe innovative assessment procedures that include unstructured techniques (e.g., writing samples, homework, logs, games, debates, story telling) and structured techniques (e.g., criterion- 4    In 3 of these states, however, English-language learners may be exempted under certain conditions.

OCR for page 41
referenced tests, cloze tests, structured interviews), as well as portfolios that include both of these techniques. In addition, students are assessed in their native language to better determine their academic achievement and ensure appropriate coursework (Genesee and Hamayan, 1994). Information on student background characteristics, such as literacy in the home, parents' educational backgrounds, and previous educational experiences, is collected and provides essential information that helps put assessment results in context. Issues in Assessing Special Populations Very Young Second-Language Learners The assessment of young children's development in meaningful ways is already surrounded by a great deal of controversy and concern among the preschool education community because of the dearth of valid and reliable instruments for measuring all aspects of child development (Meisels, 1994). For these reasons, McLaughlin et al. (1995) recommend what they call "instructionally embedded assessment," in which teachers make a plan about what, when, and how to assess a child; collect information from a variety of sources, including observations, prompted responses, classroom products, and conversations with family members; develop a portfolio; write narrative summaries; meet with family and staff; and finally, use the information to inform curriculum development. And this is a recursive process that begins again once it has been completed for any individual child. An assessment system of this sort is, of course, extremely time-consuming and necessitates reform in several areas, including use of time, professional staff development, accountability, and relationships with parents. It may, however, be the only meaningful way teachers can assess very young second-language learners. Children with Disabilities The field still lacks instruments appropriate for assessing English-language learners with disabilities. A practical strategy may be to train assessment personnel in appropriate procedures for this population, including acceptable modifications or alternatives, rather than awaiting the development of norm-referenced instruments appropriate for English-language learners. The literature does identify several promising practices for assessing English-language learners with disabilities that may be useful as well for

OCR for page 41
inclusion of all English-language learners in local and state assessments. Durán (1989) recommends the use of dynamic assessment (e.g., Feuerstein's [1979] Learning Potential Assessment Device), which involves a test-train-test cycle during which a student's response to a criterion problem is evaluated, feedback is given to help improve performance, and the student is reassessed. Lewis (1991) recommends use of the Kaufman Assessment Battery for Children (KABC) because it separates the mental processing scores from the achievement scores, and includes a training component to ensure that the student understands the task. He suggests that this approach accommodates different cognitive processing styles, an advantage in assessing diverse cultural groups. Further, he claims that Feuerstein's dynamic assessment approach and the KABC are more advantageous than instruments like the Weschler Intelligence Scales for Children-Revised (WISC-R) because they deemphasize factual information and learned content and focus instead on problem-solving tasks. Because of the myriad of factors that must be considered in distinguishing linguistic and cultural differences from disabilities, ecological models of assessment are recommended so that learning problems will be examined in light of contextual variables affecting the teaching-learning process, including the interaction of teachers, students, curriculum, instructional variables, and so forth. Assessors must consider the student's native- and English-language skills, select appropriate measures for assessing skills across languages, and interpret outcomes considering factors such as the student's age and cultural and experiential background (Cloud, 1991). Standards-Based Reform The standards-based reform movement has major implications for English-language learners, especially in the area of assessment. Both Goals 2000 and the Improving America's Schools Act state explicitly that all students, including English-language-learners, are expected to attain high standards. For example, program accountability provisions in both Title I and Title VII are framed around the need to demonstrate that students in these programs are meeting state and local performance standards for all students. The demonstration of results has been a particularly complex issue for English-language learners because of the unavailability of assessments suited to their needs, as discussed previously. Issues of validity and reliability in assessing the subject matter knowledge and skills of English-language learners were discussed earlier in this

OCR for page 41
chapter. Another assessment issue related to standards-based reform is how to define adequate yearly progress for English-language learners. The Title I law, for example, requires that adequate yearly progress be defined in a manner that "… is sufficient to achieve the goal of all children served under [this part] in meeting the State's proficient and advanced levels of performance, particularly economically disadvantaged and LEP students." Yearly progress as defined by the law pertains to the progress of districts and schools, measured by the aggregation of individual student scores on assessments aligned with performance standards. According to the law, the same high performance standards that are established for all students are the ultimate goal for English-language learners as well. On average, however, English-language learners (especially those with limited prior schooling) may take more time to meet these standards. Therefore, additional benchmarks might be developed for assessing the progress of these students toward meeting the standards. Moreover, because English-language learners are acquiring English-language skills and knowledge already possessed by students who arrive in school speaking English, additional content and performance standards in English-language arts may be appropriate. Recently, the Teachers of English to Speakers of Other Languages professional association has developed model content standards to guide the instruction and assessment of English skills and knowledge for such students (Teachers of English to Speakers of Other Languages, Inc., 1997). Another issue related to adequate yearly progress has to do with districts' obligation to determine whether schools served by Title I funds are progressing sufficiently toward enabling all children to meet the state's student performance standards. According to the law, adequate progress is defined as that which results in continuous and substantial yearly improvement of each district and school, sufficient to achieve the goal of having all children—particularly economically disadvantaged students and English-language learners—meet the state's proficient and advanced levels of performance. To determine whether English-language learners are meeting these standards, assessment results must be disaggregated by English proficiency status. Some states, such as Florida, Hawaii, Louisiana, Maine, Ohio, and Washington, do this already (August and Lara, 1996). However, research is needed to determine how best to accomplish this goal in statistically sound ways, especially in light of alternative assessment procedures used with English-language learners. Because of the difficulties in assessing English-language learners, it may be important to assess their access to necessary resources and condi-

OCR for page 41
tions, such as adequate and appropriate instruction. However, defining and assessing these conditions is a very difficult task. Although there has been substantial work in defining some conditions, such as content coverage and time on task for mainstream students (Carroll, 1958; Leinhardt, 1978), the research base for defining the most important and effective resources and conditions for English-language learners is very weak (see Chapter 7). Yet many English-language learners do find themselves in poor schools with few resources. A good start would be to define and assess essential resources (e.g., textbooks, course offerings, accessibility of information) while continuing research into other aspects of school life, such as effective school-wide and classroom attributes that result in students' social and academic success. In terms of enhancing opportunities to learn for English-language learners, another strategy would be to encourage the development and evaluation of methods to help school staff monitor progress in improving schooling through systematic attempts to compare their school's performance against certain quality indicators.5 This notion is further elaborated in Chapter 7. IMPLICATIONS Educational To best inform instructional practice, an assessment process is recommended that incorporates information about students in a variety of contexts (i.e., home and school) obtained from a variety of sources (i.e., special language teachers and classroom teachers) through a variety of procedures (i.e., criterion-referenced tests, classroom observations, and portfolios). To assess English-language proficiency, both discrete language skills (e.g., vocabulary and grammar) and more authentic and holistic uses of language should be assessed. Because English-language learners are acquiring English-language skills and knowledge already possessed by students who arrive in school speaking English, additional content and performance standards in English-language arts may be appropriate. Ultimately, English-language learners should be held to high standards for both English language and literacy, and should transition from being assessed with special measures of their increasing command of English to 5    California, for example, has a Program Quality Review System that relies on peer review. Additional benchmarks could include school-wide and classroom factors that are known to improve the performance of English-language learners.

OCR for page 41
full participation in regularly administered assessments of English-language arts. Strategies under active investigation for incorporating English-language learners into assessments of subject-matter knowledge include the use of native-language assessments. Other strategies for making assessments both comprehensible and conceptually appropriate for English-language learners entail a decrease in the English-language load of the assessment through actual modification of the items or instructions, extra time, small-group administration, flexible scheduling, reading of directions aloud, use of dictionaries, and administration of the assessment by a person familiar with the children's primary language and culture. However, additional research is needed to determine the psychometric soundness of these techniques. Finally, because of inaccurate and inconsistent scoring of open-ended or performance-based measures of English-language learner subject matter knowledge, training is needed so that such scoring is reliable. Because of the difficulties in assessing English-language learners, it may be important to assess their access to necessary resources and conditions, such as adequate and appropriate instruction. However, defining and assessing these conditions is a very difficult task. In terms of enhancing opportunities to learn for English-language learners, another strategy would be to encourage the development and evaluation of methods to help school staff monitor progress in improving schooling through systematic attempts to compare their school's performance against certain quality indicators. Research Research is needed to improve native-language and English-language proficiency assessments so they are consistent with research findings on first- and second-language acquisition and literacy development. Research is also needed to determine the levels of proficiency in different aspects of English required for English-language learners to participate in English-only instruction. Along the same lines, there is a need to develop guidelines for determining when English-language learners are ready to take the same assessments as their English-proficient peers, and when versions of an assessment other than the "standard" English version should be administered. There is a need as well to develop psychometrically sound and practical assessments and assessment procedures that incorporate English-language learners into district and state assessment systems. In addition, research is needed to improve inaccurate and incon-

OCR for page 41
sistent scoring of open-ended or performance-based measures of the work of English-language learners. The field still lacks an array of instruments appropriate for assessing young English-language learners and those with disabilities. Several strategies are reviewed in this chapter, but should be evaluated to determine whether they are psychometrically sound. Finally, to incorporate English-language learners into standards-based reform, research is needed in the following areas: (1) whether it is possible to establish common standard benchmarks for subject matter knowledge and English proficiency for English-language learners within a valid theoretical framework, what those benchmarks might be, and how the benchmarks for English proficiency might be related to performance standards for English-language arts; (2) how it can be determined whether, in the context of school and district outcomes, English-language learners are making progress toward meeting proficient and advanced levels of performance; and (3) how opportunities to learn can be evaluated.

OCR for page 41
PROGRAM EVALUATION: KEY FINDINGS The following key findings can be drawn from the literature on program evaluation: The major national-level program evaluations suffer from design limitations; lack of documentation of study objectives, conceptual details, and procedures followed; poorly articulated goals; lack of fit between goals and research design; and excessive use of elaborate statistical designs to overcome shortcomings in research designs. In general, more has been learned from reviews of smaller-scale evaluations, although these, too, have suffered from methodological limitations. It is difficult to synthesize the program evaluations of bilingual education because of the extreme politicization of the process. Most consumers of research are not researchers who want to know the truth, but advocates who are convinced of the absolute correctness of their positions. The beneficial effects of native-language instruction are clearly evident in programs that are labeled "bilingual education," but they also appear in some programs that are labeled "immersion." There appear to be benefits of programs that are labeled "structured immersion," although a quantitative analysis of such programs is not yet available. There is little value in conducting evaluations to determine which type of program is best. The key issue is not finding a program that works for all children and all localities, but rather finding a set of program components that works for the children in the community of interest, given that community's goals, demographics, and resources.